Class Mouth
java.lang.Object
org.openoffice.da.comp.w2lcommon.tex.tokenizer.Mouth
The Mouth is the main class of this package. It is a tokenizer to TeX files: According to "The TeXBook", the "eyes" and "mouth" of TeX are responsible for turning the input to TeX into a sequence of tokens. We are not going to reimplement TeX, but rather providing a service for parsing high-level languages based on TeX (eg. LaTeX, ConTeXt). For this reason the tokenizer deviates slightly from TeX: We're not reading a stream of bytes but rather a stream of characters (which makes no difference for ASCII files).
In tribute to Donald E. Knuths digestive metaphors, we divide the process in four levels
- The parser should provide a pair of glasses to translate the stream of bytes into a stream of characters
- The eyes sees the stream of characters as a sequence of lines
- The mouth chews a bit on the characters to turn them into tokens
- The tongue reports the "taste" of the token to the parser
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionGet the currently used catcode tablechar
Return the current value of the \endlinechar (the character added to the end of each input line)getToken()
Get the next tokenReturn the object used to store the current token (the "tongue" of TeX).void
setCatcodes
(CatcodeTable catcodes) Set the catcode table.void
setEndlinechar
(char c) Set a new \endlinechar (the character added to the end of each input line).
-
Constructor Details
-
Mouth
Construct a newMouth
based on a character stream- Parameters:
reader
- the character stream to tokenize- Throws:
IOException
- if we fail to read the character stream
-
-
Method Details
-
getCatcodes
Get the currently used catcode table- Returns:
- the table
-
setCatcodes
Set the catcode table. The catcode table can be changed at any time during tokenization.- Parameters:
catcodes
- the table
-
getEndlinechar
public char getEndlinechar()Return the current value of the \endlinechar (the character added to the end of each input line)- Returns:
- the character
-
setEndlinechar
public void setEndlinechar(char c) Set a new \endlinechar (the character added to the end of each input line). The character can be changed at any time during tokenization.- Parameters:
c
- the character
-
getTokenObject
Return the object used to store the current token (the "tongue" of TeX). The same object is reused for all tokens, so for convenience the parser can keep a reference to the object. If on the other hand the parser needs to store a token list, it must explicitly clone all tokens.- Returns:
- the token
-
getToken
Get the next token- Returns:
- the token (for convenience; the same object is returned by
getTokenObject()
). - Throws:
IOException
- if we fail to read the underlying stream
-