public abstract class Tokenizer extends TokenStream
This is an abstract class; subclasses must override TokenStream.incrementToken()
NOTE: Subclasses overriding TokenStream.incrementToken()
must
call AttributeSource.clearAttributes()
before
setting attributes.
Subclasses overriding TokenStream.next(Token)
must call
Token.clear()
before setting Token attributes.
AttributeSource.AttributeFactory, AttributeSource.State
Modifier and Type | Field and Description |
---|---|
protected java.io.Reader |
input
The text source for this Tokenizer.
|
Modifier | Constructor and Description |
---|---|
protected |
Tokenizer()
Construct a tokenizer with null input.
|
protected |
Tokenizer(AttributeSource.AttributeFactory factory)
Construct a tokenizer with null input using the given AttributeFactory.
|
protected |
Tokenizer(AttributeSource.AttributeFactory factory,
java.io.Reader input)
Construct a token stream processing the given input using the given AttributeFactory.
|
protected |
Tokenizer(AttributeSource source)
Construct a token stream processing the given input using the given AttributeSource.
|
protected |
Tokenizer(AttributeSource source,
java.io.Reader input)
Construct a token stream processing the given input using the given AttributeSource.
|
protected |
Tokenizer(java.io.Reader input)
Construct a token stream processing the given input.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
By default, closes the input Reader.
|
protected int |
correctOffset(int currentOff)
Return the corrected offset.
|
void |
reset(java.io.Reader input)
Expert: Reset the tokenizer to a new reader.
|
end, getOnlyUseNewAPI, incrementToken, next, next, reset, setOnlyUseNewAPI
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString
protected Tokenizer()
protected Tokenizer(java.io.Reader input)
protected Tokenizer(AttributeSource.AttributeFactory factory)
protected Tokenizer(AttributeSource.AttributeFactory factory, java.io.Reader input)
protected Tokenizer(AttributeSource source)
protected Tokenizer(AttributeSource source, java.io.Reader input)
public void close() throws java.io.IOException
close
in class TokenStream
java.io.IOException
protected final int correctOffset(int currentOff)
input
is a CharStream
subclass
this method calls CharStream.correctOffset(int)
, else returns currentOff
.currentOff
- offset as seen in the outputCharStream.correctOffset(int)
public void reset(java.io.Reader input) throws java.io.IOException
java.io.IOException
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.