public final class LowerCaseTokenizer extends LetterTokenizer
Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.
AttributeSource.AttributeFactory, AttributeSource.State
Constructor and Description |
---|
LowerCaseTokenizer(AttributeSource.AttributeFactory factory,
java.io.Reader in)
Construct a new LowerCaseTokenizer using a given
AttributeSource.AttributeFactory . |
LowerCaseTokenizer(AttributeSource source,
java.io.Reader in)
Construct a new LowerCaseTokenizer using a given
AttributeSource . |
LowerCaseTokenizer(java.io.Reader in)
Construct a new LowerCaseTokenizer.
|
Modifier and Type | Method and Description |
---|---|
protected char |
normalize(char c)
Converts char to lower case
Character.toLowerCase(char) . |
isTokenChar
end, incrementToken, next, next, reset
close, correctOffset
getOnlyUseNewAPI, reset, setOnlyUseNewAPI
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString
public LowerCaseTokenizer(java.io.Reader in)
public LowerCaseTokenizer(AttributeSource source, java.io.Reader in)
AttributeSource
.public LowerCaseTokenizer(AttributeSource.AttributeFactory factory, java.io.Reader in)
AttributeSource.AttributeFactory
.protected char normalize(char c)
Character.toLowerCase(char)
.normalize
in class CharTokenizer
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.