Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.standard |
A fast grammar-based tokenizer constructed with JFlex.
|
org.apache.lucene.index |
Code to maintain and access indices.
|
org.apache.lucene.util |
Some utility classes.
|
Modifier and Type | Class and Description |
---|---|
class |
ASCIIFoldingFilter
This class converts alphabetic, numeric, and symbolic Unicode characters
which are not in the first 127 ASCII characters (the "Basic Latin" Unicode
block) into their ASCII equivalents, if one exists.
|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
CharTokenizer
An abstract base class for simple, character-oriented tokenizers.
|
class |
ISOLatin1AccentFilter
Deprecated.
in favor of
ASCIIFoldingFilter which covers a superset
of Latin 1. This class will be removed in Lucene 3.0. |
class |
KeywordTokenizer
Emits the entire input as a single token.
|
class |
LengthFilter
Removes words that are too long or too short from the stream.
|
class |
LetterTokenizer
A LetterTokenizer is a tokenizer that divides text at non-letters.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
LowerCaseTokenizer
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together.
|
class |
NumericTokenStream
Expert: This class provides a
TokenStream
for indexing numeric values that can be used by NumericRangeQuery or NumericRangeFilter . |
class |
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm.
|
class |
SinkTokenizer
Deprecated.
Use
TeeSinkTokenFilter instead |
class |
StopFilter
Removes stop words from a token stream.
|
class |
TeeSinkTokenFilter
This TokenFilter provides the ability to set aside attribute states
that have already been analyzed.
|
static class |
TeeSinkTokenFilter.SinkTokenStream |
class |
TeeTokenFilter
Deprecated.
Use
TeeSinkTokenFilter instead |
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream.
|
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
|
class |
TokenStream
|
class |
WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
|
Modifier and Type | Method and Description |
---|---|
abstract boolean |
TeeSinkTokenFilter.SinkFilter.accept(AttributeSource source)
Returns true, iff the current state of the passed-in
AttributeSource shall be stored
in the sink. |
Constructor and Description |
---|
CharTokenizer(AttributeSource source,
java.io.Reader input) |
KeywordTokenizer(AttributeSource source,
java.io.Reader input,
int bufferSize) |
LetterTokenizer(AttributeSource source,
java.io.Reader in)
Construct a new LetterTokenizer using a given
AttributeSource . |
LowerCaseTokenizer(AttributeSource source,
java.io.Reader in)
Construct a new LowerCaseTokenizer using a given
AttributeSource . |
NumericTokenStream(AttributeSource source,
int precisionStep)
Expert: Creates a token stream for numeric values with the specified
precisionStep using the given AttributeSource . |
Tokenizer(AttributeSource source)
Construct a token stream processing the given input using the given AttributeSource.
|
Tokenizer(AttributeSource source,
java.io.Reader input)
Construct a token stream processing the given input using the given AttributeSource.
|
TokenStream(AttributeSource input)
A TokenStream that uses the same attributes as the supplied one.
|
WhitespaceTokenizer(AttributeSource source,
java.io.Reader in)
Construct a new WhitespaceTokenizer using a given
AttributeSource . |
Modifier and Type | Class and Description |
---|---|
class |
StandardFilter
Normalizes tokens extracted with
StandardTokenizer . |
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex
|
Constructor and Description |
---|
StandardTokenizer(AttributeSource source,
java.io.Reader input,
boolean replaceInvalidAcronym)
Deprecated.
|
StandardTokenizer(Version matchVersion,
AttributeSource source,
java.io.Reader input)
Creates a new StandardTokenizer with a given
AttributeSource . |
Modifier and Type | Method and Description |
---|---|
AttributeSource |
FieldInvertState.getAttributeSource() |
Modifier and Type | Method and Description |
---|---|
AttributeSource |
AttributeSource.cloneAttributes()
Performs a clone of all
AttributeImpl instances returned in a new
AttributeSource instance. |
Constructor and Description |
---|
AttributeSource(AttributeSource input)
An AttributeSource that uses the same attributes as the supplied one.
|
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.