public class ShingleAnalyzerWrapper
extends org.apache.lucene.analysis.Analyzer
ShingleFilter
around another Analyzer
.
A shingle is another name for a token based n-gram.
Modifier and Type | Field and Description |
---|---|
protected org.apache.lucene.analysis.Analyzer |
defaultAnalyzer |
protected int |
maxShingleSize |
protected boolean |
outputUnigrams |
Constructor and Description |
---|
ShingleAnalyzerWrapper()
Wraps
StandardAnalyzer . |
ShingleAnalyzerWrapper(org.apache.lucene.analysis.Analyzer defaultAnalyzer) |
ShingleAnalyzerWrapper(org.apache.lucene.analysis.Analyzer defaultAnalyzer,
int maxShingleSize) |
ShingleAnalyzerWrapper(int nGramSize) |
Modifier and Type | Method and Description |
---|---|
int |
getMaxShingleSize()
The max shingle (ngram) size
|
boolean |
isOutputUnigrams() |
org.apache.lucene.analysis.TokenStream |
reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
void |
setMaxShingleSize(int maxShingleSize)
Set the maximum size of output shingles
|
void |
setOutputUnigrams(boolean outputUnigrams)
Shall the filter pass the original tokens (the "unigrams") to the output
stream?
|
org.apache.lucene.analysis.TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader) |
protected org.apache.lucene.analysis.Analyzer defaultAnalyzer
protected int maxShingleSize
protected boolean outputUnigrams
public ShingleAnalyzerWrapper(org.apache.lucene.analysis.Analyzer defaultAnalyzer)
public ShingleAnalyzerWrapper(org.apache.lucene.analysis.Analyzer defaultAnalyzer, int maxShingleSize)
public ShingleAnalyzerWrapper()
StandardAnalyzer
.public ShingleAnalyzerWrapper(int nGramSize)
public int getMaxShingleSize()
public void setMaxShingleSize(int maxShingleSize)
maxShingleSize
- max shingle sizepublic boolean isOutputUnigrams()
public void setOutputUnigrams(boolean outputUnigrams)
outputUnigrams
- Whether or not the filter shall pass the original
tokens to the output streampublic org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldName, java.io.Reader reader)
tokenStream
in class org.apache.lucene.analysis.Analyzer
public org.apache.lucene.analysis.TokenStream reusableTokenStream(java.lang.String fieldName, java.io.Reader reader) throws java.io.IOException
reusableTokenStream
in class org.apache.lucene.analysis.Analyzer
java.io.IOException
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.