public final class NumericUtils
extends java.lang.Object
To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.
This class generates terms to achieve this: First the numerical integer values need to
be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned
and the bits are converted to ASCII chars with each 7 bit. The resulting string is
sortable like the original integer value. Each value is also prefixed
(in the first char) by the shift
value (number of bits removed) used
during encoding.
To also index floating point numbers, this class supplies two methods to convert them
to integer values by changing their bit layout: doubleToSortableLong(double)
,
floatToSortableInt(float)
. You will have no precision loss by
converting floating point numbers to integers and back (only that the integer form
is not usable). Other data types like dates can easily converted to longs or ints (e.g.
date to long: Date.getTime()
).
For easy usage, the trie algorithm is implemented for indexing inside
NumericTokenStream
that can index int
, long
,
float
, and double
. For querying,
NumericRangeQuery
and NumericRangeFilter
implement the query part
for the same data types.
This class can also be used, to generate lexicographically sortable (according
String.compareTo(String)
) representations of numeric data types for other
usages (e.g. sorting).
NOTE: This API is experimental and might change in incompatible ways in the next release.
Modifier and Type | Class and Description |
---|---|
static class |
NumericUtils.IntRangeBuilder
Expert: Callback for
splitIntRange(org.apache.lucene.util.NumericUtils.IntRangeBuilder, int, int, int) . |
static class |
NumericUtils.LongRangeBuilder
Expert: Callback for
splitLongRange(org.apache.lucene.util.NumericUtils.LongRangeBuilder, int, long, long) . |
Modifier and Type | Field and Description |
---|---|
static int |
BUF_SIZE_INT
Expert: The maximum term length (used for
char[] buffer size)
for encoding int values. |
static int |
BUF_SIZE_LONG
Expert: The maximum term length (used for
char[] buffer size)
for encoding long values. |
static int |
PRECISION_STEP_DEFAULT
The default precision step used by
NumericField , NumericTokenStream ,
NumericRangeQuery , and NumericRangeFilter as default |
static char |
SHIFT_START_INT
Expert: Integers are stored at lower precision by shifting off lower bits.
|
static char |
SHIFT_START_LONG
Expert: Longs are stored at lower precision by shifting off lower bits.
|
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
doubleToPrefixCoded(double val)
Convenience method: this just returns:
longToPrefixCoded(doubleToSortableLong(val))
|
static long |
doubleToSortableLong(double val)
Converts a
double value to a sortable signed long . |
static java.lang.String |
floatToPrefixCoded(float val)
Convenience method: this just returns:
intToPrefixCoded(floatToSortableInt(val))
|
static int |
floatToSortableInt(float val)
Converts a
float value to a sortable signed int . |
static java.lang.String |
intToPrefixCoded(int val)
This is a convenience method, that returns prefix coded bits of an int without
reducing the precision.
|
static java.lang.String |
intToPrefixCoded(int val,
int shift)
Expert: Returns prefix coded bits after reducing the precision by
shift bits. |
static int |
intToPrefixCoded(int val,
int shift,
char[] buffer)
Expert: Returns prefix coded bits after reducing the precision by
shift bits. |
static java.lang.String |
longToPrefixCoded(long val)
This is a convenience method, that returns prefix coded bits of a long without
reducing the precision.
|
static java.lang.String |
longToPrefixCoded(long val,
int shift)
Expert: Returns prefix coded bits after reducing the precision by
shift bits. |
static int |
longToPrefixCoded(long val,
int shift,
char[] buffer)
Expert: Returns prefix coded bits after reducing the precision by
shift bits. |
static double |
prefixCodedToDouble(java.lang.String val)
Convenience method: this just returns:
sortableLongToDouble(prefixCodedToLong(val))
|
static float |
prefixCodedToFloat(java.lang.String val)
Convenience method: this just returns:
sortableIntToFloat(prefixCodedToInt(val))
|
static int |
prefixCodedToInt(java.lang.String prefixCoded)
Returns an int from prefixCoded characters.
|
static long |
prefixCodedToLong(java.lang.String prefixCoded)
Returns a long from prefixCoded characters.
|
static float |
sortableIntToFloat(int val)
Converts a sortable
int back to a float . |
static double |
sortableLongToDouble(long val)
Converts a sortable
long back to a double . |
static void |
splitIntRange(NumericUtils.IntRangeBuilder builder,
int precisionStep,
int minBound,
int maxBound)
Expert: Splits an int range recursively.
|
static void |
splitLongRange(NumericUtils.LongRangeBuilder builder,
int precisionStep,
long minBound,
long maxBound)
Expert: Splits a long range recursively.
|
public static final int PRECISION_STEP_DEFAULT
NumericField
, NumericTokenStream
,
NumericRangeQuery
, and NumericRangeFilter
as defaultpublic static final char SHIFT_START_LONG
SHIFT_START_LONG+shift
in the first characterpublic static final int BUF_SIZE_LONG
char[]
buffer size)
for encoding long
values.public static final char SHIFT_START_INT
SHIFT_START_INT+shift
in the first characterpublic static final int BUF_SIZE_INT
char[]
buffer size)
for encoding int
values.public static int longToPrefixCoded(long val, int shift, char[] buffer)
shift
bits.
This is method is used by NumericTokenStream
.val
- the numeric valueshift
- how many bits to strip from the rightbuffer
- that will contain the encoded chars, must be at least of BUF_SIZE_LONG
lengthpublic static java.lang.String longToPrefixCoded(long val, int shift)
shift
bits.
This is method is used by NumericUtils.LongRangeBuilder
.val
- the numeric valueshift
- how many bits to strip from the rightpublic static java.lang.String longToPrefixCoded(long val)
To decode, use prefixCodedToLong(java.lang.String)
.
public static int intToPrefixCoded(int val, int shift, char[] buffer)
shift
bits.
This is method is used by NumericTokenStream
.val
- the numeric valueshift
- how many bits to strip from the rightbuffer
- that will contain the encoded chars, must be at least of BUF_SIZE_INT
lengthpublic static java.lang.String intToPrefixCoded(int val, int shift)
shift
bits.
This is method is used by NumericUtils.IntRangeBuilder
.val
- the numeric valueshift
- how many bits to strip from the rightpublic static java.lang.String intToPrefixCoded(int val)
To decode, use prefixCodedToInt(java.lang.String)
.
public static long prefixCodedToLong(java.lang.String prefixCoded)
java.lang.NumberFormatException
- if the supplied string is
not correctly prefix encoded.longToPrefixCoded(long)
public static int prefixCodedToInt(java.lang.String prefixCoded)
java.lang.NumberFormatException
- if the supplied string is
not correctly prefix encoded.intToPrefixCoded(int)
public static long doubleToSortableLong(double val)
double
value to a sortable signed long
.
The value is converted by getting their IEEE 754 floating-point "double format"
bit layout and then some bits are swapped, to be able to compare the result as long.
By this the precision is not reduced, but the value can easily used as a long.sortableLongToDouble(long)
public static java.lang.String doubleToPrefixCoded(double val)
public static double sortableLongToDouble(long val)
long
back to a double
.doubleToSortableLong(double)
public static double prefixCodedToDouble(java.lang.String val)
public static int floatToSortableInt(float val)
float
value to a sortable signed int
.
The value is converted by getting their IEEE 754 floating-point "float format"
bit layout and then some bits are swapped, to be able to compare the result as int.
By this the precision is not reduced, but the value can easily used as an int.sortableIntToFloat(int)
public static java.lang.String floatToPrefixCoded(float val)
public static float sortableIntToFloat(int val)
int
back to a float
.floatToSortableInt(float)
public static float prefixCodedToFloat(java.lang.String val)
public static void splitLongRange(NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)
BooleanQuery
for each call to its
NumericUtils.LongRangeBuilder.addRange(String,String)
method.
This method is used by NumericRangeQuery
.
public static void splitIntRange(NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)
BooleanQuery
for each call to its
NumericUtils.IntRangeBuilder.addRange(String,String)
method.
This method is used by NumericRangeQuery
.
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.