public class Syns2Index
extends java.lang.Object
SynExpand.expand(...)
).
This has been tested with WordNet 2.0.
The index has fields named "word" (F_WORD
)
and "syn" (F_SYN
).
The source word (such as 'big') can be looked up in the
"word" field, and if present there will be fields named "syn"
for every synonym. What's tricky here is that there could be multiple
fields with the same name, in the general case for words that have multiple synonyms.
That's not a problem with Lucene, you just use Document.getValues(java.lang.String)
While the WordNet file distinguishes groups of synonyms with related meanings we don't do that here.
This can take 4 minutes to execute and build an index on a "fast" system and the index takes up almost 3 MB.Modifier and Type | Field and Description |
---|---|
static java.lang.String |
F_SYN |
static java.lang.String |
F_WORD |
Constructor and Description |
---|
Syns2Index() |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
Takes arg of prolog file name and index directory.
|
public static final java.lang.String F_SYN
public static final java.lang.String F_WORD
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.