Comment #2 on issue 132 by
tristan...@nothingisreal.com: Use
Why use a separate lemmatizer at all? The UBY WordNet module already
depends on the extJWNL library, which contains a morphological analyzer
that can recover lemmas from an inflected form. Here's a short program
which prints the example sentence for every synonym of every synset:
Dictionary wn = Dictionary
.getInstance(new
URL("file:///path/to/WordNet31/properties_file.xml")
.openStream());
MorphologicalProcessor mp = wn.getMorphologicalProcessor();
for (POS pos : POS.values()) {
Iterator<Synset> synsetIterator = wn.getSynsetIterator(pos);
while (synsetIterator.hasNext()) {
Synset synset = synsetIterator.next();
String[] examples = synset.getGloss().split(";");
for (Word word : synset.getWords()) {
for (int i = 1; i < examples.length; i++) {
for (String exampleWord : examples[i].toLowerCase()
.replaceAll("[^a-zA-Z ]", " ").split("\\s+")) {
// Dummy lookup to work around Issue 6
mp.lookupAllBaseForms(pos,
exampleWord).contains(word.getLemma());
if (mp.lookupAllBaseForms(pos,
exampleWord).contains(word.getLemma())) {
System.out.println("Synonym " + word.getLemma()
+ " of synset " + synset.getOffset()
+ pos.getKey() + " has example" +
examples[i]);