Smalltalk model for the Wordnet3.0 based lex db now committed

2 views
Skip to first unread message

Klaus D. Witzel

unread,
Dec 20, 2008, 6:54:33 AM12/20/08
to mindlog-dev, nman...@gmail.com
Hi list,

the project repository is now connected to the list and it will
automatically post commits and issues. Nevertheless I set this up as
moderated, since the first commit message had 100% of the source code
in the message body and I do not want to flood list traffic with such
large messages.

The source code is here (click the + for a gist)

- http://code.google.com/p/mindlog/source/detail?r=170

In this .cs file, steps are mentioned for installation (just the
WordNet raw data files must be downloaded from Princeton.edu). On my
dual-core P[L]entium with ~2GHz, class initialization takes ~1.5
minutes to parse the files and instantiate the model.

The instantiated model consumes ~15.5 MB with all but *) nodes and
relations of the WordNet hypergraph (two arcs per edge/per relation),
and there is not much pressure on GC since arcs are in WordArrays.

In class WordNetTests method #testIntegrity takes ~30 seconds for
validating the integrity of all relations (arcs and nodes).

The next milestone is the protocol for evaluation of Mindlog
expressions, for use by the parser. Let me know if you guys want to do
something with/for the plain English parser, all help+feedback is
appreciated :)

Cheers,
Klaus

*) relations "see also" and "gloss" are not of importance at the
moment, both will be added later for use by the GUI.

--
"If at first, the idea is not absurd, then there is no hope for it".
Albert Einstein
Reply all
Reply to author
Forward
0 new messages