Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

htdig: Foreign dictionaries and word stemming ?

2 views
Skip to first unread message

Alexander Bergolth

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
On Wed, 20 Jan 1999, Stephan Gilbert wrote:

> Has anyone in the community experimented with foreign language
> dictionaries ? I did download the german "ispell" compatible
> dictionary and affix file for htfuzzy. I did run htfuzzy on it
> but stopped it after 3 days (200Mhz Pentium running 2.0.34 Linux).
> The data base file did gow from approx. 2 MBytes to 6. How can
> you test to see whether it actually accomplishes something sensible?

I think that the DB formats are not binary compatible between different
architectures, but maybe a dumped version of the german root2word and
word2root DBs will help you.

I have dumped my databases (it took several weeks of rendering for those)
using db_dump from the Berkeley DB distribution, they are avaliable at

http://leo.wu-wien.ac.at/htdig/

(or http://strike.wu-wien.ac.at/~leo/htdig/)

the dumped files are gzipped, you should be able to create the databases
with
gzip -cd root2word.dump.gz | db_load root2word.db
gzip -cd word2root.dump.gz | db_load word2root.db

I tried that and the resulting database was smaller than the original.
I dumped the result again and the two dumps are identical.

Could someone try if the DBs are OK?

Cheers,
Leo

-----------------------------------------------------------------------
Alexander (Leo) Bergolth l...@leo.wu-wien.ac.at
WU-Wien - Zentrum fuer Informatikdienste http://leo.wu-wien.ac.at
Info Center
In a world without walls and fences, who needs windows and gates?

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-...@sdsu.edu containing the single word "unsubscribe" in
the body of the message.

0 new messages