Hello,
Could you point me out to any of the German-related NLP tools within
NLTK (term extraction, POS tagging, ontology etc)?
I've found this google doc:
https://docs.google.com/document/d/1rdn0hOnJNcOBWEZgipdDfSyjJdnv_sinuAUSDSpiQns/edit?hl=en
, which describes usage of the proprietary GermaNet (technically
fine in my case since I'd be using it for academic research).
There also seem to be some german stemmers mentioned here:
http://nltk.org/api/nltk.stem.html
, a POS tagger (based on the TIGER corpus) here:
https://github.com/ptnplanet/NLTK-Contributions
, an europarl corpus including german:
http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.europarl_raw-module.html#german
.
So i'd like to hear about any alternatives or suggested use cases.
I'll most likely try to at least do term extraction (based on term
frequency, stemming, stop words, maybe even synonyms), POS tagging
(although perhaps even as part of term extraction) and later on
ontology construction (based on previous steps and perhaps relying
on existing general like lexical databases such as GermaNet or babel
net). I don't expect all of these steps to be solvable by relying on
an existing NLTK tool, but some probably can.
Thanks in advance!
Gajo Petrovic