Finally I trained a tagger for spanish, based on the NLTK CESS_ESP corpus. Many spanish words are missing in this corpus. I know where the corpus is saved and I opened the multiple files of it. ¿ There's a guide or a way to add more words to the existing data trees of this corpus?
nltk_data/corpora/cess_esp
example of a tree:
(sn-SUJ
(espec.mp
(da0mp0 los el))
(grup.nom.mp
(ncmp000 abogados abogado)
(sp
(prep
(sps00 de de))
(sn
(espec.fs
(da0fs0 la el))
(grup.nom.fs
(ncfs000 empresa empresa))))))
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.