Custom Entity NLTK

167 views
Skip to first unread message

Wei Lai

unread,
Oct 23, 2016, 4:12:27 AM10/23/16
to nltk-users
Hi I'm new in nltk. i'm studying this long ago 4 or 5 months.

For my University I have to create a entity detector in natural language, My problem is with the entities I don't know how I add more entities or i have to create a new corpus tagged?.

I know the corpus are tagged in IOB format but if I:

Santiago N B-LOC

I think this is correct format, but if I create a new tag? :

Santiago N B-TOL

Is an example TOL  for TOOL  is sound stupid but is for example.



Then i hope i explained my question :)

If I used bad the IOB format please explain me

Dimitriadis, A. (Alexis)

unread,
Oct 23, 2016, 8:35:29 AM10/23/16
to nltk-...@googlegroups.com
Yes, it’s possible to train a recognizer for whatever entity types you need. First you must create your own training corpus, then you can train a classifier on it that will recognize the entities you need.

The nltk does not provide any facilities for annotating a new corpus-- unless you count the existing POS tagger and named entity recognizer, which you can use to pre-process a corpus before you correct the results and annotate your own categories by hand. Once you have a suitable training corpus in IOB format, take a look at chapters 6 and 7 of the NLTK book. Chapter 7 shows you in detail how to use a recognizer for a chunking task. In chapter 6, you’ll read about how to select and apply features for any statistical classifier.

Alexis

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages