How to Train NLTK NE CHUNKER

113 views
Skip to first unread message

Agin das M

unread,
Dec 7, 2017, 7:21:26 AM12/7/17
to nltk-users
Hi,
I would like to train nltk ne chunker?
how can I do that ??

Denzil Correa

unread,
Dec 7, 2017, 7:51:56 AM12/7/17
to nltk-users
Did you try something out yourself? There are many simple links out there regardless including the NLTK book



--Regards,
Denzil
http://correa.in
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dimitriadis, A. (Alexis)

unread,
Dec 7, 2017, 7:57:14 AM12/7/17
to nltk-...@googlegroups.com
Hi Agin,

Take a look at the nltk book, free at http://nltk.org/book. It explains in detail how to train an NP chunker. For a named entity chunker you follow the same process, just with a corpus that annotates named entities rather than NPs. 

Alexis

Dr. Alexis Dimitriadis | Assistant Professor and Senior Research Fellow | Utrecht Institute of Linguistics OTS | Utrecht University | Trans 10, 3512 JK Utrecht, room 2.33 | +31 30 253 65 68 | a.dimi...@uu.nl | www.hum.uu.nl/medewerkers/a.dimitriadis

Jay Pratap Pandey

unread,
Dec 15, 2017, 5:35:47 PM12/15/17
to nltk-users
Hi 
but i am still unable to trained the ne_chunk for Indian names
How can i solve this?

Denzil Correa

unread,
Dec 16, 2017, 4:55:35 AM12/16/17
to nltk-users
It will help if you specifically post (with code) what you did.

--Regards,
Denzil
http://correa.in

Jay Pratap Pandey

unread,
Dec 18, 2017, 6:58:52 AM12/18/17
to nltk-users
import nltk
from textblob.classifiers import NaiveBayesClassifier as NBC
from textblob import TextBlob

infile = open('Indian_names.csv', 'r')
training_data = infile.read()

infile1= open('Indian_names.csv', 'r')
testing_data infile1.read()


model = NBC(training_data)
print("Accuracy:", model.accuracy(test_data))

Dimitriadis, A. (Alexis)

unread,
Dec 18, 2017, 7:21:08 AM12/18/17
to nltk-...@googlegroups.com
This code uses textblob, not the nltk. Textblob provides wrappers around various nltk functions, but you cannot expect nltk users to know how to use it. I don’t.

The textblob code you included has nothing to do with the chapter of the nltk book that you say you read, or with named entity recognition for that matter. I recommend actually reading chapter 7 of the nltk book, and some prior chapters to help you understand it. It’s all laid out pretty clearly.

However, be warned that you can’t do named entity recognition with a list of names; you need an annotated corpus. Otherwise you’re just doing trivial word lookup.

Alexis

PS. It’s a terrible idea to “test” with the same data you trained with.


Dr. Alexis Dimitriadis | Assistant Professor and Senior Research Fellow | Utrecht Institute of Linguistics OTS | Utrecht University | Trans 10, 3512 JK Utrecht, room 2.33 | +31 30 253 65 68 | a.dimi...@uu.nl | www.hum.uu.nl/medewerkers/a.dimitriadis

Reply all
Reply to author
Forward
0 new messages