Part of speech tagging and Named Entity Recognition with song titles

109 views
Skip to first unread message

Martin Breidenbach

unread,
Feb 7, 2017, 3:11:11 PM2/7/17
to nltk-users
Hello ladies and gentleman,

I do face a smaller problem and I hope that some of you can enlighten me.

We want to check requirements for ambiguities, but, as you mostlikely, a lot of variables and methods have names which describe them best, and thus, the variable is called 'tea cooking hot' without marks and whatever might be helpfull to detect them with regulary based approaches. So I decided to train a model for these names (Haven't figured yet how to do this precisly - but I hope I can handle this).

The problem for know is that I first need to POS-Tag them and then NE_Chunk them, but the pos_tagger might have decided differently on the probability of a tag, if the tagger knew this was not a verb but part of named entity. Or does the context does not effect the likelihood of a tag?

best greetings and thank you

Martin

Dimitriadis, A. (Alexis)

unread,
Feb 8, 2017, 6:29:31 AM2/8/17
to nltk-...@googlegroups.com
Hi Martin,

I’m not sure I understand your question correctly, but I think you are asking about tagging song titles. Don’t worry about the chicken-and-egg problem of named entities; POS tagging is a low-level probabilistic process, and it will be sufficient foundation for the next step. 

Titles in general have a distinct syntax in English (fewer determiners, for example), so in principle you'll get better mileage if you could a POS tagger on song titles only. However, for this you’d need a quality tagged corpus of titles to train with. I’d suggest you use the tools you have, evaluate the quality of the result, and then decide if there’s a problem that needs solving.

Best,

Alexis

Dr. Alexis Dimitriadis | Assistant Professor and Senior Research Fellow | Utrecht Institute of Linguistics OTS | Utrecht University | Trans 10, 3512 JK Utrecht, room 2.33 | +31 30 253 65 68 | a.dimi...@uu.nl | www.hum.uu.nl/medewerkers/a.dimitriadis

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Breidenbach

unread,
Feb 9, 2017, 4:40:27 AM2/9/17
to nltk-users
Thank you Alexis,

I wasn't aware of the low-level probablistic process. I belive I can now simply move on.

Thank you
Reply all
Reply to author
Forward
0 new messages