problem using nltk.RegexpTagger

23 views
Skip to first unread message

Yosr Eman

unread,
Dec 22, 2014, 2:32:37 AM12/22/14
to nltk-...@googlegroups.com
hello,

i'm using nltk.RegexpTagger(patterns), and wrote my patterns as follow
#========================================================
patterns = [
#=========================adjectives=====================
(r'.*ful$', 'JJ'),
(r'.*ious$', 'JJ'),
(r'.*ble$', 'JJ'),
(r'.*ic$', 'JJ'),
(r'.*ive$', 'JJ'),
(r'.*ic$', 'JJ'),
(r'.*est$', 'JJ'),
#=========================article====================
.... some other patterns
#===========================Noun==========================
(r'.*\'s$', 'NN$'),               # possessive nouns
(r'.*s$', 'NNS'),                 # plural nouns
(r'.*ation$', 'NN'),
(r'.*ism$', 'NN'),  #capitalism
(r'.*ment$', 'NN'), #assignment
(r'.*ness$', 'NN'), #sadness
(r'.*ance$', 'NN'), #acceptance
(r'.*ful$', 'NN'), #greatful
#==========================================================
(r'^-?[0-9]+(.[0-9]+)?$', 'CD'),  # cardinal numbers
(r'.*', 'NN')                     # nouns (default)
]
it works good, but my problem is setting pattern according to the previous word, for example: if the previous word is "AT" the next word will be "NN", could you please help me in this?

thanks for your time

Alexis Dimitriadis

unread,
Dec 27, 2014, 10:19:30 AM12/27/14
to nltk-...@googlegroups.com
Hi Yosr,

The regexp tagger only looks at the word itself. To take the previous word into account, you need one of the other taggers, e.g. the bigram tagger. You can chain together a bigram tagger and a regexp tagger (and more). See chapter 5 of the NLTK book. (You can also train them automatically instead of writing the rules by hand.)

Alexis

Dr. Alexis Dimitriadis | Assistant Professor and Senior Research Fellow | Utrecht Institute of Linguistics OTS | Utrecht University | Trans 10, 3512 JK Utrecht, room 2.33 | +31 30 253 65 68 | a.dimi...@uu.nl | www.hum.uu.nl/medewerkers/a.dimitriadis

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yosr Eman

unread,
Dec 30, 2014, 5:34:27 AM12/30/14
to nltk-...@googlegroups.com
Thx Alexis
Reply all
Reply to author
Forward
0 new messages