[nltk-users] how to run brill tagger,which is built in python?

fareena

unread,

May 1, 2010, 10:09:25 AM5/1/10

to nltk-users

hi everyone,
i m MS student doing my research on POS tagging for Urdu language
using Brill(Transformation Based Learning) approach,wen i came to know
that its already built in python ,so m trying to run this for the
English language just to know that how it works...but as i m new(just
know how to run and save program) to python so facing hell of problems
and just stuck....kindly help me by letting me know with steps of
code for brill tagger, so that i can come to know that how it works
for English language...m really helpless aand wid short of time,plz
help

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To post to this group, send email to nltk-...@googlegroups.com.
To unsubscribe from this group, send email to nltk-users+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nltk-users?hl=en.

Jacob Perkins

unread,

May 1, 2010, 11:37:15 AM5/1/10

to nltk-users

Hi,

There's some examples at http://streamhacker.com/2008/12/03/part-of-speech-tagging-with-nltk-part-3/
Also take a look at the demo function in the source:
http://nltk.googlecode.com/svn/trunk/doc/api/nltk.tag.brill-pysrc.html

Hope that helps,
Jacob
---
http://streamhacker.com
http://twitter.com/japerk

fari

unread,

May 5, 2010, 12:47:52 PM5/5/10

to nltk-users

hi ,
i tried the first link ,but getting error,i.e 'backoff_tagger' is not
defined
let me paste as it is:
IDLE 2.6.2
>>> import nltk.tag
>>> from nltk.tag import brill
>>> raubt_tagger = backoff_tagger(train_sents, [nltk.tag.AffixTagger,nltk.tag.UnigramTagger, nltk.tag.BigramTagger, nltk.tag.TrigramTagger],backoff=nltk.tag.RegexpTagger(word_patterns))

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
raubt_tagger = backoff_tagger(train_sents,
[nltk.tag.AffixTagger,nltk.tag.UnigramTagger, nltk.tag.BigramTagger,
nltk.tag.TrigramTagger],backoff=nltk.tag.RegexpTagger(word_patterns))
NameError: name 'backoff_tagger' is not defined

m using python 2.6.2,kindly tell y back_off tagger is not accessable

On May 1, 8:37 pm, Jacob Perkins <jap...@gmail.com> wrote:
> Hi,
>

> There's some examples athttp://streamhacker.com/2008/12/03/part-of-speech-tagging-with-nltk-p...

> Also take a look at the demo function in the source:http://nltk.googlecode.com/svn/trunk/doc/api/nltk.tag.brill-pysrc.html
>
> Hope that helps,
> Jacob

> ---http://streamhacker.comhttp://twitter.com/japerk

fari

unread,

May 5, 2010, 4:56:25 PM5/5/10

to nltk-users

hi jacob,
m waiting for response please, so dat i can proceed.

Steven Bird

unread,

May 5, 2010, 6:12:43 PM5/5/10

to nltk-users

This error means that you didn't define a backoff tagger. Somewhere
earlier in your code you need a line:

>>> backoff_tagger = ...

Only then can you use it on the right-hand side:

>>> raubt_tagger = backoff_tagger(...)

Please see chapter 5 of the NLTK book, for detailed information about
running taggers.
http://nltk.org/book

Jacob Perkins

unread,

May 6, 2010, 11:08:32 AM5/6/10

to nltk-users

Hi fari,

backoff_tagger is a helper function I defined in part 1:
http://streamhacker.com/2008/11/03/part-of-speech-tagging-with-nltk-part-1/
It's simply a way to construct a backoff tagger with a list of tagger
class names. You can also construct one manually as detailed in the
NLTK Book.

Jacob
---
http://streamhacker.com
http://twitter.com/japerk

fari

unread,

May 16, 2010, 4:57:29 PM5/16/10

to nltk-users

hi all,
i m still confuse and the point is that i have run brill tagger for
english...but my research work is based on Urdu LANGUAGE,now how to
run it for urdu language(south asian language,agglunative) and as in
nltk we have only support for hindi,english,marathi,telugu etc but not
for Urdu language, but this brill is specifically designed for english
language...though it is written every where in literature and research
papers that it is language independent model...if it is so than can we
run the templates that were defined by brill for english on the urdu
language ...would they work? how to design learning algorithm my own?
i am really confused in the third step that after getting temporarily
tagged data through initial annotator, how templates and rules will be
generated through algo?does nltk fns and apis will be sufficient to do
dis job...
i m done with literature stuff but for practical work m stuck on this
step.. what to do...plz help!!!

On May 6, 8:08 pm, Jacob Perkins <jap...@gmail.com> wrote:
> Hi fari,
>

> backoff_tagger is a helper function I defined in part 1:http://streamhacker.com/2008/11/03/part-of-speech-tagging-with-nltk-p...

> It's simply a way to construct a backoff tagger with a list of tagger
> class names. You can also construct one manually as detailed in the
> NLTK Book.
>
> Jacob

fari

unread,

May 19, 2010, 4:44:09 PM5/19/10

to nltk-users

hi all,
I like to run brill demo function with urdu pos tagged i have urdu
corpus reader and a POS tagged file can you help me.

1) how and where i need to define templates for urdu . Is template dat
are given with nltk(brill) are compatible to run with urdu.
2) If i want to run brill demo function on urdu tag what should i do ?
waiting for reply plzzzzz.

Reply all

Reply to author

Forward