RegexpParser Error: Transformation generated invalid chunkstring?

506 views

Skip to first unread message

Chadobado

unread,

Apr 15, 2013, 4:14:00 PM4/15/13

to nltk-...@googlegroups.com

I'm having what I'm sure is a "bonehead problem", but I'm struggling to find a way to deal with ValueErrors while trying to parse sentences in my custom corpus. I would expect nltk to continue normally if it can't parse a sentence if the sent doesn't contain the pattern, the sent is malformed, etc. But, apparently it stops the show.

Is there a way to surpress errors with a debug_level? Or perhaps I'm missing the correct way to handle this? I see ChunkString supports debug_level, but RegexpParser does not.

Ex:

The jury's still out on the Freestyle.

ValueError: Transformation generated invalid chunkstring:

Code ex:

from nltk.chunk import RegexpParser

cp = RegexpParser('''

NP: {<DT>? <JJ>* <NN>*} # NP

P: {<IN>} # Preposition

V: {<V.*>} # Verb

PP: {<P> <NP>} # PP -> P NP

VP: {<V> <NP|PP>*} # VP -> V (NP|PP)*

''')

... #moving through sents in the corpus

result = cp.parse(pos_tagged)

print result

Thanks in advance,

Chad

Steven Bird

unread,

Apr 15, 2013, 4:59:42 PM4/15/13

to nltk-users

Hi Chad,

You could set the debug_level to zero (http://nltk.org/api/nltk.chunk.html#nltk.chunk.regexp.ChunkString), or you could catch the ValueError and ignore it (use try..except syntax).

The chunk string is invalid because it contains an empty item <>. Could this be due to an empty tag ("") in your input to the chunker?

-Steven Bird

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward

0 new messages