RegexpParser Error: Transformation generated invalid chunkstring?

506 views
Skip to first unread message

Chadobado

unread,
Apr 15, 2013, 4:14:00 PM4/15/13
to nltk-...@googlegroups.com
I'm having what I'm sure is a "bonehead problem", but I'm struggling to find a way to deal with ValueErrors while trying to parse sentences in my custom corpus.  I would expect nltk to continue normally if it can't parse a sentence if the sent doesn't contain the pattern, the sent is malformed, etc.  But, apparently it stops the show.

Is there a way to surpress errors with a debug_level?  Or perhaps I'm missing the correct way to handle this?  I see ChunkString supports debug_level, but RegexpParser does not.

Ex:

The jury's still out on the Freestyle.

ValueError: Transformation generated invalid chunkstring:
  <DET><N><><ADV><PRO><P><DET>{<NN>}<.>

Code ex:

from nltk.chunk import RegexpParser

cp = RegexpParser('''
  NP: {<DT>? <JJ>* <NN>*} # NP
  P: {<IN>}           # Preposition
  V: {<V.*>}          # Verb
  PP: {<P> <NP>}      # PP -> P NP
  VP: {<V> <NP|PP>*}  # VP -> V (NP|PP)*
  ''')

... #moving through sents in the corpus

result = cp.parse(pos_tagged)
print result



Thanks in advance,

Chad

Steven Bird

unread,
Apr 15, 2013, 4:59:42 PM4/15/13
to nltk-users
Hi Chad,

You could set the debug_level to zero (http://nltk.org/api/nltk.chunk.html#nltk.chunk.regexp.ChunkString), or you could catch the ValueError and ignore it (use try..except syntax).

The chunk string is invalid because it contains an empty item <>. Could this be due to an empty tag ("") in your input to the chunker?

-Steven Bird




--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages