I'm having what I'm sure is a "bonehead problem", but I'm struggling to find a way to deal with ValueErrors while trying to parse sentences in my custom corpus. I would expect nltk to continue normally if it can't parse a sentence if the sent doesn't contain the pattern, the sent is malformed, etc. But, apparently it stops the show.
Is there a way to surpress errors with a debug_level? Or perhaps I'm missing the correct way to handle this? I see ChunkString supports debug_level, but RegexpParser does not.
Ex:
The jury's still out on the Freestyle.
ValueError: Transformation generated invalid chunkstring:
<DET><N><><ADV><PRO><P><DET>{<NN>}<.>
Code ex:
from nltk.chunk import RegexpParser
cp = RegexpParser('''
NP: {<DT>? <JJ>* <NN>*} # NP
P: {<IN>} # Preposition
V: {<V.*>} # Verb
PP: {<P> <NP>} # PP -> P NP
VP: {<V> <NP|PP>*} # VP -> V (NP|PP)*
''')
... #moving through sents in the corpus
result = cp.parse(pos_tagged)
print result
Thanks in advance,
Chad