Hello,
I have recently cloned Pattern from GitHub on a Macbook running OS X 10.9.5.
It seems that there is a bug with the way parsetree() function handles the punctuation at the end of a sentence. Here is what I observe:
from pattern.en import parsetree
print parsetree("Word1 Word2 Word3")
This works as expected with the output:
[Sentence('Word1/NNP/B-NP/O Word2/NNP/I-NP/O Word3/NNP/I-NP/O')]
But then I add a comma at the far end of the sentence like:
print parsetree("Word1 Word2 Word3.")
This returns a
list index out of range error as follows:
File "/pattern/pattern/text/__init__.py", line 1115, in find_tokens and tokens[j] in ("...", ".", "!", "?", EOS) or tokens[j] in quotes:
IndexError: list index out of range
I did not have this problem two weeks ago but back then I was using a Linux machine though.
Thank you in advance for your time.
Cheers,
ilker