Dear GF friends,
I assembled a simple web service which makes it easier for you to test
the first prototype of the robust statistical parser. This is a
combination of the English resource grammar and a statistical model
which I trained on Penn Treebank. The work is still in very early
stage but there is at least something working. The web interface is
here:
http://www.grammaticalframework.org/demos/robust-parser/
Few of the known limitations are:
- the tokenizer and the named entity recognizer are very primitive.
For better results you should start the sentence with lowercase letter
unless if the word is a proper name or one of the English adjectives
that are written with capital letter (English, Swedish, South, etc).
Do not terminate the sentence with a dot.
- The statistical model is not lexicalized. This means that you
should not hope to get the right PP attachement. Currently the
attachement is always linked to the verb since this is more common in
the treebank.
- The parser is still slow and memory hungry for the sentences that
are not in the scope of the grammar. Be prepared for slow response and
sometimes even failure.
- It is still a very new thing and it might have a lot of bugs.
Best Regards,
Krasimir