Are you just trying to determine whether an input string is a
question? What form does "proper" take?
Determining whether an input string is a question or a statement is
not very difficult in English. A trailing question mark and certain
leading words (what/where/when/how) are pretty clear indicators.
Raising the min/max_null_count is entirely appropriate when you aren't
getting a valid parse, but that begs the question of what you are
after with regards to a "proper" question.
--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To post to this group, send email to link-g...@googlegroups.com.
To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.
Can you tell me more about this project? Is the problem that link-grammar
is giving valid parses for invalid sentences, or that too many valid sentences
are not parsed?
I've tried to make the parser coverage broader (i.e. have it accept a larger
number of "good" sentences"), but the cost of this is that it now accepts a
far higher rate of "bad" sentences as well. (too many people want to use lg
parse tweets).
--linas
--
I fixed the first sentence, and checked in the changes into the svn repo;
they'll appear in version 4.7.1
There are two approaches to fixes: handling them, case by case,
in the dictionary file. This ranges from being easy, to sometimes quite
difficult, and often as rather tedious. But this is the only practical,
short-term approach.
I have ideas for long-term solutions, but no time to pursue them; these
ideas involve certain complex collections of graphs gleaned from text.
I'd love to do this work, but am unfortunately employed doing something
completely different.
-- Linas
I'm rather surprised by this low figure. I've got several batches of test
sentences, about 3000 total, and accuracy hovers around 90%, despite
my regularly entering new, bad sentences to the batch.
Hmm. But I now realize that you must be parsing 'one flew over
the cuckoos nest'. Link-grammar fails horribly on dialogue, because it
has no idea where quotations start and end. It will not do well on
novels, and maybe only a little better on screenplays.
One way to fix this is to pre-process text, to convert input such as
Then John said "Mary, please go now"
into a pair of sentences:
Then John said X
and
Mary, please go now
I think that could raise percentages significantly.
--linas
--