--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To post to this group, send email to link-g...@googlegroups.com.
To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.
Yes. LG has the BSD license, so you can do pretty
much almost anything with it. the regex2.dll is probably
gpl, but regex sources are widely available so it s not
a problem. I assume that msvcr100d.dll has the usual
microsoft license that allows redistribution.
> This was the first step that I needed to do in order to create
> a Python wrapper for Link Grammar that would run under Windows.
> The Python wrapper linked from the AbiSource site is aimed
> at Fedora Linux
Be sure to contact the python wrapper maintainer, and send him
your fixes/updates. If he doesn't respond, then I might be able
to host it as a part of the main LG source distribution. Let me
know.
BTW, LG should have a link-grammar.dll as well as a
link-parser.exe that contains only the command-line client,
right? The exe would load the various dll's.
--linas
I found the 2 files so far.
Sounds like sending it compiled is ok :)
--
--
> I tried for kicks a run on sentence that is maybe 64 words in length
parse time goes roughly as N^3 N=number of words.
> Got it up one time though.... Does this have the capability to open a
> file and output to another by any chance?
Try !var and !help at the command line.
> That may be in a faq that I
> missed?
Read the README file.
--linas
> C:\Grammar>link-grammar
> link-grammar: Warning: locale was not UTF-8; force-setting to
> en_US.UTF-8
[...]
> link-grammar: Warning: The word "â?" found near line 8445 of en
> \4.0.dict matches
> the following words:
> â?
> This word will be ignored.
Despite trying to set a UTF-8 locale, somehow Windows still didn't
actually do so. The dictionaries contain UTF-8 symbols, e.g.
for the Euro, the british pound symbol, a few miscellaneous
parenthesis types used in Asian countries, etc. There's also
the random accented word. If the locale isn't UTF-8, then
the string compares will fail for these words, the dictionary
loader will get confused, and you won't be able to correctly
parse text that contains these symbols.
--linas
On 4 July 2011 01:47, KokomoJ0 <mys...@wi.rr.com> wrote:
> Apparently its not set up to recognize compounded words like "land-
> trustee-council" and [al]lienble and things like dropped sections
> ". . . .", ;- and "-----"runon/together etc...
>
> If I were to try and add features like that what would be my best
> approach and would it gobble up too many resources?
The problem is not one of gobbling resources. The problem is of
writing rules that are too loose, which result in a combinatoric
explosion of possible parses. Its the combinatoric explosion that
causes resource consumption problems (BTW, I know of a way
of fixing this, but this would be a rather long and involved project).
The correct way of extending the dictionary is to first determine
what the correct parse should be. This can be done by exploring
sentences with similar constructions that do parse correctly.
Then try to figure out how to create a new rule that does what
you want. This is not easy, and takes some practice. Start
with easy things first, move on to harder things.
If you are trying to add new, non-verbal elements, such as
strings of dashes, or ellipsis, make sure that you collect
a large collection of example sentences to work with. You
will need this to get a stronger idea of the kinds of constructions
that are allowed, and what's prohibited. You need to write
narrow rules that only parse what's allowed -- its very, very
easy to write rules that are general and broad, and parse just
about any non-sense sentence you type in. Parsing non-sense
is not really a goal.
-- Linas