New kid just starting with LG

47 views
Skip to first unread message

KokomoJ0

unread,
Jun 29, 2011, 2:38:04 PM6/29/11
to link-grammar
Hello everyone, I have just become acquainted with LinkGrammar.

I am using the Windows OS. I have placed text in the introductory web
version and found it to be impressive. Presuming its more robust I
would like to try the latest 4.7....n version if possible in the
Windows OS (if an executable exists), on long paragraph length
sentences to get a better idea of its capabilities. Presently I do
not have a compiler installed, (havent programmed since the turbo c
3.0 days), so if a windows executable exists and a bit of
documentation on proper pathing for hooks would be greatly
appreciated. If not any other direction or advice on the best way
under these circumstances (or otherwise), to kick the tires would be
very welcome.
tia
JJ

Bill Hayes

unread,
Jun 29, 2011, 6:28:10 PM6/29/11
to link-g...@googlegroups.com
Hi JJ,
I have compiled the link grammar 4.7.4 code under MS VC++ 2010
and generated an .exe file which works under Vista and WinXP.
It needs access to two additional non link grammar files: 
regex2.dll and msvcr100d.dll.

Can I legally post my compiled .exe file online?
For example to a LaunchPad.net account?

I can post a link to a Microsoft page that provides a program 
to install the msvcr100d.dll file, and I can l post the
regex2.dll file (or a link to a SourceForge page for it).

I'm happy to share the file if that's legal.

This was the first step that I needed to do in order to create
a Python wrapper for Link Grammar that would run under Windows.
The Python wrapper linked from the AbiSource site is aimed
at Fedora Linux.

Regards,
Bill Hayes






--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To post to this group, send email to link-g...@googlegroups.com.
To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.


KokomoJ0

unread,
Jun 29, 2011, 9:19:58 PM6/29/11
to link-grammar
Best I can gather as long as the LA is included, I think this is
applicable?

http://www.link.cs.cmu.edu/link/license.html

Link Grammar License

Below is a the license for using the Link Grammar system. It was
modified in January 2005 to make it compatible with the GPL. Roughly
what it says is that the code and data can be used, free of charge for
all commercial and non-commercial purposes, with a very mild
requirement that this license be included with further distributions.

Copyright (c) 2003-2004 Daniel Sleator, David Temperley, and
John
Lafferty. All rights reserved.

Redistribution and use in source and binary forms, with or
without
modification, are permitted provided that the following
conditions
are met:

1. Redistributions of source code must retain the above
copyright
notice, this list of conditions and the following
disclaimer.

2. Redistributions in binary form must reproduce the above
copyright
notice, this list of conditions and the following disclaimer
in
the documentation and/or other materials provided with the
distribution.

3. The names "Link Grammar" and "Link Parser" must not be used
to
endorse or promote products derived from this software
without
prior written permission. To obtain permission, contact
sle...@cs.cmu.edu

THIS SOFTWARE IS PROVIDED BY DANIEL SLEATOR, DAVID TEMPERLEY,
JOHN
LAFFERTY AND OTHER CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR
CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Daniel Sleator
Last modified: Sat Jan 15 17:19:52 EST 2005

Linas Vepstas

unread,
Jun 29, 2011, 10:17:21 PM6/29/11
to link-g...@googlegroups.com
On 29 June 2011 17:28, Bill Hayes <bhay...@gmail.com> wrote:
> Hi JJ,
> I have compiled the link grammar 4.7.4 code under MS VC++ 2010
> and generated an .exe file which works under Vista and WinXP.
> It needs access to two additional non link grammar files:
> regex2.dll and msvcr100d.dll.
> Can I legally post my compiled .exe file online?

Yes. LG has the BSD license, so you can do pretty
much almost anything with it. the regex2.dll is probably
gpl, but regex sources are widely available so it s not
a problem. I assume that msvcr100d.dll has the usual
microsoft license that allows redistribution.

> This was the first step that I needed to do in order to create
> a Python wrapper for Link Grammar that would run under Windows.
> The Python wrapper linked from the AbiSource site is aimed
> at Fedora Linux

Be sure to contact the python wrapper maintainer, and send him
your fixes/updates. If he doesn't respond, then I might be able
to host it as a part of the main LG source distribution. Let me
know.

BTW, LG should have a link-grammar.dll as well as a
link-parser.exe that contains only the command-line client,
right? The exe would load the various dll's.

--linas

KokomoJ0

unread,
Jun 30, 2011, 12:59:47 AM6/30/11
to link-grammar
I found the 2 files so far.

Sounds like sending it compiled is ok :)

Bill Hayes

unread,
Jun 30, 2011, 1:38:58 AM6/30/11
to link-g...@googlegroups.com
@KokomoJ0
I've temporarily put the link-grammar.exe file in my Google Docs here:
until I can create a hosting site and post the new link here.
The regex2.dll and msvcr100d.dll need to be in the same folder as the .exe.
I copied the link-grammar 'en' folder (which includes a 'word' folder) into the
folder which contains the .exe and .dll's.

@Linas
Re your question about the .exe calling dll's ...
I created a console project which generates a single
.exe file which includes all of the link-grammar code; the only dll's
required are the regex2 and mscvr100d.

Re a python wrapper, my goal would be to create a Windows link-grammar.dll
and a python module that accesses it, so this would be a Windows alternative
to Mario Ceresa's  Linux solution, not a fix or update.  
However if I can create a python module with the same
interface as Mario's I will do so.  I'll post a message here if I conquer SWIG
and get it all to work!

Regards
Bill


On Wed, Jun 29, 2011 at 9:59 PM, KokomoJ0 <mys...@wi.rr.com> wrote:
I found the 2 files so far.

Sounds like sending it compiled is ok :)

KokomoJ0

unread,
Jun 30, 2011, 3:05:35 AM6/30/11
to link-grammar
Thanks Bill!

Got as far as a 4.0.regex open error.....

It seems that file may not have been included with the release?

Bill Hayes

unread,
Jun 30, 2011, 4:09:38 AM6/30/11
to link-g...@googlegroups.com
KoKomoJ0,
Your unzipped files should include  link-grammar\data\en folders
and the 'en' folder conatins 11 files named '4.0.xxxx',
one of which is 4.0.regex, and also contains a 'words' folder.

So as long as 'link-grammar.exe' and the two dll's are in
the folder that contains the 'en' folder (named 'data' here) it should
find the en/4.0.regex file.

If that's not working, can you send me the output line
that starts: "Info: data_dir=C:\ ..."
The folder at the end of that name is where it expects to find
the .exe and two dll files and the 'en' folder which has the '4.0.' files.

If you can't get that to work I'll try uploading a zip of all of the 
needed files to Google Docs and posting the link to it.

Regards,
Bill





--

KokomoJ0

unread,
Jun 30, 2011, 9:29:19 PM6/30/11
to link-grammar
Thanks!

Ok took my time checked everything over and I think I sort of have it
working.

opened up a command prompt, it went through found everything it needed
to hook into, came back with a7 missing, ignored statement. then gave
me linkage at the prompt.

I tried for kicks a run on sentence that is maybe 64 words in length
with by pasting it into the dos window and it cranked away for some
time and went into a panic mode.

It seemed to label everything and I thought I would try another
sentence. This time it dumped me into the root directory and I have
not been able to get is to go to the linkage prompt again.

That is the play by play where I am at with it. Not sure what caused
that and at this point have not went any further with it.

Got it up one time though.... Does this have the capability to open a
file and output to another by any chance? That may be in a faq that I
missed? I will try to see what happened and why it no longer
functions a bit later. I did check to see that nothing was left in
ram so its not trying to run a mirror of itself. That as far as I
got....

Bill Hayes

unread,
Jun 30, 2011, 9:55:23 PM6/30/11
to link-g...@googlegroups.com
The program should not be modifying any of the files that it needs
in order to run, so your results seem strange.
(The .exe and .dll's are almost certainly read-only files.)
I would suggest deleting and re-installing all of the files.

Re long sentences, you might try entering the sentence into the 
to see what it generates.
I think I would have trouble composing a semantically correct 64 word
sentence, so I would not be surprised if that taxed the programs limits.

Bill



--

Linas Vepstas

unread,
Jun 30, 2011, 10:13:17 PM6/30/11
to link-g...@googlegroups.com
On 30 June 2011 20:29, KokomoJ0 <mys...@wi.rr.com> wrote:

> I tried for kicks a run on sentence that is maybe 64 words in length

parse time goes roughly as N^3 N=number of words.

> Got it up one time though....  Does this have the capability to open a
> file and output to another by any chance?

Try !var and !help at the command line.

> That may be in a faq that I
> missed?

Read the README file.

--linas

KokomoJ0

unread,
Jul 1, 2011, 1:44:20 AM7/1/11
to link-grammar
Ok I got the new instal to duplicate the original.

I took the files out of the 4.74 data\en and data\en\words and just
copied them into a dir on the c drive.

here are the results



C:\Grammar>link-grammar
link-grammar: Warning: locale was not UTF-8; force-setting to
en_US.UTF-8
found ModuleHandle for current program so try to get Filename for
current prog
ram
link-grammar: Info: GetModuleFileName=C:\Grammar\link-grammar.exe
found dir for current prog C:\Grammar
link-grammar: Info: data_dir=C:\Grammar
link-grammar: Info: object_open() trying C:\Grammar\en\4.0.dict
link-grammar: Info: Dictionary found at C:\Grammar\en\4.0.dict
link-grammar: Warning: The word "â?" found near line 8445 of en
\4.0.dict matches
the following words:
â?
This word will be ignored.
link-grammar: Warning: The word "â?" found near line 8445 of en
\4.0.dict matches
the following words:
â?
This word will be ignored.
link-grammar: Warning: The word "â?" found near line 21 of en
\4.0.affix matches
the following words:
â?
This word will be ignored.
link-grammar: Warning: The word "â?" found near line 21 of en
\4.0.affix matches
the following words:
â?
This word will be ignored.
link-grammar: Info: Dictionary version 4.7.4.
link-grammar: Info: Library version link-grammar-4.7.4. Enter "!help"
for help.
linkparser>

KokomoJ0

unread,
Jul 1, 2011, 2:17:24 AM7/1/11
to link-grammar
I know this is a really tough sentence, but I am testing its limits :)

Here is the sentence I used that blew it away last time. I tried it
again and it does not blow it away anymore but it does kick out with
combination explosion, timeout and panic etc

It was contended, however, in argument, that, "though originally" the
first ten amendments were adopted as limitations on federal power,
yet, in so far as they secure and recognize fundamental rights—common-
law rights—of the man, they make them privileges and immunities of the
man as a citizen of the United States, and cannot now be abridged by a
state under the fourteenth amendment.


maybe split that in a few parts?

Bill Hayes

unread,
Jul 1, 2011, 3:01:03 AM7/1/11
to link-g...@googlegroups.com
On initial startup I get exactly the same output lines that you listed
(well, the root of the directory paths are different of course, but all else
is the same).
I have not tried to find out why the "â" character always generates 
the five warnings.

Below I list three versions of your sentence: nearly your original and two
shorter versions with the following output.  I think Linas' point is correct that
combinatorial explosion of phrases in long sentences overwhelms the program.
You might need to read up on its suggestion to 'set the max allowed disjunct cost lower'.
(I assume that's in a config file - I haven't gotten into LG that deeply.)


Output from link grammar for sentence:
"It was contended, however, in argument, that, though originally the first ten amendments were adopted as limitations on federal power, yet, in so far as they secure and recognize fundamental rights — common-law rights — of the man, they make them privileges and immunities of the man as a citizen of the United States, and cannot now be abridged by a state under the fourteenth amendment."

No complete linkages found.
Timer is expired!
Entering "panic" mode...
link-grammar: WARNING: Combinatorial explosion! nulls=16 cnt=292571136
Consider retrying the parse with the max allowed disjunct cost set lower.
Found 292571136 linkages (0 of 100 random linkages had no P.P. violations) at null count 16

----------------
However, when I trimmed the sentence down to this:
"It was contended, however, in argument, that, though originally the first ten amendments were adopted as limitations on federal power,they make them privileges and immunities of the man as a citizen of the United States, and cannot now be abridged by a state under the fourteenth amendment."

Link Grammar output:

No complete linkages found.
Timer is expired!
Entering "panic" mode...
Found 11520 linkages (50 of 100 random linkages had no P.P. violations) at null count 10
        Linkage 1, cost vector = (UNUSED=8 DIS=9 FAT=0 AND=0 LEN=38)

      +--------------MVx--------------+
      +----------MVa---------+        +------------Xc-----------+
 +-Ss-+                +--Xd-+-Xca-+Xd+---------Js---------+    |
 |    |                |     |     |  |                    |    |
it was.v-d [contended] , however.e , in [argument] [,] that.j-p , [though]


                    +-----------------S-----------------+
                    +-----------B**t----------+         |
      +------Ca-----+------Rn------+          |         |
      |       +--DD-+     +---Dmc--+----Spx---+         +---MVs--+--
      |       |     |     |        |          |         |        |
originally.e the first.a ten amendments.n were.v-d adopted.v-d as.e


      +-------------------Sp------------------+
      |        +--------Jp--------+           +-----------Opn----------+----
-Cs---+---Mp---+     +------A-----+           +--Ox-+        +---SJlp--+----
      |        |     |            |           |     |        |         |
limitations.n on federal.a power,they[?].n make.v them privileges.n and.j-n


------Mp---------+---Js--+     +---Js---+      +---Js---+
SJrp---+         |  +-Ds-+--Mp-+  +--Ds-+--Mp--+  +--DG-+      +-------Ss----
       |         |  |    |     |  |     |      |  |     |      |
immunities[!].n of the man.n as.e a citizen.n of the United States [,] [and]


                        +---------MVp---------+------------Js------
   +-----Ix----+        |       +--Js--+      |    +---------Ds----
---+     +--E--+---Pv---+--MVp--+ +-Ds-+      |    |        +-----A
   |     |     |        |       | |    |      |    |        |
cannot now.r be.v abridged.v-d by a state.n under the fourteenth.a


-----+
-----+
-----+
     |
amendment.s [.]

Press RETURN for the next linkage.
---------------

When I trimmed the sentence some more, it still reports No complete linkages found, but
it reports finding 50256 linkages.

Trimmed sentence:
"It was contended, however, that, though originally the first ten amendments were adopted as limitations on federal power,they make them privileges and immunities of the man as a citizen of the United States."

Output:

No complete linkages found.
Found 50256 linkages (129 of 1000 random linkages had no P.P. violations) at null count 1
        Linkage 1, cost vector = (UNUSED=1 DIS=4 FAT=0 AND=0 LEN=63)

                                                    +---------------------
      +---------------------MVs---------------------+-------------------Cs
      +----------------Ost---------------+          |          +----------
      +----------EBm---------+           |          |          |       +--
 +-Ss-+                +--Xd-+--Xc-+     |    +--Xd-+          |       +--
 |    |                |     |     |     |    |     |          |       |
it was.v-d [contended] , however.e , that.j-p , though.c originally.e the


------------------------------------------------------------------------------
------------------+
---COd------------+                                      +-------------------S
---DD----+        |                                      |        +--------Jp-
L--+     +---Dmc--+----Spx---+----Pv---+---MVs--+---Cs---+---Mp---+     +-----
   |     |        |          |         |        |        |        |     |
first.a ten amendments.n were.v-d adopted.v-d as.e limitations.n on federal.a


--------Xc--------------------------------------------------------------------
p------------------+                        +-----------------Mp--------------
-------+           +-----------Opn----------+----------Mp---------+---Js--+
-A-----+           +--Ox-+        +---SJlp--+----SJrp---+         |  +-Ds-+
       |           |     |        |         |           |         |  |    |
power,they[?].n make.v them privileges.n and.j-n immunities[!].n of the man.n


--------------------------------------+
                                      |
--+               +-------Js------+   |
  +---Js---+      |  +-----DG-----+   |
  |  +--Ds-+--Mp--+  |     +---G--+   |
  |  |     |      |  |     |      |   |
as.e a citizen.n of the United States .

Press RETURN for the next linkage.


Regards,
Bill



KokomoJ0

unread,
Jul 1, 2011, 10:35:58 AM7/1/11
to link-grammar
Thanks,

I am also poking around to see if I can find that timer control. It
seems to be the trigger and set the timer longer.

I did get it to take a 63 word sentence after adjusting the grammar.

The most memory usage was 200k and it survived without kicking out,
once it hit 217k it kicked out. That may be irrelevant.... just took
note of it.

Linas Vepstas

unread,
Jul 1, 2011, 4:17:11 PM7/1/11
to link-g...@googlegroups.com
On 1 July 2011 00:44, KokomoJ0 <mys...@wi.rr.com> wrote:

> C:\Grammar>link-grammar
> link-grammar: Warning: locale  was not UTF-8; force-setting to
> en_US.UTF-8

[...]


> link-grammar: Warning: The word "â?" found near line 8445 of en
> \4.0.dict matches
>  the following words:
>        â?
>        This word will be ignored.

Despite trying to set a UTF-8 locale, somehow Windows still didn't
actually do so. The dictionaries contain UTF-8 symbols, e.g.
for the Euro, the british pound symbol, a few miscellaneous
parenthesis types used in Asian countries, etc. There's also
the random accented word. If the locale isn't UTF-8, then
the string compares will fail for these words, the dictionary
loader will get confused, and you won't be able to correctly
parse text that contains these symbols.

--linas

KokomoJ0

unread,
Jul 4, 2011, 2:47:33 AM7/4/11
to link-grammar
nice, thanks everyone, its working and I am having fun with it. So
far it seems to do up to about 65 words if the sentence is
grammatically proper.

Apparently its not set up to recognize compounded words like "land-
trustee-council" and [al]lienble and things like dropped sections
". . . .", ;- and "-----"runon/together etc...

If I were to try and add features like that what would be my best
approach and would it gobble up too many resources?

Any thoughts or suggestions?

Bill Hayes

unread,
Jul 4, 2011, 10:14:52 PM7/4/11
to link-g...@googlegroups.com
First, the general rule in software is to only optimize after functionality is
done.  Don't worry about resources up front.

Second, my first thought would be to write a pre-processing script
which just removes the dashes in compound words (maybe after
confirming they aren't in the dictionary files), delete the '...' or replace
it with 'etc.' or something similar in meaning, and remove square
brackets. 
I would consider 'runontogether' as an infrequent error. I would think
spelling errors would be more common.  However if you had to
solve it you could tag it as a word not found the dictionaries,
and see if chopping off longer and longer pieces returns a real word
and if so, is the remainder one or more real words.

Again, decide up front if this is one of the important tasks you need
to do right away.  Personally, my big concern in using Link Grammar
is how to pick one of the many parses it returns for a sentence.
Note that 'Bill saw Bob' returns three parses!

Bill

Linas Vepstas

unread,
Jul 5, 2011, 7:54:48 PM7/5/11
to link-g...@googlegroups.com
Hi,

On 4 July 2011 01:47, KokomoJ0 <mys...@wi.rr.com> wrote:

> Apparently its not set up to recognize compounded words like "land-
> trustee-council" and [al]lienble and things like dropped sections
> ". . . .",  ;- and "-----"runon/together etc...
>
> If I were to try and add features like that what would be my best
> approach and would it gobble up too many resources?

The problem is not one of gobbling resources. The problem is of
writing rules that are too loose, which result in a combinatoric
explosion of possible parses. Its the combinatoric explosion that
causes resource consumption problems (BTW, I know of a way
of fixing this, but this would be a rather long and involved project).

The correct way of extending the dictionary is to first determine
what the correct parse should be. This can be done by exploring
sentences with similar constructions that do parse correctly.
Then try to figure out how to create a new rule that does what
you want. This is not easy, and takes some practice. Start
with easy things first, move on to harder things.

If you are trying to add new, non-verbal elements, such as
strings of dashes, or ellipsis, make sure that you collect
a large collection of example sentences to work with. You
will need this to get a stronger idea of the kinds of constructions
that are allowed, and what's prohibited. You need to write
narrow rules that only parse what's allowed -- its very, very
easy to write rules that are general and broad, and parse just
about any non-sense sentence you type in. Parsing non-sense
is not really a goal.

-- Linas

KokomoJ0

unread,
Jul 9, 2011, 4:17:42 PM7/9/11
to link-grammar
Several things;

Has anyone tried any of these compilers?

http://www.freebyte.com/programming/cpp/

I have the old borland tc+ 3.0 and 3.1 but I am having difficulty
locating it. I am guessing grabbing a free compiler may do the job
well enough?

-----------------------

As a side note; I noticed brackets are being used in the parsed
output. I tend to use brackets according to the GPO to denote
something informational that is to be omitted from the page content.
Here are a few samples



Use ( ) for parenthetical phrases or sentences and to inclose inserted
words following the name, "Q.," or ''A." If an entire sentence is in
( ) or [ ], the closing period
should be within the ( ) or [ ].

The following examples illustrate the use of brackets, colons, and
parentheses:

At end of sentence [Laughter.]; within a sentence [laughter].

The paper was as follows [reads]:

44 STYLE MANUAL.

I do not know. [Continues reading:]

The CHAIRMAN (to Mr. Smith).

Mr. KELLEY (to the chairman).

SEVERAL VOICES. Order !

The WITNESS. He did it that way [indicating].

Q. (By Mr. SMITH ) Do you know'these men? [handing witness a list].

(Objected to.)

A. (After examining list.) Yes; I do.

Q (Continuing.) A. (Reads:)

A. (Interrupting.)

If necessary to spell "Q." and "A.," the words in parentheses should
be lower-cased, the punctuation being outside the last parenthesis, as
follows:

Question (continuing). Answer Creads): [2 leads.]

DDQBv the COMMISSIONER: [Head.] "



I tend to use that style when I write anything formal what are your
thoughts?

Here is a link to a text version of the GPO Styles Manual:

http://www.archive.org/stream/stylemanualofgov00unitrich/stylemanualofgov00unitrich_djvu.txt

There are a lot of errors because its an OCR version.


--------------------------

Then there are the situations where words are hypen'd together to form
a compound single word as an exaample;

lower-cased

or more extreme:

correct-sentence-structure-communication-parse-syntax-grammar.

from a guy by the name of Miller who argues that would be a single
word creating or attempting to create a noun etc....

I am not a grammar wiz kid by no means and reading this code and
definitions is proving to be an awesome grammar teacher, so I would
enjoy hearing yours, everyone's opinion on these issues?

I am curious if anyone has written anything to dump the output to an
html where a person could wave the mouse over the codes (Bt, Fl, Aw
etc), and bring up the corresponding definitions and examples?
JJ

Bill Hayes

unread,
Jul 9, 2011, 4:33:14 PM7/9/11
to link-g...@googlegroups.com
JJ,
The Microsoft Visual C++ Visual Studio 2010 Express is free.
I has one of the most advanced IDE's available.

/bill

Reply all
Reply to author
Forward
0 new messages