Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Ruby link grammar parses

0 views
Skip to first unread message

Claus Spitzer

unread,
Jan 20, 2005, 2:14:55 PM1/20/05
to
Greetings!
I've stumbled upon a little dilemma: I'm working on an analogy-based
lexico-semantic disambiguator. I'll spare you the gist of it, but if
you're really interested, it's related to this:
http://portal.acm.org/citation.cfm?id=992694&dl=GUIDE&coll=Portal&CFID=36359232&CFTOKEN=49186190

I'm basically attempting to reproduce that work. Now to the real
problem: I am in need of a parser to extract verb-subject/object pairs
from texts. One of the professors here directed me to Link Grammar (
http://www.link.cs.cmu.edu/link/ ), which does what I would need...
except that it is written in C. I've meddled with the API a bit, but
to my dismay I discovered that (mostly) all my knowledge of C
programming has waned, to the point of not even being able to get
their examples to work (in my defense, I'm _really_ good with Ruby and
Smalltalk now :-D ). Still, I am in need of a parser. Now, the forum
for Link Grammar indicates (
http://hartford.lti.cs.cmu.edu/linkparser/phorum/read.php?1,7 ) that
there is a Perl interface available for it so I said to myself "Hey,
why not see if there is something similar available in Ruby"? I looked
at RAA, but could not find anything related (perhaps I overlooked some
category?). So the question at hand is: Is there any English text
parser available in Ruby that would (somewhat) match my needs? I
_could_ write my programs in Perl, but I'd much rather use the one
language I've come to love: Ruby.
Cheers...
C.W.S.


Florian Gross

unread,
Jan 20, 2005, 5:04:50 PM1/20/05
to
Claus Spitzer wrote:

> Greetings!

Moin.

> I am in need of a parser to extract verb-subject/object pairs
> from texts. One of the professors here directed me to Link Grammar (
> http://www.link.cs.cmu.edu/link/ ), which does what I would need...
> except that it is written in C. I've meddled with the API a bit, but
> to my dismay I discovered that (mostly) all my knowledge of C
> programming has waned, to the point of not even being able to get

> their examples to work [...] Is there any English text


> parser available in Ruby that would (somewhat) match my needs?

While I can give you no definite answer whether something like that is
already available Ruby/DL might help a lot at writing an interface to
the C library in pure Ruby. It's part of standard Ruby and there's quite
a lot of documentation available on the web.

Kaspar Schiess

unread,
Jan 21, 2005, 6:33:44 AM1/21/05
to
(In response to news:bb1334190501...@mail.gmail.com by Claus
Spitzer)

> Is there any English text
> parser available in Ruby that would (somewhat) match my needs? I
> _could_ write my programs in Perl, but I'd much rather use the one
> language I've come to love: Ruby.

Hello Claus,

I have hacked together a small proof of concept extension that binds to
LinkParser. It is by far not complete, but you can download it from here:

www.tua.ch/ruby/link/050121-link-4.1b.tar.gz

I would like to maintain this library, although a first version will not
be out before next month; Please send whatever changes you make per
'darcs' or unified diff to me.

To get started, do a make of the link package itself. Then go to the obj
directory and do a 'ar r liblink.a *.o'; copy this library to your
library paths.

Then change to the /ext directory, do an 'extconf.rb' and then 'make',
'make install'.

You should now be able to run tc_linkparser.rb in /tests.

Don't expect too much, its only a base for what is to be. But look at it
from the bright side, you get to decide on the API ;). Plus it gets you
started with c again.

Hope this helps,
kaspar

hand manufactured code - www.tua.ch/ruby

Claus Spitzer

unread,
Jan 21, 2005, 1:20:41 PM1/21/05
to
Thanks Kaspar! Last night I found LinkParser (
http://raa.ruby-lang.org/list.rhtml?name=linkparser ), but I will look
into your code as well. Again, many thanks for the quick assistance.

Kaspar Schiess

unread,
Jan 21, 2005, 5:59:44 PM1/21/05
to
(In response to news:bb13341905012...@mail.gmail.com by Claus
Spitzer)

> Again, many thanks for the quick assistance.

Tell me if you need it, the API took quick shape after another few hours of
'hack mode' just before the we..

Kaspar Schiess

unread,
Jan 28, 2005, 6:14:02 PM1/28/05
to
After some more hacking, dlding this tgz will give you full access to the
link structure. Just look at the test case ;).

Note that whatever I've written is under Ruby license, but the other stuff
is under a GPL license.

www.tua.ch/ruby/link/050128-link-4.1b-for-ruby.tgz

This is probably going to be a real release sometime soon.

best regards,

Claus Spitzer

unread,
Jan 31, 2005, 1:28:24 PM1/31/05
to
Thanks for keeping working on this. Currently my progress with Link
Grammar has been running into a few roadblocks, the most important one
being performance with the original Link Grammar tool. As for your
implementation, I can't say that I've been able to successfully run
the tests, but I suspect that is due to the several different versions
of the linkparser library lying around on my machine. I'll give it
another try once I get access to a clean machine.
Cheers!
-CWS

Kaspar Schiess

unread,
Feb 1, 2005, 6:17:26 AM2/1/05
to
(In response to news:bb13341905013...@mail.gmail.com by Claus
Spitzer)

> I'll give it
> another try once I get access to a clean machine.

Hey, just tell me if you need help on this. You are the only user as far as
I can tell (that is: until I figure out something usefule for this tool).

Performance issues ? Isn't the thing rather well optimized ? I wonder what
you are running trough Lingua::LinkParser to get these issues...

Claus Spitzer

unread,
Feb 1, 2005, 9:34:08 AM2/1/05
to
Thanks for the offer of help, I'll keep it in mind. I mainly ran into
problems with the tests...

----8<----

baron@daedalus:~/tmp/link-4.1b-ruby/tests$ ruby tc_linkparser.rb
Loaded suite tc_linkparser
Started
Opening ./4.0.dict
F Opening ./4.0.dict
E
Finished in 0.010642 seconds.

1) Failure:
test_basic(TestLinkParser) [tc_linkparser.rb:30]:
Exception raised:
Class: <StandardError>
Message: <"Could not find dictionary.">
---Backtrace---
tc_linkparser.rb:16:in `initialize'
tc_linkparser.rb:16:in `new'
tc_linkparser.rb:16:in `initialize'
tc_linkparser.rb:10:in `new'
tc_linkparser.rb:10:in `new'
/usr/lib/ruby/1.8/singleton.rb:95:in `instance'
/usr/lib/ruby/1.8/singleton.rb:84:in `instance'
tc_linkparser.rb:20:in `dict'
tc_linkparser.rb:31:in `test_basic'
tc_linkparser.rb:30:in `assert_nothing_raised'
tc_linkparser.rb:30:in `test_basic'
---------------

2) Error:
test_link_each(TestLinkParser):
StandardError: Could not find dictionary.
tc_linkparser.rb:16:in `initialize'
tc_linkparser.rb:16:in `new'
tc_linkparser.rb:16:in `initialize'
tc_linkparser.rb:10:in `new'
tc_linkparser.rb:10:in `new'
/usr/lib/ruby/1.8/singleton.rb:95:in `instance'
/usr/lib/ruby/1.8/singleton.rb:84:in `instance'
tc_linkparser.rb:20:in `dict'
tc_linkparser.rb:65:in `test_link_each'

2 tests, 1 assertions, 1 failures, 1 errors
baron@daedalus:~/tmp/link-4.1b-ruby/tests$ ls
4.0.affix 4.0.constituent-knowledge 4.0.knowledge tiny.dict
. 4.0.batch 4.0.dict tc_linkparser.rb words
baron@daedalus:~/tmp/link-4.1b-ruby/tests$

----8<----

I have two guesses as to what the problem is:
- I copied liblink.a to the wrong directory(ies), or
- 4.0.dict needs to be in the same directory as the library.
I haven't investigated further on it due to unrelated personal reasons, though.

The performance issues themselves are not with Lingua::LinkParser, but
with Link Grammar itself. When running it on the sentences in
http://notpublic.wrong.button.com/sent.txt . Those have been extracted
by a friend from a NYT article. Some time out, others just fail, and
on one Link Grammar just refuses to work because the sentence is
longer than 70 words! Fortunately for me, the first phase of my work
does not require _all_ sentences to be parsed - just enough to extract
a sufficient amount of VS/VO pairs (~40000).

Lastly: I've been in contact with Martin Chase, who worked on the
Linguistics and LinkParser ports
(http://raa.ruby-lang.org/list.rhtml?name=linkparser and
http://www.deveiate.org/code/linguistics-overview.html - a Ruby gem is
also available for the latter). I plan to use Linguistics because I
also need to make use of WordNet. So far LinkParser is unoptimized (to
the point that pruning (and probably memoization too) are a
necessity), and needs the words in the sentence to be tagged if used
with 4.0.dict . The plan I have so far is to use your Link Grammar
wrapper instead of LinkParser for the Linguistics module, since it is
considerably faster.
Cheers!
-CWS

Kaspar Schiess

unread,
Feb 1, 2005, 3:53:17 PM2/1/05
to
(In response to news:bb1334190502...@mail.gmail.com by Claus
Spitzer)

> 1) Failure:
> test_basic(TestLinkParser) [tc_linkparser.rb:30]:
> Exception raised:
> Class: <StandardError>
> Message: <"Could not find dictionary.">

Yeah, this looks as though the dictionary could not be found. I usually run
the tests directly from the tests directory, where I also have all those
dict files. If that does not work (and it certainly seems not to), there is
an option using DICTPATH environment variable.

But I was looking at how to get the Dictionary.new an optional argument
which would be the path of the dictionaries.

Sure hope we can sort this out. And yeah, the Ruby version seems to be very
slow - I don't really understand why this should be rewritten.

0 new messages