Running multiple threads in Link Grammar Parser?

12 views
Skip to first unread message

Peter Szolovits

unread,
Jun 18, 2011, 4:48:47 PM6/18/11
to link-g...@googlegroups.com
I wonder if someone can enlighten me on the possibility of using multi-core systems to run multiple simultaneous threads to parse several sentences in parallel. I vaguely remember a comment on this mailing list that the code is reentrant, which would imply that it should be easy, but I can't find the old message. I am running the parser by foreign function call from Allegro CommonLisp, and my thought is that I can

a. Load up the parser, dictionaries, parse-options, etc, once in a parent process.

b. Spawn a handful of separate processes, each of which uses those resources to create a sentence (via sentence_create), then sentence_split, and then sentence_parse, and finally extract the data of interest from the result of the parse.

I have done a poor man's equivalent of this by starting numerous shells and running a complete parser in each, but it would be more elegant (and perhaps more efficient) to load the parser once and simply parallelize the parsing task.

Is this feasible?

Thank you. --Pete Szolovits

Linas Vepstas

unread,
Jun 18, 2011, 10:49:51 PM6/18/11
to link-g...@googlegroups.com
Hi Peter,

On 18 June 2011 15:48, Peter Szolovits <p...@mit.edu> wrote:
> I wonder if someone can enlighten me on the possibility of using multi-core systems to run multiple simultaneous threads to parse several sentences in parallel.  I vaguely remember a comment on this mailing list that the code is reentrant, which would imply that it should be easy, but I can't find the old message.

The README file says more.

I believe that it should "just work", as long as you hold off using
multiple threads until after the dictionaries are full loaded (i.e. a
few seconds). However:

-- It is very lightly tested. Now, it should work flawlessly, in
theory, because there are no more global variables (that
aren't read-only). However, software-nature is to be buggy,
so .. I'll take bug reports ...

-- there may be *some* performance benefit, since having
a shared dictionary among all threads will mean that its more
likely to all fit in the cpu cache. However, I'd be surprised if
this was more than a 5% effect ... but hey, stranger things
have happened.

> I am running the parser by foreign function call from Allegro CommonLisp, and my thought is that I can
>
> a. Load up the parser, dictionaries, parse-options, etc, once in a parent process.
>
> b. Spawn a handful of separate processes,

I am confused when you say "spawn a handful of separate
processes" since processes are not threads (or, at least, that
isn't the usual name for them, but maybe in allegro .. ???)

> each of which uses those resources to create a sentence (via sentence_create), then sentence_split, and then sentence_parse, and finally extract the data of interest from the result of the parse.

should "just work"

--linas

Reply all
Reply to author
Forward
0 new messages