ANNOUNCE: Link Grammar version 5.9.0

24 views
Skip to first unread message

Linas Vepstas

unread,
Apr 25, 2021, 6:44:56 PM4/25/21
to link-grammar, opencog
I'm proud to announce that a new version 5.9.0 of the Link Grammar system is now available. It contains a fairly long list of fixes, and one rather notable new feature: it can now generate sentences!

Sentence generation proceeds through a fill-in-the-blanks procedure, where, given a sentence template with wild-cards in it,  the dictionary will be combed for any words that could be placed there that will result in grammatical sentences. This is being actively used in the language-learning project to generate new random corpora (that contain grammatically valid sentences according to the specified grammar. In this way, the grammar that is learned can be compared to the grammar used to generate the corpus.)

This new sentence generation utility is experimental. It's not available on all platforms; it may be buggy; it may not have the features you want; the API may change at any time for any reason. For those who once looked at OpenCog's surreal and microplanning systems for sentence generation, this might offer a faster, more performant version thereof.

Here's the ChangeLog for this release:

Version 5.9.0 (25 April 2021)
 * Use #define for custom configuration in dictionaries. #1128
 * Panic-mode fixes and extensions. In link-parser see !help panic_variables.
 * English dict: fix silly mistake with "I love cats and dogs".
 * Disable maintainer-mode in `configure.ac`.
 * Fix very rare crash/corruption introduced in v.5.8.1 #1142
 * English dict: fix problems with "just/only".
 * English dict: work on hesitation markers.
 * Fix multi-threading mem-leak. #1149
 * Provide emscripten javascript wrapper for the command-line parser.
 * Public API shared library entry points exported automatically. #1182
 * Provide bindings for the Vala programming language.
 * Increase number of allowed idiom expressions. #1187
 * Replace O(n^2) idiom loading algo by an O(n log n) algo. #1194
 * Disable SAT solver by default.
 * New tool: Sentence generator! This is an experimental prototype.

You can download link-grammar from
http://www.abisource.com/downloads/link-grammar/current/

The website is here:
https://www.abisource.com/projects/link-grammar/

WHAT IS LINK GRAMMAR?
The Link Grammar Parser is a syntactic parser of English (and other
languages as well), based on Link Grammar, an original theory of English
syntax. Given a sentence, the system assigns to it a syntactic structure,
which consists of a set of labeled links connecting pairs of words.

See the Wikipedia page for more info:
https://en.wikipedia.org/wiki/Link_grammar

--linas


Linas Vepstas

unread,
Apr 27, 2021, 10:53:46 AM4/27/21
to opencog, link-grammar
Hi Jaques,

`man link-generator` will show the man page for it, and `link-generator --help` will print some additional help.

It's experimental; I'm not sure it works for English, as the English dict is very large.  I've used it only for small artificial languages.

--linas


On Mon, Apr 26, 2021 at 11:58 AM Jacques Basaldúa <jacques...@gmail.com> wrote:
Thanks a lot. It compiled on first try and I have been playing with the interactive link-parser, copy/pasting sentences and I am impressed. I will carry on reading to understand how it works and see if I can end up using it. 

Please, could you give us a link to the text generation code you mentioned and what is the idea behind it?

Jacques.

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34EHk_m34Ti6Jz%2BzNbd2aG6BL5tahqJTSc9fRMyHZXa0g%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CA%2B_pmb6hmX%2BvdFvef6NMNEdfEOqjqREtBjrVY%2BpA2VBA2Q5Crg%40mail.gmail.com.


--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.
 

Jacques Basaldúa

unread,
Apr 27, 2021, 11:11:46 AM4/27/21
to ope...@googlegroups.com, link-grammar
Thanks a lot. It compiled on first try and I have been playing with the interactive link-parser, copy/pasting sentences and I am impressed. I will carry on reading to understand how it works and see if I can end up using it. 

Please, could you give us a link to the text generation code you mentioned and what is the idea behind it?

Jacques.

On Mon, Apr 26, 2021 at 12:44 AM Linas Vepstas <linasv...@gmail.com> wrote:

ami...@gmail.com

unread,
Apr 27, 2021, 9:15:24 PM4/27/21
to link-grammar
Hi Jaques,

The link-generator program uses the link-grammar library. By default, it creates a sentence using wild-card words (denoted by "\*") according to the required sentence length. It then provides it to the parse function of the library.
Such a wild-card word represents all the world in the dictionary. So the library uses for it an expression that is a combination of all the (unique) expressions in the dictionary.
For the English dictionary, this expression generates about 20 million disjuncts. After duplication elimination, about 3 million disjuncts per word remain.
Each such disjunct points to the list of words from which it got derived. The library then performs a normal prase (like it does for sentences that you input through link-parser).
For non-trivial dictionaries and more than a few words, the parse result includes a huge number of linkages, which are sampled by the library. For each sampled linkage, the link-generator program randomly selects words for each of its disjuncts (from the list of words each disjunct points to).

Currently,  with a fast CPU and a lot of memory, generating English sentences of 7 words take about 2 hours. The parsing takes most of the time, and then a big number of sentences can be generated in a relatively short time. The parsing time increases exponentially with the number of words (especially faster when the number of words is small). So for the English dictionary, this is not a practical way to generate long random sentences. I have some ideas to make the generation much faster, but for now, they don't reduce the complexity of the parsing algorithm.

Amir

On Tuesday, 27 April 2021 at 18:11:46 UTC+3 Jacques Basaldúa wrote:
Thanks a lot. It compiled on first try and I have been playing with the interactive link-parser, copy/pasting sentences and I am impressed. I will carry on reading to understand how it works and see if I can end up using it. 

Please, could you give us a link to the text generation code you mentioned and what is the idea behind it?

Jacques.

[...]

Linas Vepstas

unread,
Apr 28, 2021, 12:41:16 PM4/28/21
to opencog, link-grammar
Hi Jacques,

On Wed, Apr 28, 2021 at 1:44 AM Jacques Basaldúa <jacques...@gmail.com> wrote:
Hi Linas,

Thanks a lot for the explanation. I also joined the link-grammar google group.

I am starting to understand the whole idea better. The multiplicity of solutions for any sentence

The goal of parsing is to not have a lot of different parses: ideally, only one, maybe two for truly ambiguous sentences. In practice, this is difficult to achieve, and so multiple parses are listed. These are always ranked from most to least likely, and have an associated "cost."  It is convenient to think of the "cost" as (minus) the logarithm of the probability: the higher the cost, the less likely that parse is correct.

Usually, the first parse is correct.  If it is not, please report that as a bug.

Sometimes, the first few parses have an identical cost; these are then sorted according to the total length of the links.  If this is the same, then which one is printed first is random.

-- Linas

Reply all
Reply to author
Forward
0 new messages