Re: 100 sentences for GC

84 views
Skip to first unread message

Linas Vepstas

unread,
Mar 27, 2019, 2:24:38 PM3/27/19
to Anton Kolonin @ Gmail, lang-learn, link-grammar, opencog, Ivan Vodišek
Hi Anton,

I've cc'ed the link-grammar mailing list, because I describe below some concepts for word-sense disambiguation. I'm also cc'ing the opencog mailing list and ivan vodisek, because after studying hilbert systems, I think he's ready to think about how knowledge extraction can be done generically, and not just on language.

-- Linas

On Mon, Mar 25, 2019 at 1:39 AM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

Hi Linas,

>I'd call it "interesting", but maybe not "golden"

These are randomly selected sentences from "Gutenberg Children" corpus:

http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/lower_LGEng_token/

"Gutenberg Children silver standard" is LG-English parses:

http://langlearn.singularitynet.io/data/parses/English/Gutenberg-Children-Books/test/GCB-LG-English-clean.ull

"Gutenberg Children gold standard" is subset of "silver standard" with semi-random selection of sentences skipping direct speech and doing manual verification of the links.

So as long as we are training on "Gutenberg Children" corpus, having the test on the same "Gutenberg Children" seems reasonable, right?


Yes. You still need to verify that each word in the "golden" corpus occurs at least N=10 or 20 times in the training corpus. The dependency of accuracy on N is not generally known, but it is very clear that if a word occurs only N=3 times in the training corpus, then whatever is learned about it will be very low quality.
 

But thanks, we may have put mire effort in removal of ancient constructions and words even if these are present in the corpus.

If you consistently train on 19th century literature, and then evaluate 19th-century literature comprehension, that's fine.  Just don't expect it to work for 21st century blog posts.

The strongest effect will be the N=number of observations effect.
 

>Anyway -- you only indicate pair-wise word-links. Is the omission of disjuncts intentional?

If you have all links in the sentence, you can construct all of the disjuncts with o ambiguity, correct?

No, but only because you did not indicate the link-type.  The whole point of a clustering step is to obtain a link-type; if you discard it, you will never get  better-than-MST results. The link-type is critical for obtaining the word-classes.  The whole point of learning is to learn the word-classes; you've learned very little, if you know only word-pairs.

Consider this example:

I saw wood
I saw some wood

A solution that would be "almost perfect" (or "golden") would be this:

saw: {performer-of-actions}- & {sculptable-mass}+;
saw: {observer}-  & {viewable-thing}+;

These disambiguate the two different senses of the word "saw".  It's impossible to have word-sense disambiguation without actually having these disjuncts.  The word-pairs alone are not sufficient to report the link-type connecting the words.  Clustering gives the other dictionary entries:

I: {performer-of-actions}+ or {observer}+;
wood: {sculptable-mass}- or ({quantity-determiner}- & {viewable-thing}-);
some: {quantity-determiner}+;

Thus, the pronoun "I" also belong to two different word-sense categories: performers and observers.  Compare to:

"The chainsaw saws wood"  -- a "chainsaw" can be  a "performer of actions" but cannot be an "observer".
"The dog saw some wood" -- dogs can be observers. They can perform some actions; like run, jump, but they cannot saw, hammer, cut, stab.

The link-type is absolutely crucial to understanding a word.  The language-learning project is all about learning the link-types. Without correct link-type assignments, you cannot have correct parses.

... which is 100% of the problem with MST.  The problem with MST is not so much that "its not accurate" -sure, it is not terribly accurate. But even if MST or some MST-replacement was 100% accurate, it would still be "wrong" because it fails to indicate the link-type.  If you want to understand a sentence, you MUST know the link-types!   

Otherwise, you just have "green ideas sleep furiously", which parses, but only because the link types have been erased, or made stupid.  Here's a stupid grammar:

ideas:  {adjective}- & {verb}+;
green: {adjective}+;

which allows "green ideas" to parse.  But of course, this is wrong; it should have been:

ideas: {noospheric-modifier}- & {concept-manipulating-verb}+;
green: {physical-object-modifier}+;

and now it is clear that "green ideas" cannot parse, because the link-types clash.

* If you cluster down to 5 or 6 clusters (adjective, verb, noun ...) you will get very low quality grammars.

* If you cluster to 200 or 300 clusters, you get sort-of-OK grammars. This is what deep-learning/neural-nets do: this is why the deep-learning systems seem to give nice results: 200 or 300 features is enough to start having adequate functional distinctions (e.g. the famous "king - male+female=queen" example, or "paris-france+germany=berlin" example)

* If you cluster to 3K to 8K clusters, you start having a quite decent model of language 

* Note that wordnet has 117K "synsets".

Note that in the above example:
wood: {sculptable-mass}- or ({quantity-determiner}- & {viewable-thing}-);

the things in the curly-braces are effectively "synsets".

The next set of goal-posts is to have disjuncts, of maybe low-medium quality, and use these to extract ontologies.  e.g.
{sculptable-mass} is-a {mass} is-a {physical-thing} is-a {thing}

You can try to do this by clustering but there are probably better ways of discovering ontology. 

 

>Also -- no hint of any word-classes or part-of-speech tagging? This is surely important to evaluate as well, or is this to be done in some other way?  i.e. to evaluate if "Pivi" was correctly clustered with other given names?  Or that lama/llama was clustered with other four-legged animals?

We don't have that in MST-Parsing, right? We need this corpus to assess the quality of the MST-Parsing so we don't need part-of-speech information for that.

But we know that MST parsing is shit.  Stop wasting time on MST or trying to "improve" it. We already know that it is close to a high-entropy path to structure; trying to squeeze a few more percent of entropy is not worth the effort, not at this time.  Focus on finding a high-entropy structure extraction algorithm, don't waste time on MST.

You should be focusing on extracting disjuncts, word-classes, word-senses, and trying to improve the quality of those.  If you obtain a high-entropy path to these structures, the quality of your parses will automatically improve.  Focus on the entropy numbers. Try to maximize that.

The clustering is able to do that anyway - see the graphs in the end of the last year report:

https://docs.google.com/document/d/1gxl-hIqPQCYPb9NNkyA3sBYUyfwvJFvT1hZ5ZpXsaPc/edit#heading=h.twoiv52o0tou

>Also -- I can't tell -- is it free of loops, or are loops allowed?  Allowing loops tends to provide stronger, more accurate parses.  Loops act as constraints.

The loops and crossing links are not allowed in the MST-Parser now. If we allow them in the test corpus, how could it make assessment of MST-Parses better?

Note, that we ARE working we MST-Parses now - accordingly to Ben's directions.


Not to say bad things about Ben, but I'm certain he has not actually thought about this problem very much. He is very very busy doing other things; he is not thinking about this stuff.  I have repeatedly tried to explain the issues to him, and its quite clear that he is far away from understanding them, from working at the level that I would like to have you and your team work at.   

I'm trying to have you make small, quantified baby-steps, to verify the accuracy of your methods and data.  What I'm seeing is that you are attempting to make giant-steps, without verification, and then getting low-quality results, without understanding the root causes for them.  You can't dig yourself out of a ditch, and digging harder and more furiously won't raise the accuracy of the parse results.

--linas

We have your MST-Parser-less idea on the map but we are NOT trying it now:

https://github.com/singnet/language-learning/issues/170

We may try it after we explore the account for costs

https://github.com/singnet/language-learning/issues/183   

Thanks,

-Anton

24.03.2019 9:24, Linas Vepstas пишет:
Also, BTW, link-grammar cannot parse "I just stood there, my hand on the knob, trembling like a leaf." correctly. It is one of a class of sentences it does not know about.  Which is maybe OK, because ideally, the learned grammar will be able to do this. But today, LG cannot.

--linas

On Sat, Mar 23, 2019 at 9:12 PM Linas Vepstas <linasv...@gmail.com> wrote:
Anton, 

It's certainly an unusual corpus, and it might give you rather low scores. I'd call it "interesting", but maybe not "golden". Although I suppose it depends on your training corpus.  Here are some problems that pop out:

First sentence -- 
"the old beast was whinnying on his shoulder" -- the word "whinnying" is a fairly rare English verb -- you could read half-a-million wikipedia articles, and not see it once. You could read lots of 19th-century or early-20th century cowboy/adventure novels, (like what you'd find on Project Gutenberg) and maybe see it some fair amount. Even then -- to "whinny on a shoulder" seems bizarre.. I guess he's hugging the horse? How often does that happen, in any cowboy novel? "to whinny on something" is an extremely rare construction.  It will work only if you've correctly categorized "whinny" as a verb that can take a preposition.  Are your clustering algos that good, yet, to correctly cluster rare words into appropriate verb categories?

Second sentence .. "Jims" is a very uncommon name. Frankly, I've never heard of it as a name before.  Your training data is going to be extremely slim on this. And lack of training data means poor statistics, which means low scores.  Unless -- again, your clustering code is good enough to place "Jims" in a "proper name" cluster...

"the lama snuffed blandly" -- "snuffed" is a very uncommon, almost archaic verb. These days, everyone spells llama with two ll's not one. Unless your talking about Buddhist monks, its a typo.  

"you understand?"  is .. awkward. Common in speech, uncommon in writing. Unlikely that you'll have enough training data for this.

"Willard" is an uncommon name. Does your training corp[us have a sufficient number of mentions of Willard? Do you have clustering working well enough to stick "Willard" into a cluster with other names?

"it is so with Sammy Jay" is clearly archaic English.

"he hasn't any relations here" is clearly archaic, an olde-fashioned construction.

"Pivi said not one word" - again, a clearly old-fashioned construction. Does the training set contain enough examples of "Pivi" to recognize it as a name? Are names clustering correctly? 

Any sentence with an inversion is going to sound old-fashioned. All of the sentences in that corpus sound old-fashioned. Which maybe is OK if you are training on 19th century Gutenberg texts .. but its certainly not modern English.  Even when I was a child, and I read those old crumbly-yellow paper adventure books, part of the fun was that no one actually talked that way -- not at school, not at home, not on TV. It was clearly from a different time and place -- an adventure.

Anyway -- you only indicate pair-wise word-links. Is the omission of disjuncts intentional? Also -- no hint of any word-classes or part-of-speech tagging? This is surely important to evaluate as well, or is this to be done in some other way?  i.e. to evaluate if "Pivi" was correctly clustered with other given names?  Or that lama/llama was clustered with other four-legged animals?

Also -- I can't tell -- is it free of loops, or are loops allowed?  Allowing loops tends to provide stronger, more accurate parses.  Loops act as constraints.

-- Linas

On Thu, Mar 21, 2019 at 11:09 PM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

Hi Linas, Andes and whoever understands LG and English well enough both.

Attached are first 100 sentences for GC "gold standard" - manually checked based on LG parses.

We are expecting more to come in the next two weeks.

To enable that, please have cursory review of the corpus and let us know if there are corrections still needed so your corrections will be used as a reference to fix the rest and keep going further.

Thank you,

-Anton


--
You received this message because you are subscribed to the Google Groups "lang-learn" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lang-learn+...@googlegroups.com.
To post to this group, send email to lang-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lang-learn/bde76364-a578-4ab8-8ac5-2f49f794072b%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
cassette tapes - analog TV - film cameras - you


--
cassette tapes - analog TV - film cameras - you
-- 
-Anton Kolonin
skype: akolonin
cell: +79139250058
akol...@aigents.com
https://aigents.com
https://www.youtube.com/aigents
https://www.facebook.com/aigents
https://medium.com/@aigents
https://steemit.com/@aigents
https://golos.blog/@aigents
https://vk.com/aigents


--
cassette tapes - analog TV - film cameras - you

Ivan V.

unread,
Mar 28, 2019, 11:22:06 AM3/28/19
to Anton Kolonin @ Gmail, linasv...@gmail.com, b...@goertzel.org, lang-learn, link-grammar, opencog
Linas Vepstas wrote:

>... knowledge extraction can be done generically, and not just on language.

If link grammar would be Turing complete, this might be possible right away. But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" (URE) instead of link grammar at its core, and with URE things get much more complicated. I'm sorry, but that is still a Gordian knot to me, considering all of my modest knowledge. On the other hand, if someone really smart would provide automatic grammar extraction by means of unrestricted grammar, I believe that would be it.

Thank you,
Ivan V.


čet, 28. ožu 2019. u 07:58 Anton Kolonin @ Gmail <akol...@gmail.com> napisao je:

Ben, Linas,

>But we know that MST parsing is shit.  Stop wasting time on MST or trying to "improve" it.

I think that sounds like kind of support for the concept of "dumb explosive parsing" being advocated for 1+ year ago:

https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit#heading=h.aqo9bumb3doy

I also agree we other Linas'es reasoning in this thread. I would consider giving it a try starting next month if we don't have a breakthrough with DNN-MI-milking-based-MST-Parsing by that time.

> can be done generically, and not just on language

I think everyone in bio-informatics dreams of extracting secrets of "dark side of the genome" with something like that ;-)

Cheers,

-Anton


28.03.2019 1:24, Linas Vepstas пишет:

For more options, visit https://groups.google.com/d/optout.
-- 
-Anton Kolonin
skype: akolonin
cell: +79139250058

Ivan V.

unread,
Mar 28, 2019, 12:31:16 PM3/28/19
to Anton Kolonin @ Gmail, linasv...@gmail.com, link-grammar, opencog
P.S.

Just to adjust the pitch of my reply, automatic extraction of link grammar rules is more than excellent achievement already. Congratulations on this important piece of the puzzle!

Keep up the good work,
Ivan V.

Linas Vepstas

unread,
Mar 31, 2019, 10:18:14 PM3/31/19
to Ivan V., Anton Kolonin @ Gmail, b...@goertzel.org, lang-learn, link-grammar, opencog
On Thu, Mar 28, 2019 at 10:22 AM Ivan V. <ivan....@gmail.com> wrote:
Linas Vepstas wrote:

>... knowledge extraction can be done generically, and not just on language.

If link grammar would be Turing complete, this might be possible right away.
 
In my experience, thinking about Turing completeness is unproductive and a distraction.

But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" (URE) instead of link grammar at its core,

No. It has the rule-engine because back then, I did not understand sheaves.  I'm starting to think that the rule engine is a strategic mistake. The original idea is that rule-application is the main conceptual abstraction of term-rewriting.  One rewrites, or proves theorems by applying sequences of rules.  It turns out that discovering the right sequence is hard. Finding correct long sequences is hard - a combinatorial explosion.

The openpsi system addresses some of these issues. Unfortunately, it's current implementation is a tangle of rule-selection mechanisms, and theories of human psychology. It's probably better than the URE, but is currently not as powerful.

I'm trying to place a theory of sheaves as a replacement for URE, and as the natural generalization of openpsi, but I've successfully self-sabotaged myself in these efforts. 
 
and with URE things get much more complicated. I'm sorry, but that is still a Gordian knot to me, considering all of my modest knowledge.

We all have modest knowledge. That is the nature of the human condition.
 
On the other hand, if someone really smart would provide automatic grammar extraction by means of unrestricted grammar, I believe that would be it.

Yes, that is the goal of the language-learning project.  However, as noted in my last email (on the link-grammar list) it is not enough to just learn a semi-Thue system, declare victory, and go home.  The example I gave there: 

  "I think that you should give that car a second look" 
  "you should really give that song a second listen" 
  "maybe you should give Sue a second chance".

Learning to parse these "set phrases" or phrasemes is equivalent to learning a semi-Thue system; however, its not enough to realize that all three are forms of advice-giving, having "conserved" or "fixed" regions "x YOU SHOULD y GIVE z SECOND w" where z is very highly variable having millions of variations, and w only has a few dozen allowed variations.  Note that the words "fixed", "conserved", "variable" are words used in genetics and proteomics and antibody structure. Its the same idea.

The goal of learning lexical functions (LF's) is to learn that all three are advice-giving forms, and also to learn what is, and what can be plugged in for x,y,z,w.   So, although a super-whiz-bang grammar learner capable of learning context-sensitive languages should be able to learn "x YOU SHOULD y GIVE z SECOND w", it still will not know the *meaning* of this phrase.  To know the *meaning*, you have to know the acceptable ranges (as fuzzy-sets) of x,y,z,w. 

To conclude, thinking about Turing-completeness is a waste of time, because Turing completeness only tells you that "x YOU SHOULD y GIVE z SECOND w" is recursively enumerable; it does not tell you what it actually means. 

Put another way:  having a universal Turing machine is not the same as knowing how some particular program works. Automagically learning a context-sensitive grammar is not enough to know what that grammar is "saying/doing". 

-- Linas

Ben Goertzel

unread,
Mar 31, 2019, 11:53:33 PM3/31/19
to Anton Kolonin @ Gmail, Linas Vepstas, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
I suspect these cannot be common garden-variety milk machines, and the
milk machines need to embody some understanding of the
grammatical/semantic structures they are working to extract...

-- Ben

On Mon, Apr 1, 2019 at 12:51 PM Anton Kolonin @ Gmail
<akol...@gmail.com> wrote:
>
> Hi Linas, I like this thread more and more :-)
>
> >But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" (URE) instead of link grammar at its core,
>
> Linas, the "extraction of phrasemes" goal approaching has been discussed exactly in terms of MST->GL->URL on the last fall in Hong Kong discussion: https://docs.google.com/document/d/13YyqtGud0GAbVaFcc94kAd2LhGf7jTr5XDYgiuC294c/edit
>
> That is:
>
> 1) Do MST-parsing to get word links proto-disjuncts
>
> 2) Do Grammar Learning to cluster and conclude word categories and rules with disjuncts
>
> 3) Do URE-kind-of-thing to build the rules into "phrasemes" or "sections" or "patterns".
>
> However, your current discourse and our current results just show that "no one is be able to do reasonable MST-parsing" so the above is just waste of time, correct?
>
> At the time we speak, Ben, Alexely, Sergey and Asuares are trying to use DNN/BERT magic to do the trick 1. To my mind, that may get possible only if the DNN/BERT magic do the trick having the steps 2 and 3 done under the hood. If this is done, in such case, we don't need to do 2 and 3 after we have the DNN/BERT-based model, because we can simply "milk-out" the grammar rules out of DNN/BERT micelium for that. And we don't need the ULL as well by the way, because we just need DNN/BERT and rows of different sorts of milk machines around it.
>
> So, instead of solving the problem of constructing the pipeline for learning grammar from raw text we need to solve the problem of milking the grammar out of DNN/BERT model trained on these texts, right?
>
> However, either way, we need to understand algorithmic machinery of how the links assemble in disjuncts and disjuncts assemble into sections, through the universe-scale combinatorial explosion. And I agree that clustering and categorizing word and links (and then disjuncts and sections, right) is part of the process - explicitly in ULL pipeline or implicitly deep in DNN/BERT darkness.
>
> Cheers,
>
> -Anton
>
>
> 01.04.2019 9:17, Linas Vepstas:
Ben Goertzel, PhD
http://goertzel.org

"Listen: This world is the lunatic's sphere, / Don't always agree
it's real. / Even with my feet upon it / And the postman knowing my
door / My address is somewhere else." -- Hafiz

Linas Vepstas

unread,
Apr 1, 2019, 1:07:25 AM4/1/19
to Anton Kolonin @ Gmail, Ivan V., Alexey Potapov, Ben Goertzel, Oleg Baskov, b...@goertzel.org, lang-learn, link-grammar, opencog
On Sun, Mar 31, 2019 at 10:51 PM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

Hi Linas, I like this thread more and more :-)

I don't. I use a lot of CAPITALIZED WORDS below.  There is a deep and dark fundamental misunderstanding, and I am sometimes at wits end trying to figure out why, and how to explain things in an understandable fashion.

>But somehow, I suspect... Isn't this why OpenCog has "unified rule engine" (URE) instead of link grammar at its core,

Linas, the "extraction of phrasemes" goal approaching has been discussed exactly in terms of MST->GL->URL on the last fall in Hong Kong discussion: https://docs.google.com/document/d/13YyqtGud0GAbVaFcc94kAd2LhGf7jTr5XDYgiuC294c/edit

That is:

1) Do MST-parsing to get word links proto-disjuncts

2) Do Grammar Learning to cluster and conclude word categories and rules with disjuncts

3) Do URE-kind-of-thing to build the rules into "phrasemes" or "sections" or "patterns".

Yes. 

However, your current discourse and our current results just show that "no one is be able to do reasonable MST-parsing" so the above is just waste of time, correct?

No. Very much no.  I'm saying the opposite of that. You can replace MST by almost *ANYTHING* else, and the quality of your results WILL NOT CHANGE! 

If the quality of your results depends on the quality of MST, you are DOING SOMETHING WRONG!

I'm utterly flabbergasted. I don't know how many more times I can say this: stop wasting time on this unimportant step!

At the time we speak, Ben, Alexely, Sergey and Asuares are trying to use DNN/BERT magic to do the trick 1.

I want to call this "a complete waste of time". It will almost surely not improve the quality of the results!  I don't understand why four smart people think that replacing MST by BERT will make any difference at all!  It should not matter!  Nothing depends on this step! Anything at all, anything with a probability better than random chance, is sufficient!  Why isn't this obvious?

If Ben is reading this: I recall talking to Ben about this in an ice-cream shop in Berlin, for an AGI conference, and he seemed to understand back then.  I have no idea why he changed his mind.  I really do not understand why everyone spends so much time obsessing about MST. Is this a "color of the bike shed" problem?  https://en.wikipedia.org/wiki/Law_of_triviality  

MST-vs.-BERT==color-of-bike-shed

Just use MST. It's simple. It works. It gives good results.  Stop trying to improve it.  The interesting problems are elsewhere!  Just use MST, and move on to the good stuff!  

To my mind, that may get possible only if the DNN/BERT magic do the trick having the steps 2 and 3 done under the hood. If this is done, in such case, we don't need to do 2 and 3 after we have the DNN/BERT-based model, because we can simply "milk-out" the grammar rules out of DNN/BERT micelium for that. And we don't need the ULL as well by the way, because we just need DNN/BERT and rows of different sorts of milk machines around it.

So why are you bothering to work on ULL?  

So, instead of solving the problem of constructing the pipeline for learning grammar from raw text we need to solve the problem of milking the grammar out of DNN/BERT model trained on these texts, right?

Because I don't think that you know how to milk lexical functions out of DNN/BERT -- We've wasted more than a year talking about MST.  Instead of endlessly talking about MST, you could have  JUST USED IT, WITHOUT ANY MODIFICATIONS, gotten good results, and spent the year working on something interesting!

Again: replacing MST by DNN/BERT with something else will NOT IMPROVE the accuracy!  You'll have exactly the same accuracy as before, and if your accuracy improves, it is because you are doing something wrong!

However, either way, we need to understand algorithmic machinery of how the links assemble in disjuncts and disjuncts assemble into sections, through the universe-scale combinatorial explosion.

No. That is the OPPOSITE of what ACTUALLY HAPPENS!!!! 

And I agree that clustering and categorizing word and links (and then disjuncts and sections, right) is part of the process - explicitly in ULL pipeline or implicitly deep in DNN/BERT darkness. 

It is NOT DEEP AND DARK.  I wrote not one but TWO PAPERS on this, CASTING LIGHT ON THAT DARKNESS

I'm frustrated to the 43rd degree on why I cannot seem to have a reasonable conversation with any other human being about any of this.  

-- Linas

Cheers,

-Anton


01.04.2019 9:17, Linas Vepstas:

Ben Goertzel

unread,
Apr 1, 2019, 1:15:03 AM4/1/19
to Linas Vepstas, Anton Kolonin @ Gmail, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
"Replacing MST by DNN/BERT" is a strange way to put it...

DNN/BERT builds a pretty complex and comprehensive language model,
much beyond what is done by calculation of MI values and similar

The extraction of a parse dag satisfying syntactic constraints (no
links cross, covering all words in the sentence, connected graph) is a
conceptually simple step, and nobody is spending much time on this
step indeed...

The question of how to assign a quantitative weight to the relation
btw two word-instances in a sentence, taking into account the specific
context in that sentence, but also the history of co-utilization of
those words (or other similar words), is less conceptually simple and
this is one place I think DNN language models can help

Using MST or similar parsing based on numbers exported from DNN
language models is one way of extracting symbolic-ish structured
knowledge from these big messy subsymbolic probabilistic language
models...

The DNNs in use now like BERT do not really satisfy me on a
theoretical or conceptual level, but they have been tuned to work
pretty nicely and they have been implemented pretty efficiently on
multi-GPU hardware -- so, given this and given the quality of the
recent practical results obtained with them -- I consider it well
worth exploring how to use them as tools in our pursuits for grammar
and semantics learning

-- Ben

Nil Geisweiller

unread,
Apr 1, 2019, 2:51:03 AM4/1/19
to ope...@googlegroups.com, Linas Vepstas, Ivan V., Anton Kolonin @ Gmail, b...@goertzel.org, lang-learn, link-grammar
Hi,

On 4/1/19 5:17 AM, Linas Vepstas wrote:
> But somehow, I suspect... Isn't this why OpenCog has "unified rule
> engine" (URE) instead of link grammar at its core,
>
>
> No. It has the rule-engine because back then, I did not understand
> sheaves. I'm starting to think that the rule engine is a strategic
> mistake. The original idea is that rule-application is the main
> conceptual abstraction of term-rewriting. One rewrites, or proves
> theorems by applying sequences of rules. It turns out that discovering
> the right sequence is hard. Finding correct long sequences is hard - a
> combinatorial explosion.

So is writing programs, yet MOSES still manages to do that and
produce useful models. BTW, writing inference trees (which the URE
does) is almost exactly equivalent to writing programs.

> The openpsi system addresses some of these issues. Unfortunately, it's
> current implementation is a tangle of rule-selection mechanisms, and
> theories of human psychology. It's probably better than the URE, but is
> currently not as powerful.

OpenPsi and the URE are 2 different systems, doing different
things. OpenPsi is an action selection mechanism to fulfill urges and
create plans, the URE is a inference tree builder. Though actually
both may need each other. The inference control mechanism in the URE
uses a specialized re-implementation of OpenPsi, and OpenPsi could use
the URE to build plans.

> I'm trying to place a theory of sheaves as a replacement for URE, and as
> the natural generalization of openpsi, but I've successfully
> self-sabotaged myself in these efforts.

Linas, what is a good starting point to understand what you're trying
to accomplish?

Here?
https://github.com/opencog/atomspace/blob/master/opencog/sheaf/README.md

Nil

>
> and with URE things get much more complicated. I'm sorry, but that
> is still a Gordian knot to me, considering all of my modest knowledge.
>
>
> We all have modest knowledge. That is the nature of the human condition.
>
> On the other hand, if someone really smart would provide automatic
> grammar extraction by means of unrestricted grammar
> <https://en.wikipedia.org/wiki/Unrestricted_grammar>, I believe that
> <mailto:akol...@gmail.com>> napisao je:
>>> <linasv...@gmail.com <mailto:linasv...@gmail.com>>
>>> <mailto:akol...@gmail.com>> wrote:
>>>
>>> Hi Linas, Andes and whoever understands LG and
>>> English well enough both.
>>>
>>> Attached are first 100 sentences for GC "gold
>>> standard" - manually checked based on LG parses.
>>>
>>> We are expecting more to come in the next two weeks.
>>>
>>> To enable that, please have cursory review of the
>>> corpus and let us know if there are corrections
>>> still needed so your corrections will be used as
>>> a reference to fix the rest and keep going further.
>>>
>>> Thank you,
>>>
>>> -Anton
>>>
>>>
>>> --
>>> You received this message because you are
>>> subscribed to the Google Groups "lang-learn" group.
>>> To unsubscribe from this group and stop receiving
>>> emails from it, send an email to
>>> lang-learn+...@googlegroups.com
>>> <mailto:lang-learn+...@googlegroups.com>.
>>> To post to this group, send email to
>>> lang-...@googlegroups.com
>>> <mailto:lang-...@googlegroups.com>.
>>> <https://groups.google.com/d/msgid/lang-learn/bde76364-a578-4ab8-8ac5-2f49f794072b%40gmail.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit
>>> https://groups.google.com/d/optout.
>>>
>>>
>>>
>>> --
>>> cassette tapes - analog TV - film cameras - you
>>>
>>>
>>>
>>> --
>>> cassette tapes - analog TV - film cameras - you
>>
>> --
>> -Anton Kolonin
>> skype: akolonin
>> cell: +79139250058
>> akol...@aigents.com <mailto:akol...@aigents.com>
>> https://aigents.com
>> https://www.youtube.com/aigents
>> https://www.facebook.com/aigents
>> https://medium.com/@aigents
>> https://steemit.com/@aigents
>> https://golos.blog/@aigents
>> https://vk.com/aigents
>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>> --
>> You received this message because you are subscribed to the
>> Google Groups "lang-learn" group.
>> To unsubscribe from this group and stop receiving emails from
>> it, send an email to lang-learn+...@googlegroups.com
>> <mailto:lang-learn+...@googlegroups.com>.
>> To post to this group, send email to
>> lang-...@googlegroups.com <mailto:lang-...@googlegroups.com>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/lang-learn/CAHrUA36dE5ihtcCaqPv_q4qgmbEy-yX6kTkUHyLZmjk6d4VfOg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/lang-learn/CAHrUA36dE5ihtcCaqPv_q4qgmbEy-yX6kTkUHyLZmjk6d4VfOg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> -Anton Kolonin
> skype: akolonin
> cell: +79139250058
> akol...@aigents.com <mailto:akol...@aigents.com>
> https://aigents.com
> https://www.youtube.com/aigents
> https://www.facebook.com/aigents
> https://medium.com/@aigents
> https://steemit.com/@aigents
> https://golos.blog/@aigents
> https://vk.com/aigents
>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>
> --
> You received this message because you are subscribed to the Google
> Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to opencog+u...@googlegroups.com
> <mailto:opencog+u...@googlegroups.com>.
> To post to this group, send email to ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/CAHrUA36URqnNdjG-qjAScr-serD%3DoT%2B-%2BHfWkdZZxsKUZXvR8A%40mail.gmail.com
> <https://groups.google.com/d/msgid/opencog/CAHrUA36URqnNdjG-qjAScr-serD%3DoT%2B-%2BHfWkdZZxsKUZXvR8A%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Nil Geisweiller

unread,
Apr 1, 2019, 2:51:06 AM4/1/19
to ope...@googlegroups.com, Linas Vepstas, Ivan V., Anton Kolonin @ Gmail, b...@goertzel.org, lang-learn, link-grammar
Hi,

On 4/1/19 5:17 AM, Linas Vepstas wrote:
> But somehow, I suspect... Isn't this why OpenCog has "unified rule
> engine" (URE) instead of link grammar at its core,
>
>
> No. It has the rule-engine because back then, I did not understand
> sheaves. I'm starting to think that the rule engine is a strategic
> mistake. The original idea is that rule-application is the main
> conceptual abstraction of term-rewriting. One rewrites, or proves
> theorems by applying sequences of rules. It turns out that discovering
> the right sequence is hard. Finding correct long sequences is hard - a
> combinatorial explosion.

So is writing programs, yet MOSES still manages to do that and
produce useful models. BTW, writing inference trees (which the URE
does) is almost exactly equivalent to writing programs.

> The openpsi system addresses some of these issues. Unfortunately, it's
> current implementation is a tangle of rule-selection mechanisms, and
> theories of human psychology. It's probably better than the URE, but is
> currently not as powerful.

OpenPsi and the URE are 2 different systems, doing different
things. OpenPsi is an action selection mechanism to fulfill urges and
create plans, the URE is a inference tree builder. Though actually
both may need each other. The inference control mechanism in the URE
uses a specialized re-implementation of OpenPsi, and OpenPsi could use
the URE to build plans.

> I'm trying to place a theory of sheaves as a replacement for URE, and as
> the natural generalization of openpsi, but I've successfully
> self-sabotaged myself in these efforts.

Linas, what is a good starting point to understand what you're trying
to accomplish?

Here?
https://github.com/opencog/atomspace/blob/master/opencog/sheaf/README.md

Nil

>
> and with URE things get much more complicated. I'm sorry, but that
> is still a Gordian knot to me, considering all of my modest knowledge.
>
>
> We all have modest knowledge. That is the nature of the human condition.
>
> On the other hand, if someone really smart would provide automatic
> grammar extraction by means of unrestricted grammar
> <https://en.wikipedia.org/wiki/Unrestricted_grammar>, I believe that
> <mailto:akol...@gmail.com>> napisao je:
>>> <linasv...@gmail.com <mailto:linasv...@gmail.com>>
>>> <mailto:akol...@gmail.com>> wrote:
>>>
>>> Hi Linas, Andes and whoever understands LG and
>>> English well enough both.
>>>
>>> Attached are first 100 sentences for GC "gold
>>> standard" - manually checked based on LG parses.
>>>
>>> We are expecting more to come in the next two weeks.
>>>
>>> To enable that, please have cursory review of the
>>> corpus and let us know if there are corrections
>>> still needed so your corrections will be used as
>>> a reference to fix the rest and keep going further.
>>>
>>> Thank you,
>>>
>>> -Anton
>>>
>>>
>>> --
>>> You received this message because you are
>>> subscribed to the Google Groups "lang-learn" group.
>>> To unsubscribe from this group and stop receiving
>>> emails from it, send an email to
>>> lang-learn+...@googlegroups.com
>>> <mailto:lang-learn+...@googlegroups.com>.
>>> To post to this group, send email to
>>> lang-...@googlegroups.com
>>> <mailto:lang-...@googlegroups.com>.
>>> <https://groups.google.com/d/msgid/lang-learn/bde76364-a578-4ab8-8ac5-2f49f794072b%40gmail.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit
>>> https://groups.google.com/d/optout.
>>>
>>>
>>>
>>> --
>>> cassette tapes - analog TV - film cameras - you
>>>
>>>
>>>
>>> --
>>> cassette tapes - analog TV - film cameras - you
>>
>> --
>> -Anton Kolonin
>> skype: akolonin
>> cell: +79139250058
>> akol...@aigents.com <mailto:akol...@aigents.com>
>> https://aigents.com
>> https://www.youtube.com/aigents
>> https://www.facebook.com/aigents
>> https://medium.com/@aigents
>> https://steemit.com/@aigents
>> https://golos.blog/@aigents
>> https://vk.com/aigents
>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>> --
>> You received this message because you are subscribed to the
>> Google Groups "lang-learn" group.
>> To unsubscribe from this group and stop receiving emails from
>> it, send an email to lang-learn+...@googlegroups.com
>> <mailto:lang-learn+...@googlegroups.com>.
>> To post to this group, send email to
>> lang-...@googlegroups.com <mailto:lang-...@googlegroups.com>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/lang-learn/CAHrUA36dE5ihtcCaqPv_q4qgmbEy-yX6kTkUHyLZmjk6d4VfOg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/lang-learn/CAHrUA36dE5ihtcCaqPv_q4qgmbEy-yX6kTkUHyLZmjk6d4VfOg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> -Anton Kolonin
> skype: akolonin
> cell: +79139250058
> akol...@aigents.com <mailto:akol...@aigents.com>
> https://aigents.com
> https://www.youtube.com/aigents
> https://www.facebook.com/aigents
> https://medium.com/@aigents
> https://steemit.com/@aigents
> https://golos.blog/@aigents
> https://vk.com/aigents
>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>
> --
> You received this message because you are subscribed to the Google
> Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to opencog+u...@googlegroups.com
> <mailto:opencog+u...@googlegroups.com>.
> To post to this group, send email to ope...@googlegroups.com
> <mailto:ope...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/CAHrUA36URqnNdjG-qjAScr-serD%3DoT%2B-%2BHfWkdZZxsKUZXvR8A%40mail.gmail.com
> <https://groups.google.com/d/msgid/opencog/CAHrUA36URqnNdjG-qjAScr-serD%3DoT%2B-%2BHfWkdZZxsKUZXvR8A%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Linas Vepstas

unread,
Apr 1, 2019, 6:39:11 PM4/1/19
to Ben Goertzel, Anton Kolonin @ Gmail, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
OK, There's clearly a lot ow work happening in linguistics these days, that I have fallen behind on reading. 

The nature of the conversations here has been frustrating, because so far, it sounds like an attempt to evade the  "central limit theorem" -- https://en.wikipedia.org/wiki/Central_limit_theorem

There are two related ideas I'm trying to get across: one is that if you make enough observations of a phenomenon, eventually, the central-limit theorem kicks in, and smooths over random variations.  Specifically, I claim that, despite MST being imperfect, a large number of observations should smooth over the imperfections. I believe this to be true, (but I could be wrong).

The other idea is that the golden test corpus must avoid accidentally testing disjuncts far away from the central limit -- to avoid, as it were, making statements analogous to "Well, I flipped the coin three times, and I did not get 50-50 odds, therefore the theory doesn't work". You have to flip the coin at least N times, for some large N.  Here, for MST, we don't know  how big N has to be, we don't have a good plan for determining N. It's worse, cause everything is Zipfian aka 1/f noise. It is possible that BERT or other approaches allow smaller values of N to work, but this is also not clear. 

Its also not clear that BERT would converge to a different limit than MST - the central-limit theorem says there is only one limit -- not two. But perhaps I'm misapplying it, perhaps I'm neglecting some important effect.  Without measurements, its hard to guess what that effect is (if it even exists).

Anyway, I have a backlog of half-a-dozen important unread papers, so I'll try to get around to that "real soon now".

--linas


Linas Vepstas

unread,
Apr 2, 2019, 12:55:24 AM4/2/19
to Nil Geisweiller, opencog, Ivan V., Anton Kolonin @ Gmail, b...@goertzel.org, lang-learn, link-grammar
On Mon, Apr 1, 2019 at 1:51 AM Nil Geisweiller <ngei...@googlemail.com> wrote:
Hi,

On 4/1/19 5:17 AM, Linas Vepstas wrote:
... 
MOSES, URE and OpenPsi ... 

Linas, what is a good starting point to understand what you're trying
to accomplish?

Here?
https://github.com/opencog/atomspace/blob/master/opencog/sheaf/README.md

Yes, although perhaps, by now, that might be "naive and out of date". There are two PDF's, which I'd like to call "casual, without formulas", and another "with formulas".  As always: 



But maybe what I'm really proposing is "comparative AI": compare two different approaches/solutions/algorithms, and identify what they have in common, and ask: is there a general algo, of which the first two are a special-case? 

Most fruitful today might be to pick your favorite symbolic-AI algo/theory, and find some neural-net that solves a similar problem, and then try to compare/contrast/unify those two.   The skippy.pdf paper tries to explicitly show that the symbolic-AI-of-link-grammar-style is "just like" a neural net -- two sides of the same coin. I then try to show both sides of that coin, at once, thus "explaining" how/why neural nets work, and at the same time, showing how deep-learning style algos can be transferred over to symbolic AI.  More generally, how deep learning algos are an example from a larger class of algos, all of which "solve the same kinds of problems".  Its just not generally recognized that they are in the same class, because they arose in different niches, each using their own vocabulary, they're own notation, formulas, conventions, which obscures the fact that they are all variations of one-another.  So the idea is -- provide a rosetta stone, translation across the different disciplines. 

Both those PDF's are in some almost-but-not-quite finished state; I need to finish them, and then get the next set of ideas written down. I've been procrastinating.

--linas

Linas Vepstas

unread,
Apr 2, 2019, 1:09:31 PM4/2/19
to Anton Kolonin @ Gmail, Ben Goertzel, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
On Tue, Apr 2, 2019 at 12:57 AM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

Hi Linas,

Are you saying that "while ULL team has found strong linear correlation between A) quality (F1) on input parses and B) quality (F1) of the output parses based on the grammar learned from the input parses, this phenomenon is due to the fact that they test on the entire input corpus so this phenomena should go away once they test on gold standard corpus consisting only of sentences with high-frequency words"?


I am saying that I have not seen any evidence at all that you actually constructed or counted disjuncts, or that you clustered disjuncts, or that you controlled or managed counting in any way.

So -- you did something ... but I don't understand what that "something" is, and, based on these conversations, that "something" does not match up with what I had hoped that you would be doing. 

It's not just high-frequency words.  Its also how you perform clustering.  Are you using MI for that? or cosines for that?  Are you handling word-sense disambiguation, or not? How did you handle WSD? Through orthogonalization of cosines? Through maximizatino of MI? By computing a markov vector? Some other way?  Did you perform any data cuts during orhtogonalization/maximization? What kind of cuts were they? How do the cuts affect the F1-score?

All of these things deserve "instrumental verification". Without them, I don't know how to assign any meaning to F1-scores (or ROC curves, which you haven't shown - and even if you did show them, I would not know what they mean, until the above questions are resolved.)

So I've got this ball of questions, and I'm getting unclear, confusing answers to them.

-- Linas

Best regards,

-Anton


02.04.2019 5:38, Linas Vepstas пишет:

For more options, visit https://groups.google.com/d/optout.

Linas Vepstas

unread,
Apr 2, 2019, 1:30:37 PM4/2/19
to Anton Kolonin @ Gmail, Ben Goertzel, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
Anton,

I feel like I am torturing you, and for that, I apologize. But when I read your attachments, I see things like this:  Page 2:

> Obstacles
> The following problems have been encountered during the research. 
>
> 1. The MST-parses quality seems to be the main blocking issue preventing learning the reasonable and useful Link Grammar rules.

To me, this claim sounds absurd and outlandish -- it is a direct violation of the central limit theorem.  The quality of the MST parses should have no effect (minimal effect) on the learned grammar.  That is what the central limit theorem says. So, as bullet/obstacle #1, you claim that a cornerstone, a foundation-stone of probability theory is false? Incorrect? Inapplicable?  What evidence do you provide that your claim is true?  Well -- none, so far.

Now, perhaps the central limit theorem does not apply to language-learning. Perhaps it does not work the way that I think it would work.  Perhaps it's invalid for Zipfian distributions. Perhaps it requires more iterations. We can be creative and invent many "perhaps". Which of these is it?

You cannot just discard one of the cornerstone theorems of probability without explaining how/why. Extraordinary claims require extraordinary proof. Provide that proof.

-- Linas




On Tue, Apr 2, 2019 at 12:57 AM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

Hi Linas,

Are you saying that "while ULL team has found strong linear correlation between A) quality (F1) on input parses and B) quality (F1) of the output parses based on the grammar learned from the input parses, this phenomenon is due to the fact that they test on the entire input corpus so this phenomena should go away once they test on gold standard corpus consisting only of sentences with high-frequency words"?

If so, I hope we will have this premise verified instrumentally.

Best regards,

-Anton


02.04.2019 5:38, Linas Vepstas пишет:
OK, There's clearly a lot ow work happening in linguistics these days, that I have fallen behind on reading. 

For more options, visit https://groups.google.com/d/optout.

Linas Vepstas

unread,
Apr 2, 2019, 1:35:12 PM4/2/19
to Anton Kolonin @ Gmail, Ben Goertzel, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
Anton, that paper does not address, answer, talk about or mention any of the questions I have posed to you. I do not understand why we are feuding about this all the time. --linas

On Tue, Apr 2, 2019 at 12:32 PM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:

>I don't understand what that "something"

Hi Linas, last year paper is here

http://langlearn.singularitynet.io/data/docs/

this year paper draft is attached.

Cheers,

-Anton

03.04.2019 0:09, Linas Vepstas пишет:

For more options, visit https://groups.google.com/d/optout.

Ben Goertzel

unread,
Apr 2, 2019, 7:51:20 PM4/2/19
to Linas Vepstas, Anton Kolonin @ Gmail, Ivan V., Alexey Potapov, Oleg Baskov, lang-learn, link-grammar, opencog
On Wed, Apr 3, 2019 at 2:31 AM Linas Vepstas <linasv...@gmail.com> wrote:
>
> Anton,
>
> I feel like I am torturing you, and for that, I apologize. But when I read your attachments, I see things like this: Page 2:
>
> > Obstacles
> > The following problems have been encountered during the research.
> >
> > 1. The MST-parses quality seems to be the main blocking issue preventing learning the reasonable and useful Link Grammar rules.
>
> To me, this claim sounds absurd and outlandish -- it is a direct violation of the central limit theorem. The quality of the MST parses should have no effect (minimal effect) on the learned grammar. That is what the central limit theorem says. So, as bullet/obstacle #1, you claim that a cornerstone, a foundation-stone of probability theory is false? Incorrect? Inapplicable? What evidence do you provide that your claim is true? Well -- none, so far.


Linas, is it possible for you to spell out in an email what is your
argument that {the central limit theorem implies the MST parses should
have minimal effect on the learned grammar} ?

I don't really get why you believe this, and I have read your various
documents and also obviously spent a fair bit of time thinking about
the unsupervised language learning methods we are playing with...

It seems to me that the MST parsing step in our pipeline is where some
key grammatical constraints are imposed (e.g. no links cross, complete
connectivity etc.), and thus doing this step should impact the learned
grammar even in the large...

But I would like to understand what you're thinking better...

thanks
Ben
Reply all
Reply to author
Forward
0 new messages