PCA "neural" classifier/patten miner for grammatical classification

58 views
Skip to first unread message

Linas Vepstas

unread,
Jun 6, 2017, 2:13:39 AM6/6/17
to Ben Goertzel, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, link-grammar, Shujing Ke
Ben,

The attached PDF describes the algorithm I plan to implement for performing the actual clustering. As of right now, I really like it: its simple, its straightforward, I believe it will work well.  It might be a real CPU burner, though, and so blue skies might bring tears.

I like to think of it as a kind-of "pattern miner", as it can be made completely generic; it works for any correlation matrix.  I suspect that it is totally different from what Shujing does, which is still on my list of things to study in greater detail.

--linas
classifier.pdf

Ben Goertzel

unread,
Jun 6, 2017, 8:27:43 AM6/6/17
to link-grammar, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, Shujing Ke
Interesting!

Clearly this is doing the right sorts of things, so it should do
something in the vicinity of what's needed...

still tho -- My intuition remains that a more fully nonlinear NN
approach might do better than a "linear algebra plus thresholding"
approach like this... Put differently, I think we need some more
powerful learning method like evolutionary-learning or backprop in
there, to capture the nonlinear dependencies btw word tuples...

But this seems worth trying and who knows, maybe it will be awesome...
it will be good to compare different approaches...

ben
> --
> You received this message because you are subscribed to the Google Groups
> "link-grammar" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to link-grammar...@googlegroups.com.
> To post to this group, send email to link-g...@googlegroups.com.
> Visit this group at https://groups.google.com/group/link-grammar.
> For more options, visit https://groups.google.com/d/optout.



--
Ben Goertzel, PhD
http://goertzel.org

"I am God! I am nothing, I'm play, I am freedom, I am life. I am the
boundary, I am the peak." -- Alexander Scriabin

Linas Vepstas

unread,
Jun 7, 2017, 12:04:57 AM6/7/17
to link-grammar, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, Shujing Ke
Maybe.

Note, though, there is something really really important that happens when "word tuples" get replaced by disjuncts. Its hard  to talk about because its both "obvious" and totally obscure...

The point is that if you have some complex pattern of things connected to other things, you have this problem of trying to figure out how to count it, which lead to our arguments about "surprisingness" earlier.  The whole point of the disjunct is that it replaces the pattern by a kind-of snapshot of it, a building-block. And this simplification also makes the complexities f tuples and patterns "go way", or rather, converts them into something manageable.

Again: it decomposes a pattern into the building-blocks of a pattern. I think this means that you can use linear tools on these building blocks, and when you re-assemble the whole pattern, the non-linearity re-emerges.

Perhaps one way to think of this is .. well, if you recall what an atlas is, in topology: its a set of flat maps, that you can glue together to get a non-flat manifold.  e.g. literally, the earth is round, but maps are flat, you glue them together to get a round earth.

So same here: the disjuncts are the flattened parts of a pattern. you can glue them together to get the whole complex pattern, but by working with the flattened pieces, everything becomes much much simpler.  So your word-tuples are undoubtedly non-linear: and that is the point: don't work with word-tuples. They are a difficult, bad representation.

--linas



> To post to this group, send email to link-g...@googlegroups.com.
> Visit this group at https://groups.google.com/group/link-grammar.
> For more options, visit https://groups.google.com/d/optout.



--
Ben Goertzel, PhD
http://goertzel.org

"I am God! I am nothing, I'm play, I am freedom, I am life. I am the
boundary, I am the peak." -- Alexander Scriabin

--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to link-grammar+unsubscribe@googlegroups.com.

Ben Goertzel

unread,
Jun 7, 2017, 2:50:45 AM6/7/17
to link-grammar, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, Shujing Ke
On Wed, Jun 7, 2017 at 12:04 PM, Linas Vepstas <linasv...@gmail.com> wrote:
> So same here: the disjuncts are the flattened parts of a pattern. you can
> glue them together to get the whole complex pattern, but by working with the
> flattened pieces, everything becomes much much simpler. So your word-tuples
> are undoubtedly non-linear: and that is the point: don't work with
> word-tuples. They are a difficult, bad representation.


Yes, that much is very clear ... one big advantage of our approach is
that we're NOT just acting on word tuples and doing statistics; we're
alternating steps of

-- forming parse trees

-- gathering statistics based on abstractions from these parse trees

... ben

Linas Vepstas

unread,
Jun 7, 2017, 5:53:15 PM6/7/17
to link-grammar, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, Shujing Ke
well, 'm trying to say something more than that. They're not just "statistics based on abstraction", and the abstractions have nothing to do with natural language; they can be formed for any pattern whatsoever.

The "abstractions" are the decomposed parts or components of a pattern.  They describe how one part of a pattern fits with another part, how it interacts with another part.

Back in they day, you had a fascination with combinators, and the reason for that was legit: combinators represent a problem in certain nice ways that lambda cannot do.

What I'm saying is that disjuncts are kind-of-like combinators. They abstract away the connection into a connector, so that the details of the connection no longer matter.

The deal was that combinators were not typed (because lambda calculus is not typed). I claim that the disjuncts are like combinators, but they carry the types along with them.  Also, they carry directional information, unlike the combinators: the combinatores were hard to use, because they carried implicit positional information: you had to write them in the correct order.   With disjuncts, you don't have to write them in any order, you nly have to connect them in the correct order.

Its is this disentanglement form order and relation dependencies that make them powerful. You no longer have to deal with the complexities of the full pattern, or the string-limitations of lambda.

--linas

Ben Goertzel

unread,
Jun 7, 2017, 11:10:34 PM6/7/17
to link-grammar, Ruiting Lian, Amen Belayneh, Curtis M. Faith, opencog, Shujing Ke
Yeah, I see how the disjuncts can be viewed as combinators, and that
mapping becomes pretty clear in the "pregroup grammar" formulation of
link grammar, it seems...
>> email to link-grammar...@googlegroups.com.
>> To post to this group, send email to link-g...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/link-grammar.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "link-grammar" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to link-grammar...@googlegroups.com.
> To post to this group, send email to link-g...@googlegroups.com.
> Visit this group at https://groups.google.com/group/link-grammar.
> For more options, visit https://groups.google.com/d/optout.



Reply all
Reply to author
Forward
0 new messages