Generalizing the Pattern Miner implementation to make it encompass the MOSES algorithm

51 views
Skip to first unread message

Ben Goertzel

unread,
Dec 9, 2016, 1:26:57 PM12/9/16
to opencog, Nil Geisweiller, Linas Vepstas, Shujing Ke
As we've discussed many times, having MOSES implemented outside the
Atomspace is awkward for various reasons...

It seems to me that one could use the Pattern Miner infrastructure,
with only fairly modest modification, to implement a variety of MOSES
that operates in the Atomspace. This would have various advantages,
including enabling the hybridization of pattern mining and MOSES in
various ways.

PRELIMINARIES

Observe the basic parallel:

* the Pattern Miner works by taking an ensemble of patterns, and
growing them into larger patterns by adding more Atoms to them

* MOSES works by taking an ensemble of program trees (the deme
exemplars) and growing them into larger program trees by adding more
nodes to them

The key difference is that in the PM case, the new Atoms being added
are drawn from the Atomspace (with a goal of making the pattern one
that is highly frequent, or surprising, across the Atomspace).
Whereas in the MOSES case, the new nodes being added are “made up”
with a goal of making the program maximize a certain externally given
fitness function. (However, in an EDA-based version of MOSES, the
new nodes added are drawn from a certain probability distribution,
which is one that was mined by analysis of the program tree ensemble,
with a view toward finding patterns of nodes that lead to high
fitness).

If we view frequency/surprisingness as one fitness function among
many; and “forming patterns already existing in the Atomspace” as one
heuristic for pattern-expansion among many … then we can see the PM
and MOSES as aspects of the same basic dynamic.

A representational difference is that in the PM, all the patterns in
the ensemble are represented as parts of the same data structure, the
H-Graph (a collection of Atoms in a special Atomspace). In MOSES the
programs in the ensemble are stored as distinct objects, with no
attempt to factor out the redundant subprograms in the representation
(though redundant subprograms are identified in the EDA step).

In MOSES, Reduct is used to do program normalization. This concept
does not exist in the current PM. However, it would enhance the PM
considerably, via enabling semantically identical but syntactically
different instances of a pattern to be recognized as identical.

EXPANDED PATTERN MINER ALGORITHM

Incorporating these ideas, the unified/expanded pattern miner
algorithm would look like:

* Start with an ensemble of simple hypergraphs, constructed according
to template T (which is itself represented as a hypergraph)

* Iterate through the hypergraphs H in the ensemble, expanding each
hypergraph H into a set of larger hypergraphs, by applying an
expansion template T1 to some Atoms within H. T1 is a pattern with
variables, and some of the variables in T1 are bound to Atoms in H
whereas others are bound to other Atoms not in H.

* The non-H Atoms filling in the variable slots of T1 are chosen from
a certain probability distribution D, the nature of which is
pre-specified as part of the configuration of the algorithm. For
instance, D could draw from the Atomspace, meaning that only
expansions of H that already exist somewhere in the Atomspace would be
chosen. Or D could be proportional to the estimated fitness of the
expansion of H, according to some fitness estimation method (which
could be based on pattern-analysis of the growing pattern-ensemble
itself; more on this below). Crossover is another possible expansion
distribution.

(Note that the previous two steps could be implemented using SampleLink, i.e.

SampleLink
application of constraint T1 in the context of H
distribution D
)

* Once a particular expansion of H, call it H*, is formed, then its
quality is evaluated by a pre-specified quality measure. One possible
quality measure is the frequency or surprisingness of H* in the
Atomspace (this is the PM approach). Another possible quality measure
is the evaluation of H* according to some externally specified fitness
function (this is the MOSES case).

* Expansions leading to relatively poor-quality H* are pruned.

* Etc.

This process encompasses both PM and MOSES and other possibilities.
To run the process one specifies a template pattern for the initial
population, a template pattern and distribution for pattern-expansion,
and a quality measure for expanded-pattern assessment.

In some uses of MOSES, different demes involve different “feature
sets.” This logic would go into the distribution for
pattern-expansion: each feature-set is a different pattern-expansion
distribution. The “feature selection inside MOSES” logic comprises a
scheme wherein: the feature-sets leading to successful expansion
operations on H, are tweaked and then the tweaked versions are used
for a percentage of the new expansion operations on H.

Finally, an elegant aspect of this implementation of MOSES would be
that: the EDA distribution-mining aspect of MOSES would be implemented
using pattern mining on the evolving H-Graph (the program tree
ensemble). Pattern mining would be applied to the ensemble to find
patterns (expressed as Atoms) that occur surprisingly frequently in
highly “fit” (high quality) patterns in the H-Graph. These patterns
would then be used to constitute the pattern-expansion distribution of
the H-Graph.

Reduction could be implemented as an operation occurring on a
newly-expanded pattern H, immediately after H’s quality is evaluated
(unless the choice is made to prune H immediately). As in MOSES,
reduction would make pattern mining on the H-Graph work better. In a
PM application, reduction could either be done in this way or done on
the Atomspace as a whole, in advance.

EFFICIENCY OPTIMIZATION

The “plumbing” of this MOSES implementation would be slower than that
of the current MOSES, because the evaluation of each program-tree
(each pattern) would use the Atom formalism. However, this could be
optimized, as a later implementation step, if it appears necessary.
The way to optimize it would be to maintain a faster, non-Atom-based
version of the H-Graph, which is optimized for rapid evaluation of the
sub-hypergraphs in the H-Graph. When an expanded pattern is added to
the H-Graph, a corresponding expanded pattern must be added to the
optimized H-Graph. There would be some overhead in this scheme, but
if the fitness function involved (e.g.) repeated execution of patterns
against some dataset, the overhead would be worth it. The formalism
of linking executable subhypergraphs of the Atomspace to optimized
versions would be more broadly useful. (Possibly, the optimized
H-Graph structure could simply be built using the VertexTrees used in
the current MOSES implementation.)

Among many other applications — If one wished to uses MOSES as a
fancier version of pattern mining, able to make non-greedy leaps from
small patterns to significantly larger patterns, this would be do-able
very directly in this framework. (This application is actually what
originally triggered the above mode of thinking. I think this sort of
hybridization of MOSES and pattern mining will be needed to make
pattern mining really impactful on biology Atomspaces, for example.)

IMPLEMENTATION STRATEGY

The reason I mention this now is that we are shortly going to embark
on some improvements to the Pattern Miner implementation. One
possible way to do this would be to

* implement SampleLink

* modify the Pattern Miner to use SampleLink, and to use the abstract
workflow implied in the above “expanded Pattern Miner” description

* then, in parallel with using the refactored Pattern Miner for
pattern mining, build a new MOSES on top of the expanded Pattern
Miner infrastructure

* finally, do the MOSES efficiency optimization suggested above, if
some other efficiency optimization isn’t concocted


And then, as the Aussies say, Bob's your Uncle... ;)

--
Ben Goertzel, PhD
http://goertzel.org

“I tell my students, when you go to these meetings, see what direction
everyone is headed, so you can go in the opposite direction. Don’t
polish the brass on the bandwagon.” – V. S. Ramachandran

Nil Geisweiller

unread,
Dec 11, 2016, 11:19:40 PM12/11/16
to Ben Goertzel, opencog, Nil Geisweiller, Linas Vepstas, Shujing Ke


On 12/09/2016 08:26 PM, Ben Goertzel wrote:
> EXPANDED PATTERN MINER ALGORITHM
>
> Incorporating these ideas, the unified/expanded pattern miner
> algorithm would look like:
>
> * Start with an ensemble of simple hypergraphs, constructed according
> to template T (which is itself represented as a hypergraph)
>
> * Iterate through the hypergraphs H in the ensemble, expanding each
> hypergraph H into a set of larger hypergraphs, by applying an
> expansion template T1 to some Atoms within H. T1 is a pattern with
> variables, and some of the variables in T1 are bound to Atoms in H
> whereas others are bound to other Atoms not in H.

Interesting. So the way I see it is that you start from a general
pattern T. You find atoms that meet that pattern, based on these atoms
you refine T into T1, or perhaps rather an ensemble of T1s, right?

You do need a scheme that defines how you go from T to T1 though, what
is called knob/representation building in MOSES, right? Would be cool if
such scheme can be coded as a BindLink or something like that.

> EFFICIENCY OPTIMIZATION
>
> The “plumbing” of this MOSES implementation would be slower than that
> of the current MOSES, because the evaluation of each program-tree
> (each pattern) would use the Atom formalism. However, this could be
> optimized, as a later implementation step, if it appears necessary.
> The way to optimize it would be to maintain a faster, non-Atom-based
> version of the H-Graph, which is optimized for rapid evaluation of the
> sub-hypergraphs in the H-Graph. When an expanded pattern is added to
> the H-Graph, a corresponding expanded pattern must be added to the
> optimized H-Graph. There would be some overhead in this scheme, but
> if the fitness function involved (e.g.) repeated execution of patterns
> against some dataset, the overhead would be worth it. The formalism
> of linking executable subhypergraphs of the Atomspace to optimized
> versions would be more broadly useful. (Possibly, the optimized
> H-Graph structure could simply be built using the VertexTrees used in
> the current MOSES implementation.)

It was clear to me whether using the AtomSpace as working DB for the
evolving trees would speed up or slow down MOSES. One thing that could
speed it up is enabling memoization at the subtree level, as opposed to
the tree level as in MOSES. Seemingly ProtoAtoms could be used as a
cache memory over subtrees. Who knows what kind of speed up that may yield?

Nil

Nil Geisweiller

unread,
Dec 11, 2016, 11:34:43 PM12/11/16
to Nil Geisweiller, Ben Goertzel, opencog, Linas Vepstas, Shujing Ke


On 12/12/2016 06:19 AM, Nil Geisweiller wrote:
> It was clear to me whether using the AtomSpace as working DB for the

I forgot to insert the word "never" in here.

Nil

Ben Goertzel

unread,
Dec 12, 2016, 3:07:57 AM12/12/16
to Nil Geisweiller, opencog, Linas Vepstas, Shujing Ke
Nil,

> Interesting. So the way I see it is that you start from a general pattern T.
> You find atoms that meet that pattern, based on these atoms you refine T
> into T1, or perhaps rather an ensemble of T1s, right?
>
> You do need a scheme that defines how you go from T to T1 though, what is
> called knob/representation building in MOSES, right? Would be cool if such
> scheme can be coded as a BindLink or something like that.

yes...

The Pattern Miner contains one such scheme for going from T to T1,
which is hard-wired into its C++ code...

In the proposed refactoring I'm suggesting, this code would be
abstracted so that the basic "incremental growth of an ensemble of
sub-hypergraphs" process can be done according to any specified scheme
for going from T to T1 ... and indeed such a scheme should ideally be
represented in Atomese as well...


> It was [never] clear to me whether using the AtomSpace as working DB for the
> evolving trees would speed up or slow down MOSES. One thing that could speed
> it up is enabling memoization at the subtree level, as opposed to the tree
> level as in MOSES. Seemingly ProtoAtoms could be used as a cache memory over
> subtrees. Who knows what kind of speed up that may yield?

Yeah, for some fitness functions, subtree-level memoization or other
Atomspacey tricks could speed up MOSES evaluation ... and for others,
the raw speed increase of VertexTrees over the Atomspace will be
dominant ...

I would imagine that for numerical-data-analysis type applications,
e.g. financial prediction, the current MOSES would be faster, because
memoization probably isn't going to get you that far in that context,
and ultimately you just want to evaluate a wide variety of different
Boolean combinations on a wide variety of datasets really fast...

However, for applications involving more abstract programming
constructs, a smarter Reduct is what's going to give huge
improvements, and we'd rather implement that in OpenCog ...

Obviously we will just need to implement the Pattern Miner-y version
of MOSES and then profile it on different types of problems, and look
at the best routes to optimization when we get there... premature
optimization yadda yadda .... As the Atomspace is not *that*
inefficient now, I don't think the Pattern Miner-y MOSES is gonna be
so slow as to be a non-starter...

-- Ben
Reply all
Reply to author
Forward
0 new messages