As we've discussed many times, having MOSES implemented outside the
Atomspace is awkward for various reasons...
It seems to me that one could use the Pattern Miner infrastructure,
with only fairly modest modification, to implement a variety of MOSES
that operates in the Atomspace. This would have various advantages,
including enabling the hybridization of pattern mining and MOSES in
various ways.
PRELIMINARIES
Observe the basic parallel:
* the Pattern Miner works by taking an ensemble of patterns, and
growing them into larger patterns by adding more Atoms to them
* MOSES works by taking an ensemble of program trees (the deme
exemplars) and growing them into larger program trees by adding more
nodes to them
The key difference is that in the PM case, the new Atoms being added
are drawn from the Atomspace (with a goal of making the pattern one
that is highly frequent, or surprising, across the Atomspace).
Whereas in the MOSES case, the new nodes being added are “made up”
with a goal of making the program maximize a certain externally given
fitness function. (However, in an EDA-based version of MOSES, the
new nodes added are drawn from a certain probability distribution,
which is one that was mined by analysis of the program tree ensemble,
with a view toward finding patterns of nodes that lead to high
fitness).
If we view frequency/surprisingness as one fitness function among
many; and “forming patterns already existing in the Atomspace” as one
heuristic for pattern-expansion among many … then we can see the PM
and MOSES as aspects of the same basic dynamic.
A representational difference is that in the PM, all the patterns in
the ensemble are represented as parts of the same data structure, the
H-Graph (a collection of Atoms in a special Atomspace). In MOSES the
programs in the ensemble are stored as distinct objects, with no
attempt to factor out the redundant subprograms in the representation
(though redundant subprograms are identified in the EDA step).
In MOSES, Reduct is used to do program normalization. This concept
does not exist in the current PM. However, it would enhance the PM
considerably, via enabling semantically identical but syntactically
different instances of a pattern to be recognized as identical.
EXPANDED PATTERN MINER ALGORITHM
Incorporating these ideas, the unified/expanded pattern miner
algorithm would look like:
* Start with an ensemble of simple hypergraphs, constructed according
to template T (which is itself represented as a hypergraph)
* Iterate through the hypergraphs H in the ensemble, expanding each
hypergraph H into a set of larger hypergraphs, by applying an
expansion template T1 to some Atoms within H. T1 is a pattern with
variables, and some of the variables in T1 are bound to Atoms in H
whereas others are bound to other Atoms not in H.
* The non-H Atoms filling in the variable slots of T1 are chosen from
a certain probability distribution D, the nature of which is
pre-specified as part of the configuration of the algorithm. For
instance, D could draw from the Atomspace, meaning that only
expansions of H that already exist somewhere in the Atomspace would be
chosen. Or D could be proportional to the estimated fitness of the
expansion of H, according to some fitness estimation method (which
could be based on pattern-analysis of the growing pattern-ensemble
itself; more on this below). Crossover is another possible expansion
distribution.
(Note that the previous two steps could be implemented using SampleLink, i.e.
SampleLink
application of constraint T1 in the context of H
distribution D
)
* Once a particular expansion of H, call it H*, is formed, then its
quality is evaluated by a pre-specified quality measure. One possible
quality measure is the frequency or surprisingness of H* in the
Atomspace (this is the PM approach). Another possible quality measure
is the evaluation of H* according to some externally specified fitness
function (this is the MOSES case).
* Expansions leading to relatively poor-quality H* are pruned.
* Etc.
This process encompasses both PM and MOSES and other possibilities.
To run the process one specifies a template pattern for the initial
population, a template pattern and distribution for pattern-expansion,
and a quality measure for expanded-pattern assessment.
In some uses of MOSES, different demes involve different “feature
sets.” This logic would go into the distribution for
pattern-expansion: each feature-set is a different pattern-expansion
distribution. The “feature selection inside MOSES” logic comprises a
scheme wherein: the feature-sets leading to successful expansion
operations on H, are tweaked and then the tweaked versions are used
for a percentage of the new expansion operations on H.
Finally, an elegant aspect of this implementation of MOSES would be
that: the EDA distribution-mining aspect of MOSES would be implemented
using pattern mining on the evolving H-Graph (the program tree
ensemble). Pattern mining would be applied to the ensemble to find
patterns (expressed as Atoms) that occur surprisingly frequently in
highly “fit” (high quality) patterns in the H-Graph. These patterns
would then be used to constitute the pattern-expansion distribution of
the H-Graph.
Reduction could be implemented as an operation occurring on a
newly-expanded pattern H, immediately after H’s quality is evaluated
(unless the choice is made to prune H immediately). As in MOSES,
reduction would make pattern mining on the H-Graph work better. In a
PM application, reduction could either be done in this way or done on
the Atomspace as a whole, in advance.
EFFICIENCY OPTIMIZATION
The “plumbing” of this MOSES implementation would be slower than that
of the current MOSES, because the evaluation of each program-tree
(each pattern) would use the Atom formalism. However, this could be
optimized, as a later implementation step, if it appears necessary.
The way to optimize it would be to maintain a faster, non-Atom-based
version of the H-Graph, which is optimized for rapid evaluation of the
sub-hypergraphs in the H-Graph. When an expanded pattern is added to
the H-Graph, a corresponding expanded pattern must be added to the
optimized H-Graph. There would be some overhead in this scheme, but
if the fitness function involved (e.g.) repeated execution of patterns
against some dataset, the overhead would be worth it. The formalism
of linking executable subhypergraphs of the Atomspace to optimized
versions would be more broadly useful. (Possibly, the optimized
H-Graph structure could simply be built using the VertexTrees used in
the current MOSES implementation.)
Among many other applications — If one wished to uses MOSES as a
fancier version of pattern mining, able to make non-greedy leaps from
small patterns to significantly larger patterns, this would be do-able
very directly in this framework. (This application is actually what
originally triggered the above mode of thinking. I think this sort of
hybridization of MOSES and pattern mining will be needed to make
pattern mining really impactful on biology Atomspaces, for example.)
IMPLEMENTATION STRATEGY
The reason I mention this now is that we are shortly going to embark
on some improvements to the Pattern Miner implementation. One
possible way to do this would be to
* implement SampleLink
* modify the Pattern Miner to use SampleLink, and to use the abstract
workflow implied in the above “expanded Pattern Miner” description
* then, in parallel with using the refactored Pattern Miner for
pattern mining, build a new MOSES on top of the expanded Pattern
Miner infrastructure
* finally, do the MOSES efficiency optimization suggested above, if
some other efficiency optimization isn’t concocted
And then, as the Aussies say, Bob's your Uncle... ;)
--
Ben Goertzel, PhD
http://goertzel.org
“I tell my students, when you go to these meetings, see what direction
everyone is headed, so you can go in the opposite direction. Don’t
polish the brass on the bandwagon.” – V. S. Ramachandran