> I believe Moshe implemented some variant of Eric's Hayek system, some
> time ago... as I recall he got it to work, but it was slow and
> difficult to tune...
>
I was inspired by
Market-Based Reinforcement Learning in Partially Observable Worlds
Kwee, Hutter, and Schmidhuber
http://arxiv.org/PS_cache/cs/pdf/0105/0105025v1.pdf
which could not reproduce Baum & Durdanovic's results, but changed
some details. I tried to get all the details right, and managed a 2/5
success rate on the blocksworld problem (whereas Baum's own
implementation got 100% success, and Kwee et al.'s was 0%).
> What makes this sort of comparison hard is that, when you dig into the
> details, both MOSES/Plop and Hayek turn out to be moderately broad
> "families of algorithms", so that choosing only one alg from each
> family for comparisons is very hard and winds up being a research
> project in improving/fleshing-out each approach...
>
Right, and additionally also choosing which problems to test on would
be tricky.
One major difference between Hayek and MOSES approaches is that in
Hayek one explicitly learns a collection of rules, whereas in MOSES
one learns individual program trees. So there is a nontrivial language
gap to be overcome. Did you have particular test problems in mind,
Abram? One possibility would be to simply see if MOSES could learn to
solve the Hayek Blocksworld task: I wrote up some notes on how to
present the problem a long time ago at:
http://code.google.com/p/moses/wiki/BlocksWorldProblem
Plop does not have action/perception capabilities, yet, so you would
need to add these. Also, recent plop improvements that I have made
have been done to an internal Google fork of plop for a proprietary
project, so if you do end up undertaking this project, drop me a line
and I will backport these improvements to the open source version... I
plan on doing this at some point anyway, but it is not a priority
unless there is someone outside Google (like you) who wants to do
serious work on the codebase.
> Is there any
> more documentation on this than that available in syntax.lisp? (I have
> not yet examined the whole codebase.)
>
Semantics.lisp and some other files might have more comments, but
these are not up-to-date or complete overall, sorry! On a
non-implementation level, the paper that Ben & I wrote for AGI-09
(http://www.agi-09.org/papers/paper_69.pdf) has some relevant bits..
Cheers,
Moshe
Abram, I would think you know that a collection of programs is another
program ;-)
>> Plop does not have action/perception capabilities, yet, so you would
>> need to add these.
>
> How difficult do you think this would be?
>
Not especially difficult to do poorly, difficult to do well. The
problem is that representation-building and distance estimation
between programs would need to be worked out carefully and tuned in
order to obtain decent performance...
- Moshe
P.S. Regarding collections of programs vs. individual programs there
is of course no difference for a language where the individual
programs are turing complete but when they are not (as in Hayek) a
collection can be more expressive (i.e. correspond to a broader
computational class).