Addressing problem selection for GP

17 views
Skip to first unread message

Forrest Bennett

unread,
Apr 6, 2008, 9:14:51 PM4/6/08
to Field Guide to Genetic Programming
I had a local group of software professions read the tutorial that
overlaps with the book. People seemed to find the explanation of GP
itself very clear - there were literally no questions on that. 50% of
the questions centered around issues of problem selection.

Looking through your section 7.11 titled, "Where can we Expect GP to
Do Well?", I don't think that it quite answers the kinds of questions
I was getting. They wanted to know whether evolving a program to
compute the score for bowling would be a good first problem to get
their feet wet - the point being that it wasn't obvious to them
whether it would be or not. They asked me about a whole series of
problems, where the question was simply, could GP be used on something
like problem X?

For example, reading section 7.11 doesn't make it immediately obvious
that one difficulty in evolving a bowling scorer might be that there
are about 10**20 possible different games. You do hint at the issue
with your mention of approximate results. I talked a little about what
I could vaguely remember of (I think it was) Tina Yu's report of
evolving exact solutions to N-parity for arbitrary N and other
problems, without using exhaustive testing. I also explained how an
MDL component of fitness could help in this regard.

Of course you gave lots of examples, but they seemed to be more in
search of principles on which to base problem selection.

Forrest

Nic McPhee

unread,
Apr 7, 2008, 5:53:42 AM4/7/08
to Field Guide to Genetic Programming
For those who are wondering what "tutorial" Forrest is referring to,
the book started life as a chapter in a forthcoming handbook on
computational intelligence. (There's more on this in the Preface.) A
version of that chapter was also released as <a href="http://
www.essex.ac.uk/dces/research/publications/technicalreports/2007/ces475.pdf">a
technical report</a>, and that's what Forrest's group was using.
There were a lot of changes between that report and the final book,
including changing sections into chapters and moving a lot of material
around. In particular, the Section 7.11 that Forrest talks about
turned into Section 12.1 in the book, and along the way it changed
from being a simple bullet list to being an actual sequence of
paragraphs.

I'd obviously hope that the additional text might clarify some of
these issues, but I suspect the underlying issue still remains: When
is an evolutionary algorithm (in particular GP) the "right" (or at
least a promising) tool for a particular problem? The book provides a
zillion examples of applications of GP in Chapter 12, and some general
principles in 12.1, but that all still leaves the reader with a
substantial induction problem when faced with a new domain.

And this is clearly a common and important question; it came up from
the audience more than once at the panel discussion at EuroGP 2008 two
weeks ago in Naples. One answer from the audience was effectively
"Use GP when you don't know what else to try, and you don't mind
magic". There were people (including at least one of my co-authors)
who disagreed with the "method of last resort" tone, but I think
there's a kernel of truth in it. If there are well-behaved, well
understood deterministic solutions to your problem, then you probably
want to use them rather than using GP to evolve some mysterious
beast. On the other hand, I wouldn't recommend spending 6 months
trying to develop a "traditional" solution if a week with GP might
have turned up something entirely workable.

All of which leaves the question of which problems might GP be good
for, which I don't think has a simple answer. As experience
practitioners know, there's a certain amount of black magic and
experience that goes into getting an evolutionary algorithm (or a
neural net or most machine learning systems) to successfully address a
hard problem. Applying such tools is still more craft than science
(and changing that would be extremely valuable). In the bowling
score, for example, I suspect there are clever ways to pose the
problem and organize the large number of test cases that would make
the problem fairly tractable. Similarly, there are also ways of
framing the problem that are almost certainly guaranteed to fail.

I'm not sure how much this helps, but I think it's an important
question and look forward to continuing the conversation.

Nic
Reply all
Reply to author
Forward
0 new messages