What's new in the FAQ?

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Last-Modified: September 20, 2000
Issue: 8.2

Hello, and welcome to the new edition of the comp.ai.genetic FAQ,
"The Hitch-Hiker's Guide to Evolutionary Computation".

As usual, here's a description of the differences in content between
this and the previous version. There are no major changes. Details of
minor changes are given below.

The last edition was posted on March 28 2000. There was no June edition.
The next edition is due out on December 20th, 2000. Every second
monday, a reminder is posted to comp.ai.genetic and telling people how
to pick up copies on the WWW, etc.

As always, if you have information which could usefully go into the
guide, send it to me at the address below. (NOTE: Do NOT send any
email to the cs.cf.ac.uk address: it will be ignored.) Please include
in the Subject line: HHGTEC. If you find any information is
inaccurate, please let me know (if possible, with the _accurate_
information). All contributions/help welcome.

If you have any questions which you'd like to be answered in the FAQ,
post them to the newsgroup first. If any sensible answers are
forthcoming, let me know, and I'll include the information in the FAQ.

The ASCII version of the FAQ comes in 6 parts, with the following broad
categories:

1. Intro to the FAQ and the newsgroup 24k
2. Descriptions of different types of EAs 56k
3. Practice and theory 55k
4. References and information sources 81k
5. EA software packages 93k
6. Other topics and glossary 74k

CHANGES

Q2 : Updated "checkers" to "checkers (draughts)" for Chellapilla and
Fogel's work.
Q4.1 : Noted that FTP address for Larry Yaegers Polyworld is now unknown
(does anyone know where it has gone to?)
Q12 : Updated conference information
Q15.3: Deleted reference to Purdue site, since it has now gone
Q20.1: Revised entry for Gannet.
Q99 : Elitism - fixed typo

AVAILABILITY

The HTML version is available from various places, including:

Germany: EUnet Deutschland GmbH:
http://surf.de.uu.net/encore/www/

Spain: The University of Granada:
http://krypton.ugr.es/~encore/www/

The University of Oviedo:
http://www.etsimo.uniovi.es/ftp/pub/EC/FAQ/www/

UK: The University of Birmingham
http://www.cs.bham.ac.uk/Mirrors/ftp.de.uu.net/EC/clife/www/

USA: The Santa Fe Institute:
http://alife.santafe.edu/~joke/encore/www/

Purdue University, West Lafayette, IN:
http://www.cs.purdue.edu/coast/archive/clife/FAQ/www/

Hong Kong: The Chinese University of Hong Kong:
http://www.cs.cuhk.hk/pub/EC/FAQ/www/top.htm

David Beasley, Bath, UK __o
david....@iee.org \<,
-comp.ai.genetic FAQ Editor- ___________________()/ ()___
--

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part1
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

The

Hitch-Hiker's

Guide to

Evolutionary Computation

(FAQ for comp.ai.genetic)

edited by

Joerg Heitkoetter
UUnet Deutschland GmbH
Sebrathweg 20
D-44149 Dortmund, Germany
<jo...@de.uu.net>
or <jo...@santafe.edu>

and

David Beasley
ingenta ltd
BUCS Building,
University of Bath,
Bath, United Kingdom BA2 7AY
<david....@iee.org>

PLEASE:
Search this document first if you have a question
and
If someone posts a question to the newsgroup which is answered in here
DON'T POST THE ANSWER TO THE NEWSGROUP:
POINT THE ASKER TO THIS FAQ
and finally

DON'T PANIC!

Copyright (c) 1993-2000 by Joerg Heitkoetter and David Beasley, all
rights reserved.

This FAQ may be posted to any USENET newsgroup, on-line service, or
BBS as long as it is posted in its entirety and includes this
copyright statement. This FAQ may not be distributed for financial
gain. This FAQ may not be included in commercial collections or
compilations without express permission from the author.

FAQ /F-A-Q/ or /fak/ [USENET] n. 1. A Frequently Asked Question.
2. A compendium of accumulated lore, posted periodically to
high-volume newsgroups in an attempt to forestall such
questions. Some people prefer the term `FAQ list' or `FAQL'
/fa'kl/, reserving `FAQ' for sense 1.

RTFAQ
/R-T-F-A-Q/ [USENET: primarily written, by analogy with RTFM]
imp. Abbrev. for `Read the FAQ!', an exhortation that the person
addressed ought to read the newsgroup's FAQ list before posting
questions.

RTFM /R-T-F-M/ [UNIX] imp. Acronym for `Read The Fucking Manual'. 1.
Used by gurus to brush off questions they consider trivial or
annoying. Compare Don't do that, then! 2. Used when reporting
a problem to indicate that you aren't just asking out of
randomness. "No, I can't figure out how to interface UNIX to my
toaster, and yes, I have RTFM." Unlike sense 1, this use is
considered polite. ...

--- "The on-line hacker Jargon File, version 3.0, 29 July
1993"

PREFACE
This guide is intended to help, provide basic information, and serve
as a first straw for individuals, i.e. uninitiated hitch-hikers, who
are stranded in the mindboggling universe of Evolutionary Computation
(EC); that in turn is only a small footpath to an even more
mindboggling scientific universe, that, incorporating Fuzzy Systems,
and Artificial Neural Networks, is sometimes referred to as
Computational Intelligence (CI); that in turn is only part of an even
more advanced scientific universe of mindparalysing complexity, that
incorporating Artificial Life, Fractal Geometry, and other Complex
Systems Sciences might someday be referred to as Natural Computation
(NC).

Over the course of the past years, GLOBAL OPTIMIZATION algorithms
imitating certain principles of nature have proved their usefulness
in various domains of applications. Especially worth copying are
those principles where nature has found "stable islands" in a
"turbulent ocean" of solution possibilities. Such phenomena can be
found in annealing processes, central nervous systems and biological
EVOLUTION, which in turn have lead to the following OPTIMIZATION
methods: Simulated Annealing (SA), Artificial Neural Networks (ANNs)
and the field of Evolutionary Computation (EC).

EC may currently be characterized by the following pathways: Genetic
Algorithms (GA), Evolutionary Programming (EP), Evolution Strategies
(ES), Classifier Systems (CFS), Genetic Programming (GP), and several
other problem solving strategies, that are based upon biological
observations, that date back to Charles Darwin's discoveries in the
19th century: the means of natural selection and the survival of the
fittest, and theories of evolution. The inspired algorithms are thus
termed Evolutionary Algorithms (EA).

Moreover, this guide is intended to help those who are just beginning
to read the comp.ai.genetic newsgroup, and those who are new "on"
USENET. It shall help to avoid lengthy discussions of questions that
usually arise for beginners of one or the other kind, and which are
boring to read again and again by comp.ai.genetic "old-timers."

You will see this guide popping up periodically in the Usenet
newsgroup comp.ai.genetic (and also comp.answers , and news.answers ,
where it should be locatable at any time).

ORIGIN
This guide was produced by Joerg Heitkoetter (known as Joke) in early
1993, using material from many sources (see Acknowledgements ), mixed
with his own brand of humour. Towards the end of 1993, Joerg handed
over editorial responsibility to David Beasley . He reorganised the
guide in various ways, and generally attempted to inject his own
brand of orderliness. Thus, any jokes are the work of Joke. The
mundane bits are David's responsibility.

The guide is kept up to date, as far as possible, and new versions
are issued several times a year. However, we do rely on useful
information being sent to us for inclusion in the guide (we dont
always have time to read comp.ai.genetic , for example).
Contributions, additions, corrections, cash, etc. are therefore
always welcome. Send e-mail to the address at the beginning of the
guide.

DISCLAIMER
This periodic posting is not meant to discuss any topic exhaustively,
but should be thought of as a list of reference pointers, instead.
This posting is provided on an "as is" basis, NO WARRANTY whatsoever
is expressed or implied, especially, NO WARRANTY that the information
contained herein is up-to-date, correct or useful in any way,
although all this is intended.

Moreover, please note that the opinions expressed herein do not
necessarily reflect those of the editors' institutions or employers,
neither as a whole, nor in part. They are just the amalgamation of
the editors' collections of ideas, and contributions gleaned from
other sources.

NOTE: some portions of this otherwise rather dry guide are intended
to be satirical. If you do not recognize it as such, consult your
local doctor or a professional comedian.

HOW TO USE THIS GUIDE
HITCH-HIKING THE FAQNIVERSE
This guide is big. Really big. You just won't believe how hugely,
vastly, mindbogglingly big it is. That's why it has been split into a
"trilogy" -- which, like all successful trilogies, eventually ends up
consisting of more than three parts.

Searching for answers
To find the answer of question number x, just search for the string
"Qx:". (So the answer to question 42 is at "Q42:"!)

What does [xxxx99] mean?
Some books are referenced again and again, that's why they have this
kind of "tag", that an experienced hitch-hiker will search for in the
list of books (see Q10 and Q12 and other places) to dissolve the
riddle. Here, they have a ":" appended, thus you can search for the
string "[ICGA85]:" for example.

Why all this UPPERCASING in running text?
Words written in all uppercase letters are cross-references to
entries in the Glossary (see Q99). Again, they have a ":" appended,
thus if you find, say EVOLUTION, you can search for the string
"EVOLUTION:" in the Glossary.

FTP and HTTP naming conventions
A file available on an FTP server will be specified as: <ftp-site-
name>/<the-complete-filename> So for example, the file bar.tar.gz in
the directory /pub/foo on the ftp server ftp.certain.site would be
specified as: ftp.certain.site/pub/foo/bar.tar.gz

A specification ending with a "/" is a reference to a whole
directory, e.g. ftp.certain.site/pub/foo/
HTTP files are specified in a similar way, but with the prefix:
http://

WHERE TO FIND THIS GUIDE
Between postings to comp.ai.genetic , this FAQ is available on the
World Wide Web. Get it from any ENCORE site (See Q15.3). The
following Encore sites can be accessed by HTTP. If you use the one
closest to you, you should get the best speed of service. (Note,
however, that some sites are not always up to date. The guide is
normally issued every 3 months.)

o The Chinese University of Hong Kong (China):
http://www.cs.cuhk.hk/pub/EC/FAQ/www/top.htm

o Ecole Polytechnique (France):
http://www.eark.polytechnique.fr/EC/FAQ/www/top.htm

o UUnet Deutschland GmbH (Germany):
http://surf.de.uu.net/encore/www/

o The University of Granada (Spain):
http://krypton.ugr.es/~encore/www/

o The University of Oviedo (Spain):
http://www.etsimo.uniovi.es/ftp/pub/EC/FAQ/www/

o The University of Birmingham (UK)
http://www.cs.bham.ac.uk/Mirrors/ftp.de.uu.net/EC/clife/www/

o The Santa Fe Institute (USA):
http://alife.santafe.edu/~joke/encore/www/

Other Encore sites can be accessed by FTP, and the FAQ can be found
in the file FAQ/www/top.htm or something similar. The FAQ is also
available in plain text format on Encore, and from
rtfm.mit.edu/pub/usenet/news.answers/ai-faq/genetic/ as the files:
part1 to part6. The FAQ may also be retrieved by e-mail from <mail-
ser...@rtfm.mit.edu>. Send a message to the mail-server with "help"
and "index" in the body on separate lines for more information.

A PostScript version is also available. This looks really crisp
(using boldface, italics, etc.), and is available for those who
prefer offline reading. Get it from Encore in file FAQ/hhgtec.ps.gz
(the plain text versions are in the same directory too).

"As a net is made up of a series of ties, so everything in this
world is connected by a series of ties. If anyone thinks that the
mesh of a net is an independent, isolated thing, he is mistaken. It
is called a net because it is made up of a series of interconnected
meshes, and each mesh has its place and
responsibility in relation to other meshes."

--- Buddha

Referencing this Guide
If you want to reference this guide it should look like:

Heitkoetter, Joerg and Beasley, David, eds. (2000) "The Hitch-
Hiker's Guide to Evolutionary Computation: A list of Frequently Asked
Questions (FAQ)", USENET: comp.ai.genetic. Available via anonymous
FTP from rtfm.mit.edu/pub/usenet/news.answers/ai-faq/genetic/ About
110 pages.

Or simply call it "the Guide", or "HHGTEC" for acronymaniacs.

The ZEN Puzzle
For some weird reason this guide contains some puzzles which can only
be solved by cautious readers who have (1) a certain amount of a
certain kind of humor, (2) a certain amount of patience and time, (3)
a certain amount of experience in ZEN NAVIGATION, and (4) a certain
amount of books of a certain author.

Usually, puzzles search either for certain answers (more often, ONE
answer) to a question; or, for the real smartasses, sometimes an
answer is presented, and a certain question is searched for. ZEN
puzzles are even more challenging: you have to come up with an answer
to a question, both of which are not explicitly, rather implicitly
stated somewhere in this FAQ. Thus, you are expected to give an
answer AND a question!

To give an impression what this is all about, consider the following,
submitted by Craig W. Reynolds. The correct question is: "Why is
Fisher's `improbability quote' (cf EPILOGUE ) included in this FAQ?",
Craig's correct answer is: `This is a GREAT quotation, it sounds like
something directly out of a turn of the century Douglas Adams:
Natural Selection:
the original "Infinite Improbability Drive"' Got the message? Well,
this was easy and very obvious. The other puzzles are more
challenging...

However, all this is just for fun (mine and hopefully yours), there
is nothing like the $100 price, some big shots in computer science,
e.g. Don Knuth usually offer; all there is but a honorable
mentioning of the ZEN navigator, including the puzzle s/he solved.
It's thus like in real life: don't expect to make money from your
time being a scientist, it's all just for the fun of it...

Enjoy the trip!

TABLE OF CONTENTS
Part1

Q0: How about an introduction to comp.ai.genetic?

Part2

Q1: What are Evolutionary Algorithms (EAs)?
Q1.1: What's a Genetic Algorithm (GA)?
Q1.2: What's Evolutionary Programming (EP)?
Q1.3: What's an Evolution Strategy (ES)?
Q1.4: What's a Classifier System (CFS)?
Q1.5: What's Genetic Programming (GP)?

Part3

Q2: What applications of EAs are there?

Q3: Who is concerned with EAs?

Q4: How many EAs exist? Which?
Q4.1: What about Alife systems, like Tierra and VENUS?

Q5: What about all this Optimization stuff?

Part4

Q10: What introductory material on EAs is there?
Q10.1: Suitable background reading for beginners?
Q10.2: Textbooks on EC?
Q10.3: The Classics?
Q10.4: Introductory Journal Articles?
Q10.5: Introductory Technical Reports?
Q10.6: Not-quite-so-introductory Literature?
Q10.7: Biological Background Readings?
Q10.8: On-line bibliography collections?
Q10.9: Videos?
Q10.10: CD-ROMs?
Q10.11: How do I get a copy of a dissertation?

Q11: What EC related journals and magazines are there?

Q12: What are the important conferences/proceedings on EC?

Q13: What Evolutionary Computation Associations exist?

Q14: What Technical Reports are available?

Q15: What information is available over the net?
Q15.1: What digests are there?
Q15.2: What mailing lists are there?
Q15.3: What online information repositories are there?
Q15.4: What relevant newsgroups and FAQs are there?
Q15.5: What about all these Internet Services?

Part5

Q20: What EA software packages are available?
Q20.1: Free software packages?
Q20.2: Commercial software packages?
Q20.3: Current research projects?

Part6

Q21: What are Gray codes, and why are they used?

Q22: What test data is available?

Q42: What is Life all about?
Q42b: Is there a FAQ to this group?

Q98: Are there any patents on EAs?

Q99: A Glossary on EAs?

----------------------------------------------------------------------

Subject: Q0: How about an introduction to comp.ai.genetic?

Certainly. See below.

What is comp.ai.genetic all about?
The newsgroup comp.ai.genetic is intended as a forum for people who
want to use or explore the capabilities of Genetic Algorithms (GA),
Evolutionary Programming (EP), Evolution Strategies (ES), Classifier
Systems (CFS), Genetic Programming (GP), and some other, less well-
known problem solving algorithms that are more or less loosely
coupled to the field of Evolutionary Computation (EC).

How do I get started? What about USENET documentation?
The following guidelines present the essentials of the USENET online
documentation, that is posted each month to news.announce.newusers
If you are already familiar with "netiquette" you can skip to the end
of this answer; if you don't know what the hell this is all about,
proceed as follows: (1) carefully read the following paragraphs, (2)
read all the documents in news.announce.newusers before posting any
article to USENET. At least you should give the introductory stuff a
try, i.e. files "news-answers/introduction" and "news-answers/news-
newusers-intro". Both are survey articles, that provide a short and
easy way to get an overview of the interesting parts of the online
docs, and thus can help to prevent you from drowning in the megabytes
to read. Both can be received either by subscribing to news.answers ,
or sending the following message to <mail-...@rtfm.mit.edu>:

send usenet/news.answers/introduction
send usenet/news.answers/news-newusers-intro
quit

Netiquette
"Usenet is a convention, in every sense of the word."

Although USENET is usually characterized as "an anarchy, with no laws
and no one in charge" there have "emerged" several rules over the
past years that shall facilitate life within newsgroups. Thus, you
will probably find the following types of articles:

1. Requests
Requests are articles of the form "I am looking for X" where X is
something public like a book, an article, a piece of software.

If multiple different answers can be expected, the person making the
request should prepare to make a summary of the answers he/she got
and announce to do so with a phrase like "Please e-mail, I'll
summarize" at the end of the posting.

The Subject line of the posting should then be something like
"Request: X"

2. Questions
As opposed to requests, questions are concerned with something so
specific that general interest cannot readily be assumed. If the
poster thinks that the topic is of some general interest, he/she
should announce a summary (see above).

The Subject line of the posting should be something like "Question:
this-and-that" (Q: this-and-that) or have the form of a question
(i.e., end with a question mark)

3. Answers
These are reactions to questions or requests. As a rule of thumb
articles of type "answer" should be rare. Ideally, in most cases
either the answer is too specific to be of general interest (and
should thus be e-mailed to the poster) or a summary was announced
with the question or request (and answers should thus be e-mailed to
the poster).

The subject lines of answers are automatically adjusted by the news
software.

4. Summaries
In all cases of requests or questions the answers for which can be
assumed to be of some general interest, the poster of the request or
question shall summarize the answers he/she received. Such a summary
should be announced in the original posting of the question or
request with a phrase like "Please answer by e-mail, I'll summarize"

In such a case answers should NOT be posted to the newsgroup but
instead be mailed to the poster who collects and reviews them. After
about 10 to 20 days from the original posting, its poster should make
the summary of answers and post it to the net.

Some care should be invested into a summary:

a) simple concatenation of all the answers might not be enough;
instead redundancies, irrelevances, verbosities and errors should
be filtered out (as good as possible),

b) the answers shall be separated clearly

c) the contributors of the individual answers shall be identifiable
unless they requested to remain anonymous [eds note: yes, that
happens])

d) the summary shall start with the "quintessence" of the answers, as
seen by the original poster

e) A summary should, when posted, clearly be indicated to be one by
giving it a Subject line starting with "Summary:"

Note that a good summary is pure gold for the rest of the newsgroup
community, so summary work will be most appreciated by all of us.
(Good summaries are more valuable than any moderator!)

5. Announcements
Some articles never need any public reaction. These are called
announcements (for instance for a workshop, conference or the
availability of some technical report or software system).

Announcements should be clearly indicated to be such by giving them a
subject line of the form "Announcement: this-and-that", or "ust "A:
this-and-that".

Due to common practice, conference announcements usually carry a
"CFP:" in their subject line, i.e. "call for papers" (or: "call for
participation").

6. Reports
Sometimes people spontaneously want to report something to the
newsgroup. This might be special experiences with some software,
results of own experiments or conceptual work, or especially
interesting information from somewhere else.

Reports should be clearly indicated to be such by giving them a
subject line of the form "Report: this-and-that"

7. Discussions
An especially valuable possibility of USENET is of course that of
discussing a certain topic with hundreds of potential participants.
All traffic in the newsgroup that can not be subsumed under one of
the above categories should belong to a discussion.

If somebody explicitly wants to start a discussion, he/she can do so
by giving the posting a subject line of the form "Start discussion:
this-and-that" (People who react on this, please remove the "Start
discussion: " label from the subject line of your replies)
It is quite difficult to keep a discussion from drifting into chaos,
but, unfortunately, as many other newsgroups show there seems to be
no secure way to avoid this. On the other hand, comp.ai.genetic has
not had many problems with this effect, yet, so let's just go and
hope...

Thanks in advance for your patience!
The Internet
For information on internet services, see Q15.5.

------------------------------

Copyright (c) 1993-2000 by J. Heitkoetter and D. Beasley, all rights
reserved.

This FAQ may be posted to any USENET newsgroup, on-line service, or
BBS as long as it is posted in its entirety and includes this
copyright statement. This FAQ may not be distributed for financial
gain. This FAQ may not be included in commercial collections or
compilations without express permission from the author.

End of ai-faq/genetic/part1
***************************

--

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part2
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

TABLE OF CONTENTS OF PART 2

Q1: What are Evolutionary Algorithms (EAs)?
Q1.1: What's a Genetic Algorithm (GA)?
Q1.2: What's Evolutionary Programming (EP)?
Q1.3: What's an Evolution Strategy (ES)?
Q1.4: What's a Classifier System (CFS)?
Q1.5: What's Genetic Programming (GP)?

----------------------------------------------------------------------

Subject: Q1: What are Evolutionary Algorithms (EAs)?

Evolutionary algorithm is an umbrella term used to describe computer-
based problem solving systems which use computational models of some
of the known mechanisms of EVOLUTION as key elements in their design
and implementation. A variety of EVOLUTIONARY ALGORITHMs have been
proposed. The major ones are: GENETIC ALGORITHMs (see Q1.1),
EVOLUTIONARY PROGRAMMING (see Q1.2), EVOLUTION STRATEGIEs (see Q1.3),
CLASSIFIER SYSTEMs (see Q1.4), and GENETIC PROGRAMMING (see Q1.5).
They all share a common conceptual base of simulating the evolution
of INDIVIDUAL structures via processes of SELECTION, MUTATION, and
REPRODUCTION. The processes depend on the perceived PERFORMANCE of
the individual structures as defined by an ENVIRONMENT.

More precisely, EAs maintain a POPULATION of structures, that evolve
according to rules of selection and other operators, that are
referred to as "search operators", (or GENETIC OPERATORs), such as
RECOMBINATION and mutation. Each individual in the population
receives a measure of its FITNESS in the environment. Reproduction
focuses attention on high fitness individuals, thus exploiting (cf.
EXPLOITATION) the available fitness information. Recombination and
mutation perturb those individuals, providing general heuristics for
EXPLORATION. Although simplistic from a biologist's viewpoint, these
algorithms are sufficiently complex to provide robust and powerful
adaptive search mechanisms.

--- "An Overview of Evolutionary Computation" [ECML93], 442-459.

BIOLOGICAL BASIS
To understand EAs, it is necessary to have some appreciation of the
biological processes on which they are based.

Firstly, we should note that EVOLUTION (in nature or anywhere else)
is not a purposive or directed process. That is, there is no
evidence to support the assertion that the goal of evolution is to
produce Mankind. Indeed, the processes of nature seem to boil down to
a haphazard GENERATION of biologically diverse organisms. Some of
evolution is determined by natural SELECTION or different INDIVIDUALs
competing for resources in the ENVIRONMENT. Some are better than
others. Those that are better are more likely to survive and
propagate their genetic material.

In nature, we see that the encoding for genetic information (GENOME)
is done in a way that admits asexual REPRODUCTION. Asexual
reproduction typically results in OFFSPRING that are genetically
identical to the PARENT. (Large numbers of organisms reproduce
asexually; this includes most bacteria which some biologists hold to
be the most successful SPECIES known.)

Sexual reproduction allows some shuffing of CHROMOSOMEs, producing
offspring that contain a combination of information from each parent.
At the molecular level what occurs (wild oversimplification alert!)
is that a pair of almost identical chromosomes bump into one another,
exchange chunks of genetic information and drift apart. This is the
RECOMBINATION operation, which is often referred to as CROSSOVER
because of the way that biologists have observed strands of
chromosomes crossing over during the exchange.

Recombination happens in an environment where the selection of who
gets to mate is largely a function of the FITNESS of the individual,
i.e. how good the individual is at competing in its environment. Some
"luck" (random effect) is usually involved too. Some EAs use a simple
function of the fitness measure to select individuals
(probabilistically) to undergo genetic operations such as crossover
or asexual reproduction (the propagation of genetic material
unaltered). This is fitness-proportionate selection. Other
implementations use a model in which certain randomly selected
individuals in a subgroup compete and the fittest is selected. This
is called tournament selection and is the form of selection we see in
nature when stags rut to vie for the privilege of mating with a herd
of hinds.

Much EA research has assumed that the two processes that most
contribute to evolution are crossover and fitness based
selection/reproduction. Evolution, by definition, absolutely
requires diversity in order to work. In nature, an important source
of diversity is MUTATION. In an EA, a large amount of diversity is
usually introduced at the start of the algorithm, by randomising the
GENEs in the POPULATION. The importance of mutation, which
introduces further diversity while the algorithm is running,
therefore continues to be a matter of debate. Some refer to it as a
background operator, simply replacing some of the original diversity
which may have been lost, while others view it as playing the
dominant role in the evolutionary process.

It cannot be stressed too strongly that an EVOLUTIONARY ALGORITHM (as
a SIMULATION of a genetic process) is not a random search for a
solution to a problem (highly fit individual). EAs use stochastic
processes, but the result is distinctly non-random (better than
random).

PSEUDO CODE
Algorithm EA is

// start with an initial time
t := 0;

// initialize a usually random population of individuals
initpopulation P (t);

// evaluate fitness of all initial individuals in population
evaluate P (t);

// test for termination criterion (time, fitness, etc.)
while not done do

// increase the time counter
t := t + 1;

// select sub-population for offspring production
P' := selectparents P (t);

// recombine the "genes" of selected parents
recombine P' (t);

// perturb the mated population stochastically
mutate P' (t);

// evaluate its new fitness
evaluate P' (t);

// select the survivors from actual fitness
P := survive P,P' (t);
od
end EA.

------------------------------

Subject: Q1.1: What's a Genetic Algorithm (GA)?

The GENETIC ALGORITHM is a model of machine learning which derives
its behavior from a metaphor of some of the mechanisms of EVOLUTION
in nature. This is done by the creation within a machine of a
POPULATION of INDIVIDUALs represented by CHROMOSOMEs, in essence a
set of character strings that are analogous to the base-4 chromosomes
that we see in our own DNA. The individuals in the population then
go through a process of simulated "evolution".

Genetic algorithms are used for a number of different application
areas. An example of this would be multidimensional OPTIMIZATION
problems in which the character string of the chromosome can be used
to encode the values for the different parameters being optimized.

In practice, therefore, we can implement this genetic model of
computation by having arrays of bits or characters to represent the
chromosomes. Simple bit manipulation operations allow the
implementation of CROSSOVER, MUTATION and other operations. Although
a substantial amount of research has been performed on variable-
length strings and other structures, the majority of work with
genetic algorithms is focussed on fixed-length character strings. We
should focus on both this aspect of fixed-lengthness and the need to
encode the representation of the solution being sought as a character
string, since these are crucial aspects that distinguish GENETIC
PROGRAMMING, which does not have a fixed length representation and
there is typically no encoding of the problem.

When the genetic algorithm is implemented it is usually done in a
manner that involves the following cycle: Evaluate the FITNESS of
all of the individuals in the population. Create a new population by
performing operations such as crossover, fitness-proportionate
REPRODUCTION and mutation on the individuals whose fitness has just
been measured. Discard the old population and iterate using the new
population.

One iteration of this loop is referred to as a GENERATION. There is
no theoretical reason for this as an implementation model. Indeed,
we do not see this punctuated behavior in populations in nature as a
whole, but it is a convenient implementation model.

The first generation (generation 0) of this process operates on a
population of randomly generated individuals. From there on, the
genetic operations, in concert with the fitness measure, operate to
improve the population.

PSEUDO CODE
Algorithm GA is

// start with an initial time
t := 0;

// initialize a usually random population of individuals
initpopulation P (t);

// evaluate fitness of all initial individuals of population
evaluate P (t);

// test for termination criterion (time, fitness, etc.)
while not done do
// increase the time counter
t := t + 1;

// select a sub-population for offspring production
P' := selectparents P (t);

// recombine the "genes" of selected parents
recombine P' (t);

// perturb the mated population stochastically
mutate P' (t);

// evaluate its new fitness
evaluate P' (t);

// select the survivors from actual fitness
P := survive P,P' (t);
od
end GA.

------------------------------

Subject: Q1.2: What's Evolutionary Programming (EP)?

Introduction
EVOLUTIONARY PROGRAMMING, originally conceived by Lawrence J. Fogel
in 1960, is a stochastic OPTIMIZATION strategy similar to GENETIC
ALGORITHMs, but instead places emphasis on the behavioral linkage
between PARENTs and their OFFSPRING, rather than seeking to emulate
specific GENETIC OPERATORs as observed in nature. Evolutionary
programming is similar to EVOLUTION STRATEGIEs, although the two
approaches developed independently (see below).

Like both ES and GAs, EP is a useful method of optimization when
other techniques such as gradient descent or direct, analytical
discovery are not possible. Combinatoric and real-valued FUNCTION
OPTIMIZATION in which the optimization surface or FITNESS landscape
is "rugged", possessing many locally optimal solutions, are well
suited for evolutionary programming.

History
The 1966 book, "Artificial Intelligence Through Simulated Evolution"
by Fogel, Owens and Walsh is the landmark publication for EP
applications, although many other papers appear earlier in the
literature. In the book, finite state automata were evolved to
predict symbol strings generated from Markov processes and non-
stationary time series. Such evolutionary prediction was motivated
by a recognition that prediction is a keystone to intelligent
behavior (defined in terms of adaptive behavior, in that the
intelligent organism must anticipate events in order to adapt
behavior in light of a goal).

In 1992, the First Annual Conference on evolutionary programming was
held in La Jolla, CA. Further conferences have been held annually
(See Q12). The conferences attract a diverse group of academic,
commercial and military researchers engaged in both developing the
theory of the EP technique and in applying EP to a wide range of
optimization problems, both in engineering and biology.

Rather than list and analyze the sources in detail, several
fundamental sources are listed below which should serve as good
pointers to the entire body of work in the field.

The Process
For EP, like GAs, there is an underlying assumption that a fitness
landscape can be characterized in terms of variables, and that there
is an optimum solution (or multiple such optima) in terms of those
variables. For example, if one were trying to find the shortest path
in a Traveling Salesman Problem, each solution would be a path. The
length of the path could be expressed as a number, which would serve
as the solution's fitness. The fitness landscape for this problem
could be characterized as a hypersurface proportional to the path
lengths in a space of possible paths. The goal would be to find the
globally shortest path in that space, or more practically, to find
very short tours very quickly.

The basic EP method involves 3 steps (Repeat until a threshold for
iteration is exceeded or an adequate solution is obtained):

(1) Choose an initial POPULATION of trial solutions at random. The
number of solutions in a population is highly relevant to the
speed of optimization, but no definite answers are available as
to how many solutions are appropriate (other than >1) and how
many solutions are just wasteful.

(2) Each solution is replicated into a new population. Each of
these offspring solutions are mutated according to a
distribution of MUTATION types, ranging from minor to extreme
with a continuum of mutation types between. The severity of
mutation is judged on the basis of the functional change imposed
on the parents.

(3) Each offspring solution is assessed by computing its fitness.
Typically, a stochastic tournament is held to determine N
solutions to be retained for the population of solutions,
although this is occasionally performed deterministically.
There is no requirement that the population size be held
constant, however, nor that only a single offspring be generated
from each parent.

It should be pointed out that EP typically does not use any CROSSOVER
as a genetic operator.

EP and GAs
There are two important ways in which EP differs from GAs.

First, there is no constraint on the representation. The typical GA
approach involves encoding the problem solutions as a string of
representative tokens, the GENOME. In EP, the representation follows
from the problem. A neural network can be represented in the same
manner as it is implemented, for example, because the mutation
operation does not demand a linear encoding. (In this case, for a
fixed topology, real- valued weights could be coded directly as their
real values and mutation operates by perturbing a weight vector with
a zero mean multivariate Gaussian perturbation. For variable
topologies, the architecture is also perturbed, often using Poisson
distributed additions and deletions.)

Second, the mutation operation simply changes aspects of the solution
according to a statistical distribution which weights minor
variations in the behavior of the offspring as highly probable and
substantial variations as increasingly unlikely. Further, the
severity of mutations is often reduced as the global optimum is
approached. There is a certain tautology here: if the global optimum
is not already known, how can the spread of the mutation operation be
damped as the solutions approach it? Several techniques have been
proposed and implemented which address this difficulty, the most
widely studied being the "Meta-Evolutionary" technique in which the
variance of the mutation distribution is subject to mutation by a
fixed variance mutation operator and evolves along with the solution.

EP and ES
The first communication between the evolutionary programming and
EVOLUTION STRATEGY groups occurred in early 1992, just prior to the
first annual EP conference. Despite their independent development
over 30 years, they share many similarities. When implemented to
solve real-valued function optimization problems, both typically
operate on the real values themselves (rather than any coding of the
real values as is often done in GAs). Multivariate zero mean Gaussian
mutations are applied to each parent in a population and a SELECTION
mechanism is applied to determine which solutions to remove (i.e.,
"cull") from the population. The similarities extend to the use of
self-adaptive methods for determining the appropriate mutations to
use -- methods in which each parent carries not only a potential
solution to the problem at hand, but also information on how it will
distribute new trials (offspring). Most of the theoretical results on
convergence (both asymptotic and velocity) developed for ES or EP
also apply directly to the other.

The main differences between ES and EP are:

1. Selection: EP typically uses stochastic selection via a
tournament. Each trial solution in the population faces
competition against a preselected number of opponents and
receives a "win" if it is at least as good as its opponent in
each encounter. Selection then eliminates those solutions with
the least wins. In contrast, ES typically uses deterministic
selection in which the worst solutions are purged from the
population based directly on their function evaluation.

2. RECOMBINATION: EP is an abstraction of EVOLUTION at the level of
reproductive populations (i.e., SPECIES) and thus no
recombination mechanisms are typically used because
recombination does not occur between species (by definition: see
Mayr's biological species concept). In contrast, ES is an
abstraction of evolution at the level of INDIVIDUAL behavior.
When self-adaptive information is incorporated this is purely
genetic information (as opposed to phenotypic) and thus some
forms of recombination are reasonable and many forms of
recombination have been implemented within ES. Again, the
effectiveness of such operators depends on the problem at hand.

References
Some references which provide an excellent introduction (by no means
extensive) to the field, include:

Artificial Intelligence Through Simulated Evolution [Fogel66]
(primary)

Fogel DB (1995) "Evolutionary Computation: Toward a New Philosophy of
Machine Intelligence," IEEE Press, Piscataway, NJ.

Proceeding of the first [EP92], second [EP93] and third [EP94] Annual
Conference on Evolutionary Programming (primary) (See Q12).

PSEUDO CODE
Algorithm EP is

// start with an initial time
t := 0;

// initialize a usually random population of individuals
initpopulation P (t);

// evaluate fitness of all initial individuals of population
evaluate P (t);

// test for termination criterion (time, fitness, etc.)
while not done do

// perturb the whole population stochastically
P'(t) := mutate P (t);

// evaluate its new fitness
evaluate P' (t);

// stochastically select the survivors from actual fitness
P(t+1) := survive P(t),P'(t);

// increase the time counter
t := t + 1;

od
end EP.

[Eds note: An extended version of this introduction is available from
ENCORE (see Q15.3) in /FAQ/supplements/what-is-ep ]

------------------------------

Subject: Q1.3: What's an Evolution Strategy (ES)?

In 1963 two students at the Technical University of Berlin (TUB) met
and were soon to collaborate on experiments which used the wind
tunnel of the Institute of Flow Engineering. During the search for
the optimal shapes of bodies in a flow, which was then a matter of
laborious intuitive experimentation, the idea was conceived of
proceeding strategically. However, attempts with the coordinate and
simple gradient strategies (cf Q5) were unsuccessful. Then one of
the students, Ingo Rechenberg, now Professor of Bionics and
Evolutionary Engineering, hit upon the idea of trying random changes
in the parameters defining the shape, following the example of
natural MUTATIONs. The EVOLUTION STRATEGY was born. A third
student, Peter Bienert, joined them and started the construction of
an automatic experimenter, which would work according to the simple
rules of mutation and SELECTION. The second student, Hans-Paul
Schwefel, set about testing the efficiency of the new methods with
the help of a Zuse Z23 computer; for there were plenty of objections
to these "random strategies."

In spite of an occasional lack of financial support, the Evolutionary
Engineering Group which had been formed held firmly together. Ingo
Rechenberg received his doctorate in 1970 (Rechenberg 73). It
contains the theory of the two membered EVOLUTION strategy and a
first proposal for a multimembered strategy which in the nomenclature
introduced here is of the (m+1) type. In the same year financial
support from the Deutsche Forschungsgemeinschaft (DFG, Germany's
National Science Foundation) enabled the work, that was concluded, at
least temporarily, in 1974 with the thesis "Evolutionsstrategie und
numerische Optimierung" (Schwefel 77).

Thus, EVOLUTION STRATEGIEs were invented to solve technical
OPTIMIZATION problems (TOPs) like e.g. constructing an optimal
flashing nozzle, and until recently ES were only known to civil
engineering folks, as an alternative to standard solutions. Usually
no closed form analytical objective function is available for TOPs
and hence, no applicable optimization method exists, but the
engineer's intuition.

The first attempts to imitate principles of organic evolution on a
computer still resembled the iterative optimization methods known up
to that time (cf Q5): In a two-membered or (1+1) ES, one PARENT
generates one OFFSPRING per GENERATION by applying NORMALLY
DISTRIBUTED mutations, i.e. smaller steps occur more likely than big
ones, until a child performs better than its ancestor and takes its
place. Because of this simple structure, theoretical results for
STEPSIZE control and CONVERGENCE VELOCITY could be derived. The ratio
between successful and all mutations should come to 1/5: the so-
called 1/5 SUCCESS RULE was discovered. This first algorithm, using
mutation only, has then been enhanced to a (m+1) strategy which
incorporated RECOMBINATION due to several, i.e. m parents being
available. The mutation scheme and the exogenous stepsize control
were taken across unchanged from (1+1) ESs.

Schwefel later generalized these strategies to the multimembered ES
now denoted by (m+l) and (m,l) which imitates the following basic
principles of organic evolution: a POPULATION, leading to the
possibility of recombination with random mating, mutation and
selection. These strategies are termed PLUS STRATEGY and COMMA
STRATEGY, respectively: in the plus case, the parental generation is
taken into account during selection, while in the comma case only the
offspring undergoes selection, and the parents die off. m (usually a
lowercase mu, denotes the population size, and l, usually a lowercase
lambda denotes the number of offspring generated per generation). Or
to put it in an utterly insignificant and hopelessly outdated
language:

(define (Evolution-strategy population)
(if (terminate? population)
population
(evolution-strategy
(select
(cond (plus-strategy?
(union (mutate
(recombine population))
population))
(comma-strategy?
(mutate
(recombine population))))))))

However, dealing with ES is sometimes seen as "strong tobacco," for
it takes a decent amount of probability theory and applied STATISTICS
to understand the inner workings of an ES, while it navigates through
the hyperspace of the usually n-dimensional problem space, by
throwing hyperelipses into the deep...

Luckily, this medium doesn't allow for much mathematical ballyhoo;
the author therefore has to come up with a simple but brilliantly
intriguing explanation to save the reader from falling asleep during
the rest of this section, so here we go:

Imagine a black box. A large black box. As large as, say for example,
a Coca-Cola vending machine. Now, [..] (to be continued)

A single INDIVIDUAL of the ES' population consists of the following
GENOTYPE representing a point in the SEARCH SPACE:

OBJECT VARIABLES
Real-valued x_i have to be tuned by recombination and mutation
such that an objective function reaches its global optimum.
Referring to the metaphor mentioned previously, the x_i
represent the regulators of the alien Coka-Cola vending machine.

STRATEGY VARIABLEs
Real-valued s_i (usually denoted by a lowercase sigma) or mean
stepsizes determine the mutability of the x_i. They represent
the STANDARD DEVIATION of a (0, s_i) GAUSSIAN DISTRIBUTION (GD)
being added to each x_i as an undirected mutation. With an
"expectancy value" of 0 the parents will produce offspring
similar to themselves on average. In order to make a doubling
and a halving of a stepsize equally probable, the s_i mutate
log-normally, distributed, i.e. exp(GD), from generation to
generation. These stepsizes hide the internal model the
population has made of its ENVIRONMENT, i.e. a SELF-ADAPTATION
of the stepsizes has replaced the exogenous control of the (1+1)
ES.

This concept works because selection sooner or later prefers
those individuals having built a good model of the objective
function, thus producing better offspring. Hence, learning takes
place on two levels: (1) at the genotypic, i.e. the object and
strategy variable level and (2) at the phenotypic level, i.e.
the FITNESS level.

Depending on an individual's x_i, the resulting objective
function value f(x), where x denotes the vector of objective
variables, serves as the PHENOTYPE (fitness) in the selection
step. In a plus strategy, the m best of all (m+l) individuals
survive to become the parents of the next generation. Using the
comma variant, selection takes place only among the l offspring.
The second scheme is more realistic and therefore more
successful, because no individual may survive forever, which
could at least theoretically occur using the plus variant.
Untypical for conventional optimization algorithms and lavish at
first sight, a comma strategy allowing intermediate
deterioration performs better! Only by forgetting highly fit
individuals can a permanent adaptation of the stepsizes take
place and avoid long stagnation phases due to misadapted s_i's.
This means that these individuals have built an internal model
that is no longer appropriate for further progress, and thus
should better be discarded.

By choosing a certain ratio m/l, one can determine the
convergence property of the evolution strategy: If one wants a
fast, but local convergence, one should choose a small HARD
SELECTION, ratio, e.g. (5,100), but looking for the global
optimum, one should favour a softer selection (15,100).

Self-adaptation within ESs depends on the following agents
(Schwefel 87):

Randomness: One cannot model mutation
as a purely random process. This would mean that a child is
completely independent of its parents.

Population size: The population has to be sufficiently large. Not
only
the current best should be allowed to reproduce, but a set of
good individuals. Biologists have coined the term "requisite
variety" to mean the genetic variety necessary to prevent a
SPECIES from becoming poorer and poorer genetically and
eventually dying out.

COOPERATION:
In order to exploit the effects of a population (m > 1), the
individuals should recombine their knowledge with that of others
(cooperate) because one cannot expect the knowledge to
accumulate in the best individual only.
Deterioration: In order to allow better internal models (stepsizes)
to provide better progress in the future, one should accept
deterioration from one generation to the next. A limited life-
span in nature is not a sign of failure, but an important means
of preventing a species from freezing genetically.

ESs prove to be successful when compared to other iterative
methods on a large number of test problems (Schwefel 77). They
are adaptable to nearly all sorts of problems in optimization,
because they need very little information about the problem,
especially no derivatives of the objective function. For a list
of some 300 applications of EAs, see the SyS-2/92 report (cf
Q14). ESs are capable of solving high dimensional, multimodal,
nonlinear problems subject to linear and/or nonlinear
constraints. The objective function can also, e.g. be the
result of a SIMULATION, it does not have to be given in a closed
form. This also holds for the constraints which may represent
the outcome of, e.g. a finite elements method (FEM). ESs have
been adapted to VECTOR OPTIMIZATION problems (Kursawe 92), and
they can also serve as a heuristic for NP-complete combinatorial
problems like the TRAVELLING SALESMAN PROBLEM or problems with a
noisy or changing response surface.

References

Kursawe, F. (1992) " Evolution strategies for vector
optimization", Taipei, National Chiao Tung University, 187-193.

Kursawe, F. (1994) " Evolution strategies: Simple models of
natural processes?", Revue Internationale de Systemique, France
(to appear).

Rechenberg, I. (1973) "Evolutionsstrategie: Optimierung
technischer Systeme nach Prinzipien der biologischen Evolution",
Stuttgart: Fromman-Holzboog.

Schwefel, H.-P. (1977) "Numerische Optimierung von
Computermodellen mittels der Evolutionsstrategie", Basel:
Birkhaeuser.

Schwefel, H.-P. (1987) "Collective Phaenomena in Evolutionary
Systems", 31st Annu. Meet. Inter'l Soc. for General System
Research, Budapest, 1025-1033.

------------------------------

Subject: Q1.4: What's a Classifier System (CFS)?

The name of the Game
First, a word on naming conventions is due, for no other paradigm of
EC has undergone more changes to its name space than this one.
Initially, Holland called his cognitive models "Classifier Systems"
abbrv. with CS, and sometimes CFS, as can be found in [GOLD89].

Whence Riolo came into play in 1986 and Holland added a reinforcement
component to the overall design of a CFS, that emphasized its ability
to learn. So, the word "learning" was prepended to the name, to make:
"Learning Classifier Systems" (abbrv. to LCS). On October 6-9, 1992
the "1st Inter'l Workshop on Learning Classifier Systems" took place
at the NASA Johnson Space Center, Houston, TX. A summary of this
"summit" of all leading researchers in LCS can be found on ENCORE
(See Q15.3) in file CFS/papers/lcs92.ps.gz

Today, the story continues, LCSs are sometimes subsumed under a "new"
machine learning paradigm called "Evolutionary Reinforcement
Learning" or ERL for short, incorporating LCSs, "Q-Learning", devised
by Watkins (1989), and a paradigm of the same name, devised by Ackley
and Littman [ALIFEIII].

And then, this latter statement is really somewhat confusing, as
Marco Dorigo has pointed out in a letter to editors of this guide,
since Q-Learning has no evolutionary component. So please let the
Senior Editor explain: When I wrote this part of the guide, I just
had in mind that Q-Learning would make for a pretty good replacement
of Holland's bucket-brigade algorithm, so I used this litte
speculation to see what comes out of it; in early December 95, almost
two years later, it has finally caught Marco's attention. But
meanwhile, I have been proven right: Wilson has suggested to use Q-
Learning in CLASSIFIER SYSTEM (Wilson 1994) and Dorigo & Bersini
(1994) have shown that Q-Learning and the bucket-brigade are truly
equivalent concepts.

We would therefore be allowed to call a CFS that uses Q-Learning for
rule discovery, rather than a bucket-brigade, a Q-CFS, Q-LCS, or Q-
CS; in any case would the result be subsumed under the term ERL, even
if Q-Learning itself is not an EVOLUTIONARY ALGORITHM!

Interestingly, Wilson called his system ZCS (apparantly no "Q"
inside), while Dorigo & Bersini called their system a D-Max-VSCS, or
"discounted max very simple classifier system" (and if you know Q-
Learning at least the D-Max part of the name will remind you of that
algorithm). The latter paper can be found on Encore (see Q15.3) in
file CFS/papers/sab94.ps.gz

And by the way in [HOLLAND95] the term "classifier system" is not
used anymore. You cannot find it in the index. Its a gone! Holland
now stresses the adaptive component of his invention, and simply
calls the resulting systems adaptive agents. These agents are then
implemented within the framework of his recent invention called ECHO.

(See http://www.santafe.edu/projects/echo for more.)

On Schema Processors and ANIMATS
So, to get back to the question above, "What are CFSs?", we first
might answer, "Well, there are many interpretations of Holland's
ideas...what do you like to know in particular?" And then we'd start
with a recitation from [HOLLAND75], [HOLLAND92], and explain all the
SCHEMA processors, the broadcast language, etc. But, we will take a
more comprehensive, and intuitive way to understand what CLASSIFIER
SYSTEMs are all about.

The easiest road to explore the very nature of classifier systems, is
to take the animat (ANIMAl + ROBOT = ANIMAT) "lane" devised by Booker
(1982) and later studied extensively by Wilson (1985), who also
coined the term for this approach. Work continues on animats but is
often regarded as ARTIFICIAL LIFE rather than EVOLUTIONARY
COMPUTATION. This thread of research has even its own conference
series: "Simulation of Adaptive Behavior (SAB)" (cf Q12), including
other notions from machine learning, connectionist learning,
evolutionary robotics, etc. [NB: the latter is obvious, if an animat
lives in a digital microcosm, it can be put into the real world by
implantation into an autonomous robot vehicle, that has
sensors/detectors (camera eyes, whiskers, etc.) and effectors
(wheels, robot arms, etc.); so all that's needed is to use our
algorithm as the "brain" of this vehicle, connecting the hardware
parts with the software learning component. For a fascinating intro
to the field see, e.g. Braitenberg (1984)]

classifier systems, however, are yet another offspring of John
Holland's aforementioned book, and can be seen as one of the early
applications of GAs, for CFSs use this EVOLUTIONARY ALGORITHM to
adapt their behavior toward a changing ENVIRONMENT, as is explained
below in greater detail.

Holland envisioned a cognitive system capable of classifying the
goings on in its environment, and then reacting to these goings on
appropriately. So what is needed to build such a system? Obviously,
we need (1) an environment; (2) receptors that tell our system about
the goings on; (3) effectors, that let our system manipulate its
environment; and (4) the system itself, conveniently a "black box" in
this first approach, that has (2) and (3) attached to it, and "lives"
in (1).

In the animat approach, (1) usually is an artificially created
digital world, e.g. in Booker's Gofer system, a 2 dimensional grid
that contains "food" and "poison". And the Gofer itself, that walks
across this grid and tries (a) to learn to distinguish between these
two items, and (b) survive well fed.

Much the same for Wilson's animat, called "*". Yes, its just an
asterisk, and a "Kafka-esque naming policy" is one of the sign posts
of the whole field; e.g. the first implementation by Holland and
Reitmann 1978 was called CS-1, (cognitive system 1); Smith's Poker
player LS-1 (1980) followed this "convention". Riolo's 1988 LCS
implementations on top of his CFS-C library (cf Q20), were dubbed
FSW-1 (Finite State World 1), and LETSEQ-1 (LETter SEQuence predictor
1).

So from the latter paragraph we can conclude that environment can
also mean something completely different (e.g. an infinite stream of
letters, time serieses, etc.) than in the animat approach, but
anyway; we'll stick to it, and go on.

Imagine a very simple animat, e.g. a simplified model of a frog.
Now, we know that frogs live in (a) Muppet Shows, or (b) little
ponds; so we chose the latter as our demo environment (1); and the
former for a non-Kafka-esque name of our model, so let's dub it
"Kermit".

Kermit has eyes, i.e. sensorial input detectors (2); hands and legs,
i.e. environment-manipulating effectors (3); is a spicy-fly-
detecting-and-eating device, i.e. a frog (4); so we got all the 4
pieces needed.

Inside the Black Box
The most primitive "black box" we can think of is a computer. It has
inputs (2), and outputs (3), and a message passing system inbetween,
that converts (i.e., computes), certain input messages into output
messages, according to a set of rules, usually called the "program"
of that computer. From the theory of computer science, we now borrow
the simplest of all program structures, that is something called
"production system" or PS for short. A PS has been shown to be
computationally complete by Post (1943), that's why it is sometimes
called a "Post System", and later by Minsky (1967). Although it
merely consists of a set of if-then rules, it still resembles a full-
fledged computer.

We now term a single "if-then" rule a "classifier", and choose a
representation that makes it easy to manipulate these, for example by
encoding them into binary strings. We then term the set of
classifiers, a "classifier population", and immediately know how to
breed new rules for our system: just use a GA to generate new
rules/classifiers from the current POPULATION!

All that's left are the messages floating through the black box.
They should also be simple strings of zeroes and ones, and are to be
kept in a data structure, we call "the message list".

With all this given, we can imagine the goings on inside the black
box as follows: The input interface (2) generates messages, i.e., 0/1
strings, that are written on the message list. Then these messages
are matched against the condition-part of all classifiers, to find
out which actions are to be triggered. The message list is then
emptied, and the encoded actions, themselves just messages, are
posted to the message list. Then, the output interface (3) checks
the message list for messages concerning the effectors. And the cycle
restarts.

Note, that it is possible in this set-up to have "internal messages",
because the message list is not emptied after (3) has checked; thus,
the input interface messages are added to the initially empty list.
(cf Algorithm CFS, LCS below)

The general idea of the CFS is to start from scratch, i.e., from
tabula rasa (without any knowledge) using a randomly generated
classifier population, and let the system learn its program by
induction, (cf Holland et al. 1986), this reduces the input stream to
recurrent input patterns, that must be repeated over and over again,
to enable the animat to classify its current situation/context and
react on the goings on appropriately.

What does it need to be a frog?
Let's take a look at the behavior emitted by Kermit. It lives in its
digital microwilderness where it moves around randomly. [NB: This
seemingly "random" behavior is not that random at all; for more on
instinctive, i.e., innate behavior of non-artificial animals see,
e.g. Tinbergen (1951)]

Whenever a small flying object appears, that has no stripes, Kermit
should eat it, because its very likely a spicy fly, or other flying
insect. If it has stripes, the insect is better left alone, because
Kermit had better not bother with wasps, hornets, or bees. If Kermit
encounters a large, looming object, it immediately uses its effectors
to jump away, as far as possible.

So, part of these behavior patterns within the "pond world", in AI
sometimes called a "frame," from traditional knowledge representation
techniques, Rich (1983), can be expressed in a set of "if <condition>
then <action>" rules, as follows:

IF small, flying object to the left THEN send @
IF small, flying object to the right THEN send %
IF small, flying object centered THEN send $
IF large, looming object THEN send !
IF no large, looming object THEN send *
IF * and @ THEN move head 15 degrees left
IF * and % THEN move head 15 degrees right
IF * and $ THEN move in direction head pointing
IF ! THEN move rapidly away from direction head pointing

Now, this set of rules has to be encoded for use within a CLASSIFIER
SYSTEM. A possible encoding of the above rule set in CFS-C (Riolo)
classifier terminology. The condition part consists of two
conditions, that are combined with a logical AND, thus must be met
both to trigger the associated action. This structure needs a NOT
operation, (so we get NAND, and know from hardware design, that we
can build any computer solely with NANDs), in CFS-C this is denoted
by the `~' prefix character (rule #5).

IF THEN
0000, 00 00 00 00
0000, 00 01 00 01
0000, 00 10 00 10
1111, 01 ## 11 11
~1111, 01 ## 10 00
1000, 00 00 01 00
1000, 00 01 01 01
1000, 00 10 01 10
1111, ## ## 01 11

Obviously, string `0000' denotes small, and `00' in the fist part of
the second column, denotes flying. The last two bits of column #2
encode the direction of the object approaching, where `00' means
left, `01' means right, etc.

In rule #4 a the "don't care symbol" `#' is used, that matches `1'
and `0', i.e., the position of the large, looming object, is
completely arbitrary. A simple fact, that can save Kermit's
(artificial) life.

PSEUDO CODE (Non-Learning CFS)
Algorithm CFS is

// start with an initial time
t := 0;

// an initially empty message list
initMessageList ML (t);

// and a randomly generated population of classifiers
initClassifierPopulation P (t);

// test for cycle termination criterion (time, fitness, etc.)
while not done do

// increase the time counter
t := t + 1;

// 1. detectors check whether input messages are present
ML := readDetectors (t);

// 2. compare ML to the classifiers and save matches
ML' := matchClassifiers ML,P (t);

// 3. process new messages through output interface
ML := sendEffectors ML' (t);
od
end CFS.

To convert the previous, non-learning CFS into a learning CLASSIFIER
SYSTEM, LCS, as has been proposed in Holland (1986), it takes two
steps:

(1) the major cycle has to be changed such that the activation of
each classifier depends on some additional parameter, that can
be modified as a result of experience, i.e. reinforcement from
the ENVIRONMENT;

(2) and/or change the contents of the classifier list, i.e.,
generate new classifiers/rules, by removing, adding, or
combining condition/action-parts of existing classifiers.

The algorithm thus changes accordingly:

PSEUDO CODE (Learning CFS)
Algorithm LCS is

// start with an initial time
t := 0;

// an initially empty message list
initMessageList ML (t);

// and a randomly generated population of classifiers
initClassifierPopulation P (t);

// test for cycle termination criterion (time, fitness, etc.)
while not done do

// increase the time counter
t := t + 1;

// 1. detectors check whether input messages are present
ML := readDetectors (t);

// 2. compare ML to the classifiers and save matches
ML' := matchClassifiers ML,P (t);

// 3. highest bidding classifier(s) collected in ML' wins the
// "race" and post the(ir) message(s)
ML' := selectMatchingClassifiers ML',P (t);

// 4. tax bidding classifiers, reduce their strength
ML' := taxPostingClassifiers ML',P (t);

// 5. effectors check new message list for output msgs
ML := sendEffectors ML' (t);

// 6. receive payoff from environment (REINFORCEMENT)
C := receivePayoff (t);

// 7. distribute payoff/credit to classifiers (e.g. BBA)
P' := distributeCredit C,P (t);

// 8. Eventually (depending on t), an EA (usually a GA) is
// applied to the classifier population
if criterion then
P := generateNewRules P' (t);
else
P := P'
od
end LCS.

What's the problem with CFSs?
Just to list the currently known problems that come with CFSs, would
take some additional pages; therefore only some interesting papers
dealing with unresolved riddles are listed; probably the best paper
containing most of these is the aforementioned summary of the LCS
Workshop:

Smith, R.E. (1992) "A report on the first Inter'l Workshop on LCSs"
avail. from ENCORE (See Q15.3) in file CFS/papers/lcs92.ps.gz

Other noteworthy critiques on LCSs include:

Wilson, S.W. (1987) "Classifier Systems and the Animat Problem"
Machine Learning, 2.

Wilson, S.W. (1988) "Bid Competition and Specificity Reconsidered"
Complex Systems, 2(5):705-723.

Wilson, S.W. & Goldberg, D.E. (1989) "A critical review of classifier
systems" [ICGA89], 244-255.

Goldberg, D.E., Horn, J. & Deb, K. (1992) "What makes a problem hard
for a classifier system?" (containing the Goldberg citation below)
is also available from Encore (See Q15.3) in file
CFS/papers/lcs92-2.ps.gz

Dorigo, M. (1993) "Genetic and Non-genetic Operators in ALECSYS"
Evolutionary Computation, 1(2):151-164. The technical report, the
journal article is based on is avail. from Encore (See Q15.3) in file
CFS/papers/icsi92.ps.gz
Smith, R.E. Forrest, S. & Perelson, A.S. (1993) "Searching for
Diverse, Cooperative POPULATIONs with Genetic Algorithms"
Evolutionary Computation, 1(2):127-149.

Conclusions?
Generally speaking: "There's much to do in CFS research!"

No other notion of EC provides more space to explore and if you are
interested in a PhD in the field, you might want to take a closer
look at CFS. However, be warned!, to quote Goldberg: "classifier
systems are a quagmire---a glorious, wondrous, and inventing
quagmire, but a quagmire nonetheless."

References

Booker, L.B. (1982) "Intelligent behavior as an adaption to the task
environment" PhD Dissertation, Univ. of Michigan, Logic of Computers
Group, Ann Arbor, MI.

Braitenberg, V. (1984) "Vehicles: Experiments in Synthetic
Psychology" Boston, MA: MIT Press.

Dorigo M. & H. Bersini (1994). "A Comparison of Q-Learning and
Classifier Systems." Proceedings of From Animals to Animats, Third
International Conference on SIMULATION of Adaptive Behavior (SAB94),
Brighton, UK, D.Cliff, P.Husbands, J.-A.Meyer and S.W.Wilson (Eds.),
MIT Press, 248-255.
http://iridia.ulb.ac.be/dorigo/dorigo/conferences/IC.11-SAB94.ps.gz

Holland, J.H. (1986) "Escaping Brittleness: The possibilities of
general-purpose learning algorithms applied to parallel rule-based
systems". In: R.S. Michalski, J.G. Carbonell & T.M. Mitchell (eds),
Machine Learning: An Artificial Intelligence approach, Vol II,
593-623, Los Altos, CA: Morgan Kaufman.

Holland, J.H., et al. (1986) "Induction: Processes of Inference,
Learning, and Discovery", Cambridge, MA: MIT Press.

Holland, J.H. (1992) "Adaptation in natural and artificial systems"
Boston, MA: MIT Press.

Holland, J.H. (1995) "Hidden Order: How adaptation builds complexity"
Reading, MA: Addison-Wesley. [HOLLAND95]:

Holland, J.H. & Reitman, J.S. (1978) "Cognitive Systems based on
Adaptive Algorithms" In D.A. Waterman & F.Hayes-Roth, (eds) Pattern-
directed inference systems. NY: Academic Press.

Minsky, M.L. (1961) "Steps toward Artificial Intelligence"
Proceedings IRE, 49, 8-30. Reprinted in E.A. Feigenbaum & J. Feldman
(eds) Computers and Thought, 406-450, NY: McGraw-Hill, 1963.

Minsky, M.L. (1967) "Computation: Finite and Infinite Machines"
Englewood Cliffs, NJ: Prentice-Hall.

Post, Emil L. (1943) "Formal reductions of the general combinatorial
decision problem" American Journal of Mathematics, 65, 197-215.

Rich, E. (1983) "Artificial Intelligence" NY: McGraw-Hill.
Tinbergen, N. (1951) "The Study of Instinct" NY: Oxford Univ. Press.

Watkins, C. (1989) "Learning from Delayed Rewards" PhD Dissertation,
Department of Psychology, Cambridge Univ., UK.

Wilson, S.W. (1985) "Knowledge growth in an artificial animal" in
[ICGA85], 16-23.
Wilson, S.W. (1994) "ZCS: a zeroth level classifier system" in EC
2(1), 1-18.

------------------------------

Subject: Q1.5: What's Genetic Programming (GP)?

GENETIC PROGRAMMING is the extension of the genetic model of learning
into the space of programs. That is, the objects that constitute the
POPULATION are not fixed-length character strings that encode
possible solutions to the problem at hand, they are programs that,
when executed, "are" the candidate solutions to the problem. These
programs are expressed in genetic programming as parse trees, rather
than as lines of code. Thus, for example, the simple program "a + b
* c" would be represented as:

+
/ \
a *
/ \
b c

or, to be precise, as suitable data structures linked together to
achieve this effect. Because this is a very simple thing to do in the
programming language Lisp, many GPers tend to use Lisp. However, this
is simply an implementation detail. There are straightforward methods
to implement GP using a non-Lisp programming environment.

The programs in the population are composed of elements from the
FUNCTION SET and the TERMINAL SET, which are typically fixed sets of
symbols selected to be appropriate to the solution of problems in the
domain of interest.

In GP the CROSSOVER operation is implemented by taking randomly
selected subtrees in the INDIVIDUALs (selected according to FITNESS)
and exchanging them.

It should be pointed out that GP usually does not use any MUTATION as
a GENETIC OPERATOR.

More information is available in the GP mailing list FAQ. (See
Q15.2) and from http://smi-web.stanford.edu/people/koza/

------------------------------

Copyright (c) 1993-2000 by J. Heitkoetter and D. Beasley, all rights
reserved.

This FAQ may be posted to any USENET newsgroup, on-line service, or
BBS as long as it is posted in its entirety and includes this
copyright statement. This FAQ may not be distributed for financial
gain. This FAQ may not be included in commercial collections or
compilations without express permission from the author.

End of ai-faq/genetic/part2
***************************

--

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part3
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

TABLE OF CONTENTS OF PART 3

Q2: What applications of EAs are there?

Q3: Who is concerned with EAs?

Q4: How many EAs exist? Which?
Q4.1: What about Alife systems, like Tierra and VENUS?

Q5: What about all this Optimization stuff?

----------------------------------------------------------------------

Subject: Q2: What applications of EAs are there?

In principle, EAs can compute any computable function, i.e.
everything a normal digital computer can do.

But EAs are especially badly suited for problems where efficient ways
of solving them are already known, (unless these problems are
intended to serve as benchmarks). Special purpose algorithms, i.e.
algorithms that have a certain amount of problem domain knowledge
hard coded into them, will usually outperform EAs, so there is no
black magic in EC. EAs should be used when there is no other known
problem solving strategy, and the problem domain is NP-complete.
That's where EAs come into play: heuristically finding solutions
where all else fails.

Following is an incomplete (sic!) list of successful EA
applications:

BIOCOMPUTING
Biocomputing, or Bioinformatics, is the field of biology dedicated to
the automatic analysis of experimental data (mostly sequencing data).
Several approaches to specific biocomputing problems have been
described that involve the use of GA, GP and simulated annealing.
General information about biocomputing (software, databases, misc.)
can be found on the server of the European Bioinformatics Institute:
http://www.ebi.ac.uk/ebi_home.html ENCORE has a good selection of
pointers related to this subject. VSCN provides a detailed online
course on bioinformatics: http://www.techfak.uni-
bielefeld.de/bcd/Curric/welcome.html

There are three main domains to which GA have been applied in
Bioinformatics: protein folding, RNA folding, sequence alignment.

Protein Folding

Proteins are one of the essential components of any form of life.
They are made of twenty different types of amino acid. These amino
acids are chained together in order to form the protein that can
contain from a few to several thousands residues. In most of the
cases, the properties and the function of a protein are a result of
its three dimensional structure. It seems that in many cases this
structure is a direct consequence of the sequence. Unfortunately, it
is still very difficult/impossible to deduce the three dimensional
structure, knowing only the sequence. A part of the VSCN on-line
bioinformatics course is dedicated to the use of GAs in Protein
Folding Prediction. It contains an extensive bibliography and a
detailed presentation of the subject with LOTS of explanations and
on-line papers. The URL is: http://www.techfak.uni-
bielefeld.de/bcd/Curric/ProtEn/contents.html

Koza [KOZA92] gives one example of GP applied to Protein Folding.
Davis [DAVIS91] gives an example of DNA conformation prediction (a
closely related problem) in his Handbook of GAs.

RNA Folding

Describing the tertiary structure of an RNA molecule, is about as
hard as for a protein, but describing the intermediate structure
(secondary structure) is somehow easier because RNA molecules are
using the same pairing rules as DNA, (Watson and Crick base pairing).
There exist deterministic algorithms that given a set of constraints
(rules), compute the more stable structure, but: (a) their time and
memory requirement increase quadratically or more with the length of
the sequences, and (b) they require simplified rules. Lots of effort
has recently been put into applying GAs to this problem, and several
papers can be found (on-line if your institute subscribes to these
journals):

A genetic Algorithm Based Molecular Modelling Technique For RNA Stem-
loop Structures H. Ogata, Y. Akiyama and M Kanehisa, Nucleic Acid
Research, 1995, vol 23,3 419-426

An Annealing Mutation Operator in the GA for RNA folding B.A Shapiro
and J. C. Wu, CABIOS, 1996, vol 12, 3, 171-180

The computer Simulation of RNA Folding Pathway Using a Genetic
Algorithm A.P. Gultyaev, F.D.H van Batenburg and C. W. A. Pleij in
Journal of Molecular Biology, 1995, vol 250 37-51

Simulated Annealing has also been applied successfully to this
problem:

Description of RNA folding by SA M. Schmitz and G. Steger in Journal
of Molecular Biology, 1995, 255, 245-266

Sequence Alignments

Sequence Alignment is another important problem of Bioinformatics.
The aim is to align together several related sequences (from two to
hundreds) given a cost function. For the most widely used cost
functions, the problem has been shown to be NP-complete. Several
attempts have been made using SA:

Multiple Sequence Alignment Using SA J. Kim, Sakti Pramanik and M.J.
Chung, CABIOS, 1994, vol 10, 4, 419-426

Multiple Sequence Alignment by Parallel SA M. Isshikawa, T. Koya and
al, CABIOS, 1993,vol 9, 3, 267-273

SAM, software which uses Hidden Markov Models for Multiple Sequence
Alignment, can use SA to train the model. Several papers have been
published on SAM. The software, documentation and an extensive
bibliography can be found in:
http://www.cse.ucsc.edu/research/compbio/sam.html

More recently, various software using different methods like Gibbs
sampling or GAs has been developed:

A Gibbs Sampling Strategy for Multiple Alignment C.E. Lawrence, S. F.
Altschull and al, Science, October 1993, vol 262, 208-214

SAGA: Sequence Alignment by Genetic Algorithm C. Notredame and D.G.
Higgins, Nucleic Acid Research, 1995, vol 24, 8,
1515-1524

A beta release of SAGA (along with the paper) is available on the
European Bioinformatics Institute anonymous FTP server:
ftp.ebi.ac.uk/pub/software/unix/saga.tar.Z

CELLULAR PROGRAMMING: Evolution of Parallel Cellular Machines
Nature abounds in systems involving the actions of simple, locally-
interacting components, that give rise to coordinated global
behavior. These collective systems have evolved by means of natural
SELECTION to exhibit striking problem-solving capacities, while
functioning within a complex, dynamic ENVIRONMENT. Employing simple
yet versatile parallel cellular models, coupled with EVOLUTIONARY
COMPUTATION techniques, cellular programming is an approach for
constructing man-made systems that exhibit characteristics such as
those manifest by their natural counterparts.

Parallel cellular machines hold potential both scientifically, as
vehicles for studying phenomena of interest in areas such as complex
adaptive systems and ARTIFICIAL LIFE, as well as practically,
enabling the construction of novel systems, endowed with
evolutionary, reproductive, regenerative, and learning capabilities.

Web site: http://lslwww.epfl.ch/~moshes/cp.html

References:

Sipper, M. (1997) "Evolution of Parallel Cellular Machines: The
Cellular Programming Approach", Springer-Verlag, Heidelberg.

Sipper, M. (1996) "Co-evolving Non-Uniform Cellular Automata to
Perform Computations", Physica D, 92, 193-208.

Sipper, M. and Ruppin, E. (1997) "Co-evolving architectures for
cellular machines", Physica D, 99, 428-441.

Sipper, M. and Tomassini, M. (1996) "Generating Parallel Random
Number Generators By Cellular Programming", International Journal of
Modern Physics C, 7(2), 181-190.

Sipper, M. (1997) "Evolving Uniform and Non-uniform Cellular Automata
Networks", in Annual Reviews of Computational Physics, D. Stauffer
(ed)

Evolvable Hardware
The idea of evolving machines, whose origins can be traced to the
cybernetics movement of the 1940s and the 1950s, has recently
resurged in the form of the nascent field of bio-inspired systems and
evolvable hardware. The field draws on ideas from the EVOLUTIONARY
COMPUTATION domain as well as on novel hardware innovations.
Recently, the term evolware has been used to describe such evolving
ware, with current implementations centering on hardware, while
raising the possibility of using other forms in the future, such as
bioware. The inaugural workshop, Towards Evolvable Hardware, took
place in Lausanne, in October 1995, followed by the First
International Conference on Evolvable Systems: From Biology to
Hardware (ICES96) held in Japan, in October 1996. Another major event
in the field, ICES98, was held in Lausanne, Switzerland, in September
1998.

References:

Sipper, M. et al (1997) "A Phylogenetic, Ontogenetic, and Epigenetic
View of Bio-Inspired Hardware Systems", IEEE Transactions on
Evolutionary Computation, 1(1).
Sanchez, E. and Tomassini, M. (eds) (1996) "Towards Evolvable
Hardware", Springer-Verlag, Lecture Notes in Computer Science, 1062.

Higuchi, T. et al (1997) "Proceedings of First International
Conference on Evolvable Systems: From Biology to Hardware (ICES96)",
Springer-Verlag, Lecture Notes in Computer Science.

GAME PLAYING
GAs can be used to evolve behaviors for playing games. Work in
evolutionary GAME THEORY typically surrounds the EVOLUTION of a
POPULATION of players who meet randomly to play a game in which they
each must adopt one of a limited number of moves. (Maynard-Smith
1982). Let's suppose it is just two moves, X and Y. The players
receive a reward, analogous to Darwinian FITNESS, depending on which
combination of moves occurs and which move they adopted. In more
complicated models there may be several players and several moves.

The players iterate such a game a series of times, and then move on
to a new partner. At the end of all such moves, the players will have
a cumulative payoff, their fitness. This fitness can then be used to
generate a new population.

The real key in using a GA is to come up with an encoding to
represent player's strategies, one that is amenable to CROSSOVER and
to MUTATION. Possibilities are to suppose at each iteration a player
adopts X with some probability (and Y with one minus such). A player
can thus be represented as a real number, or a bit-string suitably
interpreted as a probability

An alternative characterisation is to model the players as Finite
State Machines, or Finite Automata (FA). These can be though of as a
simple flow chart governing behaviour in the "next" play of the game
depending upon previous plays. For example:

100 Play X
110 If opponent plays X go to 100
120 Play Y
130 If opponent plays X go to 100 else go to 120
represents a strategy that does whatever its opponent did last, and
begins by playing X, known as "Tit-For-Tat." (Axelrod 1982). Such
machines can readily be encoded as bit-strings. Consider the encoding
"1 0 1 0 0 1" to represent TFT. The first three bits, "1 0 1" are
state 0. The first bit, "1" is interpreted as "Play X." The second
bit, "0" is interpreted as "if opponent plays X go to state 1," the
third bit, "1", is interpreted as "if the opponent plays Y, go to
state 1." State 1 has a similar interpretation. Crossing over such
bit-strings always yields valid strategies.

SIMULATIONs in the Prisoner's dilemma have been undertaken (Axelrod
1987, Fogel 1993, Miller 1989) of these machines.

Alternative representations of game players include CLASSIFIER
SYSTEMs (Marimon, McGrattan and Sargent 1990, [GOLD89]), and Neural-
networks (Fogel and Harrald 1994), though not necessarily with a GA.
(Fogel 1993), and Fogel and Harrald 1994 use an Evolutionary
Program). Chellapilla and Fogel (1999) have evolved a neural network
which can play checkers (draughts) at near expert level.

Other methods of evolving a population can be found in Lindgren 1991,
Glance and Huberman 1993 and elsewhere.

A GA for playing the game "Mastermind" has been produced. See
http://kal-el.ugr.es/mastermind

References.

Axelrod, R. (1987) ``The Evolution of Strategies in the Repeated
Prisoner's Dilemma,'' in [DAVIS91]

Axelrod, R (?) ``The Complexity of Cooperation'' (See the web site,
which includes code to implement tournaments:
http://pscs.physics.lsa.umich.edu/Software/ComplexCoop.html )

Chellapilla, K. and Fogel, D.B. (1999) ``Evolution, neural networks,
games, and intelligence'' , Proc. IEEE, Sept., pp. 1471-1496.

Miller, J.H. (1989) ``The Coevolution of Automata in the Repeated
Prisoner's Dilemma'' Santa Fe Institute Working Paper 89-003.

Marimon, Ramon, Ellen McGrattan and Thomas J. Sargent (1990) ``Money
as a Medium of Exchange in an Economy with Artificially Intelligent
Agents'' Journal of Economic Dynamics and Control 14, pp. 329--373.

Maynard-Smith, (1982) Evolution and the Theory of Games, CUP.

Lindgren, K. (1991) ``Evolutionary Phenomena in Simple Dynamics,'' in
[ALIFEI].

Holland, J.H and John Miller (1990) ``Artificially Adaptive Agents in
Economic Theory,'' American Economic Review: Papers and Proceedings
of the 103rd Annual Meeting of the American Economics Association:
365--370.

Huberman, Bernado, and Natalie S. Glance (1993) "Diversity and
Collective Action" in H. Haken and A. Mikhailov (eds.)
Interdisciplinary Approaches to Nonlinear Systems, Springer.

Fogel (1993) "Evolving Behavior in the Iterated Prisoner's Dilemma"
Evolutionary Computation 1:1, 77-97

Fogel, D.B. and Harrald, P. (1994) ``Evolving Complex Behaviour in
the Iterated Prisoner's Dilemma,'' Proceedings of the Fourth Annual
Meetings of the Evolutionary Programming Society, L.J. Fogel and A.W.
Sebald eds., World Science Press.

Lindgren, K. and Nordahl, M.G. "Cooperation and Community Structure
in Artificial Ecosystems", Artificial Life, vol 1:1&2, 15-38

Stanley, E.A., Ashlock, D. and Tesfatsion, L. (1994) "Iterated
Prisoners Dilemma with Choice and Refusal of Partners in [ALIFEIII]
131-178

JOB-SHOP SCHEDULING
The Job-Shop Scheduling Problem (JSSP) is a very difficult NP-
complete problem which, so far, seems best addressed by sophisticated
branch and bound search techniques. GA researchers, however, are
continuing to make progress on it. (Davis 85) started off GA
research on the JSSP, (Whitley 89) reports on using the edge
RECOMBINATION operator (designed initially for the TSP) on JSSPs too.
More recent work includes (Nakano 91),(Yamada & Nakano 92), (Fang et
al. 93). The latter three report increasingly better results on
using GAs on fairly large benchmark JSSPs (from Muth & Thompson 63);
neither consistently outperform branch & bound search yet, but seem
well on the way. A crucial aspect of such work (as with any GA
application) is the method used to encode schedules. An important
aspect of some of the recent work on this is that better results have
been obtained by rejecting the conventional wisdom of using binary
representations (as in (Nakano 91)) in favor of more direct
encodings. In (Yamada & Nakano 92), for example, a GENOME directly
encodes operation completion times, while in (Fang et al. 93) genomes
represent implicit instructions for building a schedule. The success
of these latter techniques, especially since their applications are
very important in industry, should eventually spawn advances in GA
theory.

Concerning the point of using GAs at all on hard job-shop scheduling
problems, the same goes here as suggested above for `Timetabling':
The GA approach enables relatively arbitrary constraints and
objectives to be incorporated painlessly into a single OPTIMIZATION
method. It is unlikely that GAs will outperform specialized
knowledge-based and/or conventional OR-based approaches to such
problems in terms of raw solution quality, however GAs offer much
greater simplicity and flexibility, and so, for example, may be the
best method for quick high-quality solutions, rather than finding the
best possible solution at any cost. Also, of course, hybrid methods
will have a lot to offer, and GAs are far easier to parallelize than
typical knowledge-based/OR methods.

Similar to the JSSP is the Open Shop Scheduling Problem (OSSP).
(Fang et al. 93) reports an initial attempt at using GAs for this.
Ongoing results from the same source shows reliable achievement of
results within less than 0.23% of optimal on moderately large OSSPs
(so far, up to 20x20), including an improvement on the previously
best known solution for a benchmark 10x10 OSSP. A simpler form of job
shop problem is the Flow-Shop Sequencing problem; recent successful
work on applying GAs to this includes (Reeves 93)."

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part4
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

TABLE OF CONTENTS OF PART 4

Q10: What introductory material on EAs is there?
Q10.1: Suitable background reading for beginners?
Q10.2: Textbooks on EC?
Q10.3: The Classics?
Q10.4: Introductory Journal Articles?
Q10.5: Introductory Technical Reports?
Q10.6: Not-quite-so-introductory Literature?
Q10.7: Biological Background Readings?
Q10.8: On-line bibliography collections?
Q10.9: Videos?
Q10.10: CD-ROMs?
Q10.11: How do I get a copy of a dissertation?

Q11: What EC related journals and magazines are there?

Q12: What are the important conferences/proceedings on EC?

Q13: What Evolutionary Computation Associations exist?

Q14: What Technical Reports are available?

Q15: What information is available over the net?
Q15.1: What digests are there?
Q15.2: What mailing lists are there?
Q15.3: What online information repositories are there?
Q15.4: What relevant newsgroups and FAQs are there?
Q15.5: What about all these Internet Services?

----------------------------------------------------------------------

Subject: Q10: What introductory material on EAs is there?

There are many sources of introductory material on evolutionary
algorithms: background books (see Q10.1), textbooks (see Q10.2),
classical works (see Q10.3), journal articles (see Q10.4), technical
reports (see Q10.5), more advanced literature (see Q10.6), biological
background reading (see Q10.7), bibliography collections (see Q10.8),
videos (see Q10.9) and CD-ROMs (Q10.10). Information on how to get
dissertations is also given below (see Q10.11).

Conference proceedings (see Q12) are also a good source of up-to-date
(and sometimes introductory) material.

------------------------------

Subject: Q10.1: Suitable background reading for beginners?

These books give a "flavor" of what the subject is about.

Dawkins, R. (1976, 1989 2nd ed) "The Selfish Gene", Oxford: Oxford
University Press. [The 2nd edition includes two new chapters]

Dawkins, R. (1982) "The Extended Phenotype: The Gene as a Unit of
Selection", Oxford: Oxford University Press.

Dawkins, R. (1986) "The Blind Watchmaker", New York: W.W. Norton.

Fogel, D. (1998) "Evolutionary Computation: The Fossil Record," IEEE
Press. Chronicles the history of simulated evolution from the early
1950s. http://www.natural-selection.com/people/dbf.html

Gonick, L. (1983) "The Cartoon Guide to Computer Science", New York:
Barnes & Noble. [eds note: features an interesting chapter on Charles
Babbage in conjunction with "horse racing forecasting", if you want
to use EAs to fullfill this task, better read this section first]
Gonick, L. (1983) "The Cartoon Guide to Genetics", New York: Barnes &
Noble.

Regis, E. (1987) "Who got Einstein's Office? Eccentricity and Genius
at the Institute for Advanced Study", Reading, MA: Addison Wesley
[eds note: chapters 5, 10 and 12]

Levy, S. (1992) "Artificial Life: The Quest for a new Creation", New
York, NY: Pantheon. [LEVY92]: [eds note: read this and you will have
the urge to work in this field]

Sigmund, K. (1993) "Games of Life: Explorations in Ecology, Evolution
and Behaviour", Oxford: Univ. Press. 252 pp. Hard/Softcover avail.

------------------------------

Subject: Q10.2: Textbooks on EC?

These books go into the "nuts and bolts" of EC.

Goldberg, D.E. (1989) "Genetic Algorithms in Search, Optimization,
and Machine Learning",Addison-Wesley. [GOLD89]: (Probably the most
widely referenced book in the field!)

Davis, L. (ed) (1991) "Handbook of Genetic Algorithms", Van Nostrand
Reinhold, New York, NY. [DAVIS91]:

Michalewicz, Z. (1992) Genetic algorithms + Data Structures =
Evolution Programs", Springer-Verlag, New York, NY. [MICHALE92]:
Also second, extended edition (1994) with index. [MICHALE94]:

Koza, J.R. (1992), Genetic Programming: On the Programming of
Computers by means of Natural Selection", Cambridge, MA: MIT Press.
[KOZA92]:

Langdon, W.B. (1998), Genetic Programming and Data Structures
Hingham, MA: Kluwer. [LANG98]:
http://www.wkap.nl/book.htm/0-7923-8135-1

------------------------------

Subject: Q10.3: The Classics?

Mostly older works which have helped to shape the field.

Charles Darwin (1859), "The Origin of Species", London: John Murray.
(Penguin Classics, London, 1985; New American Library, Mentor
Paperback)

Box, G.E.P. (1957) "Evolutionary operation: a method of increasing
industrial productivity", Applied Statistics, 6, 81-101.

Fraser, A.S. (1957) "Simulation of genetic systems by automatic
digital computers", Australian Journal of Biological Sciences, 10,
484-491.

Friedman, G.J. (1959) "Digital simulation of an evolutionary
process", General Systems Yearbook, 4:171-184.

Bremermann, H.J. (1962) "Optimization through evolution and
recombination". In M.C. Yovits, et al, (eds) Self-Organizing Systems.
Washington, DC: Spartan Books.

Holland, J.H. (1962) "Outline for a logical theory of adaptive
systems", JACM, 3, 297-314.

Samuel, A.L. (1963) "Some Studies in Machine Learning using the Game
of Checkers", in Computers and Thought, E.A. Feigenbaum and J.
Feldman (eds), New York: McGraw-Hill.

Walter, W.G. (1963) "The Living Brain", New York: W.W. Norton.

Fogel, L.J., Owens, A.J. & Walsh, M.J. (1966) "Artificial
Intelligence through Simulated Evolution", New York: Wiley.
[Fogel66]:

Rosen, R. (1967) "Optimality Principles in Biology", London:
Butterworths.

Rechenberg, I. (1973, 1993 2nd edn) "Evolutionsstrategie: Optimierung

technischer Systeme nach Prinzipien der biologischen Evolution",

Stuttgart: Fromman-Holzboog. (Evolution Strategy: Optimization of
technical systems by means of biological evolution)

Holland, J.H. (1975) "Adaptation in natural and artificial systems",
Ann Arbor, MI: The University of Michigan Press. [HOLLAND75]: 2nd
edn. (1992) [HOLLAND92]:

De Jong, K.A. (1975) "An analysis of the behavior of a class of
genetic adaptive systems", Doctoral thesis, Dept. of Computer and
Communication Sciences, University of Michigan, Ann Arbor.

Schwefel, H.-P. (1977) "Numerische Optimierung von Computer-Modellen

mittels der Evolutionsstrategie", Basel: Birkhaeuser.

Schwefel, H.-P. (1981) "Numerical Optimization of Computer Models",
Chichester: Wiley. [eds note: English translation of the previous
entry; a reworked edition is currently in preparation for 1994]

Axelrod, R. (1984) "The evolution of cooperation", NY: Basic Books.

Cramer, N.L. (1985) "A Representation for the Adaptive Generation of
Simple Sequential Programs" [ICGA85], 183-187.

Baeck, T., Hoffmeister, F. & Schwefel, H.-P. (1991) "A Survey of
Evolution Strategies" [ICGA91], 2-9.

------------------------------

Subject: Q10.4: Introductory Journal Articles?

Baeck, T. & Schwefel, H.-P. (1993) "An Overview of Evolutionary
Algorithms for Parameter Optimization", Evolutionary Computation,
1(1), 1-23.

Baeck, T., Rudolph, G. & Schwefel, H.-P. (1993) "Evolutionary
Programming and Evolution Strategies: Similarities and Differences",
[EP93], 11-22.

Baeck, T., Hammel, U. and Schwefel, H.-P. (1997) "Evolutionary
computation: Comments on the history and current state," IEEE Trans.
Evolutionary Computation, Vol. 1:1, pp. 3-17

Beasley, D., Bull, D.R., & Martin, R.R. (1993) "An Overview of
Genetic Algortihms: Part 1, Fundamentals", University Computing,
15(2) 58-69. Available by ftp from ENCORE (See Q15.3) in file:
GA/papers/over93.ps.gz or from
ralph.cs.cf.ac.uk/pub/papers/GAs/ga_overview1.ps

Beasley, D., Bull, D.R., & Martin, R.R. (1993) "An Overview of
Genetic Algortihms: Part 2, Research Topics", University Computing,
15(4) 170-181. Available by ftp from ENCORE (See Q15.3) in file:
GA/papers/over93-2.ps.gz or from
ralph.cs.cf.ac.uk/pub/papers/GAs/ga_overview2.ps

Brooks, R.A. (1991) "Intelligence without Reason", MIT AI Memo No.
1293. Appeared in "Computer's and Thought", IJCAI-91.

Dawkins, R. (1987) "The Evolution of Evolvability", [ALIFEI],
201-220.

Fogel, D.B. (1994) "An introduction to simulated evolutionary
optimization," IEEE Trans. Neural Networks, Vol. 5:1, pp. 3-14.

Goldberg, D.E. (1986) "The Genetic Algorithm: Who, How, and What
Next?". In Kumpati S. Narenda, ed., Adaptive and Learning Systems,
Plenum, New York, NY.

Goldberg, D. (1994), "Genetic and Evolutionary Algorithms Come of
Age", Communications of the ACM, 37(3), 113--119.

Hillis, W.D. (1987) "The Connection Machine", Scientific American,
255(6).

Hillis, W.D. (1992) "Massively Parallel Computing" Daedalus, winter,
121(1), 1-29. [HILLIS92]:

Holland, J.H. (1989) "Using Classifier Systems to Study Adaptive
Nonlinear Networks". In: Lectures in the Science of Complexity, SFI
Studies in the Science of Complexity, D. Stein, (ed), Addison Wesley.

Holland, J.H. (1992) "Genetic Algorithms", Scientific American,
267(1), 66-72.

Holland, J.H. (1992) "Complex Adaptive Systems" Daedalus, winter,
121(1), 17-30.

Mitchell, M. & Forrest S. (1993) "Genetic Algorithms and Artificial
Life", Artificial Life, 1(1). Also avail. as SFI Working Paper
31-11-072.

Sims, K. (1991) "Artificial Evolution for Computer Graphics",
Computer Graphics, 25(4), 319-328

Sipper, M (1996) "A Brief Introduction to Genetic Algorithms",
unpublished guide, available from
http://lslwww.epfl.ch/~moshes/ga.html

Spears, W.M., DeJong, K.A., Baeck, T., Fogel, D. & de Garis, H.
(1993) "An Overview of Evolutionary Computation", [ECML93], 442-459.

Peter Wayner (1991), "Genetic Algorithms: Programming takes a
valuable tip from nature", BYTE, January, 361--368.

------------------------------

Subject: Q10.5: Introductory Technical Reports?

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part5
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

TABLE OF CONTENTS OF PART 5

Q20: What EA software packages are available?
Q20.1: Free software packages?
Q20.2: Commercial software packages?
Q20.3: Current research projects?

----------------------------------------------------------------------

Subject: Q20: What EA software packages are available?

This gives a list of all known EA software packages available to the
public. The list was originally maintained by Nici Schraudolph. In
June '93 it was agreed that it would be incorporated into this FAQ
and the responsibility for maintenance taken over by the FAQ editor.

A copy of most of the packages described below are kept at ENCORE,
(See Q15.3), available by anonymous FTP.

Most GENETIC PROGRAMMING software is available by FTP in:
ftp.io.com/pub/genetic-programming/ There are subdirectories
containing papers related to GP, archives of the mailing list, as
well as a suite of programs for implementing GP. These programs
include the Lisp code from Koza's "Genetic Programming" [KOZA92], as
well as implementations in C and C++, as for example SGPC: Simple
Genetic Programming in C by Walter Alden Tackett and Aviram Carmi
<g...@ipld01.hac.com>.

A survey paper entitled "Genetic Algorithm Programming Environments"
was published in IEEE Computer in the June 1994 issue. Written by
Filho, Alippi and Treleaven of University College, London, UK. It's
available by FTP as bells.cs.ucl.ac.uk/papagena/game/docs/gasurvey.ps
(file size: 421k).

PLEASE NOTE
For many of these software packages, specific ordering instructions
are given in the descriptions below (see Q20.1 - Free Software
packages, Q20.2 - Commercial Software Packages, Q20.3 - Research
Projects). Please read and follow them before unnecessarily
bothering the listed author or contact! Also note that these
programs haven't been independently tested, so there are no
guarantees of their quality.

A major revision was undertaken in August 1994, when all authors were
contacted, and asked to confirm the accuracy of the information
contained here. A few authors did not respond to the request for
information. These are noted below by: (Unverified 8/94). In these
cases, FTP address were checked by the FAQ editor, to confirm that
this information (at least) is correct. In two cases, email to the
author bounced back as "undeliverable" -- these are noted below.

Legend
Type (this is a very ad-hoc classification)
GE: generational GA
SS: steady-state GA
PA: (pseudo) parallel GA
ES: evolution strategy
OO: object-oriented
XP: expert system
ED: educational/demo
CF: classifier system

OS Operating System; X11 implies Unix; "Win" means Microsoft
Windows 3.x/NT (PC); "DOS" means MS-DOS or compatibles.

Lang Programming Language; in parentheses: source code not included;
"OPas" = MPW Object Pascal
Price (circa 1994)
(1) free to government contractors, $221 otherwise, (2)
educational discount available, (3) available as addendum to a
book, (4) single 1850 DM, site license 5200 DM, (5) single 200
DM, site license 500 DM, (6) free for academic and educational
use.

Author or Contact
Name of creator/maintainer. For internet e-mail addresses, refer
to the details of the specific package.
ES/GA/XP System Implementations:

=========================================================================
Name Type OS Lang Price Author/Contact
=========================================================================

BUGS GE, X11, C free Joshua Smith
ED Suntools

Computer- ED, Win ? free Scott Kennedy
Ants GA

DGenesis GE, Unix C free Erick Cantu-Paz
PA,ED

DOUGAL SS, DOS Turbo free Brett Parker
GE Pascal

Ease GE, Unix Tcl free Joachim Sprave
ES

ESCaPaDE ES Unix C free Frank Hoffmeister

Evolution GE, DOS C free Hans-Michael Voigt and
Machine ES Joachim Born

Evolutionary GE, Unix C++ free JJ Merelo
Objects OO

GAC, GE Unix C free Bill Spears
GAL " " Lisp "

GAGA GE Unix C free Jon Crowcroft

GAGS GE, Unix, C++ free JJ Merelo
SS,OO DOS

GAlib GA Unix, C++ free Matthew Wall
Mac,DOS

GALOPPS GE, Unix, C free Erik Goodman
PA DOS

GAMusic ED Win (VB) $10 Jason H. Moore

GANNET GE, Unix C free Darrell Duane
NN

GAucsd GE Unix C free Nici Schraudolph

GA GE, DOS (C++) free Mark Hughes
Workbench ED
GECO GE, Unix, Lisp free George P. W. Williams, Jr.
OO,ED MacOS

Genesis GE, Unix, C free John Grefenstette
ED DOS

GENEsYs GE Unix C free Thomas Baeck

GenET SS, Unix, C free Cezary Z. Janikow
ES,ED X, etc.

Genie GE Mac Think free Lance Chambers
Pascal

Genitor SS Unix C free Darrell Whitley

GENlib SS Unix, C (6) Jochen Ruhland
DOS

GENOCOP GE Unix C free Zbigniew Michalewicz

GIGA SS Unix C free Joe Culberson

GPEIST GP Win, Small- free Tony White
OS/2 talk

Imogene GP Win C++ free Harley Davis

JAG GA - Java free Stephen Hartley

LibGA GE, Unix/DOS C free Art Corcoran
SS,ED NeXT/Amiga

LICE ES Unix, C free Joachim Sprave
DOS

Matlab-GA GE ? Matlab free Andy Potvin

mGA GE Unix C, free Dave Goldberg
Lisp

PARAGenesis PA, CM C* free Michael van Lent
GE

PGA PA, Unix, C free Peter Ross
SS,GE etc.

PGAPack GA, any C free David Levine
PA

REGAL GA C free Filippo Neri

SGA-C, GE Unix C free Robert E. Smith
SGA-Cube nCube

Splicer GE Mac, C (1) Steve Bayer
X11

TOLKIEN OO, Unix, C++ free Anthony Yiu-Cheung Tang
GE DOS

Trans-Dimensional
Learning NN Win ? free Universal Problem Solvers

WOLF SS Unix C free David Rogers

XGenetic GA, Win ActiveX free Jeff Goslin
OO,ED demo

=========================================================================

Classifier System Implementations:

=========================================================================
Name Type OS Lang Price Author/Contact
=========================================================================

CFS-C CF, Unix/DOS C free Rick Riolo
ED

SCS-C CF, Unix/DOS C free Joerg Heitkoetter
ED Atari TOS
==========================================================================

Commercial Packages:

=========================================================================
Name Type OS Lang Price Author/Contact
=========================================================================

ActiveGA GA Win (ActiveX) $99 Brightwater Software

EnGENEer OO, X11 C ? George Robbins,
GA Logica Cambridge Ltd.

EvoFrame/ OO, Mac, C++/ (4,2) Optimum Software
REALizer ES DOS OPas (5,2)

Evolver GE DOS, (C, UKP350 Palisade
Mac Pascal)

FlexTool GA Win Matlab ? Flexible Intelligence Group

GAME OO, X11 C++ (3) Jose R. Filho
GA

GeneHunter GA Win, (VB) $369 Ward Systems
Excel

Generator GE,SS Win, (C++) $379 Steve McGrew, New Light Industries
ES,OO,ED Excel

Genetic GE,SS Win (ActiveX) ? NeuroDimension Inc.
Server/Library (C++)

MicroGA/ OO, Mac, C++ $249 Emergent Behavior, Inc.
Galapagos SS Win (2)

Omega ? DOS ? ? David Barrow, KiQ Ltd.

OOGA OO, Mac, Lisp $60 Lawrence Davis
GE DOS

PC/Beagle XP DOS ? 69UKP Richard Forsyth

XpertRule/ XP DOS (Think 995UKP Attar Software
GenAsys Pascal)

XYpe SS Mac (C) $725 Ed Swartz, Virtual Image Inc.
=========================================================================

------------------------------

Subject: Q20.1: Free software packages?

BUGS:
BUGS (Better to Use Genetic Systems) is an interactive program for
demonstrating the GENETIC ALGORITHM and is written in the spirit of
Richard Dawkins' celebrated Blind Watchmaker software. The user can
play god (or `GA FITNESS function,' more accurately) and try to
evolve lifelike organisms (curves). Playing with BUGS is an easy way
to get an understanding of how and why the GA works. In addition to
demonstrating the basic GENETIC OPERATORs (SELECTION, CROSSOVER, and
MUTATION), it allows users to easily see and understand phenomena
such as GENETIC DRIFT and premature convergence. BUGS is written in C
and runs under Suntools and X Windows.

BUGS was written by Joshua Smith <j...@media.mit.edu> at Williams
College and is available from
www.aic.nrl.navy.mil/pub/galist/src/BUGS.tar.Z Note that it is
unsupported software, copyrighted but freely distributable. Address:
Room E15-492, MIT Media Lab, 20 Ames Street, Cambridge, MA 02139.
(Unverified 8/94).

ComputerAnts:
ComputerAnts is a free Windows program that teaches principles of
GENETIC ALGORITHMs by breeding a colony of ants on your computer
screen. Users create ants, food, poison, and set CROSSOVER and
MUTATION rates. Then they watch the colony slowly evolve. Includes
extensive on-line help and tutorials on genetic algorithms. For
further information or to download, see the download section under
http://www.bitstar.com

DGenesis:
DGenesis is a distributed implementation of a Parallel GA. It is
based on Genesis 5.0. It runs on a network of UNIX workstations. It
has been tested with DECstations, microVAXes, Sun Workstations and
PCs running 386BSD 0.1. Each subpopulation is handled by a UNIX
process and the communication between them is accomplished using
Berkeley sockets. The system is programmed in C and is available free
of charge by anonymous FTP from lamport.rhon.itam.mx:/ and from
ftp.aic.nrl.navy.mil/pub/galist/src/ga/dgenesis-1.0.tar.Z

DGenesis allows the user to set the MIGRATION interval, the migration
rate and the topology between the SUB-POPULATIONs. There has not
been much work investigating the effect of the topology on the
PERFORMANCE of the GA, DGenesis was written specifically to encourage
experimentation in this area. It still needs many refinements, but
some may find it useful.

Contact Erick Cantu-Paz <eca...@lamport.rhon.itam.mx> at the
Instituto Tecnologico Autonomo de Mexico (ITAM)

Dougal:
DOUGAL is a demonstration program for solving the TRAVELLING SALESMAN
PROBLEM using GAs. The system guides the user through the GA,
allowing them to see the results of altering parameters relating to
CROSSOVER, MUTATION etc. The system demonstrates graphicaly the
OPTIMIZATION of the route. The options open to the user to
experiment with include percentage CROSSOVER and MUTATION, POPULATION
size, steady state or generational replacement, FITNESS technique
(linear normalised, is evaluation, etc).

DOUGAL requires an IBM compatible PC with a VGA monitor. The
software is free, however I would appreciate feedback on what you
think of the software.

Dougal is available by FTP from ENCORE (see Q15.3) in file
EC/GA/src/dougal.zip It's pkzipped and contains executable, vga
driver, source code and full documentation. It is important to place
the vga driver (egavga.bgi) in the same directory as DOUGAL. Author:
Brett Parker, 7 Glencourse, East Boldon, Tyne + Wear, NE36 0LW,
England. <b.s.p...@durham.ac.uk>

Ease:
Ease - Evolutionary Algorithms Scripting Environment - is an
extension to the Tcl scripting language, providing commands to
create, modify, and evaluate POPULATIONs of INDIVIDUALs represented
by real number vectors and/or bit strings. With Ease, a standard ES
or GA can be written in less than 20 lines of code.

Ease is available as source code for Linux and Solaris under the GNU
Public License. Tcl version 8.0 or higher is required. If you know
how generate DLLs, you may be able to use it on Win9x/NT, as well.

The URL is http://ls11-www.cs.uni-dortmund.de/~joe/Ease/Ease.html .
Written by Joachim Sprave <spr...@LS11.cs.uni-dortmund.de>.

ESCaPaDE:
ESCaPaDE is a sophisticated software environment to run experiments
with EVOLUTIONARY ALGORITHMs, such as e.g. an EVOLUTION STRATEGY.
The main support for experimental work is provided by two internal
tables: (1) a table of objective functions and (2) a table of so-
called data monitors, which allow easy implementation of functions
for monitoring all types of information inside the Evolutionary
Algorithm under experiment.

ESCaPaDE 1.2 comes with the KORR implementation of the evolution
strategy by H.-P. Schwefel which offers simple and correlated
MUTATIONs. KORR is provided as a FORTRAN 77 subroutine, and its
cross-compiled C version is used internally by ESCaPaDE.

An extended version of the package was used for several
investigations so far and has proven to be very reliable. The
software and its documentation is fully copyrighted although it may
be freely used for scientific work; it requires 5-6 MB of disk space.

In order to obtain ESCaPaDE, please send a message to the e-mail
address below. The SUBJECT line should contain 'help' or 'get
ESCaPaDE'. (If the subject lines is invalid, your mail will be
ignored!). For more information contact: Frank Hoffmeister, Systems
Analysis Research Group, LSXI, Department of Computer Science,
University of Dortmund, D-44221 Dortmund, Germany. Net:
<hoffm...@ls11.informatik.uni-dortmund.de>

Evolution Machine:
The Evolution Machine (EM) is universally applicable to continuous
(real-coded) OPTIMIZATION problems. In the EM we have coded
fundamental EVOLUTIONARY ALGORITHMs (GENETIC ALGORITHMs and EVOLUTION
STRATEGIEs), and added some of our approaches to evolutionary search.

The EM includes extensive menu techniques with:

o Default parameter setting for unexperienced users.

o Well-defined entries for EM-control by freaks of the EM, who
want to leave the standard process control.

o Data processing for repeated runs (with or without change of the
strategy parameters).

o Graphical presentation of results: online presentation of the
EVOLUTION progress, one-, two- and three-dimensional graphic
output to analyse the FITNESS function and the evolution process.

o Integration of calling MS-DOS utilities (Turbo C).

We provide the EM-software in object code, which can be run on PC's
with MS-DOS and Turbo C, v2.0, resp. Turbo C++,v1.01. The Manual to
the EM is included in the distribution kit.

The EM software is available by FTP from ftp-bionik.fb10.tu-
berlin.de/pub/software/Evolution-Machine/ This directory contains the
compressed files em_tc.exe (Turbo C), em_tcp.exe (Turbo C++) and
em_man.exe (the manual). There is also em-man.ps.Z, a compressed
PostScript file of the manual. If you do not have FTP access, please
send us either 5 1/4 or 3 1/2 MS-DOS compatible disks. We will return
them with the compressed files (834 kB).

Official contact information: Hans-Michael Voigt or Joachim Born,
Technical University Berlin, Bionics and evolution Techniques
Laboratory, Bio- and Neuroinformatics Research Group, Ackerstrasse
71-76 (ACK1), D-13355 Berlin, Germany. Net: <vo...@fb10.tu-
berlin.de>, <bo...@fb10.tu-berlin.de> (Unverified 8/94).

EVOLUTIONARY OBJECTS:
EO (Evolutionary Objects) is a C++ library written and designed to
allow a variety of evolutionary algorithms to be constructed easily.
It is intended to be an "Open source" effort to create the definitive
EC library. It has: a mailing list, anon-CVS access, frequent
snapshots and other features. For details, see http://fast.to/EO

Maintained by J.J. Merelo, Grupo Geneura, Univ. Granada <jmerelo@kal-
el.ugr.es>

GA Workbench:
A mouse-driven interactive GA demonstration program aimed at people
wishing to show GAs in action on simple FUNCTION OPTIMIZATIONs and to
help newcomers understand how GAs operate. Features: problem
functions drawn on screen using mouse, run-time plots of GA
POPULATION distribution, peak and average FITNESS. Useful population
STATISTICS displayed numerically, GA configuration (population size,
GENERATION gap etc.) performed interactively with mouse.
Requirements: MS-DOS PC, mouse, EGA/VGA display.
Available by FTP from the simtel20 archive mirrors, e.g. wsmr-
simtel20.army.mil/pub/msdos/neurlnet/gaw110.zip or
wuarchive.wustl.edu: or oak.oakland.edu: Produced by Mark Hughes
<m...@i2ltd.demon.co.uk>. A windows version is in preparation.

GAC, GAL:
Bill Spears <spe...@aic.nrl.navy.mil> writes: These are packages I've
been using for a few years. GAC is a GA written in C. GAL is my
Common Lisp version. They are similar in spirit to John
Grefenstette's Genesis, but they don't have all the nice bells and
whistles. Both versions currently run on Sun workstations. If you
have something else, you might need to do a little modification.

Both versions are free: All I ask is that I be credited when it is
appropriate. Also, I would appreciate hearing about improvements!
This software is the property of the US Department of the Navy.

The code will be in a "shar" format that will be easy to install.
This code is "as is", however. There is a README and some
documentation in the code. There is NO user's guide, though (nor am I
planning on writing one at this time). I am interested in hearing
about bugs, but I may not get around to fixing them for a while.
Also, I will be unable to answer many questions about the code, or
about GAs in general. This is not due to a lack of interest, but due
to a lack of free time!

Available by FTP from
ftp.aic.nrl.navy.mil/pub/galist/src/ga/GAC.shar.Z and GAL.shar.Z .
PostScript versions of some papers are under "/pub/spears". Feel
free to browse.

GAGA:
GAGA (GA for General Application) is a self-contained, re-entrant
procedure which is suitable for the minimization of many "difficult"
cost functions. Originally written in Pascal by Ian Poole, it was
rewritten in C by Jon Crowcroft. GAGA can be obtained by request from
the author: Jon Crowcroft <j...@cs.ucl.ac.uk>, Univeristy College
London, Gower Street, London WCIE 6BT, UK, or by FTP from
ftp://cs.ucl.ac.uk/darpa/gaga.shar

GAGS:
GAGS (Genetic Algorithms from Granada, Spain) is a library and
companion programs written and designed to take the heat out of
designing a GENETIC ALGORITHM. It features a class library for
genetic algorithm programming, but, from the user point of view, is a
genetic algorithm application generator. Just write the function you
want to optimize, and GAGS surrounds it with enough code to have a
genetic algorithm up and running, compiles it, and runs it. GAGS Is
written in C++, so that it can be compiled in any platform running
this GNU utility. It has been tested on various machines.
Documentation is available.

GAGS includes:

o Steady-state, roulette-wheel, tournament and elitist SELECTION.

o FITNESS evaluation using training files.

o Graphics output through gnuplot.

o Uniform and 2-point CROSSOVER, and bit-flip and gene-transposition
MUTATION.

o Variable length CHROMOSOMEs and related operators.

The application generator gags.pl is written in perl, so this
language must also be installed before GAGS. Available from:
http://kal-el.ugr.es/GAGS The programmer's manual is in the file
gagsprogs.ps.gz. GAGS is also available from ENCORE (see Q15.3) in
file EC/GA/src/gags-0.92.tar.gz (there may be a more recent version)
with documentation in EC/GA/docs/gagsprog.ps.gz

Maintained by J.J. Merelo, Grupo Geneura, Univ. Granada <jmerelo@kal-
el.ugr.es>

GAlib:
GAlib is a C++ library that provides the application programmer with
a set of GENETIC ALGORITHM objects. With GAlib you can add GA
OPTIMIZATION to your program using any data representation and
standard or custom SELECTION, CROSSOVER, MUTATION, scaling, and
replacement, and termination methods. View the documentation on-line
at http://lancet.mit.edu/ga/ There you will find a complete
description of the programming interface, features, and examples.

The canonical source for this library is the FTP site:
lancet.mit.edu/pub/ga/ This directory contains UNIX (.tar.gz), MacOS
(.sea.hqx), and DOS (.zip) versions of the GA library. Once you have
downloaded the file, uncompress and extract it. It will expand to
its own directory. If you extract the DOS version be sure to use the
-d option to keep everything in one directory.

GAlib requires a cfront 3.0 compatible C++ compiler. It has been
used on the following systems: SGI IRIX 4.0.x (Cfront); SGI IRIX 5.x
(DCC 1.0, g++ 2.6.8, 2.7.0); IBM RSAIX 3.2 (g++ 2.6.8, 2.7.0); DEC
MIPS ultrix 4.2 (g++ 2.6.8, 2.7.0); SUN SOLARIS 5.3 (g++ 2.6.8,
2.7.0); HP-UX (g++); MacOS (MetroWerks CodeWarrior 5); MacOS
(Symantec THINK C++ 7.0); DOS/Windows (Borland Turbo C++ 3.0).

Maintained by: Matthew Wall <mbw...@mit.edu>

GALOPPS:
GALOPPS (Genetic Algorithm Optimized for Portability and Parallelism)
is a general-purpose parallel GENETIC ALGORITHM system, written in
'C', organized like Goldberg's "Simple Genetic Algorithm". User
defines objective function (in template furnished) and any callback
functions desired (again, filling in template); can run one or many
subpopulations, on one or many PC's, workstations, Mac's, MPP. Runs
interactively (GUI or answering questions) or from files, makes file
and/or graphical output. Runs easily interrupted and restarted, and
a PVM version for Unix networks even moves processes automatically
when workstations become busy. (Note: optional GUI requires Tcl/Tk.)
14 example problems included (De Jong Functions, Royal Road, BTSP,
etc. )

User may choose:

o problem type (permutation or value-type)

o field sizes (arbitrary, possibly unequal, heeded by CROSSOVER,
MUTATION)

o among 7 crossover types and 4 mutation types (or define own)

o among 6 SELECTION types, including "automatic" option based on
Boltzmann scaling and Shapiro and Pruegel-Bennett statist.
Mechanics stuff

o operator probabilities, FITNESS scaling, amount of output,
MIGRATION frequency and patterns,

o stopping criteria (using "standard" convergence STATISTICS, etc.)

o the GGA (Grouping Genetic Algorithm) REPRODUCTION and operators of
Falkenauer

GALOPPS allows and supports:

o use of a different representation in each subpopulation, with
transformation of migrants
o INVERSION on level of subpopulations, with automatic handling of
differing field sizes, migrants

o control over replacement by OFFSPRING, including DeJong crowding
or random replacement or SGA-like replacement of PARENTs

o mate selection, using incest reduction

o migrant selection, using incest reduction, and/or DeJong crowding
into receiving subpopulation

o optional ELITISM

Generic (Unix) GALOPPS 3.2 (includes 80-pp. manual) is available on
ENCORE. For PVM GALOPPS, PC version (different line endings,
makefiles), Threaded GALOPPS, and GALOPPS-based 2-level adaptive
system, see the MSU GARAGe web site: http://GARAGe.cps.msu.edu/ .

Contact: Erik D. Goodman, <goo...@egr.msu.edu>, MSU GARAGe, Case
Center, 112 Engineering Building, MSU, East Lansing, MI 48824 USA.

GAMusic:
GAMusic 1.0 is a user-friendly interactive demonstration of a simple
GA that evolves musical melodies. Here, the user is the FITNESS
function. Melodies from the POPULATION can be played and then
assigned a fitness. Iteration, RECOMBINATION frequency and MUTATION
frequency are all controlled by the user. This program is intended
to provide an introduction to GAs and may not be of interest to the
experienced GA programmer.

GAMusic was programmed with Microsoft Visual Basic 3.0 for Windows
3.1x. No special sound card is required. GAMusic is distributed as
shareware (cost $10) and can be obtained by FTP from
wuarchive.wustl.edu/pub/MSDOS_UPLOADS/GenAlgs/gamusic.zip or from
fly.bio.indiana.edu/science/ibmpc/gamusic.zip The program is also
available from the America Online archive.

Contact: Jason H. Moore <j...@superh.hg.med.umich.edu> or
<jason...@aol.com>

GANNET:
GANNET (Genetic Algorithm / Neural NETwork) is a software package
written by Jason Spofford in 1990 which allows one to evolve binary
valued neural networks. It offers a variety of configuration options
related to rates of the GENETIC OPERATORs. GANNET evolves nets based
upon three FITNESS functions: Input/Output Accuracy, Output
'Stability', and Network Size.

The evolved neural network presently has a binary input and binary
output format, with neurodes that have either 2 or 4 inputs and
weights ranging from -3 to +4. GANNET allows for up to 250 neurons
in a net. Research using GANNET is continuing.

GANNET 2.0 is available at http://www.duane.com/~dduane/gannet
. As well as the software, the masters thesis that utilized this
program as well as a paper is available in this directory.

The major enhancement of version 2.0 is the ability to recognize
variable length binary strings, such as those that would be generated
by a finite automaton. Included is code for calculating the
Effective Measure Complexity (EMC) of finite automata as well as code
for generating test data.

A mailing list has been established for discussing uses and problems
with the GANNET software. To subscribe, send a message to:
<majo...@duane.com> On the first line of the message (not the
subject) type: subscribe gannet

Contact: Darrell Duane <ddu...@duane.com> or Dr. Kenneth Hintz
<khi...@gmu.edu>, George Mason University, Dept. of Electrical &
Computer Engineering, Mail Stop 1G5, 4400 University Drive, Fairfax,
VA 22033-4444 USA.

GAucsd:
GAucsd is a Genesis-based GA package incorporating numerous bug fixes
and user interface improvements. Major additions include a wrapper
that simplifies the writing of evaluation functions, a facility to
distribute experiments over networks of machines, and Dynamic
Parameter Encoding, a technique that improves GA PERFORMANCE in
continuous SEARCH SPACEs by adaptively refining the genomic
representation of real-valued parameters.

GAucsd was written in C for Unix systems, but the central GA engine
is easily ported to other platforms. The entire package can be ported
to systems where implementations of the Unix utilities "make", "awk"
and "sh" are available.

GAucsd is available by FTP from
ftp.cs.ucsd.edu/pub/GAucsd/GAucsd14.sh.Z or from
ftp.aic.nrl.navy.mil/pub/galist/src/GAucsd14.sh.Z To be added to a
mailing list for bug reports, patches and updates, send "add GAucsd"
to <list...@cs.ucsd.edu>.

Cognitive Computer Science Research Group, CSE Department, UCSD 0114,
La Jolla, CA 92093-0114, USA. Net: <GAucsd-...@cs.ucsd.edu>

GECO:
GECO (Genetic Evolution through Combination of Objects) is an
extensible, object-oriented framework for prototyping GENETIC
ALGORITHMs in Common Lisp. GECO makes extensive use of CLOS, the
Common Lisp Object System, to implement its functionality. The
abstractions provided by the classes have been chosen with the intent
both of being easily understandable to anyone familiar with the
paradigm of genetic algorithms, and of providing the algorithm
developer with the ability to customize all aspects of its operation.
It comes with extensive documentation, in the form of a PostScript
file, and some simple examples are also provided to illustrate its
intended use.

GECO Version 2.0 is available by FTP. See the file
ftp.aic.nrl.navy.mil/pub/galist/src/ga/GECO-v2.0.README for more
information.

George P. W. Williams, Jr., 1334 Columbus City Rd., Scottsboro, AL
35768. Net: <geo...@hsvaic.hv.boeing.com>.

Genesis:
Genesis is a generational GA system written in C by John Grefenstette
<gr...@aic.nrl.navy.mil>. As the first widely available GA program
Genesis has been very influential in stimulating the use of GAs, and
several other GA packages are based on it. Genesis is available
together with OOGA (see below), or by FTP from
ftp.aic.nrl.navy.mil/pub/galist/src/genesis.tar.Z (Unverified 8/94).

GENEsYs:
GENEsYs is a Genesis-based GA implementation which includes
extensions and new features for experimental purposes, such as
SELECTION schemes like linear ranking, Boltzmann, (mu,
lambda)-selection, and general extinctive selection variants,
CROSSOVER operators like n-point and uniform crossover as well as
discrete and intermediate RECOMBINATION. SELF-ADAPTATION of MUTATION
rates is also possible.

A set of objective functions is provided, including De Jong's
functions, complicated continuous functions, a TSP-problem, binary
functions, and a fractal function. There are also additional data-
monitoring facilities such as recording average, variance and skew of
OBJECT VARIABLES and mutation rates, or creating bitmap-dumps of the
POPULATION.

GENEsYs 1.0 is available via FTP from lumpi.informatik.uni-
dortmund.de/pub/GA/src/GENEsYs-1.0.tar.Z The documentation alone is
available as /pub/GA/docs/GENEsYs-1.0-doc.tar.Z

For more information contact: Thomas Baeck, Systems Analysis Research
Group, LSXI, Department of Computer Science, University of Dortmund,
D-44221 Dortmund, Germany. Net: <ba...@ls11.informatik.uni-
dortmund.de> (Unverified 8/94).

GenET:
GenET is a "generic" GA package. It is generic in the sense that all
problem independent mechanisms have been implemented and can be used
regardless of application domain. Using the package forces (or
allows, however you look at it) concentration on the problem: you
have to suggest the best representation, and the best operators for
such space that utilize your problem-specific knowledge. You do not
have to think about possible GA models or their implementation.

The package, in addition to allowing for fast implementation of
applications and being a natural tool for comparing different models
and strategies, is intended to become a depository of representations
and operators. Currently, only floating point representation is
implemented in the library with few operators.

The algorithm provides a wide selection of models and choices. For
example, POPULATION models range from generational GA, through
steady-state, to (n,m)-EP and (n,n+m)-EP models (for arbitrary
problems, not just parameter OPTIMIZATION). (Some are not finished
at the moment). Choices include automatic adaptation of operator
probabilities and a dynamic ranking mechanism, etc.

Even though the implementation is far from optimal, it is quite
efficient - implemented in ATT's C++ (3.0) (functional design) and
also tested on gcc. Along with the package you will get two
examples. They illustrate how to implement problems with
heterogeneous and homogeneous structures, with explicit rep/opers and
how to use the existing library (FP). Very soon I will place there
another example - our GENOCOP operators for linearly constrained
optimization. One more example soon to appear illustrates how to
deal with complex structures and non-stationary problems - this is a
fuzzy rule-based controller optimized using the package and some
specific rep/operators.

If you start using the package, please send evaluations (especially
bugs) and suggestions for future versions to the author.

GenET Version 1.00 is available by FTP from
radom.umsl.edu/var/ftp/GenET.tar.Z To learn more, you may get the
User's Manual, available in compressed postscript in
"/var/ftp/userMan.ps.Z". It also comes bundled with the complete
package.

Cezary Z. Janikow, Department of Math and CS, CCB319, St. Louis, MO
63121, USA. Net: <jan...@radom.umsl.edu>

Genie:
Genie is a GA-based modeling/forecasting system that is used for
long-term planning. One can construct a model of an ENVIRONMENT and
then view the forecasts of how that environment will evolve into the
future. It is then possible to alter the future picture of the
environment so as to construct a picture of a desired future (I will
not enter into arguments of who is or should be responsible for
designing a desired or better future). The GA is then employed to
suggest changes to the existing environment so as to cause the
desired future to come about.

Genie is available free of charge via e-mail or on 3.5'' disk from:
Lance Chambers, Department of Transport, 136 Stirling Hwy, Nedlands,
West Australia 6007. Net: <pst...@yarrow.wt.uwa.edu.au> It is also
available by FTP from hiplab.newcastle.edu.au/pub/Genie&Code.sea.Hqx

Genitor:
"Genitor is a modular GA package containing examples for floating-
point, integer, and binary representations. Its features include many
sequencing operators as well as subpopulation modeling.

The Genitor Package has code for several order based CROSSOVER
operators, as well as example code for doing some small TSPs to
optimality.

We are planning to release a new and improved Genitor Package this
summer (1993), but it will mainly be additions to the current package
that will include parallel island models, cellular GAs, delta coding,
perhaps CHC (depending on the legal issues) and some other things we
have found useful."

Genitor is available from Colorado State University Computer Science
Department by FTP from ftp.cs.colostate.edu/pub/GENITOR.tar

Please direct all comments and questions to
<math...@cs.colostate.edu>. If these fail to work, contact: L.
Darrell Whitley, Dept. of Computer Science, Colorado State
University, Fort Collins, CO 80523, USA. Net:
<whi...@cs.colostate.edu> (Unverified 8/94).

GENlib:
GENlib is a library of functions for GENETIC ALGORITHMs. Included
are two applications of this library to the field of neural networks.
The first one called "cosine" uses a genetic algorithm to train a
simple three layer feed-Forward network to work as a cosine-function.
This task is very difficult to train for a backprop algorithm while
the genetic algorithm produces good results. The second one called
"vartop" is developing a Neural Network to perform the XOR-function.
This is done with two genetic algorithms, the first one develops the
topology of the network, the second one adjusts the weights.

GENlib may be obtained by FTP from ftp.neuro.informatik.uni-
kassel.de/pub/NeuralNets/GA-and-NN/

Author: Jochen Ruhland, FG Neuronale Netzwerke / Uni Kassel,
Heinrich-Plett-Str. 40, D-34132 Kassel, Germany.
<joc...@neuro.informatik.uni-kassel.de>

GENOCOP:
This is a GA-based OPTIMIZATION package that has been developed by
Zbigniew Michalewicz and is described in detail in his book Genetic
Algorithms + Data Structures = Evolution Programs [MICHALE94].

GENOCOP (Genetic Algorithm for Numerical Optimization for COnstrained
Problems) optimizes a function with any number of linear constraints
(equalities and inequalities).

The second version of the system is available by FTP from
ftp.uncc.edu/coe/evol/genocop2.tar.Z

Zbigniew Michalewicz, Dept. of Computer Science, University of North
Carolina, Chappel-Hill, NC, USA. Net: <zby...@uncc.edu>

GIGA:
GIGA is designed to propogate information through a POPULATION, using
CROSSOVER as its operator. A discussion of how it propogates BUILDING
BLOCKs, similar to those found in Royal Road functions by John
Holland, is given in the DECEPTION section of: "Genetic Invariance: A
New Paradigm for Genetic Algorithm Design." University of Alberta
Technical Report TR92-02, June 1992. See also: "GIGA Program
Description and Operation" University of Alberta Computing Science
Technical Report TR92-06, June 1992

These can be obtained, along with the program, by FTP from
ftp.cs.ualberta.ca/pub/TechReports/ in the subdirectories TR92-02/
and TR92-06/ .

Also, the paper "Mutation-Crossover Isomorphisms and the Construction
of Discriminating Functions" gives a more in-depth look at the
behavior of GIGA. Its is available from
ftp.cs.ualberta.ca/pub/joe/Preprints/xoveriso.ps.Z

Joe Culberson, Department of Computer Science, University of Alberta,
CA. Net: <j...@cs.ualberta.ca>

GPEIST:
The GENETIC PROGRAMMING ENVIRONMENT in Smalltalk (GPEIST) provides a
framework for the investigation of Genetic Programming within a
ParcPlace VisualWorks 2.0 development system. GPEIST provides
program, POPULATION, chart and report browsers and can be run on
HP/Sun/PC (OS/2 and Windows) machines. It is possible to distribute
the experiment across several workstations - with subpopulation
exchange at intervals - in this release 4.0a. Experiments,
populations and INDIVIDUAL genetic programs can be saved to disk for
subsequent analysis and experimental statistical measures exchanged
with spreadsheets. Postscript printing of charts, programs and
animations is supported. An implementation of the Ant Trail problem
is provided as an example of the use of the GPEIST environment.

GPEIST is available from ENCORE (see Q15.3) in file:
EC/GP/src/GPEIST4.tar.gz

Contact: Tony White, Bell-Northern Research Ltd., Computer Research
Lab - Gateway, 320 March Road, Suite 400, Kanata, Ontario, Canada,
K2K 2E3. tel: (613) 765-4279 <ar...@bnr.ca>

Imogene:
Imogene is a Windows 3.1 shareware program which generates pretty
images using GENETIC PROGRAMMING. The program displays GENERATIONs
of 9 images, each generated using a formula applied to each pixel.
(The formulae are initially randomly computed). You can then select
those images you prefer. In the next generation, the nine images are
generated by combining and mutating the formulae for the most-
preferred images in the previous generation. The result is a
SIMULATION of natural SELECTION in which images evolve toward your
aesthetic preferences.

Imogene supports different color maps, palette animation, saving
images to .BMP files, changing the wallpaper to nice images, printing
images, and several other features. Imogene works only in 256 color
mode and requires a floating point coprocessor and a 386 or better
CPU.

Imogene is based on work originally done by Karl Sims at
(ex-)Thinking Machines for the CM-2 massively parallel computer - but
you can use it on your PC. You can get Imogene from:
http://www.aracnet.com/~wwir/software.html

Contact: Harley Davis, ILOG S.A., 2 Avenue Gallini, BP 85, 94253
Gentilly Cedex, France. tel: +33 1 46 63 66 66 <da...@ilog.fr>

JAG:
This Java program implements a simple GENETIC ALGORITHM where the
FITNESS function takes non-negative values only. It employs ELITISM.
The Java code was derived from the C code in the Appendix of Genetic
Algorithms + Data Structures = Evolution Programs, [MICHALE94].
Other ideas and code were drawn from GAC by Bill Spears.

Four sample problems are contained in the code: three with bit GENEs
and one with double genes. To use this program, modify the class
MyChromosome to include your problem, which you have coded in some
class, say YourChromosome. All changes to the sGA.java file to run
your problem are confined to your class YourChromosome. This is what
object-oriented programming is all about! The sGA.java source code
file has a big comment at the end containing some sample runs.

Available by FTP from ftp.mcs.drexel.edu/pub/shartley/simpleGA.tar.gz
. Further information from Stephen J. Hartley
<shar...@mcs.drexel.edu>, http://www.mcs.drexel.edu/~shartley .
Drexel University, Math and Computer Science Department Philadelphia,
PA 19104 USA. +1-215-895-2678

LibGA:
LibGA is a library of routines written in C for developing GENETIC
ALGORITHMs. It is fairly simple to use, with many knobs to turn.
Most GA parameters can be set or changed via a configuration file,
with no need to recompile. (E.g., operators, pool size and even the
data type used in the CHROMOSOME can be changed in the configuration
file.) Function pointers are used for the GENETIC OPERATORs, so they
can easily be manipulated on the fly. Several genetic operators are
supplied and it is easy to add more. LibGA runs on many
systems/architectures. These include Unix, DOS, NeXT, and Amiga.

LibGA Version 1.00 is available by FTP from
ftp.aic.nrl.navy.mil/pub/galist/src/ga/libga100.tar.Z or by email
request to its author, Art Corcoran <corc...@penguin.mcs.utulsa.edu>
(Unverified 8/94).

LICE:
LICE is a parameter OPTIMIZATION program based on EVOLUTION
STRATEGIEs (ES). In contrast to classic ES, LICE has a local
SELECTION scheme to prevent premature stagnation. Details and results
were presented at the EP'94 conference in San Diego. LICE is written
in ANSI-C (more or less), and has been tested on Sparc-stations and
Linux-PCs. If you want plots and graphics, you need X11 and gnuplot.
If you want a nice user interface to create parameter files, you also
need Tk/Tcl.

LICE-1.0 is available as source code by FTP from
lumpi.informatik.uni-dortmund.de/pub/ES/src/LICE-1.0.tar.gz

Author: Joachim Sprave <j...@ls11.informatik.uni-dortmund.de>

Matlab-GA:
The MathWorks FTP site has some Matlab GA code in the directory
ftp.mathworks.com/pub/contrib/v4/optim/genetic It's a bunch of .m
files that implement a basic GA. Contact: Andrew Potvin,
<pot...@mathworks.com> for information.
mGA:
mGA is an implementation of a messy GA as described in TCGA report
No. 90004. Messy GAs overcome the linkage problem of simple GENETIC
ALGORITHMs by combining variable-length strings, GENE expression,
messy operators, and a nonhomogeneous phasing of evolutionary
processing. Results on a number of difficult deceptive test
functions have been encouraging with the messy GA always finding
global optima in a polynomial number of function evaluations.

See TCGA reports 89003, 90005, 90006, and 91004, and IlliGAL report
91008 for more information on messy GAs (See Q14). The C language
version is available by FTP from IlliGAL in the directory
gal4.ge.uiuc.edu/pub/src/messyGA/C/

Contact: Dave Goldberg <gold...@vmd.cso.uiuc.edu>

PARAGenesis:
PARAGenesis is the result of a project implementing Genesis on the
CM-200 in C*. It is an attempt to improve PERFORMANCE as much as
possible without changing the behavior of the GENETIC ALGORITHM.
Unlike the punctuated equilibria and local SELECTION models,
PARAGenesis doesn't modify the genetic algorithm to be more
parallelizable as these modifications can drastically alter the
behavior of the algorithm. Instead each member is placed on a
separate processor allowing initialization, evaluation and MUTATION
to be completely parallel. The costs of global control and
communication in selection and CROSSOVER are present but minimized as
much as possible. In general PARAGenesis on an 8k CM-200 seems to run
10-100 times faster than Genesis on a Sparc 2 and finds equivalent
solutions.

PARAGenesis includes all the features of serial Genesis plus some
additions. The additions include the ability to collect timing
STATISTICS, probabilistic selection (as opposed to Baker's stochastic
universal sampling), uniform crossover and local or neighborhood
selection. Anyone familiar with the serial implementation of Genesis
and C* should have little problem using PARAGenesis.

PARAGenesis is available by FTP from
ftp.aic.nrl.navy.mil/pub/galist/src/ga/paragenesis.tar.Z

DISCLAIMER: PARAGenesis is fairly untested at this point and may
contain some bugs.

Michael van Lent, Advanced Technology Lab, University of Michigan,
1101 Beal Av., Ann Arbor, MI 48109, USA. Net:
<van...@eecs.umich.edu>.

PGA:
PGA is a simple testbed for basic explorations in GENETIC ALGORITHMs.
Command line arguments control a range of parameters, there are a
number of built-in problems for the GA to solve. The current set
includes:

o maximize the number of bits set in a CHROMOSOME

o De Jong's functions DJ1, DJ2, DJ3, DJ5

o binary F6, used by Schaffer et al
o a crude 1-d knapsack problem; you specify a target and a set of
numbers in an external file, GA tries to find a subset that sums
as closely as possible to the target

o the `royal road' function(s); a chromosome is regarded as a set of
consecutive blocks of size K, and scores K for each block entirely
filled with 1s, etc; a range of parameters.
o max contiguous bits, you choose the ALLELE range.

o timetabling, with various smart MUTATION options; capable of
solving a good many real-world timetabling problems (has done so)

Lots of GA options: rank, roulette, tournament, marriage-tournament,
spatially-structured SELECTION; one-point, two-point, uniform or no
CROSSOVER; fixed or adaptive mutation; one child or two; etc.

Default output is curses-based, with optional output to file; can be
run non-interactively too for batched series of experiments.

It's easy to add your own problems. Chromosomes are represented as
character arrays, so you are not (quite) stuck with bit-string
problem encodings.

PGA has been used for teaching for a couple of years now, and has
been used as a starting point by a fair number of people for their
own projects. So it's reasonably reliable. However, if you find bugs,
or have useful contributions to make, Tell Me! It is available by FTP
from ftp.dai.ed.ac.uk/pub/pga/pga-3.1.tar.gz (see the file pga.README
in the same directory for more information)

Peter Ross, Department of AI, University of Edinburgh, 80 South
Bridge, Edinburgh EH1 1HN, UK. Net: <pe...@aisb.ed.ac.uk>

PGAPack:
PGAPack is a general-purpose, data-structure-neutral parallel GENETIC
ALGORITHM library. It is intended to provide most capabilities
desired in a genetic algorithm library, in an integrated, seamless,
and portable manner.

Features include:

o Callable from Fortran or C.

o Runs on uniprocessors, parallel computers, and workstation
networks.

o Binary-, integer-, and real- and character-valued native data
types

o Full extensibility to support custom operators and new data types.

o Easy-to-use interface for novice and application users.

o Multiple levels of access for expert users.

o Extensive debugging facilities.

o Large set of example problems.

o Detailed users guide

o Parameterized POPULATION replacement.

o Multiple choices for SELECTION, CROSSOVER, and MUTATION operators

o Easy integration of hill-climbing heuristics.

Availability: PGAPack is freely available and may be obtained by FTP
from info.mcs.anl.gov/pub/pgapack/pgapack.tar.Z or from
http://www.mcs.anl.gov/pgapack.html

Further Information from David Levine, Mathematics and Computer
Science Division, Argonne National Laboratory, Argonne, Illinois
60439, (708)-252-6735 <lev...@mcs.anl.gov>
http://www.mcs.anl.gov/home/levine

REGAL:
REGAL (RElational Genetic Algorithm Learner) is a distributed GA-
based system, designed for learning multi-modal First Order Logic
concept descriptions from examples. REGAL is based on a SELECTION
operator, called Universal Suffrage operator, provably allowing the
POPULATION to asymptotically converge, on average, to an equilibrium
state, in which several SPECIES coexist. REGAL makes use of PVM 3.3
and Tcl/Tk. This version of REGAL is provided with a graphical user
interface developed in Tcl/Tk language.

REGAL has been jointly developed by: Attilio Giordana
<att...@di.unito.it> http://www.di.unito.it/~attilio/ and Filippo
Neri <ne...@di.unito.it> http://www.di.unito.it/~neri/ at the
University of Torino, Dipartimento di Informatica, Italy.

David Beasley

unread,

Sep 20, 2000, 3:00:00 AM9/20/00

to

Archive-name: ai-faq/genetic/part6
Last-Modified: 9/20/00
Issue: 8.2

Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
david....@iee.org

TABLE OF CONTENTS OF PART 6

Q21: What are Gray codes, and why are they used?

Q22: What test data is available?

Q42: What is Life all about?
Q42b: Is there a FAQ to this group?

Q98: Are there any patents on EAs?

Q99: A Glossary on EAs?

----------------------------------------------------------------------

Subject: Q21: What are Gray codes, and why are they used?

The correct spelling is "Gray"---not "gray", "Grey", or "grey"---
since Gray codes are named after the Frank Gray who patented their
use for shaft encoders in 1953 [1]. Gray codes actually have a
longer history, and the inquisitive reader may want to look up the
August, 1972, issue of Scientific American, which contains two
articles of interest: one on the origin of binary codes [2], and
another by Martin Gardner on some entertaining aspects of Gray
codes [3]. Other references containing descriptions of Gray codes
and more modern, non-GA, applications include the second edition of
Numerical Recipes [4], Horowitz and Hill [5], Kozen [6], and
Reingold [7].

A Gray code represents each number in the sequence of integers
{0...2^N-1} as a binary string of length N in an order such that
adjacent integers have Gray code representations that differ in only
one bit position. Marching through the integer sequence therefore
requires flipping just one bit at a time. Some call this defining
property of Gray codes the "adjacency property" [8].

Example (N=3): The binary coding of {0...7} is {000, 001, 010, 011,
100, 101, 110, 111}, while one Gray coding is {000, 001, 011, 010,
110, 111, 101, 100}. In essence, a Gray code takes a binary sequence
and shuffles it to form some new sequence with the adjacency
property. There exist, therefore, multiple Gray codings for
any given N. The example shown here belongs to a class of Gray
codes that goes by the fancy name "binary-reflected Gray codes".
These are the most commonly seen Gray codes, and one simple
scheme for generationg such a Gray code sequence says, "start with
all bits zero and successively flip the right-most bit that produces
a new string."

Hollstien [9] investigated the use of GAs for optimizing functions of
two variables and claimed that a Gray code representation worked
slightly better than the binary representation. He attributed this
difference to the adjacency property of Gray codes. Notice in the
above example that the step from three to four requires the flipping
of all the bits in the binary representation. In general, adjacent
integers in the binary representaion often lie many bit flips apart.
This fact makes it less likely that a MUTATION operator can effect
small changes for a binary-coded INDIVIDUAL.

A Gray code representation seems to improve a mutation operator's
chances of making incremental improvements, and a close examination
suggests why. In a binary-coded string of length N, a single
mutation in the most significant bit (MSB) alters the number by
2^(N-1). In a Gray-coded string, fewer mutations lead to a change
this large. The user of Gray codes does, however, pay a price for
this feature: those "fewer mutations" lead to much larger changes.
In the Gray code illustrated above, for example, a single mutation of
the left-most bit changes a zero to a seven and vice-versa, while the
largest change a single mutation can make to a corresponding binary-
coded individual is always four. One might still view this aspect of
Gray codes with some favor: most mutations will make only small
changes, while the occasional mutation that effects a truly big
change may initiate EXPLORATION of an entirely new region in the
space of CHROMOSOMEs.

The algorithm for converting between the binary-reflected Gray code
described above and the standard binary code turns out to be
surprisingly simple to state. First label the bits of a binary-coded
string B[i], where larger i's represent more significant bits, and
similarly label the corresponding Gray-coded string G[i]. We convert
one to the other as follows: Copy the most significant bit. Then
for each smaller i do either G[i] = XOR(B[i+1], B[i])---to convert
binary to Gray---or B[i] = XOR(B[i+1], G[i])---to convert Gray to
binary.

One may easily implement the above algorithm in C. Imagine you do
something like

typedef unsigned short ALLELE;

and then use type "allele" for each bit in your chromosome, then the
following two functions will convert between binary and Gray code
representations. You must pass them the address of the high-order
bits for each of the two strings as well as the length of each
string. (See the comment statements for examples.) NB: These
functions assume a chromosome arranged as shown in the following
illustration.

index: C[9] C[0]
*-----------------------------------------------------------*
Char C: | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
*-----------------------------------------------------------*
^^^^^ ^^^^^
high-order bit low-order bit

C CODE
/* Gray <==> binary conversion routines */
/* written by Dan T. Abell, 7 October 1993 */
/* please send any comments or suggestions */
/* to dab...@quark.umd.edu */

void gray_to_binary (Cg, Cb, n)
/* convert chromosome of length n+1 */
/* from Gray code Cg[0...n] */
/* to binary code Cb[0...n] */

allele *Cg,*Cb;
int n;
{
int j;

*Cb = *Cg; /* copy the high-order bit */
for (j = 0; j < n; j++) {
Cb--; Cg--; /* for the remaining bits */
*Cb= *(Cb+1)^*Cg; /* do the appropriate XOR */
}
}

void binary_to_gray(Cb, Cg, n)
/* convert chromosome of length n+1 */
/* from binary code Cb[0...n] */
/* to Gray code Cg[0...n] */

allele *Cb, *Cg;
int n;
{
int j;

*Cg = *Cb; /* copy the high-order bit */
for (j = 0; j < n; j++) {
Cg--; Cb--; /* for the remaining bits */
*Cg= *(Cb+1)^*Cb; /* do the appropriate XOR */
}
}

References

[1] F. Gray, "Pulse Code Communication", U. S. Patent 2 632 058,
March 17, 1953.

[2] F. G. Heath, "Origins of the Binary Code", Scientific American
v.227,n.2 (August, 1972) p.76.

[3] Martin Gardner, "Mathematical Games", Scientific American
v.227,n.2 (August, 1972) p.106.

[4] William H. Press, et al., Numerical Recipes in C, Second Edition
(Cambridge University Press, 1992).

[5] Paul Horowitz and Winfield Hill, The Art of Electronics, Second
Edition (Cambridge University Press, 1989).

[6] Dexter Kozen, The Design and Analysis of Algorithms (Springer-
Verlag, New York, NY, 1992).

[7] Edward M. Reingold, et al., Combinatorial Algorithms (Prentice
Hall, Englewood Cliffs, NJ, 1977).

[8] David E. Goldberg, Genetic Algorithms in Search, Optimization,
and Machine Learning (Addison-Wesley, Reading, MA, 1989).

[9] R. B. Hollstien, Artificial Genetic Adaptation in Computer
Control Systems (PhD thesis, University of Michigan, 1971).

[10] Albert Nijenhuis and Herbert S. Wilf, Combinatorial Algorithms,
(Academic Press, Inc., New York, San Francisco, London 1975).

------------------------------

Subject: Q22: What test data is available?

TSP DATA
There is a TSP library (TSPLIB) available which has many solved and
semi-solved TSPs and different variants. The library is maintained by
Gerhard Reinelt <rei...@ares.iwr.Uni-Heidelberg.de>. It is available
from various FTP sites, including:
softlib.cs.rice.edu/pub/tsplib/tsblib.tar

OPERATIONAL RESERACH DATA
Information about Operational Research test problems in a wide
variety of areas can be obtained by emailing <o.rli...@ic.ac.uk>
with the body of the email message being just the word "info". The
files in OR-Library are also available via anonymous FTP from
mscmga.ms.ic.ac.uk/pub/ A WWW page is also available at URL:
http://mscmga.ms.ic.ac.uk/info.html Instructions on how to use OR-
Library can be found in the file "paper.txt", or in the article:
J.E.Beasley, "OR-Library: distributing test problems by electronic
mail", Journal of the Operational Research Society 41(11) (1990)
pp1069-1072.

The following is a list of some of the topics covered.
File Problem area

assigninfo.txt Assignment problem
deainfo.txt Data envelopment analysis
gapinfo.txt Generalised assignment problem
mipinfo.txt Integer programming
lpinfo.txt Linear programming
scpinfo.txt Set covering
sppinfo.txt Set partitioning
tspinfo.txt Travelling salesman problem
periodtspinfo.txt Period travelling salesman problem
netflowinfo.txt Network flow problem

Location:
capmstinfo.txt capacitated minimal spanning tree
capinfo.txt capacitated warehouse location
pmedinfo.txt p-median
uncapinfo.txt uncapacitated warehouse location
mknapinfo.txt Multiple knapsack problem
qapinfo.txt Quadratic assignment problem
rcspinfo.txt Resource constrained shortest path
phubinfo.txt p-hub location problem

Scheduling:
airlandinfo.txt Aircraft Landing Problem
cspinfo.txt Crew scheduling
flowshopinfo.txt flow shop
jobshopinfo.txt job shop
openshopinfo.txt open shop
tableinfo.txt timetabling problem

Steiner:
esteininfo.txt Euclidean Steiner problem
rsteininfo.txt Rectilinear Steiner problem
steininfo.txt Steiner problem in graphs

Two-dimensional cutting:
assortinfo.txt assortment problem
cgcutinfo.txt constrained guillotine
ngcutinfo.txt constrained non-guillotine
gcutinfo.txt unconstrained guillotine

Vehicle routing:
areainfo.txt fixed areas
fixedinfo.txt fixed routes
periodinfo.txt period routing
vrpinfo.txt single period
multivrpinfo.txt multiple depot vehicle routing problem

OTHER DATA
William Spears <spe...@aic.nrl.navy.mil> maintains a WWW page titled:
Test Functions for Evolutionary Algorithms which contians links to
various sources of test functions.
http://www.aic.nrl.navy.mil:80/~spears/functs.html

ENCORE (see Q15.3) also contains some test data. See directories
under /etc/data/
------------------------------

Subject: Q42: What is Life all about?

42

References
Adams, D. (1979) "The Hitch Hiker's Guide to the Galaxy", London: Pan
Books.

Adams, D. (1980) "The Restaurant at the End of the Universe", London:
Pan Books.

Adams, D. (1982) "Life, the Universe and Everything", London: Pan
Books.

Adams, D. (1984) "So long, and thanks for all the Fish", London: Pan
Books.

Adams, D. (1992) "Mostly Harmless", London: Heinemann.

------------------------------

Subject: Q42b: Is there a FAQ to this group?

Yes.

------------------------------

Subject: Q98: Are there any patents on EAs?

Process patents have been issued both for the Bucket Brigade
Algorithm in CLASSIFIER SYSTEMs: U.S. patent #4,697,242: J.H. Holland
and A. Burks, "Adaptive computing system capable of learning and
discovery", 1985, issued Sept 29 1987; and for GP: U.S. patent
#4,935,877 (to John Koza).

This FAQ does not attempt to provide legal advice. However, use of
the Lisp code in the book [KOZA92] is freely licensed for academic
use. Although those wishing to make commercial use of any process
should obviously consult any patent holders in question, it is pretty
clear that it's not in anyone's best interests to stifle GA/GP
research and/or development. Commercial licenses much like those used
for CAD software can presumably be obtained for the use of these
processes where necessary.

Jarmo Alander's massive bibliography of GAs (see Q10.8) includes a
(probably) complete list of all currently know patents. There is
also a periodic posting on comp.ai.neural-nets by Gregory Aharonian
<src...@world.std.com> about patents on Artificial Neural Networks
(ANNs).

------------------------------

Subject: Q99: A Glossary on EAs?

A very good glossary of genetics terminology can be found at
http://helios.bto.ed.ac.uk/bto/glossary

1
1/5 SUCCESS RULE:
Derived by I. Rechenberg, the suggestion that when Gaussian
MUTATIONs are applied to real-valued vectors in searching for
the minimum of a function, a rule-of-thumb to attain good rates
of error convergence is to adapt the STANDARD DEVIATION of
mutations to generate one superior solution out of every five
attempts.

A
ADAPTIVE BEHAVIOUR:
"...underlying mechanisms that allow animals, and potentially,
ROBOTs to adapt and survive in uncertain environments" --- Meyer
& Wilson (1991), [SAB90]

AI: See ARTIFICIAL INTELLIGENCE.

ALIFE:
See ARTIFICIAL LIFE.

ALLELE :
(biol) Each GENE is able to occupy only a particular region of a
CHROMOSOME, its locus. At any given locus there may exist, in
the POPULATION, alternative forms of the gene. These alternative
are called alleles of one another.

(EC) The value of a gene. Hence, for a binary representation,
each gene may have an ALLELE of 0 or 1.

ARTIFICIAL INTELLIGENCE:
"...the study of how to make computers do things at which, at
the moment, people are better" --- Elaine Rich (1988)

ARTIFICIAL LIFE:
Term coined by Christopher G. Langton for his 1987 [ALIFEI]
conference. In the preface of the proceedings he defines ALIFE
as "...the study of simple computer generated hypothetical life
forms, i.e. life-as-it-could-be."

B
BUILDING BLOCK:
(EC) A small, tightly clustered group of GENEs which have co-
evolved in such a way that their introduction into any
CHROMOSOME will be likely to give increased FITNESS to that
chromosome.

The "building block hypothesis" [GOLD89] states that GAs find
solutions by first finding as many BUILDING BLOCKs as possible,
and then combining them together to give the highest fitness.

C
CENTRAL DOGMA:
(biol) The dogma that nucleic acids act as templates for the
synthesis of proteins, but never the reverse. More generally,
the dogma that GENEs exert an influence over the form of a body,
but the form of a body is never translated back into genetic
code: acquired characteristics are not inherited. cf LAMARCKISM.

(GA) The dogma that the behaviour of the algorithm must be
analysed using the SCHEMA THEOREM.

(life in general) The dogma that this all is useful in a way.

"You guys have a dogma. A certain irrational set of believes.
Well, here's my irrational set of beliefs. Something that
works."
--- Rodney A. Brooks, [LEVY92]

CFS: See CLASSIFIER SYSTEM.

CHROMOSOME:
(biol) One of the chains of DNA found in cells. CHROMOSOMEs
contain GENEs, each encoded as a subsection of the DNA chain.
Chromosomes are usually present in all cells in an organism,
even though only a minority of them will be active in any one
cell.

(EC) A datastructure which holds a `string' of task parameters,
or genes. This may be stored, for example, as a binary bit-
string, or an array of integers.

CLASSIFIER SYSTEM:
A system which takes a (set of) inputs, and produces a (set of)
outputs which indicate some classification of the inputs. An
example might take inputs from sensors in a chemical plant, and
classify them in terms of: 'running ok', 'needs more water',
'needs less water', 'emergency'. See Q1.4 for more information.

COMBINATORIAL OPTIMIZATION:
Some tasks involve combining a set of entities in a specific way
(e.g. the task of building a house). A general combinatorial
task involves deciding (a) the specifications of those entities
(e.g. what size, shape, material to make the bricks from), and
(b) the way in which those entities are brought together (e.g.
the number of bricks, and their relative positions). If the
resulting combination of entities can in some way be given a
FITNESS score, then COMBINATORIAL OPTIMIZATION is the task of
designing a set of entities, and deciding how they must be
configured, so as to give maximum fitness. cf ORDER-BASED
PROBLEM.

COMMA STRATEGY:
Notation originally proposed in EVOLUTION STRATEGIEs, when a
POPULATION of "mu" PARENTs generates "lambda" OFFSPRING and the
mu parents are discarded, leving only the lambda INDIVIDUALs to
compete directly. Such a process is written as a (mu,lambda)
search. The process of only competing offspring then is a
"comma strategy." cf. PLUS STRATEGY.

CONVERGED:
A GENE is said to have CONVERGED when 95% of the CHROMOSOMEs in
the POPULATION all contain the same ALLELE for that gene. In
some circumstances, a population can be said to have converged
when all genes have converged. (However, this is not true of
populations containing multiple SPECIES, for example.)

Most people use "convergence" fairly loosely, to mean "the GA
has stopped finding new, better solutions". Of course, if you
wait long enough, the GA will *eventually* find a better
solution (unless you have already found the global optimum).
What people really mean is "I'm not willing to wait for the GA
to find a new, better solution, because I've already waited
longer than I wanted to and it hasn't improved in ages."

An interesting discussion on convergence by Michael Vose can be
found in GA-Digest v8n22, available from
ftp.aic.nrl.navy.mil/pub/galist/digests/v8n22

CONVERGENCE VELOCITY:
The rate of error reduction.

COOPERATION:
The behavior of two or more INDIVIDUALs acting to increase the
gains of all participating individuals.

CROSSOVER:
(EC) A REPRODUCTION OPERATOR which forms a new CHROMOSOME by
combining parts of each of two `parent' chromosomes. The
simplest form is single-point CROSSOVER, in which an arbitrary
point in the chromosome is picked. All the information from
PARENT A is copied from the start up to the crossover point,
then all the information from parent B is copied from the
crossover point to the end of the chromosome. The new chromosome
thus gets the head of one parent's chromosome combined with the
tail of the other. Variations exist which use more than one
crossover point, or combine information from parents in other
ways.

(biol) A complicated process which typically takes place as
follows: chromosomes, while engaged in the production of
GAMETEs, exchange portions of genetic material. The result is
that an almost infinite variety of gametes may be produced.
Subsequently, during sexual REPRODUCTION, male and female
gametes (i.e. sperm and ova) fuse to produce a new DIPLOID cell
with a pair of chromosomes.

In [HOLLAND92] the sentence "When sperm and ova fuse, matching
chromosomes line up with one another their length, thus swapping
genetic material" is thus wrong, since these two activities
occur in different parts of the life cycle. [eds note: If
sexual reproduction (the Real Thing) worked like in GAs, then
Holland would be right, but as we all know, it's not the
case. We just encountered a Freudian slip of a Grandmaster.
BTW: even the German translation of this article has this
"bug", although it's well-hidden by the translator.]

CS: See CLASSIFIER SYSTEM.

D
DARWINISM:
(biol) Theory of EVOLUTION, proposed by Darwin, that evolution
comes about through random variation of heritable
characteristics, coupled with natural SELECTION (survival of the
fittest). A physical mechanism for this, in terms of GENEs and
CHROMOSOMEs, was discovered many years later. DARWINISM was
combined with the selectionism of Weismann and the genetics of
Mendel to form the Neo-Darwinian Synthesis during the
1930s-1950s by T. Dobzhansky, E. Mayr, G. Simpson, R. Fisher, S.
Wright, and others. cf LAMARCKISM.

The talk.origins FAQ contains more details (See Q10.7). Also,
the "Dictionary of Darwinism and of Evolution" (Ed. by Patrick
Tort) was published in early 1996. It contains a vast amount of

information about what Darwinism is and (perhaps more
importantly) is not. Further information from
http://www.planete.net/~ptort/darwin/evolengl.html (in various
languages).

(EC) Theory which inspired all branches of EC.

DECEPTION:
The condition where the combination of good BUILDING BLOCKs
leads to reduced FITNESS, rather than increased fitness.
Proposed by [GOLD89] as a reason for the failure of GAs on many
tasks.

DIPLOID:
(biol) This refers to a cell which contains two copies of each
CHROMOSOME. The copies are homologous i.e. they contain the
same GENEs in the same sequence. In many sexually reproducing
SPECIES, the genes in one of the sets of chromosomes will have
been inherited from the father's GAMETE (sperm), while the genes
in the other set of chromosomes are from the mother's gamete
(ovum).
DNA: (biol) Deoxyribonucleic Acid, a double stranded macromolecule of
helical structure (comparable to a spiral staircase). Both
single strands are linear, unbranched nucleic acid molecules
build up from alternating deoxyribose (sugar) and phosphate
molecules. Each deoxyribose part is coupled to a nucleotide
base, which is responsible for establishing the connection to
the other strand of the DNA. The 4 nucleotide bases Adenine
(A), Thymine (T), Cytosine (C) and Guanine (G) are the alphabet
of the genetic information. The sequences of these bases in the
DNA molecule determines the building plan of any organism. [eds
note: suggested reading: James D. Watson (1968) "The Double
Helix", London: Weidenfeld and Nicholson]

(literature) Douglas Noel Adams, contemporary Science Fiction
comedy writer. Published "The Hitch-Hiker's Guide to the Galaxy"
when he was 25 years old, which made him one of the currently
most successful British authors. [eds note: interestingly
Watson was also 25 years old, when he discovered the DNA; both
events are probably not interconnected; you might also want to
look at: Neil Gaiman's (1987) "DON'T PANIC -- The Official
Hitch-Hiker's Guide to the Galaxy companion", and of course get
your hands on the wholly remarkable FAQ in alt.fan.douglas-adams
]

DNS: (biol) Desoxyribonukleinsaeure, German for DNA.

(comp) The Domain Name System, a distributed database system for
translating computer names (e.g. lumpi.informatik.uni-
dortmund.de) into numeric Internet, i.e. IP-addresses
(129.217.36.140) and vice-versa. DNS allows you to hook into
the net without remembering long lists of numeric references,
unless your system administrator has incorrectly set-up your
site's system.

E
EA: See EVOLUTIONARY ALGORITHM.

EC: See EVOLUTIONARY COMPUTATION.

ELITISM:
ELITISM (or an elitist strategy) is a mechanism which is
employed in some EAs which ensures that the CHROMOSOMEs of the
most highly fit member(s) of the POPULATION are passed on to the
next GENERATION without being altered by GENETIC OPERATORs.
Using elitism ensures that the minimum FITNESS of the population
can never reduce from one generation to the next. Elitism
usually brings about a more rapid convergence of the population.
In some applications elitism improves the chances of locating an
optimal INDIVIDUAL, while in others it reduces it.

ENCORE:
The EvolutioNary Computation REpository Network. An collection
of FTP servers/World Wide Web sites holding all manner of
interesting things related to EC. See Q15.3 for more
information.

ENVIRONMENT:
(biol) That which surrounds an organism. Can be 'physical'
(abiotic), or biotic. In both, the organism occupies a NICHE
which influences its FITNESS within the total ENVIRONMENT. A
biotic environment may present frequency-dependent fitness
functions within a POPULATION, that is, the fitness of an
organism's behaviour may depend upon how many others are also
doing it. Over several GENERATIONs, biotic environments may
foster co-evolution, in which fitness is determined with
SELECTION partly by other SPECIES.

EP: See EVOLUTIONARY PROGRAMMING.

EPISTASIS:
(biol) A "masking" or "switching" effect among GENEs. A biology
textbook says: "A gene is said to be epistatic when its presence
suppresses the effect of a gene at another locus. Epistatic
genes are sometimes called inhibiting genes because of their
effect on other genes which are described as hypostatic."

(EC) When EC researchers use the term EPISTASIS, they are
generally referring to any kind of strong interaction among
genes, not just masking effects. A possible definition is:

Epistasis is the interaction between different genes in a
CHROMOSOME. It is the extent to which the contribution to
FITNESS of one gene depends on the values of other genes.

Problems with little or no epistasis are trivial to solve
(hillclimbing is sufficient). But highly epistatic problems are
difficult to solve, even for GAs. High epistasis means that
BUILDING BLOCKs cannot form, and there will be DECEPTION.

ES: See EVOLUTION STRATEGY.

EVOLUTION:
That process of change which is assured given a reproductive
POPULATION in which there are (1) varieties of INDIVIDUALs, with
some varieties being (2) heritable, of which some varieties (3)
differ in FITNESS (reproductive success). (See the talk.origins
FAQ for discussion on this (See Q10.7).)

"Don't assume that all people who accept EVOLUTION are atheists"

--- Talk.origins FAQ

EVOLUTION STRATEGIE:

EVOLUTION STRATEGY:
A type of EVOLUTIONARY ALGORITHM developed in the early 1960s in
Germany. It employs real-coded parameters, and in its original
form, it relied on MUTATION as the search operator, and a
POPULATION size of one. Since then it has evolved to share many
features with GENETIC ALGORITHMs. See Q1.3 for more
information.

EVOLUTIONARILY STABLE STRATEGY:
A strategy that does well in a POPULATION dominated by the same
strategy. (cf Maynard Smith, 1974) Or, in other words, "An
'ESS' ... is a strategy such that, if all the members of a
population adopt it, no mutant strategy can invade." (Maynard
Smith "Evolution and the Theory of Games", 1982).

EVOLUTIONARY ALGORITHM:
A algorithm designed to perform EVOLUTIONARY COMPUTATION.

EVOLUTIONARY COMPUTATION:
Encompasses methods of simulating EVOLUTION on a computer. The
term is relatively new and represents an effort bring together
researchers who have been working in closely related fields but
following different paradigms. The field is now seen as
including research in GENETIC ALGORITHMs, EVOLUTION STRATEGIEs,
EVOLUTIONARY PROGRAMMING, ARTIFICIAL LIFE, and so forth. For a
good overview see the editorial introduction to Vol. 1, No. 1 of
"Evolutionary Computation" (MIT Press, 1993). That, along with
the papers in the issue, should give you a good idea of
representative research.

EVOLUTIONARY PROGRAMMING:
An evolutionay algorithm developed in the mid 1960s. It is a
stochastic OPTIMIZATION strategy, which is similar to GENETIC
ALGORITHMs, but dispenses with both "genomic" representations
and with CROSSOVER as a REPRODUCTION OPERATOR. See Q1.2 for
more information.

EVOLUTIONARY SYSTEMS:
A process or system which employs the evolutionary dynamics of
REPRODUCTION, MUTATION, competition and SELECTION. The specific
forms of these processes are irrelevant to a system being
described as "evolutionary."

EXPECTANCY:
Or expected value. Pertaining to a random variable X, for a
continuous random variable, the expected value is:
E(X) = INTEGRAL(-inf, inf) [X f(X) dX].
The discrete expectation takes a similar form using a summation
instead of an integral.

EXPLOITATION:
When traversing a SEARCH SPACE, EXPLOITATION is the process of
using information gathered from previously visited points in the
search space to determine which places might be profitable to
visit next. An example is hillclimbing, which investigates
adjacent points in the search space, and moves in the direction
giving the greatest increase in FITNESS. Exploitation
techniques are good at finding local maxima.

EXPLORATION:
The process of visiting entirely new regions of a SEARCH SPACE,
to see if anything promising may be found there. Unlike
EXPLOITATION, EXPLORATION involves leaps into the unknown.
Problems which have many local maxima can sometimes only be
solved by this sort of random search.

F
FAQ: Frequently Asked Questions. See definition given before the main
table of contents.

FITNESS:
(biol) Loosely: adaptedness. Often measured as, and sometimes
equated to, relative reproductive success. Also proportional to
expected time to extinction. "The fit are those who fit their
existing ENVIRONMENTs and whose descendants will fit future
environments." (J. Thoday, "A Century of Darwin", 1959).
Accidents of history are relevant.

(EC) A value assigned to an INDIVIDUAL which reflects how well
the individual solves the task in hand. A "fitness function" is
used to map a CHROMOSOME to a FITNESS value. A "fitness
landscape" is the hypersurface obtained by applying the fitness
function to every point in the SEARCH SPACE.
FUNCTION OPTIMIZATION:
For a function which takes a set of N input parameters, and
returns a single output value, F, FUNCTION OPTIMIZATION is the
task of finding the set(s) of parameters which produce the
maximum (or minimum) value of F. Function OPTIMIZATION is a type
of VALUE-BASED PROBLEM.

FTP: File Transfer Protocol. A system which allows the retrieval of
files stored on a remote computer. Basic FTP requires a password
before access can be gained to the remote computer. Anonymous
FTP does not require a special password: after giving
"anonymous" as the user name, any password will do (typically,
you give your email address at this point). Files available by
FTP are specified as <ftp-site-name>:<the-complete-filename> See
Q15.5.

FUNCTION SET:
(GP) The set of operators used in GP. These functions label the
internal (non-leaf) points of the parse trees that represent the
programs in the POPULATION. An example FUNCTION SET might be
{+, -, *}.

G
GA: See GENETIC ALGORITHM.

GAME THEORY:
A mathematical theory originally developed for human games, and
generalized to human economics and military strategy, and to
EVOLUTION in the theory of EVOLUTIONARILY STABLE STRATEGY. GAME
THEORY comes into its own wherever the optimum policy is not
fixed, but depends upon the policy which is statistically most
likely to be adopted by opponents.

GAMETE:
(biol) Cells which carry genetic information from their PARENTs
for the purposes of sexual REPRODUCTION. In animals, male
GAMETEs are called sperm, female gametes are called ova. Gametes
have a HAPLOID number of CHROMOSOMEs.

GAUSSIAN DISTRIBUTION:
See NORMALLY DISTRIBUTED.

GENE:
(EC) A subsection of a CHROMOSOME which (usually) encodes the
value of a single parameter.

(biol) The fundamental unit of inheritance, comprising a segment
of DNA that codes for one or several related functions and
occupies a fixed position (locus) on the chromosome. However,
the term may be defined in different ways for different
purposes. For a fuller story, consult a book on genetics (See
Q10.7).

GENE-POOL:
The whole set of GENEs in a breeding POPULATION. The metaphor
on which the term is based de-emphasizes the undeniable fact
that genes actually go about in discrete bodies, and emphasizes
the idea of genes flowing about the world like a liquid.

Everybody out of the gene-pool, now!

--- Author prefers to be anonymous

GENERATION:
(EC) An iteration of the measurement of FITNESS and the creation
of a new POPULATION by means of REPRODUCTION OPERATORs.

GENETIC ALGORITHM:
A type of EVOLUTIONARY COMPUTATION devised by John Holland
[HOLLAND92]. A model of machine learning that uses a
genetic/evolutionary metaphor. Implementations typically use
fixed-length character strings to represent their genetic
information, together with a POPULATION of INDIVIDUALs which
undergo CROSSOVER and MUTATION in order to find interesting
regions of the SEARCH SPACE. See Q1.1 for more information.

GENETIC DRIFT:
Changes in gene/allele frequencies in a POPULATION over many
GENERATIONs, resulting from chance rather than SELECTION.
Occurs most rapidly in small populations. Can lead to some
ALLELEs becoming `extinct', thus reducing the genetic
variability in the population.

GENETIC PROGRAMMING:
GENETIC ALGORITHMs applied to programs. GENETIC PROGRAMMING is
more expressive than fixed-length character string GAs, though
GAs are likely to be more efficient for some classes of
problems. See Q1.5 for more information.

GENETIC OPERATOR:
A search operator acting on a coding structure that is analogous
to a GENOTYPE of an organism (e.g. a CHROMOSOME).

GENOTYPE:
The genetic composition of an organism: the information
contained in the GENOME.

GENOME:
The entire collection of GENEs (and hence CHROMOSOMEs) possessed
by an organism.

GLOBAL OPTIMIZATION:
The process by which a search is made for the extremum (or
extrema) of a functional which, in EVOLUTIONARY COMPUTATION,
corresponds to the FITNESS or error function that is used to
assess the PERFORMANCE of any INDIVIDUAL.

GP: See GENETIC PROGRAMMING.

H
HAPLOID:
(biol) This refers to cell which contains a single CHROMOSOME or
set of chromosomes, each consisting of a single sequence of
GENEs. An example is a GAMETE. cf DIPLOID.

In EC, it is usual for INDIVIDUALs to be HAPLOID.

HARD SELECTION:
SELECTION acts on competing INDIVIDUALs. When only the best
available individuals are retained for generating future
progeny, this is termed "hard selection." In contrast, "soft
selection" offers a probabilistic mechanism for maitaining
individuals to be PARENTs of future progeny despite possessing
relatively poorer objective values.

I
INDIVIDUAL:
A single member of a POPULATION. In EC, each INDIVIDUAL
contains a CHROMOSOME (or, more generally, a GENOME) which
represents a possible solution to the task being tackled, i.e. a
single point in the SEARCH SPACE. Other information is usually
also stored in each individual, e.g. its FITNESS.

INVERSION:
(EC) A REORDERING operator which works by selecting two cut
points in a CHROMOSOME, and reversing the order of all the GENEs
between those two points.

L
LAMARCKISM:
Theory of EVOLUTION which preceded Darwin's. Lamarck believed
that evolution came about through the inheritance of acquired
characteristics. That is, the skills or physical features which
an INDIVIDUAL acquires during its lifetime can be passed on to
its OFFSPRING. Although Lamarckian inheritance does not take
place in nature, the idea has been usefully applied by some in
EC. cf DARWINISM.

LCS: See LEARNING CLASSIFIER SYSTEM.

LEARNING CLASSIFIER SYSTEM:
A CLASSIFIER SYSTEM which "learns" how to classify its inputs.
This often involves "showing" the system many examples of input
patterns, and their corresponding correct outputs. See Q1.4 for
more information.

M
MIGRATION:
The transfer of (the GENEs of) an INDIVIDUAL from one SUB-
POPULATION to another.

MOBOT:
MOBile ROBOT. cf ROBOT.

MUTATION:
(EC) a REPRODUCTION OPERATOR which forms a new CHROMOSOME by
making (usually small) alterations to the values of GENEs in a
copy of a single, PARENT chromosome.

N
NFL: See NO FREE LUNCH.

NICHE:
(biol) In natural ecosystems, there are many different ways in
which animals may survive (grazing, hunting, on the ground, in
trees, etc.), and each survival strategy is called an
"ecological niche." SPECIES which occupy different NICHEs (e.g.
one eating plants, the other eating insects) may coexist side by
side without competition, in a stable way. But if two species
occupying the same niche are brought into the same area, there
will be competition, and eventually the weaker of the two
species will be made (locally) extinct. Hence diversity of
species depends on them occupying a diversity of niches (or on
geographical separation).

(EC) In EC, we often want to maintain diversity in the
POPULATION. Sometimes a FITNESS function may be known to be
multimodal, and we want to locate all the peaks. We may consider
each peak in the fitness function as analogous to a niche. By
applying techniques such as fitness sharing (Goldberg &
Richardson, [ICGA87]), the population can be prevented from
converging on a single peak, and instead stable SUB-POPULATIONs
form at each peak. This is analogous to different species
occupying different niches. See also SPECIES, SPECIATION.

NO FREE LUNCH:
Cocktail party definition:

For any pair of search algorithms, there are "as many" problems
for which the first algorithm outperforms the second as for
which the reverse is true. One consequence of this is that if we
don't put any domain knowledge into our algorithm, it is as
likely to perform worse than random search as it is likely to
perform better. This is true for all algorimths including
GENETIC ALGORITHMs.

More detailed description:

The NFL work of Wolpert and Macready is a framework that
addresses the core aspects of search, focusing on the connection
between FITNESS functions and effective search algorithms. The
central importance of this connection is demonstrated by the No
Free Lunch theorem which states that averaged over all problems,
all search algorithms perform equally. This result implies that
if we are comparing a genetic algorithm to some other algorithm
(e.g., simulated annealing, or even random search) and the
genetic algorithm performs better on some class of problems,
then the other algorithm necessarily performs better on problems
outside the class. Thus it is essential to incorporate knowledge
of the problem into the search algorithm.

The NFL framework also does the following: it provides a
geometric interpretation of what it means for an algorithm to be
well matched to a problem; it provides information theoretic
insight into the search procedure; it investigates time-varying
fitness functions; it proves that independent of the fitness
function, one cannot (without prior domain knowledge)
successfully choose between two algorithms based on their
previous behavior; it provides a number of formal measures of
how well an algorithm performs; and it addresses the difficulty
of OPTIMIZATION problems from a viewpoint outside of traditional
computational complexity.

NORMALLY DISTRIBUTED:
A random variable is NORMALLY DISTRIBUTED if its density
function is described as
f(x) = 1/sqrt(2*pi*sqr(sigma)) * exp(-0.5*(x-mu)*(x-
mu)/sqr(sigma))
where mu is the mean of the random variable x and sigma is the
STANDARD DEVIATION.

O
OBJECT VARIABLES:
Parameters that are directly involved in assessing the relative
worth of an INDIVIDUAL.

OFFSPRING:
An INDIVIDUAL generated by any process of REPRODUCTION.

OPTIMIZATION:
The process of iteratively improving the solution to a problem
with respect to a specified objective function.

ORDER-BASED PROBLEM:
A problem where the solution must be specified in terms of an
arrangement (e.g. a linear ordering) of specific items, e.g.
TRAVELLING SALESMAN PROBLEM, computer process scheduling.
ORDER-BASED PROBLEMs are a class of COMBINATORIAL OPTIMIZATION
problems in which the entities to be combined are already
determined. cf VALUE-BASED PROBLEM.

ONTOGENESIS:
Refers to a single organism, and means the life span of an
organism from its birth to death. cf PHYLOGENESIS.
P
PANMICTIC POPULATION:
(EC, biol) A mixed POPULATION. A population in which any
INDIVIDUAL may be mated with any other individual with a
probability which depends only on FITNESS. Most conventional
EVOLUTIONARY ALGORITHMs have PANMICTIC POPULATIONs.

The opposite is a population divided into groups known as SUB-
POPULATIONs, where individuals may only mate with others in the
same sub-population. cf SPECIATION.

PARENT:
An INDIVIDUAL which takes part in REPRODUCTION to generate one
or more other individuals, known as OFFSPRING, or children.

PERFORMANCE:
cf FITNESS.

PHENOTYPE:
The expressed traits of an INDIVIDUAL.

PHYLOGENESIS:
Refers to a POPULATION of organisms. The life span of a
population of organisms from pre-historic times until today. cf
ONTOGENESIS.

PLUS STRATEGY:
Notation originally proposed in EVOLUTION STRATEGIEs, when a
POPULATION of "mu" PARENTs generates "lambda" OFFSPRING and all
mu and lambda INDIVIDUALs compete directly, the process is
written as a (mu+lambda) search. The process of competing all
parents and offspring then is a "plus strategy." cf. COMMA
STRATEGY.

POPULATION:
A group of INDIVIDUALs which may interact together, for example
by mating, producing OFFSPRING, etc. Typical POPULATION sizes in
EC range from 1 (for certain EVOLUTION STRATEGIEs)
to many thousands (for GENETIC PROGRAMMING). cf SUB-
POPULATION.

R
RECOMBINATION:
cf CROSSOVER.

REORDERING:
(EC) A REORDERING operator is a REPRODUCTION OPERATOR which
changes the order of GENEs in a CHROMOSOME, with the hope of
bringing related genes closer together, thereby facilitating the
production of BUILDING BLOCKs. cf INVERSION.

REPRODUCTION:
(biol, EC) The creation of a new INDIVIDUAL from two PARENTs
(sexual REPRODUCTION). Asexual reproduction is the creation of
a new individual from a single parent.

REPRODUCTION OPERATOR:
(EC) A mechanism which influences the way in which genetic
information is passed on from PARENT(s) to OFFSPRING during
REPRODUCTION. REPRODUCTION OPERATORs fall into three broad
categories: CROSSOVER, MUTATION and REORDERING operators.

REQUISITE VARIETY:
In GENETIC ALGORITHMs, when the POPULATION fails to have a
"requisite variety" CROSSOVER will no longer be a useful search
operator because it will have a propensity to simply regenerate
the PARENTs.

ROBOT:
"The Encyclopedia Galactica defines a ROBOT as a mechanical
apparatus designed to do the work of man. The marketing division
of the Sirius Cybernetics Corporation defines a robot as `Your
Plastic Pal Who's Fun To Be With'."

--- Douglas Adams (1979)

S
SAFIER:
An EVOLUTIONARY COMPUTATION FTP Repository, now defunct.
Superceeded by ENCORE.

SCHEMA:
A pattern of GENE values in a CHROMOSOME, which may include
`dont care' states. Thus in a binary chromosome, each SCHEMA
(plural schemata) can be specified by a string of the same
length as the chromosome, with each character one of {0, 1, #}.
A particular chromosome is said to `contain' a particular schema
if it matches the schema (e.g. chromosome 01101 matches schema
#1#0#).

The `order' of a schema is the number of non-dont-care positions
specified, while the `defining length' is the distance between
the furthest two non-dont-care positions. Thus #1##0# is of
order 2 and defining length 3.

SCHEMA THEOREM:
Theorem devised by Holland [HOLLAND92] to explain the behaviour
of GAs. In essence, it says that a GA gives exponentially
increasing reproductive trials to above average schemata.
Because each CHROMOSOME contains a great many schemata, the rate
of SCHEMA processing in the POPULATION is very high, leading to
a phenomenon known as implicit parallelism. This gives a GA with
a population of size N a speedup by a factor of N cubed,
compared to a random search.

SEARCH SPACE:
If the solution to a task can be represented by a set of N real-
valued parameters, then the job of finding this solution can be
thought of as a search in an N-dimensional space. This is
referred to simply as the SEARCH SPACE. More generally, if the
solution to a task can be represented using a representation
scheme, R, then the search space is the set of all possible
configurations which may be represented in R.

SEARCH OPERATORS:
Processes used to generate new INDIVIDUALs to be evaluated.
SEARCH OPERATORS in GENETIC ALGORITHMs are typically based on
CROSSOVER and point MUTATION. Search operators in EVOLUTION
STRATEGIEs and EVOLUTIONARY PROGRAMMING typically follow from
the representation of a solution and often involve Gaussian or
lognormal perturbations when applied to real-valued vectors.

SELECTION:
The process by which some INDIVIDUALs in a POPULATION are chosen
for REPRODUCTION, typically on the basis of favoring individuals
with higher FITNESS.

SELF-ADAPTATION:
The inclusion of a mechanism not only to evolve the OBJECT
VARIABLES of a solution, but simultaneously to evolve
information on how each solution will generate new OFFSPRING.

SIMULATION:
The act of modeling a natural process.

SOFT SELECTION:
The mechanism which allows inferior INDIVIDUALs in a POPULATION
a non-zero probability of surviving into future GENERATIONs.
See HARD SELECTION.

SPECIATION:
(biol) The process whereby a new SPECIES comes about. The most
common cause of SPECIATION is that of geographical isolation. If
a SUB-POPULATION of a single species is separated geographically
from the main POPULATION for a sufficiently long time, their
GENEs will diverge (either due to differences in SELECTION
pressures in different locations, or simply due to GENETIC
DRIFT). Eventually, genetic differences will be so great that
members of the sub-population must be considered as belonging to
a different (and new) species.

Speciation is very important in evolutionary biology. Small sub-
populations can evolve much more rapidly than a large population
(because genetic changes don't take long to become fixed in the
population). Sometimes, this EVOLUTION will produce superior
INDIVIDUALs which can outcompete, and eventually replace the
species of the original, main population.

(EC) Techniques analogous to geographical isolation are used in
a number of GAs. Typically, the population is divided into sub-
populations, and individuals are only allowed to mate with
others in the same sub-population. (A small amount of MIGRATION
is performed.)

This produces many sub-populations which differ in their
characteristics, and may be referred to as different "species".
This technique can be useful for finding multiple solutions to a
problem, or simply maintaining diversity in the SEARCH SPACE.

Most biology/genetics textbooks contain information on
speciation. A more detailed account can be found in "Genetics,
Speciation and the Founder Principle", L.V. Giddings, K.Y.
Kaneshiro and W.W. Anderson (Eds.), Oxford University Press
1989.

SPECIES:
(biol) There is no universally-agreed firm definition of a
SPECIES. A species may be roughly defined as a collection of
living creatures, having similar characteristics, which can
breed together to produce viable OFFSPRING similar to their
PARENTs. Members of one species occupy the same ecological
NICHE. (Members of different species may occupy the same, or
different niches.)

(EC) In EC the definition of "species" is less clear, since
generally it is always possible for a pair INDIVIDUALs to breed
together. It is probably safest to use this term only in the
context of algorithms which employ explicit SPECIATION
mechanisms.
(biol) The existence of different species allows different
ecological niches to be exploited. Furthermore, the existence of
a variety of different species itself creates new niches, thus
allowing room for further species. Thus nature bootstraps itself
into almost limitless complexity and diversity.

Conversely, the domination of one, or a small number of species
reduces the number of viable niches, leads to a decline in
diversity, and a reduction in the ability to cope with new
situations.

"Give any one species too much rope, and they'll fuck it up"

--- Roger Waters, "Amused to Death", 1992

STANDARD DEVIATION:
A measurement for the spread of a set of data; a measurement for
the variation of a random variable.

STATISTICS:
Descriptive measures of data; a field of mathematics that uses
probability theory to gain insight into systems' behavior.

STEPSIZE:
Typically, the average distance in the appropriate space between
a PARENT and its OFFSPRING.

STRATEGY VARIABLE:
Evolvable parameters that relate the distribution of OFFSPRING
from a PARENT.

SUB-POPULATION:
A POPULATION may be sub-divided into groups, known as SUB-
POPULATIONs, where INDIVIDUALs may only mate with others in the
same group. (This technique might be chosen for parallel
processors). Such sub-divisions may markedly influence the
evolutionary dynamics of a population (e.g. Wright's 'shifting
balance' model). Sub-populations may be defined by various
MIGRATION constraints: islands with limited arbitrary migration;
stepping-stones with migration to neighboring islands;
isolation-by-distance in which each individual mates only with
near neighbors. cf PANMICTIC POPULATION, SPECIATION.

SUMMERSCHOOL:
(USA) One of the most interesting things in the US educational
system: class work during the summer break.

T
TERMINAL SET:
(GP) The set of terminal (leaf) nodes in the parse trees
representing the programs in the POPULATION. A terminal might
be a variable, such as X, a constant value, such as 42, (cf Q42)
or a function taking no arguments, such as (move-north).

TRAVELLING SALESMAN PROBLEM:
The travelling salesperson has the task of visiting a number of
clients, located in different cities. The problem to solve is:
in what order should the cities be visited in order to minimise
the total distance travelled (including returning home)? This is
a classical example of an ORDER-BASED PROBLEM.

TSP: See TRAVELLING SALESMAN PROBLEM.

U
USENET:
"Usenet is like a herd of performing elephants with diarrhea --
massive, difficult to redirect, awe-inspiring, entertaining, and
a source of mind-boggling amounts of excrement when you least
expect it."

--- Gene Spafford (1992)

V
VALUE-BASED PROBLEM:
A problem where the solution must be specified in terms of a set
of real-valued parameters. FUNCTION OPTIMIZATION problems are
of this type. cf SEARCH SPACE, ORDER-BASED PROBLEM.

VECTOR OPTIMIZATION:
Typically, an OPTIMIZATION problem wherein multiple objectives
must be satisfied.

Z
ZEN NAVIGATION:
A methodology with a tremendous propensity to get lost during a
hike from A to B. Zen Navigation simply consists of finding
something that looks as if it knows where it is going, and
following it. The results are often more surprising than
successful, but its usually worth using for the sake of the few
occasions when it is both.

Sometimes Zen Navigation is referred to as "doing scientific
research," where A is a state of mind considered as being pre-
PhD, and B is a (usually a different) state of mind, known as
post-PhD. Your time spent in state C, somewhere inbetween A and
B, is usually referred to as "being a nobody."

ACKNOWLEDGMENTS
Finally, credit where credit is due. I'd like to thank all the people
who helped in assembling this guide, and their patience with my
"variations on English grammar". In the order I received their
contributions, thanks to:

Contributors,
Lutz Prechelt (University of Karlsruhe) the comp.ai.neural-nets
FAQmeister, for letting me strip several ideas from "his" FAQ.
Ritesh "peace" Bansal (CMU) for lots of comments and references.
David Beasley (University of Wales) for a valuable list of
publications (Q12), and many further additions. David Corne, Peter
Ross, and Hsiao-Lan Fang (University of Edinburgh) for their
TIMETABLING and JSSP entries. Mark Kantrowitz (CMU) for mocking
about this-and-that, and being a "mostly valuable" source concerning
FAQ maintenance; parts of Q11 have been stripped from "his" ai-
faq/part4 FAQ; Mark also contributed the less verbose archive server
infos. The texts of Q1.1, Q1.5, Q98 and some entries of Q99 are
courtesy by James Rice (Stanford University), stripped from his
genetic-programming FAQ (Q15). Jonathan I. Kamens (MIT) provided
infos on how-to-hook-into the USENET FAQ system. Una Smith (Yale
University) contributed the better parts of the Internet resources
guide (Q15.5). Daniel Polani (Gutenberg University, Mainz)
"contributed" the ALIFE II Video proceedings info. Jim McCoy
(University of Texas) reminded me of the GP archive he maintains
(Q20). Ron Goldthwaite (UC Davis) added definitions of Environment,
EVOLUTION, Fitness, and Population to the glossary, and some thoughts
why Biologists should take note of EC (Q3). Joachim Geidel
(University of Karlsruhe) sent a diff of the current "navy server"
contents and the software survey, pointing to "missing links" (Q20).
Richard Dawkins "Glossary" section of "The extended phenotype" served
for many new entries, too numerous to mention here (Q99). Mark Davis
(New Mexico State University) wrote the part on EVOLUTIONARY
PROGRAMMING (Q1.2). Dan Abell (University of Maryland) contributed
the section on efficient greycoding (Q21). Walter Harms (University
of Oldenburg) commented on introductory EC literature. Lieutenant
Colonel J.S. Robertson (USMA, West Point), for providing a home for
this subversive posting on their FTP server
euler.math.usma.edu/pub/misc/GA Rosie O'Neill for suggesting the PhD
thesis entry (Q10.11). Charlie Pearce (University of Nottingham) for
critical remarks concerning "tables"; well, here they are! J.
Ribeiro Filho (University College London) for the pointer to the IEEE
Computer GA Software Survey; the PeGAsuS description (Q20) was
stripped from it. Paul Harrald for the entry on game playing (Q2).
Laurence Moran (Uni Toronto) for corrections to some of the
biological information in Q1 and Q99. Marco Dorigo (Uni Libre
Bruxelles) gets the award for reading the guide more thoroughly than
(including the editors). He suggested additions to Q1.4, Q2, Q4 and
Q22, and pointed out various typos. Bill Macready (SFI) for the two
defintions of the NFL theorem in Q99. Cedric Notredame (EBI) for
providing information about applications of EC in biology (Q2).
Fabio Pichierri (The Institute of Physical and Chemical Research) for
information on the relevance of EC to chemists (Q3). Moshe Sipper
(Swiss Federal Institute of Technology) for details of applications
in Cellular Automata and Evolvable Hardware (Q2). Hugh Sasse
(DeMontfort University) for tracking down missing/outdated URLs in
Q1.5 and Q15.2.

Reviewers,
Robert Elliott Smith (The University of Alabama) reviewed the TCGA
infos (Q14), and Nici Schraudolph (UCSD) first unconsciously, later
consciously, provided about 97% of Q20* answers. Nicheal Lynn Cramer
(BBN) adjusted my historic view of GP genesis. David Fogel (Natural
SELECTION, Inc.) commented and helped on this-and-that (where this-
and-that is closely related to EP), and provided many missing entries
for the glossary (Q99). Kazuhiro M. Saito (MIT) and Mark D. Smucker
(Iowa State) caught my favorite typo(s). Craig W. Reynolds was the
first who solved one of the well-hidden puzzles in the FAQ, and also
added some valuable stuff. Joachim Born (TU Berlin) updated the
EVOLUTION Machine (EM) entry and provided the pointer to the Bionics
technical report FTP site (Q14). Pattie Maes (MIT Media Lab)
reviewed the ALIFE IV additions to the list of conferences (Q12).
Scott D. Yelich (Santa Fe Institute) reviewed the SFI connectivity
entry (Q15). Rick Riolo (MERIT) reviewed the CFS-C entry (Q20).
Davika Seunarine (Acadia Univ.) for smoothing out this and that.
Paul Field (Queen Mary and Westfield College) for correcting typos,
and providing insights into the blindfold pogo-sticking nomads of the
Himalayas.

and Everybody...
Last not least I'd like to thank Hans-Paul Schwefel, Thomas Baeck,
Frank Kursawe, Guenter Rudolph for their contributions, and the rest
of the Systems Analysis Research Group for wholly remarkable patience
and almost incredible unflappability during my various extravangances
and ego-trips during my time (1990-1993) with this group.

It was a tremendously worthwhile experience. Thanks!
--- The Editor, Joerg Heitkoetter (1993)

EPILOGUE
"Natural selection is a mechanism for generating
an exceedingly high degree of improbability."

--- Sir Ronald Aylmer Fisher (1890-1962)

This is a GREAT quotation, it sounds like something directly out of a

turn of the century Douglas Adams: Natural selection: the original
"Infinite Improbability Drive"

--- Craig Reynolds (1993), on reading the previous quote

`The Babel fish,' said The Hitch Hiker's Guide to the Galaxy quietly,
`is small, yellow and leech-like, and probably the oddest thing in
the Universe. It feeds on brainwave energy received not from his own
carrier but from those around it. It absorbs all unconscious mental
frequencies from this brainwave energy to nourish itself with. It
then excretes into the mind of its carrier a telepathic matrix formed
by combining the conscious thought frequencies with nerve signals
picked up from the speech centers of the brain which has supplied
them. The practical upshot of all this is that if you stick a Babel
fish in your ear you can instantly understand anything said to you in
any form of language. The speech patterns you actually hear decode
the brainwave matrix which has been fed into your mind by your Babel
fish. `Now it is such a bizarrely improbable coincidence than
anything so mindbogglingly useful could have evolved purely by chance
that some thinkers have chosen to see it as a final and clinching
proof of the non-existence of God. `The argument goes something like
this: ``I refuse to prove that I exist,'' says God, ``for proof
denies faith, and without faith I am nothing.'' ``But,'' says Man,
``The Babel fish is a dead giveaway isn't it? It could not have
evolved by chance. It proves you exist, and so therefore, by your own
arguments, you don't. QED.'' ``Oh dear,'' says God, ``I hadn't
thought of that,'' and promptly vanishes in a puff of logic. ``Oh,
that was easy,'' says Man, and for an encore goes on to prove that
black is white and gets himself killed on the next zebra crossing.

--- Douglas Adams (1979)

"Well, people; I really wish this thingie to turn into a paper babel-
fish for all those young ape-descended organic life forms on this
crazy planet, who don't have any clue about what's going on in this
exciting "new" research field, called EVOLUTIONARY COMPUTATION.
However, this is just a start, I need your help to increase the
usefulness of this guide, especially its readability for natively
English speaking folks; whatever it is: I'd like to hear from
you...!"

--- The Editor, Joerg Heitkoetter (1993)

"Parents of young organic life forms should be warned, that
paper babel-fishes can be harmful, if stuck too deep into the ear."

--- Encyclopedia Galactica

"The meeting of these guys was definitely the best bang since the big
one."

--- Encyclopedia Galactica

ABOUT THE EDITORS
Joerg Heitkoetter,
was born in 1965 in Recklinghausen, a small but beautiful 750 year
old town at the northern rim of the Ruhrgebiet, Germany's coal mining
and steel belt. He was educated at Hittorf-Gymnasium,
Recklinghausen, Ruhruniversitaet Bochum (RUB) and Universitaet
Dortmund (UNIDO), where he read theoretical medicine, psychology,
biology, philosophy and (for whatever reason) computer sciences.

He volunteered as a RA in the Biomathematics Research Group from 1987
to 1989, at the former ``Max-Planck-Institute for Nutrition
Physiology,'' in Dortmund (since March 1, 1993 renamed to ``MPI for
Molecular Physiology''), and spent 3 years at the ``Systems Analysis
Research Group,'' at the Department of Computer Science of UniDO,
where he wrote a particularly unsuccesful thesis on LEARNING
CLASSIFIER SYSTEMs. In 1995, after 22 semesters, he finally gave up
trying to break Chris Langton's semester record, and dropped out of
the academic circus. Amazingly, he's the R&D and Security manager of
UUNET Deutschland GmbH, currently working on various interesting
things in parallel. You may visit his homepage for a mostly complete
list at http://alife.santafe.edu/~joke/ or
http://surf.de.uu.net/people/joke

His electronic publications range from a voluntary job as senior
editor of the FAQ in Usenet's comp.ai.genetic newsgroup, entitled The
Hitch-Hiker's Guide to Evolutionary Computation, over many other
projects he helped bootstrapping, for example Howard Gutowitz' FAQ on
Cellular Automata, available on USENET via comp.theory.cell-automata
,to about a dozen of so-called ``multimediagrams'' written in HTML,
the language that builds the World-Wide Web. The most useful ones
being ENCORE, the Evolutionary Computation Repository Network that
today, after several years of weekend hacking, is accessible world-
wide. And the latest additions: Zooland, the definite collection of
pointers to ARTIFICIAL LIFE resources on the 'net.

With Adam Gaffin, a former senior newspaper reporter from Middlesex
News, Boston, MA, who is now with Networks World, he edited the most
read book on Internet, that was launched by a joined venture of Mitch
Kapor's Electronic Frontier Foundation (EFF) and the Apple Computer
Library, initially called Big Dummy's Guide to the Internet it was
later renamed to EFF's (Extended) Guide to the Internet: A round trip

through Global Networks, Life in Cyberspace, and Everything...

http://www.eff.org/

Since a very special event, he has severe problems to take life
seriously, and consequently started signing everything with
``-joke'', while developing a liquid fixation on all flavours of
whiskey. He continues to write short stories, novels and works on a
diary-like lyrics collection of questionable content, entitled A
Pocketful of Eloquence, which recently was remaned to Heartland, and
published on the web as: http://surf.de.uu.net/heartland/

He likes Mickey Rourke's movies (especially Rumblefish and Barfly),
Edmund Spenser's medieval poetry, the music of QUEEN, KANSAS, and
MARILLION, McDonald's Hamburgers, diving into the analysis of complex
systems of any kind, (but prefers the long-legged ones) and the books
by Erasmus of Rotterdam, Robert Sheckley, Alexei Panshin, and, you
name it, Douglas Adams.

Due to circumstances he lead a life on the edge, until he finally
found the perfect match, which has changed many things drammatically:
he is not single anymore, and now has his first child (he definitely
knows of); on 28 January 2000 Daniel Tobias H. jumped into this
world. He even got married on November 5th 1999. If you like this
kind of stuff, have a look at the wedding pictures at
http://surf.de.uu.net/people/joke/wedding/

Well, so far so good. He is still known to reject job offers that
come bundled with Porsches and still doesn't own a BMW Z3 roadster,
for he recently purchased a red 1996 Ford Probe Medici, enjoying life
at 230 kph, while listening to the formidable 1975 KANSAS song ``Born
On Wings Of Steel.''

He still doesn't live in Surrey, but in Dortmund in a knight's
castle, which was build in the 16th century and rebuild in the early
90ies. The building with its tower, park and pond is known as
Rittergut ``Haus Soelde''.

NOTABLE WRITINGS
Nothing really worth listing here.
David Beasley,
was born in London, England in 1961. He was educated at Southampton
University where he read (for good reasons) Electronic Engineering.

After spending several years at sea, he went to the Department of
Computing Mathematics of the University of Wales, Cardiff, where he
studied ARTIFICIAL INTELLIGENCE for a year. He then went on to write
a thesis on GAs applied to Digital Signal Processing, and tried to
break Joke's publications record.

Since a very special event, he has taken over writing this FAQ, and
consequently started signing everything with ``The FAQmaster'' (He's
had severe problems taking life seriously for some time before that,
however.) He likes Woody Allen's movies, English clothing of medieval
times, especially Marks and Spencer, hates McDonald's Hamburgers, but
occasionally dives into the analysis of complex systems of any kind,
(but prefers those with pedals and handlebars) and the books by (of
course) Douglas Adams.

He is not married, has no children, and also also doesn't live in
Surrey.

He spent several years working for a (mostly interesting) software
company, Praxis in Bath, England. He left after it became clear that
the new owners, Deloitte and Touche, had no interest in software
engineering. He now works for ingenta, a company which provides on-
line access to learned publications and other on-line services to
academic users around the world. This includes the long-established
BIDS reference services. ingenta ( http://www.ingenta.com/ ) are
based at Bath University, England.

NOTABLE WRITINGS
A number of publications related to GENETIC ALGORITHMs. The most
notable ones being:

A Sequential Niche Technique for Multimodal Function Optimization,
Evolutionary Computation, 1(2) pp 101-125, 1993. Available from
ralph.cs.cf.ac.uk/pub/papers/GAs/seq_niche.ps

Reducing Epistasis in Combinatorial Problems by Expansive Coding, in
S. Forrest (ed), Proceedings of the Fifth International Conference on
Genetic Algorithms, Morgan-Kaufmann, pp 400-407, 1993. Available
from ralph.cs.cf.ac.uk/pub/papers/GAs/expansive_coding.ps

An Overview of Genetic Algorithms: Part 1, Fundamentals, University
Computing, 15(2) pp 58-69, 1993. Alailable from ENCORE (See Q15.3)
in file: GA/papers/over93.ps.gz or from
ralph.cs.cf.ac.uk/pub/papers/GAs/ga_overview1.ps

An Overview of Genetic Algorithms: Part 2, Research Topics,
University Computing, 15(4) pp 170-181, 1993. Available from Encore

(See Q15.3) in file: GA/papers/over93-2.ps.gz or from
ralph.cs.cf.ac.uk/pub/papers/GAs/ga_overview2.ps

THAT'S ALL FOLKS!

"And all our yesterdays have lighted fools the way to dusty death;
out, out brief candle; life's but a walking shadow;
a poor player that struts and frets his hour upon the stage;
and then is heared no more;
it is a tale; told by an idiot,
full of sound and fury,
signifying nothing."

--- Shakespeare, Macbeth

------------------------------

This FAQ may be posted to any USENET newsgroup, on-line service, or
BBS as long as it is posted in its entirety and includes this
copyright statement. This FAQ may not be distributed for financial
gain. This FAQ may not be included in commercial collections or
compilations without express permission from the author.

End of ai-faq/genetic/part6
***************************

--

Tim Tyler

unread,

Sep 21, 2000, 3:00:00 AM9/21/00

to

David Beasley <David....@cs.cf.ac.uk> wrote:

: Q4.1 : Noted that FTP address for Larry Yaegers Polyworld is now unknown

: (does anyone know where it has gone to?)

ftp://alife.santafe.edu/pub/SOFTWARE/Polyworld/ works for me...
--
__________ Lotus Artificial Life http://alife.co.uk/ t...@cryogen.com
|im |yler The Mandala Centre http://mandala.co.uk/ UART what UEAT.