Evolutionary Evaluation

61 views
Skip to first unread message

Dan Homan

unread,
Sep 9, 1997, 3:00:00 AM9/9/97
to

I've been thinking for quite some time about writing an automated
evaluation tuner. Has anyone done this sort of thing? With
what kind of success?

My idea is to make the tuner evolutionary. Essentially, it will
behave very similarly to biological evolution. The idea is to
make a random (but usually small) change in one of the evaluation
parameters (choosen at random) and see if this change improves
the play of the program. If so, the change is kept - if not, the
change is discarded.

In principle this is simple, but there are several problems, two
of which are...

1) How to decide if the program is playing better? Even 10
games is not really enough to tell if the change is a small
improvement. Maybe use a test suite? This has problems as
well. Ideally I'd like the testing method to be fast and
correlate well with over the board play.

2) Two (or more) changes together might be an improvement, but
any of them individually might not be an improvement. This suggests
that I should let the program (upon occasion) make more than one change
to the evaluation parameters without an improvement in play. The problem
with this is the increased complexity in evaluation of the changes.

I have two models for my idea now. The first is to write the
program such that it makes a change in parameters (one or more)
between every N games. If play (score) is improved from one set of
N games to the next set of N games, these changes made between
are kept.

The second model is to acquire a very large set of quiet positions.
With each of these positions scored as a clear/marginal win or loss
positionally. Then I write a simple program that compares the result
of my positional evaluation function to the win/loss result
associated with the position. Changes in evaluation parameters are
only made between runs through the entire set of positions. The
decision to keep these changes is based on improvement (or lack
thereof) over the entire set of positions.

The second model has the advantage of being *very* fast - allowing
a very short "evolutionary" period for the evaluation function. The
advantage of the first method is that it will certainly correlate better
with over the board play.

- Dan

---------------------------------------------------------------------
Dan Homan d...@vlbi.astro.brandeis.edu
Physics Department d...@quasar.astro.brandeis.edu
Brandeis University ho...@binah.cc.brandeis.edu
---------------------------------------------------------------------
http://quasar.astro.brandeis.edu/BRAG/people/dch.html
---------------------------------------------------------------------


Robert Hyatt

unread,
Sep 9, 1997, 3:00:00 AM9/9/97
to

Dan Homan (d...@vlbi.astro.brandeis.edu) wrote:

: I've been thinking for quite some time about writing an automated

: - Dan


Sounds similar to Samuel's "learning" checker program from many years
ago. He simply varied the weights (coefficients) of the multipliers on
the various evaluation terms, and let it play and learn which worked
best.. there are *lots* of problems. For example, in Crafty there are
way over 500 things that could be adjusted like this. That would be quite
a task...

Jay Scott

unread,
Sep 9, 1997, 3:00:00 AM9/9/97
to

In article , Dan Homan (d...@vlbi.astro.brandeis.edu) wrote:
>I've been thinking for quite some time about writing an automated
>evaluation tuner. Has anyone done this sort of thing? With
>what kind of success?
>
>My idea is to make the tuner evolutionary. Essentially, it will
>behave very similarly to biological evolution. The idea is to
>make a random (but usually small) change in one of the evaluation
>parameters (choosen at random) and see if this change improves
>the play of the program. If so, the change is kept - if not, the
>change is discarded.

I have heard of some unsuccessful attempts, but they seemed to be
of the "let's see if this works" type rather than the serious
"I'm gonna make this work!".

See <http://forum.swarthmore.edu/~jay/learn-game/methods/chess-ga.html>.

>1) How to decide if the program is playing better? Even 10
>games is not really enough to tell if the change is a small
>improvement. Maybe use a test suite? This has problems as
>well. Ideally I'd like the testing method to be fast and
>correlate well with over the board play.

I don't know any way that's both fast and accurate. I think
the previous attempts didn't get anywhere because they neglected
the accuracy, though there may be other causes too. I expect
that you're going to burn a lot of CPU time.

See <http://forum.swarthmore.edu/~jay/learn-game/inspire/tdga.html>
for one of my ideas about this.

>I have two models for my idea now. The first is to write the
>program such that it makes a change in parameters (one or more)
>between every N games. If play (score) is improved from one set of
>N games to the next set of N games, these changes made between
>are kept.

This sounds like a hillclimbing method which is about the same as
the one Arthur Samuel used in his famous checkers program. It can
work, but more sophisticated methods are known today.

There's a big literature on evolutionary algorithms. Here are a
couple starting points if you don't happen to have read up on
it already:

ftp://alife.santafe.edu/pub/USER-AREA/EC/Welcome.html
http://www.aic.nrl.navy.mil/galist/

If you prefer the hillclimbing method, read the paper on HC-Gammon
<http://forum.swarthmore.edu/~jay/learn-game/systems/hc-gammon.html>
for some valuable ideas. You might want to read that paper anyway--
it's a good one. The authors are Jordan Pollack, Alan Blair and Mark
Land.

>The second model is to acquire a very large set of quiet positions.
>With each of these positions scored as a clear/marginal win or loss
>positionally. Then I write a simple program that compares the result
>of my positional evaluation function to the win/loss result
>associated with the position. Changes in evaluation parameters are
>only made between runs through the entire set of positions. The
>decision to keep these changes is based on improvement (or lack
>thereof) over the entire set of positions.

You'll need an enormous test set if you want to avoid overfitting.
For the same reason, you might prefer to test on randomly chosen
subsets of the test set.

My guess is that this method won't easily produce good play, but
I invite you to prove me wrong. :-)

Here's another idea: using version 0 of your program (perhaps
something very simple like material+mobility), evaluate a large
number of positions (perhaps randomly chosen from games), using
perhaps 6-ply search. Then train version 1 of the program to
reproduce those evaluations as closely as possible while using
only 4-ply search. For the next round, create a new test set and
evaluate it using version 1 with 6-ply search, and so on. There are
many variations, but you get the idea.

This is similar to the plan used by the backgammon program Motif.
In artificial intelligence terms, you might say it's a batch method
of temporal difference learning, rather than a traditional
incremental method from reinforcement learning.


Jay Scott <j...@forum.swarthmore.edu>

Machine Learning in Games:
http://forum.swarthmore.edu/~jay/learn-game/index.html

Dan Thies

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to

On 9 Sep 1997 19:43:32 GMT, j...@forum.swarthmore.edu (Jay Scott)
wrote:

>I have heard of some unsuccessful attempts, but they seemed to be
>of the "let's see if this works" type rather than the serious
>"I'm gonna make this work!".
>
>See <http://forum.swarthmore.edu/~jay/learn-game/methods/chess-ga.html>.

Do you still have that stuff on your page from when I was playing
with it? I don't know if it was completely unsuccessful, I mean it
did do something, but I didn't start with reasonable parameters and
evolve from there, I started with a bad eval to see if it would move
on its own.

We ran it for a LONG time. The play did improve, but on a 68030 at
25Mhz, it took too long to play the games. I might try to find the
source and compile it on my P166+, and see what happens with about
10 times as many games getting played.

Dan

Vincent Diepeveen

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to


>
>I've been thinking for quite some time about writing an automated
>evaluation tuner. Has anyone done this sort of thing? With
>what kind of success?

I'm currently busy with a course Artificial Neural Networks.
Very interesting.

For Diep i have about 15000 adjustable parameters which can be tuned.
To express it in numbers of neurons: 40 layers, my first
estimate would be 50,000 neurons (because of the huge number of
conditions a single adjustable parameter might have).

I almost forgot to say: you need also few neurons which get back their
output (so loops).

To test a program with a set of parameters you need about a hundred games.

Few hundreds games at an autoplayer takes even a program that doesn't
give any problems at 5 autoplayers at the same time (so 10 computers)
5 games every (60 moves in 3 minutes) 6 hours.

let's take the minimum: 20 games against 5 programs so 100 games totally
a test.
100 games = 6*100 = 600 hours.

To adjust 1 parameter you want to test it

a) a little higher
b) a little lower

In case of a considerable change in score at the autoplayer
you need another few tests.

Furthermore you want to change several parameters at the same time.

If you change open file bonusses, you might want to change
half open file bonusses too.

I guess you get the problem.

Automated optimalization is far too hard.

Even the most simplistic test is far too difficult.

15000 parameters * 600 hours computing time = 900 000 00 = 90 000 000
hours. I probably can add few zero's to that, as you need more than 1
test a parameter.

Suppose you buy 1000 K6-233 computers and 500 autoplayers.

90 000 000 / 500 = 450 000 hours of computing time
for the basic optimalization test (just trying to optimize
1 parameter then the next etc, without backtracking).

This computing time problem makes ALL artificial tuners worthless.

You better can do it by hand. You are very good in
discriminating what parameter could be important if your program is
getting mated because it forgot to protect its king.

Sorry to post this, but for tuning you cannot use something clever.
Nor will it solve chess. It is hopeless...

I would love to make a neural network playing chess, even only
evaluating positions based on some knowledge i can put in it (using
patterns from my chessprogram).

But the main problem i already pointed out. The learning algorithm
needs too much systemtime.

After this calculation i was very dissappointed in neural networks.
A neural network and all random evolutionary tuners shift the real
problem to the necessary computing time needed to adjust it.

>My idea is to make the tuner evolutionary. Essentially, it will
>behave very similarly to biological evolution. The idea is to
>make a random (but usually small) change in one of the evaluation
>parameters (choosen at random) and see if this change improves
>the play of the program. If so, the change is kept - if not, the
>change is discarded.

>In principle this is simple, but there are several problems, two
>of which are...

>1) How to decide if the program is playing better? Even 10


>games is not really enough to tell if the change is a small
>improvement. Maybe use a test suite? This has problems as
>well. Ideally I'd like the testing method to be fast and
>correlate well with over the board play.

This is very easy. Play 20 games against 5 chessprograms.
Rebel, Genius, Hiarcs, Kallisto, Nimzo.

If you score considerable better you know quite sure that you
improved your program.

>2) Two (or more) changes together might be an improvement, but
>any of them individually might not be an improvement. This suggests
>that I should let the program (upon occasion) make more than one change
>to the evaluation parameters without an improvement in play. The problem
>with this is the increased complexity in evaluation of the changes.

This is also a hard problem.

>I have two models for my idea now. The first is to write the
>program such that it makes a change in parameters (one or more)
>between every N games. If play (score) is improved from one set of
>N games to the next set of N games, these changes made between
>are kept.

So i give you: n=100 is sufficient.

>The second model is to acquire a very large set of quiet positions.

This is a stupid idea. Base everything on the score it gets out of
the games.

Last version of Diep (the day before yesterday): 1.57.21 had a bluff
thing in it.

It reduces nodes everywhere, taking care it searched deeper,
so solving more positions. It however plays terrible.

This evening i delete the bluff function forever.

Now you try to make a random evolutionary tuner that figures out
just like me, but automated that the bluff function containing
several tens of parameters is the cause of
all the trouble, just after nearly 100 games which
are played by Jan Louwman in just a few days!

>With each of these positions scored as a clear/marginal win or loss
>positionally. Then I write a simple program that compares the result
>of my positional evaluation function to the win/loss result
>associated with the position. Changes in evaluation parameters are
>only made between runs through the entire set of positions. The
>decision to keep these changes is based on improvement (or lack
>thereof) over the entire set of positions.

>The second model has the advantage of being *very* fast - allowing


>a very short "evolutionary" period for the evaluation function. The
>advantage of the first method is that it will certainly correlate better

>with over the board play.

Good luck with a 4 in a row program.

> - Dan
>
>---------------------------------------------------------------------
>Dan Homan d...@vlbi.astro.brandeis.edu
>Physics Department d...@quasar.astro.brandeis.edu
>Brandeis University ho...@binah.cc.brandeis.edu
>---------------------------------------------------------------------
> http://quasar.astro.brandeis.edu/BRAG/people/dch.html
>---------------------------------------------------------------------

Vincent
--
+----------------------------------------------------+
| Vincent Diepeveen email: vdie...@cs.ruu.nl |
| http://www.students.cs.ruu.nl/~vdiepeve/ |
+----------------------------------------------------+

Anders Holtsberg

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to

>>I've been thinking for quite some time about writing
>>an automated evaluation tuner. Has anyone done this sort
>>of thing?

I think not. But I thought about the problem myself some time ago.

> ... 15000 adjustable parameters
> ... numbers of neurons: 40 layers
> ... 50,000 neurons

>After this calculation i was very dissappointed in neural networks.
>A neural network and all random evolutionary tuners shift the real
>problem to the necessary computing time needed to adjust it.

Yupp. Let's do the simplest possible instead.
An experimental plan for quadratic response surface.

>Furthermore you want to change several parameters at the same time.

Yes, an experimental design!

Let's calculate:

n parameters p(i) to vary. Quadratic response surface:

performance = const + sum a(i)*p(i) + sum a(i,j)*p(i,j)*p(i,j)

This is at least 1 + n + n(n+1)/2 experiments needed.
Now the problem is that the experimental outcome is only
0/1 (=win/lose) so to predict a probability of winning we
need a bunch of games for any accurate response estimate,
as was said earlier:

> let's take the minimum: 20 games against 5 programs so 100 games totally
> a test. 100 games = 6*100 = 600 hours.

I would also guess that something like 100 is needed for a
logistic regression type analysis.

Let us use two tricks to bring down the problem to earth

1) Use blitz games and hope for mercy that the parameters optimised
on search depth 4 are at least nearly optimal on search depth 8.

2) Let the program play against itself. We get results for
two experimental points at the same time and the outcome
is a difference between performances, and we don't need the
constant in the quadratic response surface. It also simplifies
the task of automating the computer runs. In order to reasonably
distribute the games while reducing randomness we start each run
of 100 games on the same 50 common openings and run two games for
each opening: black/white and white/black, so that we get rid of
the asymmetry.

So how many blitz games are needed?

n (n + n(n+1)/2)*100/2

10 3250
50 66250
100 257500
500 6287500
1000 25075000
5000 625375000

So if each game takes 5 minutes then for 50 parameters
66250*5/3600 = 92 days should be enough. And for ten
parameters (3250*5/3600) four and a half day should suffice.

Well, unfortunately we need at least a factor 10 more
runs to get any estimate on the experimental errors,
but in our computer room we have 20 machines unused
over christmas...

Perhaps the real value of the experiment would be that
we can say that some of the parameters could just as well
be zero so that the code belonging to those parameters
can be thrown out!

A pilot study with 10 important parameters and the rest fixed
would anyway be a rather interesting study in experimental design
and chess program parameter fine tuning. But as noted
before: even this little problem needs a LOT of computer time!


-- Anders Holtsberg

Komputer Korner

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to

Again Vincent has made a BIG statement. Is this possible that he could have
15,000 adjustable parameters? Assuming that it would take a day to debug
and test a new parameter fully, then Vincent has put in 41 years just for
the parameter part of his program. Vincent isn't that old. If he sleeps 4
hours a night then for his 20 hours per day maybe he could test and debug
let us be optimistic here, 5 parameters per day, so then Vincent would only
need 8 years of parameter building, but this gives him no time to eat. Can
we have comments from some of the more knowledge laden programs'
programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?
--
- -
Komputer Korner

The inkompetent komputer

If you see a 1 in my email address, take it out before replying.
Please do not email both me and the r.g.c.c. at the same time. I read all
the postings on r.g.c.c.

Robert Hyatt

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to

Komputer Korner (kor...@netcom.ca) wrote:
: Again Vincent has made a BIG statement. Is this possible that he could have

: 15,000 adjustable parameters? Assuming that it would take a day to debug
: and test a new parameter fully, then Vincent has put in 41 years just for
: the parameter part of his program. Vincent isn't that old. If he sleeps 4
: hours a night then for his 20 hours per day maybe he could test and debug
: let us be optimistic here, 5 parameters per day, so then Vincent would only
: need 8 years of parameter building, but this gives him no time to eat. Can
: we have comments from some of the more knowledge laden programs'
: programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?
: --
: - -
: Komputer Korner

15,000 is pure hyperbole. I might buy it, if count silly things like each
square of a piece/square table as one value, which makes little sense... or
each pattern you recognize as something (ie this pawn can run)... I thought
DB's 1500 or whatever number was quite large...

15,000 eval terms would mean at least 30,000 lines of eval code, again unless
we are talking about large arrays used to scale things. Or if you count by
saying I have 100 in this array, and 100 in that array, and I multiply them
together based on different things, so I really have 10,000 terms. That is
a tad optimistic of course. :)


: The inkompetent komputer

Marcel van Kervinck

unread,
Sep 10, 1997, 3:00:00 AM9/10/97
to

Komputer Korner (kor...@netcom.ca) wrote:
> Again Vincent has made a BIG statement. Is this possible that he could have
> 15,000 adjustable parameters? Assuming that it would take a day to debug
> and test a new parameter fully, then Vincent has put in 41 years just for
> the parameter part of his program. Vincent isn't that old. If he sleeps 4
> hours a night then for his 20 hours per day maybe he could test and debug
> let us be optimistic here, 5 parameters per day, so then Vincent would only
> need 8 years of parameter building, but this gives him no time to eat. Can
> we have comments from some of the more knowledge laden programs'
> programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?

This question can not be answered. What one considers 1 parameter,
another will implement slightly different and count as 64 parameters.
One example: say you have piece values: Q=9 R=5 B=3 N=3 P=1, that
would make 5 parameters. Another programmer decides to speed up the
computation of the material balance and introduces a giant lookup
table, indexed by some number that represents the set of pieces (or
'bag' for that matter). (This is an old idea, Belle did that in
hardware) He just created thousands of parameters to tune...

Search depth can be measured and compared, but that doesn't mean
much. Nps can be measured and compared, but that means less.
Number of parameters can be measured and compared, but that's just
hot air.

Marcel
-- _ _
_| |_|_|
|_ |_ Marcel van Kervinck
|_| mar...@stack.nl

Vincent Diepeveen

unread,
Sep 11, 1997, 3:00:00 AM9/11/97
to

In <01bcbe1f$21f80fe0$865ab5cf@ALAN> "Komputer Korner" <kor...@netcom.ca> writes:

>Again Vincent has made a BIG statement. Is this possible that he could have
>15,000 adjustable parameters? Assuming that it would take a day to debug
>and test a new parameter fully, then Vincent has put in 41 years just for

You don't know a thing about programming KK, so shut up.

Otherwise you would know that if you use an array per parameter,
you already may multiply these parameters with 64.

Just look to piece-square tables: 6 * 64 * 2 = 768
adjustable parameters. So every program already STARTS with 768 adjustable
parameters!

I also use a lot of parameters dependant of the row ==> multiply these with 8.
I also have lot of different game stages, another nice multiply for adjustable
parameters.

>the parameter part of his program. Vincent isn't that old. If he sleeps 4
>hours a night then for his 20 hours per day maybe he could test and debug
>let us be optimistic here, 5 parameters per day, so then Vincent would only
>need 8 years of parameter building, but this gives him no time to eat. Can
>we have comments from some of the more knowledge laden programs'
>programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?

You will never understand chessprograms inside.

Who are you that you keep on writing these things? A fired reporter?
Above you write a totally different question than the critisism on me.

You should ask: How many ADJUSTABLE parameters do their programs have.

Their is a big difference between parameter and adjustable parameter.
For a neural network the number of adjustable parameter is important.

For me as a human the number of parameters are more important.
Array look ups prevent a difficult formula however, and can
little better discriminate.

I definitely are not trying to adjust a new parameter until it gives the
best result that indeed would take too long.

This is for example the reason that things went a little wrong in Dutch
computer championship 96. Against Arthur Diep played the awfull Nh3??
which was caused because there was a +sign instead of a -sign to put a
knight before a free pawn!

I'm following an artificial neural network course currently and it seems
that NN aren't gonna do the adjustment job for me because of the huge number
of testgames you need.

Vincent.

>--
>- -
>Komputer Korner
>

>The inkompetent komputer
>
>If you see a 1 in my email address, take it out before replying.

I prefer taking out the entire email address.

>Please do not email both me and the r.g.c.c. at the same time. I read all
>the postings on r.g.c.c.
>
>

Dave

unread,
Sep 11, 1997, 3:00:00 AM9/11/97
to

On 11 Sep 1997 11:02:20 GMT, vdie...@cs.ruu.nl (Vincent Diepeveen)
wrote:

>In <01bcbe1f$21f80fe0$865ab5cf@ALAN> "Komputer Korner" <kor...@netcom.ca> writes:
>
>>Again Vincent has made a BIG statement. Is this possible that he could have
>>15,000 adjustable parameters? Assuming that it would take a day to debug
>>and test a new parameter fully, then Vincent has put in 41 years just for
>
>You don't know a thing about programming KK, so shut up.
>

It's hard to accept your statements validity, Vincent. You don't give
enough details to back them up. You must admit, 15,000 adjustable
parameters based on your small amount of information in the original
post, is tough to believe. Not to say it isn't true, though; it could
be.

No need though to jump on KK. He's just expressing what he does know
about - the time needed to properly test (in a conventional way) chess
software parameters which must be then adjusted, and re-tested. You
may test in a different way, Vincent, but you need to say that in your
post.

Congratulations on the strength of your program, Diep, and good luck
in your games with Crafty. As these strong chess programs continue to
be refined, and migrate to faster CPU's, with more and faster memory,
etc. I know anyone playing against them has their work cut out!

All micro system appear sub GM still,IMHO, but that will continue to
change. (300MHz PC's just tested in PC Mag!)

Regards,
Dave


Guido Stepken

unread,
Sep 11, 1997, 3:00:00 AM9/11/97
to

Komputer Korner wrote:
>
> Again Vincent has made a BIG statement. Is this possible that he could have
> 15,000 adjustable parameters? Assuming that it would take a day to debug
> and test a new parameter fully, then Vincent has put in 41 years just for
> the parameter part of his program. Vincent isn't that old. If he sleeps 4
> hours a night then for his 20 hours per day maybe he could test and debug
> let us be optimistic here, 5 parameters per day, so then Vincent would only
> need 8 years of parameter building, but this gives him no time to eat. Can
> we have comments from some of the more knowledge laden programs'
> programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?

Why not ? this is easy possible. Just don't give a figure a value, but
always
a pair or a triple (with King) a value. That's what i did in my
chessprogram.
It widely made use of RAM with valuetables. For me it's a big step
forward, since
many many static evaluations depend on minimum 2 figures attacking
effectively togeter.
I store
[peace][p1][p2][threadlist][value (depends on threadlist and attacked
enemy peaces)]

It normally brings 1 ply more, but also makes it very slow :(((

cu, Guido Stepken

brucemo

unread,
Sep 12, 1997, 3:00:00 AM9/12/97
to

Vincent Diepeveen wrote:
>
> In <Pine.ULT.3.96.970909...@vlbi.astro.brandeis.edu> Dan Homan <d...@vlbi.astro.brandeis.edu> writes:

> >I have two models for my idea now. The first is to write the
> >program such that it makes a change in parameters (one or more)
> >between every N games. If play (score) is improved from one set of
> >N games to the next set of N games, these changes made between
> >are kept.
>
> So i give you: n=100 is sufficient.

There is no magic number that is sufficient. If the result is lop-sided
enough, you can make a determination in very much less than 100, but if you
are trying to measure subtle changes, even 1000 (or more) may not be enough
to determine for sure that you have made an improvement.

It depends upon how sure you want to be. You can avoid doing any testing
at all. You can make a new version, then flip a coin, and if it comes up
heads your new version is better. Or you can do a ten-game match, then
flip a coin, if that makes you feel better.

bruce

Jay Scott

unread,
Sep 12, 1997, 3:00:00 AM9/12/97
to

In article <3415e597...@snews.zippo.com>, Dan Thies (rt...@wt.net) wrote:
>On 9 Sep 1997 19:43:32 GMT, j...@forum.swarthmore.edu (Jay Scott)
>wrote:
>
>>I have heard of some unsuccessful attempts, but they seemed to be
>>of the "let's see if this works" type rather than the serious
>>"I'm gonna make this work!".
>>
>>See <http://forum.swarthmore.edu/~jay/learn-game/methods/chess-ga.html>.
>
>Do you still have that stuff on your page from when I was playing
>with it? I don't know if it was completely unsuccessful, I mean it
>did do something, but I didn't start with reasonable parameters and
>evolve from there, I started with a bad eval to see if it would move
>on its own.

Yep, that web page consists of a couple messages from rgcc back in
February 1996, by John Stanback and you, which say a bit about what
was done and how it worked out. You might say the experiment was a
success because it was interesting enough to write about, however
briefly!

Jay Scott

unread,
Sep 12, 1997, 3:00:00 AM9/12/97
to

In article <34196121...@anu.edu.au>, Andrew Tridgell (Andrew....@anu.edu.au) wrote:
>If you want a simple demo that this can work in chess then take a
>look at WimpKnight on FICS. WimpKnight started out with all its
>evaluation coefficients (about 600 of them) set to 0. It then started
>automated learning by playing on FICS. It reached a 2100 rating in
>3 days. The first few games it played were hilarious!
>
>The learning algorithm we use is a well known one. It just
>needed to be applied to chess in the right way with the right
>error function.
>
>Jon Baxter (who wrote the learning code) is writing this up as
>a paper over the next week or so. Jon wants to keep the details
>under wraps until he has submitted the paper for publication (quite
>a common practice in academic circles)
>
>Once its submitted we will release a new version of the KnightCap
>source code with the learning algorithm included.

Way, way cool! Be sure to announce the new version of KnightCap on
this group--I don't want to miss it. Will Jon Baxter be putting a
version of the paper online? This definitely gets mention on Machine
Learning in Games.

Jay Scott

unread,
Sep 12, 1997, 3:00:00 AM9/12/97
to

In article <3415e597...@snews.zippo.com>, Dan Thies (rt...@wt.net) wrote:
>On 9 Sep 1997 19:43:32 GMT, j...@forum.swarthmore.edu (Jay Scott)
>wrote:
>
>>I have heard of some unsuccessful attempts, but they seemed to be
>>of the "let's see if this works" type rather than the serious
>>"I'm gonna make this work!".
>>
>>See <http://forum.swarthmore.edu/~jay/learn-game/methods/chess-ga.html>.
>
>Do you still have that stuff on your page from when I was playing
>with it? I don't know if it was completely unsuccessful, I mean it
>did do something, but I didn't start with reasonable parameters and
>evolve from there, I started with a bad eval to see if it would move
>on its own.

Yep, that web page consists of a couple messages from rgcc back in
February 1996, by John Stanback and you, which say a bit about what
was done and how it worked out. You might say the experiment was a
success because it was interesting enough to write about, however
briefly!

James Long

unread,
Sep 12, 1997, 3:00:00 AM9/12/97
to

Graham, Douglass wrote:
>
> My Thoughts On Tuning Large Neural Networks
> ======================================
>
> Firstly, I hold up my hands and admit I'm not an expert in this field. But
> imagine how a typical radio phone in would sound after a soccer match if experts
> weren't allowed to have their say:
>
<SNIP>
>
> Anyway, I think it must be possible to tune NNs with large numbers of
> "parameters" very quickly. The example is human beings. Look how much knowledge
> we acquire during the first few years of life! And this with very suboptimal NN
> training. Many NN "parameters" are clearly being tuned simultaneously.
>
> Now, leaving NN's aside, if I had a large number of parameters to tune (be that
> 500 or 15,000) I would go about it by obtaining a database of high quality
> games.
>
> Then, I would program my system to discover, for each parameter against each
> move, if there are limits to what that parameter could possibly be.
>

Excellent idea! Maybe an extension of that would be....what are the
limits
of parameter X versus a particuliar player? Some programs "learn" by
saving what moves work/don't work, but "learning" to self tune eval
parameters (especially player specific) would be *really* interesting
IMHO.

> This is clearly not perfect - I would get a lot of instances of parameter range
> incompatibility.
>

> What might be useful is, for a given move, to provide a graphical illustration
> (in colour, of course!) to show me visually what each parameter's relative
> effect was on my program's failure to select the move the GM selected. Again, I
> would expect this to give me good tuning information very quickly.
>
> Don't tell me that this "visual graphic" system is impractical. Look at the
> range of graph types that spreadsheets can produce!
>
> In short, I believe that tuning a large number of parameters is far from
> impossible - though it may be necessary to accept that the final outcome may not
> be the best possible.

Seems the problem(s) would be (1) how many games are enough....tough
to say, and (2) effect on speed of evaluation.

BTW, good *chess programming* thread.

James

Andrew Tridgell

unread,
Sep 13, 1997, 3:00:00 AM9/13/97
to

Vincent Diepeveen wrote:
> ...

> This computing time problem makes ALL artificial tuners worthless.
> ...
[ stuff deleted that tries to estimate time to auto-tune]


Rubbish. Your calculations totally misrepresent how gradient descent
algorithms work when the number of free parameters is large. It is
quite common to perform such optimisations with tens of thousands
of parameters and get very good results.

The thing you are missing is the fact that each test does not update
a single parameter, it updates all of them. They are updated in
proportion to the partial derivative of an error function with
respect to each parameter.

The trick (and the real reason automated learning hasn't done
well for chess in the past) is to find a sensible error function,
and work out its derivative. Once you have this there are any
number of well known algorithms for optimising the parameters.

If you want a simple demo that this can work in chess then take a
look at WimpKnight on FICS. WimpKnight started out with all its
evaluation coefficients (about 600 of them) set to 0. It then started
automated learning by playing on FICS. It reached a 2100 rating in
3 days. The first few games it played were hilarious!

The learning algorithm we use is a well known one. It just
needed to be applied to chess in the right way with the right
error function.

Jon Baxter (who wrote the learning code) is writing this up as
a paper over the next week or so. Jon wants to keep the details
under wraps until he has submitted the paper for publication (quite
a common practice in academic circles)

Once its submitted we will release a new version of the KnightCap
source code with the learning algorithm included.

Meanwhile you will just have to believe me that automated
evaluation learning can work quite well.

Andrew

Andrew Tridgell

unread,
Sep 14, 1997, 3:00:00 AM9/14/97
to

Jay Scott wrote:
> Way, way cool! Be sure to announce the new version of KnightCap on
> this group--I don't want to miss it. Will Jon Baxter be putting a
> version of the paper online? This definitely gets mention on Machine
> Learning in Games.

yep, I'll announce it. A new release of the source code is in fact way
overdue, but we've been holding off till Jon gets the paper written.

We'll also make a copy of the paper available online. It will be a short
paper (more like a letter) just to "get it out" then we hope to do
a longer paper after that. People who want all the gory details
can read the source code :-)

Andrew

Dan Thies

unread,
Sep 17, 1997, 3:00:00 AM9/17/97
to

On 10 Sep 1997 15:28:07 EDT, "Komputer Korner" <kor...@netcom.ca>
wrote:

>Again Vincent has made a BIG statement. Is this possible that he could have
>15,000 adjustable parameters? Assuming that it would take a day to debug
>and test a new parameter fully, then Vincent has put in 41 years just for
>the parameter part of his program. Vincent isn't that old. If he sleeps 4
>hours a night then for his 20 hours per day maybe he could test and debug
>let us be optimistic here, 5 parameters per day, so then Vincent would only
>need 8 years of parameter building, but this gives him no time to eat. Can
>we have comments from some of the more knowledge laden programs'
>programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?

12,923 rules and counting..... I wouldn't call them parameters,
though. A parameter is a number you don't have to think about very
much. A production rule can take an hour, or it can be auto-written.
I personally wrote about 3500 of the rules in Knowchess, and the
others were auto-written, and are basically shortcuts to places those
3500 would have gone anyway.

Dan (Knowchess)

Dan Thies

unread,
Sep 19, 1997, 3:00:00 AM9/19/97
to

On 10 Sep 1997 11:14:48 GMT, vdie...@cs.ruu.nl (Vincent Diepeveen)
wrote:

>I'm currently busy with a course Artificial Neural Networks.
>Very interesting.

Maybe when you finish the course you'll understand a little more?

>For Diep i have about 15000 adjustable parameters which can be tuned.
>To express it in numbers of neurons: 40 layers, my first
>estimate would be 50,000 neurons (because of the huge number of
>conditions a single adjustable parameter might have).
>
>I almost forgot to say: you need also few neurons which get back their
>output (so loops).
>
>To test a program with a set of parameters you need about a hundred games.

The whole point of neural networks is that you don't completely test
every value. That's brute force searching.

I used evolving networks and did, as I recall, around 100 games per
generation. The games took generally less than a minute
to complete. A full generation took about two hours. 10 generations
a day, more or less, and it ran for a couple months, in the
background.

>You better can do it by hand. You are very good in


>discriminating what parameter could be important if your program is
>getting mated because it forgot to protect its king.

Results of games are also very good in discriminating what's
important.

KC has a "king safety" neural net that's very well tuned, and does
far better than anything I could hand-code.


The rest of your comments, Vincent, simply reflect how much you have
to learn on the subject. Not surprising that you have an opinion,
though.

Dan (Knowchess)

Dave Fotland

unread,
Sep 19, 1997, 3:00:00 AM9/19/97
to

The Many Faces of Go has about 10,000 adjustable parameters. It's a go
program, so it has to be much more knowledge intensive than any chess
program. Since the parameters represent knowledge directly, I spend
almost no time tuning them. 2/3 probably still have the same value that
I assigned to them when I created the parameter. When I create a new
parameter, I usually spend about 30 seconds assigning the value, no need
for a day of debugging. The knowledge is organized in a way that I can
easily adjust many parameters when there is some problem.

The program is still weaker than I am, so I can input my own knowledge
directly. When the program is stronger than the author, I expect there
will be a lot more time tuning parameter values.

David Fotland

Komputer Korner (kor...@netcom.ca) wrote:
: Again Vincent has made a BIG statement. Is this possible that he could have
: 15,000 adjustable parameters? Assuming that it would take a day to debug
: and test a new parameter fully, then Vincent has put in 41 years just for
: the parameter part of his program. Vincent isn't that old. If he sleeps 4
: hours a night then for his 20 hours per day maybe he could test and debug
: let us be optimistic here, 5 parameters per day, so then Vincent would only
: need 8 years of parameter building, but this gives him no time to eat. Can
: we have comments from some of the more knowledge laden programs'
: programmers? HOW MANY PARAMETERS DO THEIR PROGRAMS HAVE?

: --
: - -
: Komputer Korner

: The inkompetent komputer

: If you see a 1 in my email address, take it out before replying.

: Please do not email both me and the r.g.c.c. at the same time. I read all
: the postings on r.g.c.c.

--
David Fotland fot...@cup.hp.com

Simon Read

unread,
Sep 22, 1997, 3:00:00 AM9/22/97
to

GD: Graham Douglass
GD> My Thoughts On Tuning Large Neural Networks
GD> ======================================
GD> Anyway, I think it must be possible to tune NNs with large numbers of
GD> "parameters" very quickly.
-->
Yes, for some value of "quick". I'm currently working on it. No
questions for now, this newsgroup will be the first to know, etc.
Real Soon Now. (!)

GD> The example is human beings. Look how much knowledge
GD> we acquire during the first few years of life! And this with very
GD> suboptimal NN training. Many NN "parameters" are clearly being
GD> tuned simultaneously.
-->
I wouldn't be so quick to say it's very suboptimal. No-one yet knows
all that happens in a human brain. I've studied it a little and it's
pretty impressive. There are not only many sub-systems, there are
many different types of neural networks there. The phrase "incredibly
complicated" springs to mind. Personally, I think it's all very well
tuned.

GD> Now, leaving NN's aside, if I had a large number of parameters to tune
GD> (be that 500 or 15,000) I would go about it by obtaining a database of
GD> high quality games.
-->
I would combine that with the chess network playing by itself so
that it learns what it is capable of. There's something called
temporal difference learning (A.L.Samuel's checkers program used
a form of this) where the program learns not just the outcome of
the game, but learns what the evaluation is one move ahead. Thus
is learns some parameter information from every move.


GD> Don't tell me that this "visual graphic" system is impractical. Look
GD> at the range of graph types that spreadsheets can produce!
-->
From my nightmarish memories of trying to use Microsoft Excell once, I
do remember that it could very easily produce the graph _IT_ wanted
to produce, but had great difficulty in producing the graph _I_ wanted
it to produce. It was very flashy and all that, yes there was a great
range of graphs, but I encountered great difficulty even with a
two-dimensional graph. Now imagine the difficulties a PC spreadsheet
would give you if you wanted 500 dimensions, or 15,000 dimensions,
which is more realistic.


Simon

s.read
AAAAAT
cranfield
dot
.
ac
.......
dot
..
uk


Simon Read

unread,
Sep 22, 1997, 3:00:00 AM9/22/97
to

AT: Andrew Tridgell <Andrew....@anu.edu.au> wrote:
AT> If you want a simple demo that this can work in chess then take a
AT> look at WimpKnight on FICS. WimpKnight started out with all its
AT> evaluation coefficients (about 600 of them) set to 0. It then started
AT> automated learning by playing on FICS. It reached a 2100 rating in
AT> 3 days. The first few games it played were hilarious!
-->
Fascinating. I'll go to FICS instantly and finger it.

Have you examined some of the coefficients to see what they say?
Presumably that includes simple things like values of the pieces
and a whole lot of more subtle things?

Once you can release details, can you tell us if there were any
surprises in the coefficients it chose?


Simon s read cranfieldacuk


Andrew Tridgell

unread,
Sep 23, 1997, 3:00:00 AM9/23/97
to

Simon Read wrote:
> Have you examined some of the coefficients to see what they say?

yes :-)

> Presumably that includes simple things like values of the pieces
> and a whole lot of more subtle things?

Yes, there are approximately 600 variables in KnightCap, although
the number of "independent" variables is really much smaller as there
are several arrays of variables.

All of these variables basically code for the value of a feature.
The features themselves were created by hand in much the same way
as all traditional chess programs. For example, there is a variable
that says how much connected rooks is worth.

WimpKnight only learnt the values of these variables, it didn't
learn to create any new features (that will be another, much harder,
project).



> Once you can release details, can you tell us if there were any
> surprises in the coefficients it chose?

There were some surprises. The surprises were useful because they
helped to point out bugs in the feature generation. For example,
we got a surprsing value for some of the weak pawns coefficients
and this turned out to be because I had made a mistake in the code
for weak pawns. The code worked OK for white but got it horribly
wrong for black.

We also got a surprising value for the "pawns on opposite coloured
squares to bishops" variable. It turned out negative! We still don't
understand that one.

The problem may be that Jon Baxter and I are really rotten chess
players (maybe around 1500 strength) so we probably just don't
understand the finer points of chess :-)

Cheers, Andrew

SMO CFI

unread,
Sep 23, 1997, 3:00:00 AM9/23/97
to

andrew,

Was KnightCap an outgrowth of a working program? In other words, did you
take your existing program and modify it so it would change it's eval as it
went along?

By the way, I wonder if you could provide me with a simple explanation of
how to get my program to talk with ICC. I know you just send ascii command
data, and I know what the commands are, but do you use a telnet program?

Any help would be appreciated.

Will


SMO CFI

unread,
Sep 23, 1997, 3:00:00 AM9/23/97
to

Andrew,

I said KnightCap, I meant WimpKnight, or whatever it is.

Will


Komputer Korner

unread,
Sep 23, 1997, 3:00:00 AM9/23/97
to

YES

--
- -
Komputer Korner

The inkompetent komputer

If you see a 1 in my email address, take it out before replying.
Please do not email both me and the r.g.c.c. at the same time. I read all
the postings on r.g.c.c.

SMO CFI <smo...@aol.com> wrote in article
<19970923192...@ladder02.news.aol.com>...

Andrew Tridgell

unread,
Sep 26, 1997, 3:00:00 AM9/26/97
to

> andrew,
>
> Was KnightCap an outgrowth of a working program? In other words, did you
> take your existing program and modify it so it would change it's eval as it
> went along?

KnightCap is not based on any other chess program. It was written
from scratch and does quite a few things differently from other
programs.

I started it in February of this year. It actually started as an
exercise in 3D graphics, I wrote a 3D rendered chess interface
using OpenGL and Glut. I then thought that it might be nice to write
a chess engine that works with the interface :-)

The first few versions of KnightCap did not have learning at all.
I then added book-learning (more extensive than those I have heard
of in other programs) and Jon Baxter added eval learning. I made
the necessary changes (quite a few) to the chess engine to support
Jons work.

> By the way, I wonder if you could provide me with a simple explanation of
> how to get my program to talk with ICC. I know you just send ascii command
> data, and I know what the commands are, but do you use a telnet program?

KnightCap is unix based so I use the simple method of creating a pipe
pair with pipe() then using close() and dup() to setup stdin etc and
fork()/exec() to run a user specified program (usually telnet or
timeseal). See prog.c in the sources for details.

If your program is unix based then you could also look at the excellent
RoboFICS package. I didn't use RoboFICS because I like to "roll my
own" as I enjoy the programming more than the final result.

Andrew

Reply all
Reply to author
Forward
0 new messages