A few weeks ago Marty Hirsch (author of Mchess5) wrote:
"Opening preparation against commercial opponents matters somewhat, but
not as much as one might expect, because an SSDF rating is based on
hundreds of games against at least twenty opponents."
I replied that at matters AT LEAST 100 ELO points on SSDF.
I have posted my reply also to Marty's email address so he couldn't miss
my comments. Till now I have not received any reply from Marty. Not here
and not in RGCC.
In the between time I have started several matches between Mchess5 and a
few chess programs to find out the REAL impact of killer books.
I find the results very shocking!
Here are the results against:
Mchess5 - Genius3 (currently no. 1 on SSDF ELO 2420) 7.5 - 2.5
Mchess5 - Rebel6 (currently no. 3 on SSDF ELO 2415) 13.0 - 1.0
Mchess5 - Hiarcs3 (currently no. 9 on SSDF ELO 2380) 19.0 - 0.0
According to the HIGH ratings of Genius3, Rebel6 and Hiarcs3 these
results are IMPOSSIBLE in normal play (without book traps)
I think you all now can see the impact of killer lines and maybe you
understand my feelings better and my aversion against cooked books.
To make it more clear I have made a statistic where you can see the
move where Mchess5 left the book, but take special care about the first
score. Mchess5 comes most often (if not all the time) with a TOTAL WON
position out of book! ^^^^^ ^^^
At Aegon 1994 Sandro Necchi (the Mchess5 book editor) very openly stated:
"With Mchess5 we will book out all programs and we will be the new
no.1 on SSDF".
Sandro also explained to Jeroen Noomen (the Rebel book editor) how he
prepares the book cooking on concurrent computer opponents like Genius,
Hiarcs and Rebel the main competitors for a first place on SSDF.
"A specific opening line is chosen, Sandro is watching the play of
Genius3, Rebel etc. When Genius or Rebel makes a mistake Sandro makes
advantage of that and put this WON line in the Mchess5 book".
One of the MANY examples of Sandro's work is the following game
between Mchess5 and Genius3:
1.e4 e5 2.Bc4 Nf6 3.d4 exd4 4.Nf3 Nxe4 5.Qxd4 Nf6 6.Bg5
Be7 7.Nc3 c6 8.O-O-O d5 9.Qh4 Be6 10.Rhe1 h6 11. Bd3 O-O
12.Bxh6 Ne4 13.Qh5 g6 14. Qe5 Bf6 15.Qf4 Nxc3 16.Rxe6 fxe6
17.Qg4 g5 18.Nxg5 Kh8 19.Qh5 Nxa2+ 20.Kb1 Nc3+ 21.bxc3 Qb6+
22.Kc1 Qb2+ 23. Kxb2 Bxc3+ 24.Kb3 Nd7 25.Bxf8+ Kg8 26.Qf7+
Kh8 27.Qh7# 1-0
The error is of course 11..0-0
After that move black is lost.
It's obvious that 11..0-0 is no theory at all.
Still the Mchess5 book continues till move 19 and Mchess even announces
a mate!! Sandro found a weak point in Genius and added the 11.. 0-0 ??
trap to the Mchess book.
The trap also works on other chess programs.
Here are the statistics:
Please note that the games are played in the same way SSDF does so you
will see many duplicates.
--------------------------------------------------------------------------
--
Match Mchess5 - Genius3
Level 40 in 2:00
Machine 2 x P90 (identical)
Move = The move number where Mchess5 left the book.
Score = The score of the first Mchess5 move after leaving the book.
Game Move Score Result
---- ---- ----- ------
1 25 + 3.21 1-0
2 25 + 3.21 1-0
3 27 + 0.55 draw
4 46! + 3.26 draw
5 24 +11.53! 1-0
6 18 +11.04! 1-0
7 18 + 0.14 draw
8 18 +11.04 1-0
9 23 + 0.66 0-1
10 19 + Mat9!!! 1-0
Mchess5 - Genius3 7.5 - 2.5
-------------------------------------------------------------
Match Mchess5 - Hiarcs3
Level 40 in 2:00
Machine 2 x P90 (identical)
Move = The move number where Mchess5 left the book.
Score = The score of the first Mchess5 move after leaving the book.
Game Move Score Result
---- ---- ----- ------
1 28 +14.32 1-0
2 25 + 8.53 1-0
3 24 + 8.80 1-0
4 18 + 1.01 1-0
5 24 + 8.53 1-0
6 25 + 2.20 1-0
7 29 + 2.20 1-0
8 29 + 7.33 1-0
9 35 + 7.33 1-0
10 35 + 7.33 1-0
11 25 + 8.53 1-0
12 29 + 2.20 1-0
13 35 + 7.33 1-0
14 35 + 7.33 1-0
15 35 + 7.33 1-0
16 35 + 7.33 1-0
17 25 + 8.53 1-0
18 18 + 1.01 1-0
19 25 + 8.53 1-0
Mchess5 - Hiarcs3 19 - 0 I find this unacceptable.
-------------------------------------------------------------------
Match Mchess5 - Rebel6
Level 40 in 2:00
Machine 2 x P90 (identical)
Move = The move number where Mchess5 left the book.
Score = The score of the first Mchess5 move after leaving the book.
Game Move Score Result
---- ---- ----- ------
1 16 + 0.33 1-0
2 16 + 1.29 1-0
3 19 + Mat8!!! 1-0
4 16 + 7.45!! 1-0
5 28 + 1.49 1-0
6 19 + Mat8!!! 1-0
7 17 + 0.90 1-0
8 19 + Mat8!!! 1-0
9 13 + 0.00 draw
10 19 + Mat8!!! 1-0
11 16 + 1.29 1-0
12 19 + Mat8!!! 1-0
13 17 + 0.88 1-0
14 28 + 1.39 draw
Mchess5 - Rebel6 13 - 1 Also unacceptable
--------------------------------------------------------------------------
I have posted all the games in PGN format on a new subject here in RGCC
in case anybody wants to check them. Just replay the games with Mchess5
and watch the cooked book lines.
Coming to the GOAL of this posting:
- Is this the future of computer chess?
- Spending months of our time on cooked books to get a good rating on
SSDF?
- Should the programmers of Genius, Hiarcs and Rebel do the same?
I obvious prefer to spend my time on improving the chess engine of Rebel
rather than spending months of my time looking for weak points in other
chess programs and add total won lines to the Rebel opening book!
Personally I find this behavior disgusting since it hides the truth of the
real playing strength of a chess program.
But I really wonder if I have any choice left!
What to do?
Comments are *VERY* welcome because I want to know what you all think
about this subject.
I mean if nobody really cares why should I care any longer?
Just confused and worried.
- Ed Schroder -
Ed Schröder <rebc...@xs4all.nl> wrote in article
<53ting$n...@news.xs4all.nl>...
Personally, I believe this is a problem ...not only for consumers, but
programmers and the SSDF. I like to plays computers vs computers ..but
with killer books...that's not my idea of fun..bascially what you are doing
is just documenting the fact that the killer book author has found a
weakness in another program's opening book...and the SSDF will count the
duplicate gamnes over and over....I think the SSDF needs to get away from
this autoplaying....start picking positions at random from GM games ..and
have the programs play each other from both sides against each other with
books turned off fromthis random start ...otherwise the SSDF is just
wasting it's time contriving ratings that have absolutely no
credibility...unless they are interested in telling the world which program
has the best *cooked* book...*cooking* books will not take computer chess
to the next level...in fact, the end result will be wasted resources
...once an author know one of his lines is cooked..he will simple modify
the line and we'll be back to ground zero in improving chess
programs....it's really up to the SSDF to take the initiative to discourage
the book *cooking* ..and they can by changing the method on how they will
play the programs...but I doubt if you will see any changes....because my
perception is that they are not truly independent from all the programmers
....if they were independent they would recognize that this is a problem
and do something about it...we shall see...I hope my perception is wrong
The answer is simple.
Apply a learning function to your program so that it avoids
opening lines that it loses.
Weight the function so that avoidance is stronger the more
recently the game was played.
So the 'cooked' program will win some games where your book is
busted, but your learning function will 'unbust' your book; and
you'll be back to playing random games again.
IMHO this feature is a chess strength negative for Mchess.
For it to work, Mchess has to have a very narrow and specific book.
Anrrow and specific books are very easy to counter attack - the
counter does not require a large amount of human intervention.
Secondly the 'cooked' lines are often bad. Its just that they
place a *computer* in difficulties. Think about it, if they
were good, they'ld be common human-human book lines, except
they aren't.
I've had great fun autoplaying Mchess with CST (both programs
learning function on). Mchess plays these obscure lines (Urusov
gambit is a favourite, or Volga Gambit). CST loses, self-mods
its book, finds a way out of the line that Mchess continually
throws at it, and, then, and here's the joke, CST wins 5 or
6 games in succession, until Mchess finds a way round or, more
usually, gives up on that particular opening.
Simple learning function is the answer. Its an arms race, get
racing.
Chris Whittington
> <snip>
>
> Personally I find this behavior disgusting since it hides the truth of the
> real playing strength of a chess program.
>
> But I really wonder if I have any choice left!
>
> What to do?
>
> Comments are *VERY* welcome because I want to know what you all think
> about this subject.
>
> I mean if nobody really cares why should I care any longer?
>
> Just confused and worried.
>
> - Ed Schroder -
>
I, for one, am in full agreement with your view, Ed.
The problem is in getting ALL progammers to agree amongst themselves that
"killer moves" should not be placed into the book - if this cannot be done
then I'm afraid you are left with having to join in.
--
---------------------------------------------------------------------------
Ian Harris EMail i...@iharris.demon.co.uk or CompuServe 70374,3166
PGP 2.6.3i public key available on request
---------------------------------------------------------------------------
> What to do?
>
> Comments are *VERY* welcome because I want to know what you all think
> about this subject.
>
> I mean if nobody really cares why should I care any longer?
>
> Just confused and worried.
>
> - Ed Schroder -
One of the things I like about Rebel Decade is that it never plays the
same opening twice, and that it plays different variations when it
does repeat an opening. Could this be the way to defeat the
killer books? When GMs play a match they also try to find flaws
in their opponent's openings, and when they suspect that they are
approaching a prepared line, they play something that they haven't
played previously. Computers don't neet to always play the move
which they evaluate highest. Randomly playing second or third best,
when the scores are close, would do a lot to dodge the killer books
and to make the program more fun to play against. Altering the
opening book a little for each tournamant also seems prudent.
Putting some bad moves in your commercial book for Mchess to find
and then removing them for tournaments might be amusing.
Finding flaws in your opponents' openings, and dodging prepared traps
is an interesting part of chess, don't you think?
... Peter McKone
Not necessarily. Marty could just hook up Rebel and Mchess (or
any other program for that matter) and let his learning function
simply "learn" how to beat the other program by culling book lines
that result in losses and keeping lines that win. It's not elegant,
but it is simple because it does not take a lot of human time to
find these oddball lines...
:
: Secondly the 'cooked' lines are often bad. Its just that they
: place a *computer* in difficulties. Think about it, if they
: were good, they'ld be common human-human book lines, except
: they aren't.
A possible "learning disability" too. This is an issue I'm looking
at in Crafty now, but learning to beat weak players might convince
you that 1. e4 2. Bc4 and 3. Qf3 are good moves if if results in
lots of wins. The strength of the opponent has to be factored in,
where wins over opponents that are too weak simply don't affect the
learning at all.
More ideas for discussion as I get further into trying to implement
something here. As always I'll explain what I'm trying and see what
kind of discussion ensues...
:
: I've had great fun autoplaying Mchess with CST (both programs
: learning function on). Mchess plays these obscure lines (Urusov
: gambit is a favourite, or Volga Gambit). CST loses, self-mods
: its book, finds a way out of the line that Mchess continually
: throws at it, and, then, and here's the joke, CST wins 5 or
: 6 games in succession, until Mchess finds a way round or, more
: usually, gives up on that particular opening.
I see the same thing as Crafty plays WchessX, except that crafty
doesn't learn as of yet. However, WchessX keeps going back to
the same well over and over and the only thing that keeps crafty
in the match is that I give it some random freedom with a really
huge book so that it doesn't repeat too often...
:
: Simple learning function is the answer. Its an arms race, get
: racing.
:
: Chris Whittington
:
Never is a long time. If you put it on a server, and play 20,000 games a
year, you'll likely see just how often it will repeat, because Crafty's
book is way bigger, and it still repeats openings too often for my liking...
>The MCHESS5 computer killer book...
>A few weeks ago Marty Hirsch (author of Mchess5) wrote:
> "Opening preparation against commercial opponents matters somewhat,
but not as much as one might expect, because an SSDF rating is based on
hundreds of games against at least twenty opponents."
>I replied that at matters AT LEAST 100 ELO points on SSDF.
>I have posted my reply also to Marty's email address so he couldn't
miss my comments. Till now I have not received any reply from Marty.
Not here and not in RGCC.
>In the between time I have started several matches between Mchess5 and
a few chess programs to find out the REAL impact of killer books.
>I find the results very shocking!
(SNIP - games and statistics that amply demonstrate Ed's point)
>Coming to the GOAL of this posting:
>
>- Is this the future of computer chess?
>- Spending months of our time on cooked books to get a good rating on
>SSDF? Should the programmers of Genius, Hiarcs and Rebel do the same?
>I obvious prefer to spend my time on improving the chess engine of
Rebel rather than spending months of my time looking for weak points in
other chess programs and add total won lines to the Rebel opening book!
>Personally I find this behavior disgusting since it hides the truth of
the real playing strength of a chess program.
>But I really wonder if I have any choice left!
>What to do?
>Comments are *VERY* welcome because I want to know what you all think
>about this subject.
>I mean if nobody really cares why should I care any longer?
>Just confused and worried.
>- Ed Schroder -
Dear Ed --
Here's one person's opinion, for what it's worth.
As an ecstatic new owner of Rebel 8.0 -- OF COURSE I'd like to see you
spend 100% of your development time making Rebel 9.0 even stronger as a
chess-playing engine.
That's your strength -- your strongest game, if you will. You
shouldn't waste your time trying to do things which are tangential to
your main arena of artistry and specialization. Others can do that;
you shouldn't have to.
One wouldn't require or expect a concert pianist to tune the piano.
Even if bad tuning had negative effects on her or his "results."
The real-world problem for a commercial developer, if I understand it,
is that probably many people look at the SSDF list (or other tournament
results -- but NOT the WMCCC in Jakarta this year!!) and use those
results to make their purchase decision. I.e., they buy the program
which beats the others.
Now that seems logical and unassailable -- if we were putting together
a chess team for the Olympiad, we'd take the highest-rated players we
could find.
But what your dilemma suggests is that perhaps computer chess is
DIFFERENT from OTB play, with other considerations -- and very
important ones.
In short, if I understand you correctly, you're saying that
***with 'cooked' books, tournament play is not reflective of engine
strength.***
This is a really important point. In fact, for computer chess, it
tends to negate the value of tournaments at all! What a concept.
In fact, using 'cooked books' takes computer chess competition success
COMPLETELY OUTSIDE of the computer arena entirely: it makes tournament
results dependent on a purely human factor -- who has the sharpest IM
or GM to prepare the lines the computer will play. Or worse, which
company is able to pay the most for such services!
So, the resulting competition has nothing to do with computer chess at
all.
It's human chess, with computers pushing the pawns.
(Here, thanks to this newsgroup: this is something I would never have
been aware of without reading the posts of yourself and others in this
newsgroup).
OTHER PURCHASE CRITERIA (a personal aside):
In my own purchase decision (and I own two other commercial programs
besides Rebel 8.0), tournament results were NOT primary. I wanted the
strongest engine. Tournament results weren't a factor. Testing the
STRENGTH of the program WAS a factor.
Why did I look for engine strength? Well, I saw that my Chessica
program (a Fritz 3 version, I've heard -- which was hyped as being
"World Champion" when I bought it) often did stupid things. Beating it
ceased to interest me. It doesn't fear a passed pawn, for example, and
it rarely tries to queen a pawn on its own. In this and other
respects, I began to doubt its underlying chessplaying ability.
In contrast, Rebel 8.0 is strong enough to hammer me off the board with
alarming ease. It's actually kind of scary playing it -- a very
dangerous feeling, watching it move. And, it's exciting and inspiring
to have this kind of horsepower available on my desktop P-133.
Most important, engine strength addresses my primary interest in
computer chess -- the testing and learning of opening theory, and
subjecting one's own ideas to analysis -- which can only be aided by
having the strongest engine possible. That's what I use Rebel 8.0 for,
so that was my chief determinant in choosing which program to buy. For
this use, tournament results are irrelevant.
* * *
But I can see the commercial quandary in the underlying issue you
raise:
From a programmer's perspective: a 'cooked' book CONCEALS THE PLAYING
WEAKNESS of a program -- just that quality which you (and many others)
have spent so long trying to develop... and doing so brilliantly, I
might add.
So what is the answer here?
Does every professional program developer or company need its own IM or
GM -- or, forseeably even a TEAM of GM's like the old Soviet system --
to 'cook' its books for tournaments?
It seems like we're headed exactly in that direction. Especially if
and as more money pours into the field.
This, of course, may be what 'real' chessplayers do, as part of their
training and preparation. In OTB play, it isn't extrinsic to the fair
and open competition of chess -- just the opposite. It's part of the
game. If you don't want to prepare your openings, you won't get very
far. But this analogy to OTB play can be VERY MISLEADING.
Computer chess is fundamentally different: you don't make your living
by winning tournaments, but by selling programs. Winning isn't really
your goal, except as a means to maintain the commercial viability of
your product. Instead, you're a brilliant programmer and developer.
For you, winning a tournament, while gratifying, is secondary to
creating a monster chess engine. That's your gift, talent, strength,
and maybe, destiny.
A MODEST PROPOSAL
-- Maybe we're looking at the development of two distinct types of
computer chess competitions: a "database" competition, where opening
books are unlimited, and an "engine power" competition, where books
would be carefully limited -- or even STANDARDIZED, WITH ALL
COMPETITORS USING THE SAME BOOK -- to provide a level playing field.
Of course, all other *hardware* sports routinely create such mechanical
restrictions, and even "classes", to keep competition fair and
interesting. Formula One racing, America's cup and all sailboat
competition -- in fact all the *hardware* competitions I can think of,
have developed very sophisticated formulas for keeping competition in
carefully-defined boundaries.
Computer chess might well consider doing the same.
Why? All these sports have recognized long ago that only close
competition is interesting. Without such rules, every event becomes a
vigil for a coup d'etat of some kind: you wait to learn which
competitor found the best "cheat" that blows everyone else away.
I think computer chess may be in that very situation now.
If that happens, as your posting of MChess results show, the
competition is no fun at all, and meaningless. Even worse, it could
have a dangerously weakening effect on progress towards the central
issue and problem of computer chess: developing the most powerful
engine possible.
That's the unique arena of computer chess, and the one it should follow
to the exclusion of other tangential concerns.
It's an important goal, and should be protected -- especially by the
participants themselves -- by building safeguards into competitions
that discourage achievement OUTSIDE the computer chess arena that could
come to dominate computer chess as a whole.
The worst result I can see from the present situation is that it could
tend to keep you and others from following your own path and destiny --
that of creating the program which will play the strongest, deepest,
smartest chess that our CPU's are capable of. It would be something of
a tragedy if you and others were deflected from pursuing that
fascinating and exciting goal.
Sorry for the rambling answer -- but I think this may be a very
important issue indeed...
-- garb leon
PS congratulations on Rebel 8 -- I'll be first in line as soon as Rebel
9 is available... gl
Sorry, I didn't explain myself well enough.
That is sort of what it is being alleged he does.
Or, it is alleged that, by hand, he looks for errors in opponent
programs and then programs in (narrow) book lines to try and seek
out these errors.
Conversely, it can be assumed that lines bad for Mchess are avoided.
This, it is alleged, is done by hand prior to release.
I am arguing that this avoidance/reward process can be automated
by the opponent program *after* release by learning and unlearning
book lines as they are played at SSDF testing.
If Rebel (say) loses a SSDF game, it can avoid losing that line
again by diverging on the next SSDF game or later occasion.
So, Rebel can do what Mchess does on-line. Its not so easy for
Mchess to deal with this behaviour *after* release.
Believe me, CST has played long sequences of games with Mchess
with variance (on both sides, since both learn). Never the same
game gets played. Its quite a battle. Either side can get the
better of it, I've seen strings of different wins in the same
basic opening as the opponent program wriggles more each game,
eventually finding the refutation.
It works, this learning. Nobody can book up for a series
of games played this way.
Chris Whittington
tries
Hi,
In article <53ting$n...@news.xs4all.nl>, =?iso-8859-1?q?Ed_Schr=F6der?=
<rebc...@xs4all.nl> writes
>The MCHESS5 computer killer book...
>
>A few weeks ago Marty Hirsch (author of Mchess5) wrote:
>
> "Opening preparation against commercial opponents matters somewhat, but
> not as much as one might expect, because an SSDF rating is based on
> hundreds of games against at least twenty opponents."
>
>I replied that at matters AT LEAST 100 ELO points on SSDF.
Having seen the results I have to agree with you Ed.
>
>I have posted my reply also to Marty's email address so he couldn't miss
>my comments. Till now I have not received any reply from Marty. Not here
>and not in RGCC.
>
>In the between time I have started several matches between Mchess5 and a
>few chess programs to find out the REAL impact of killer books.
>
>I find the results very shocking!
>
>Here are the results against:
>Mchess5 - Genius3 (currently no. 1 on SSDF ELO 2420) 7.5 - 2.5
>Mchess5 - Rebel6 (currently no. 3 on SSDF ELO 2415) 13.0 - 1.0
>Mchess5 - Hiarcs3 (currently no. 9 on SSDF ELO 2380) 19.0 - 0.0
>
>According to the HIGH ratings of Genius3, Rebel6 and Hiarcs3 these
>results are IMPOSSIBLE in normal play (without book traps)
These results seem to fit in with the SSDF results. For example,
MChess5 P90 - Hiarcs3 P90 16.5 - 3.5
Remember, that contains 10 white and 10 black games.
It is interesting that Hiarcs4 with a larger and significantly varied
book (with absolutely no "cooks") scores:
MChess5 P90 - Hiarcs4 P90 6.5 - 13.5
While Hiarcs4 is stronger than Hiarcs3, it certainly isn't 396 Elo
(267+129) stronger as indicated by the respective match scores!!
It is obvious from my testing too that MChess5 has a heavily "cooked"
book for Genius2/3, Rebel6 and Hiarcs3. Which incidentally were MChess'
main opposition when it was released.
This means there are at least 7 SSDF matches of 20 games each which are
influenced by the killer lines and NOT the relative engines strengths.
There is no doubt in my opinion that killer lines in a cooked book on
this scale will severely affect the SSDF rating of MChess5.
The manor in which these results were achieved is quite shocking.
>
>I have posted all the games in PGN format on a new subject here in RGCC
>in case anybody wants to check them. Just replay the games with Mchess5
>and watch the cooked book lines.
>
>Coming to the GOAL of this posting:
>
>- Is this the future of computer chess?
>- Spending months of our time on cooked books to get a good rating on
>SSDF?
>- Should the programmers of Genius, Hiarcs and Rebel do the same?
>I obvious prefer to spend my time on improving the chess engine of Rebel
>rather than spending months of my time looking for weak points in other
>chess programs and add total won lines to the Rebel opening book!
I have never put killer lines in Hiarcs' opening book for computer
opponents. What limited time I have I prefer to devote to work on the
chess engine.
I belive chess programs should be developed for the users/customers who
are willing to purchase them. It seems some chess programs are being
developed to beat other chess programs as a main priority. Surely this
cannot be right?
>
>Personally I find this behavior disgusting since it hides the truth of the
>real playing strength of a chess program.
>
>But I really wonder if I have any choice left!
>
>What to do?
I think Chris mentioned about learning and this may be the only way
forward for us all. However, it leaves a serious problem with the rating
lists like the SSDF whose accuracy is surely being severely affected,
particularly when new programs released now and in the future get to
play "old" programs like Genius2/3, Hiarcs3 and Rebel6.
I believe such a large number of possible "cooked matches" gives
programs like MChess5 an inflated rating.
>
>Comments are *VERY* welcome because I want to know what you all think
>about this subject.
>
>I mean if nobody really cares why should I care any longer?
>
Ed, you are not alone.
>Just confused and worried.
>
>- Ed Schroder -
>
>
Regards,
Mark
Author of Hiarcs3, Hiarcs4 and soon Hiarcs5!
I've seen this happen. In several test suites, positions are found
in the large GM database Crafty has. I have modified the "test"
function to disable the book before it runs a test suite to keep
from fudging the results... One position I remember came from an
old Cray Blitz vs Belle game, (ACM 1981 in fact) where the solution
is Bxh6 which leads to a draw. It's in the database. :)
Bob
On Mon, 14 Oct 1996, Pete Nielsen wrote:
--> I think you may have to factor in time pressure as well.
-->
--> I don't know if you're familiar with the Internet Chess Academy, but you
--> get a puzzle followed shortly by a lecture on the solution and how that
--> solution should have been found.
-->
--> Often after finding my solution, I give the position to Crafty to see
--what
--> it thinks. Recently, It gave one answer (Book). It turned out in the
--> lecture that the possition was from a GM vs GM game, and that the reason
--> that the move was good, was because of time pressure.
-->
> - Is this the future of computer chess?
> - Spending months of our time on cooked books to get a good rating on
> SSDF?
> - Should the programmers of Genius, Hiarcs and Rebel do the same?
> =
> I obvious prefer to spend my time on improving the chess engine of Rebel
> rather than spending months of my time looking for weak points in other
> chess programs and add total won lines to the Rebel opening book!
> =
> Personally I find this behavior disgusting since it hides the truth of th=
e
> real playing strength of a chess program.
> =
> But I really wonder if I have any choice left!
> =
> What to do?
> =
> Comments are *VERY* welcome because I want to know what you all think
> about this subject.
> =
> I mean if nobody really cares why should I care any longer?
> =
> Just confused and worried.
> =
> - Ed Schroder -
I have already stated that if the computer chess programmers would sell
updated cooked medium rare "killer books on a regular basis, that these
would actually sell. Don't underestimate the consumer's capacity to want
to have the latest killer book for his program. But maybe my ICU is =
fried by now and my circuits are overheating. SSSSSSSSSSSSSSSSSSSSSS
OH OH, My circuits ARE overheating caused by the thought of having to
absorb all those new killer books. =
-- =
Komputer Korner
Don't agree. I've played autoplayer sequences against Rebel (in -A)
mode, where the same game gets repeated over and over. I've seen
9 identical sequential draws often.
In normal autoplayer mode, rebel will just throw out a repeat
if its an identical on book exit. For obvious reasons the SSDF can't use
this mode since they can't allow Rebel to just abort games.
The problem is a technical one.
Ed wants to deal with repeats, So he aborts if the game repeats
out of *his* opening book. Fine against a non-learner, but not
fine against a learner. Ed knows about this problem, and its up to
us programmers to deal with it. I think its best dealt with by us
all using learning.
Then its an arms race which will lead back to programs being
unable to rely on killer lines.
Really, if we all use learning, then iuts a non-problem.
Chris Whittington
> Could this be the way to defeat the
> killer books? When GMs play a match they also try to find flaws
> in their opponent's openings, and when they suspect that they are
> approaching a prepared line, they play something that they haven't
> played previously. Computers don't neet to always play the move
> which they evaluate highest. Randomly playing second or third best,
> when the scores are close, would do a lot to dodge the killer books
> and to make the program more fun to play against. Altering the
> opening book a little for each tournamant also seems prudent.
> Putting some bad moves in your commercial book for Mchess to find
> and then removing them for tournaments might be amusing.
> Finding flaws in your opponents' openings, and dodging prepared traps
> is an interesting part of chess, don't you think?
>
> .... Peter McKone
Ed Schröder <rebc...@xs4all.nl> schrieb im Beitrag
<53ting$n...@news.xs4all.nl>...
> The MCHESS5 computer killer book...
>
> A few weeks ago Marty Hirsch (author of Mchess5) wrote:
>
> "Opening preparation against commercial opponents matters somewhat, but
> not as much as one might expect, because an SSDF rating is based on
> hundreds of games against at least twenty opponents."
>
> I replied that at matters AT LEAST 100 ELO points on SSDF.
>
> I have posted my reply also to Marty's email address so he couldn't miss
> my comments. Till now I have not received any reply from Marty. Not here
> and not in RGCC.
>
> In the between time I have started several matches between Mchess5 and a
> few chess programs to find out the REAL impact of killer books.
>
> I find the results very shocking!
>
> Here are the results against:
> Mchess5 - Genius3 (currently no. 1 on SSDF ELO 2420) 7.5 - 2.5
> Mchess5 - Rebel6 (currently no. 3 on SSDF ELO 2415) 13.0 - 1.0
> Mchess5 - Hiarcs3 (currently no. 9 on SSDF ELO 2380) 19.0 - 0.0
>
> According to the HIGH ratings of Genius3, Rebel6 and Hiarcs3 these
> results are IMPOSSIBLE in normal play (without book traps)
>
> I think you all now can see the impact of killer lines and maybe you
> understand my feelings better and my aversion against cooked books.
[large cut]
Thanks a lot for this nice illustration of the impact of killer books.
I agree 100% with you.
I love testing chess programs, but these killer books really make me lose
interest in computer chess. It has nothing to do with chess.
It is necessary that a) the SSDF changes their testing methods. They should
have a large set of opening positions (maybe positions after 7-10 moves)
with open and closed positions in the right proportion. They should play
each position twice and switch the colours after the first game. Opening
books should be turned off. These positions should be kept secret so that
there will be no return of booking.
b) ICCA should modify their rules for their championships. Ed Schroeder has
clearly expressed his reasons why he did not participate in Jarkata. Maybe
Richard Lang had similar reasons. - Genius is often considered as a
reference and their might be a lot of cooks against the Genius book.
The fact that a top program like Rebel does not compete should really make
the ICCA reflect about their rules.
Again: Cooking books does not make computer chess advance although it may
be very tempting because it means easy SSDF points and good sales.
Alexander Fuchs
I happen to think that killer-lines should be encouraged. They only
demonstrate how sensitive computer's are to opening traps. Human's
have long had to contend with preperation, and with databases so
widespread, it's only getting worse. The side-effect of all this is
that to be good at chess, one must be extremely well prepared.
Some of you programmers are starting to sound like you'd like to
switch to Fischer's-Random-Chess to level the playing field. Kind
of funny that you're all upset that you've been dupped by your
own invention. Face facts, if you "turn-off" your opening-books,
your machines are very likely to stumble into some very, very
bad positions. So, live by the sword, die by the sword.
As for the issue of consumer's being fooled, I suggest that they be
educated about some standardized benchmarks (I'll let y'all agree
on which bench-mark should be used).
In competition, anything legal is fair. There's no good reason why
someone should refuse to take advantage of their opponent's weakness
in the opening. Rather than crying about how unfair it is to be
defeated by an inferior opponent, I suggest you start recognising
that a weakness of yours has been revealed (be thankful that this
was an easily identifiable weakness that your opponent preyed upon!).
Of course, we all realize that such a weakness is difficult to repair,
and most will confess that it leads only to an arms-race approach to
opening theory. When the majority of tournament players are willing to
concede that this kind of preperation is no longer enjoyable, I (for one)
will gladly follow them over to Fischer's variation of chess, but I doubt
very much that such a migration will occur in the near future -- it is
against every strong player's interest to do so (save maybe RJF).
Kevin.
This is completely unreasonable.
Nobody *has* to respond to anything.
There are many programmers whose products get mentioned on rgcc
or who get emailed and who don't reply. That's just their way.
Its the height of arrogance to imagine rgcc is some kind of
god-forum where anyone and everyone has to explain themselves.
Chris Whittington
You are missing the point of the discussion!
The issue is that SSDF allows to test "book cooking" chess programs
against older chess programs who CAN NOT defend themselves.
These "book cooking" programs therefore gain a LOT of ELO points on the
SSDF rating list!! and therefore the rating of "book cooking" programs
ARE NOT RELIABLE.
At the moment the only "book cooking" program is Mchess5.
I have explained this in detail in my previous posting including a lot
of examples and complete games.
I am just worried about this new development, we *ALL* want a SSDF list
with the STRONGEST chess program on TOP. No?
As I producer myself I surely hope that this will be my own program, no
doubt about that, but I prefer a reliable no.1 on SSDF and I don't care
if that is one of my concurrents as long as it is reliable!
For years this was the ChessMachine The King;
For years this was Genius3;
Both programs were the strongest at that moment!
I have NO problems with Mchess, Genius, Hiarcs, Crafty, Schredder on TOP
of SSDF as long as the rating and the no.1 position is earned by the
STRONGEST chess engine.
Unfortunately this has been not been the case in the past year and
this is the main reason for our discussion.
Please feel free to correct me if I am wrong.
- Ed Schroder -
From: kjbe...@chimi.engr.ucdavis.edu (Kevin James Begley)
Mark Uniacke (ma...@acc-ltd.demon.co.uk) wrote:
: I have never put killer lines in Hiarcs' opening book for computer
: opponents. What limited time I have I prefer to devote to work on the
: chess engine.
: I belive chess programs should be developed for the users/customers who
: are willing to purchase them. It seems some chess programs are being
: developed to beat other chess programs as a main priority. Surely this
: cannot be right?
>I happen to think that killer-lines should be encouraged. They only
[snip]
>A MODEST PROPOSAL
>
>-- Maybe we're looking at the development of two distinct types of
>computer chess competitions: a "database" competition, where opening
>books are unlimited, and an "engine power" competition, where books
>would be carefully limited -- or even STANDARDIZED, WITH ALL
>COMPETITORS USING THE SAME BOOK -- to provide a level playing field.
[snip]
There are some excellent points here.
The fundamental problem is that, no matter what tests are used, somebody
will find some way of biassing them. I remember (from an old Selective
Search) a test position where a program found the solution instantly yet,
when a completely irrelevant pawn was moved from h2 to h3 and the program
retested, it failed. Ditto standardised book lines - what would stop someone
adding special behaviour to their program if they discovered a weakness in
the way a prospective opponent handled one of those lines?
I suggest a draconian (anti-sleaze?) solution: that computer versus
computer, or computer-only tests, are disallowed for rating purposes (by
SSDF, Eric Hallsworth and so on), and that human versus computer games only
are considered. Remember that _people_ operate computers, write programs and
buy them.
Alastair
That would solve all the problems - except that people buy chess programs to
play chess, not to play Fischerandom.
It's a bit like deciding the English Premiership by giving Newcastle United
and Manchester United a bag of golf clubs each and asking them to play
eighteen holes! (Apologies for the football/soccer analogy).
Alastair
I think this is a problem that (a) has been here a long time; and (b) will
be around for a long time. I remember a "meeting" years ago with Ken Thompson,
Mike Valvo and myself, in a tournament hall somewhere, where ken showed us an
opening that would trap most any computer. He sprang it on NuChess and crushed
it in an ACM event. Do I like getting trapped? no. Do I like trapping
someone else? yes. :)
In any case, a standard starting position would be a start, although it would
certainly weaken some engines like Genius that play certain openings very
well and others poorly. That's about the only down-side to randomly choosing
the starting position. If you pick one that one program doesn't understand
or like, it could also be unfair...
Needs thought...
Bob
:A few weeks ago Marty Hirsch (author of Mchess5) wrote:
:
: "Opening preparation against commercial opponents matters somewhat, but
: not as much as one might expect, because an SSDF rating is based on
: hundreds of games against at least twenty opponents."
:
:I replied that at matters AT LEAST 100 ELO points on SSDF.
>Having seen the results I have to agree with you Ed.
Thank you for backing me up Mark, at least I know that I am not
alone now.
[ snip ]
>It is obvious from my testing too that MChess5 has a heavily "cooked"
>book for Genius2/3, Rebel6 and Hiarcs3. Which incidentally were MChess'
>main opposition when it was released.
>This means there are at least 7 SSDF matches of 20 games each which are
>influenced by the killer lines and NOT the relative engines strengths.
>There is no doubt in my opinion that killer lines in a cooked book on
>this scale will severely affect the SSDF rating of MChess5.
Unfortunately I have to agree.
I have the same results and the same conclusions as Mark.
[ snip ]
:Coming to the GOAL of this posting:
:
:- Is this the future of computer chess?
:- Spending months of our time on cooked books to get a good rating on
:SSDF?
:- Should the programmers of Genius, Hiarcs and Rebel do the same?
:I obvious prefer to spend my time on improving the chess engine of Rebel
:rather than spending months of my time looking for weak points in other
:chess programs and add total won lines to the Rebel opening book!
>I have never put killer lines in Hiarcs' opening book for computer
>opponents. What limited time I have I prefer to devote to work on the
>chess engine.
>I belive chess programs should be developed for the users/customers who
>are willing to purchase them. It seems some chess programs are being
>developed to beat other chess programs as a main priority. Surely this
>cannot be right?
Corect, I have never seen any cooked book lines in Hiarcs3/4, neither
did I found any cooked lines in Genius2/3/4. Just Like Rebel, Hiarcs
and Genius opening books are written for humans and I am in favor to
keep it that way!!
:What to do?
>I think Chris mentioned about learning and this may be the only way
>forward for us all. However, it leaves a serious problem with the rating
>lists like the SSDF whose accuracy is surely being severely affected,
>particularly when new programs released now and in the future get to
>play "old" programs like Genius2/3, Hiarcs3 and Rebel6.
>I believe such a large number of possible "cooked matches" gives
>programs like MChess5 an inflated rating.
I agree, with or without learning, book cooking gains a lot of ELO points
especially when SSDF test older versions which does not have a learning
system. These older versions:
- Genius 2.0
- Genius 3.0
- Hiarcs 3.0
- Rebel 6.0
and soon
- Genius 4.0
- Hiarcs 4.0
- Rebel 7.0
are easy victims for "cooked lines".
No defense possible!
:Comments are *VERY* welcome because I want to know what you all think
:about this subject.
:I mean if nobody really cares why should I care any longer?
>Ed, you are not alone.
Great!
>Regards,
> Mark
>Author of Hiarcs3, Hiarcs4 and soon Hiarcs5!
- Ed Schroder -
Author of Rebel8
How do you expect consumers around the world to judge programs except by looking
at tables?
Is everybody expected to make deep studies every time they buy a USD100 product?
My opinion is that, if you can't compete, and you stand and complain noisily
about it, you make yourself look weak. I personally think that it is not good
to present yourself to the world with the word "victim" stamped upon your
forehead.
In article <53ting$n...@news.xs4all.nl>, "Ed says...
maybe he's just too busy. or maybe he's on vacation.
or maybe he just doesn't think your questions are worth
responding to. believe it or not, the world does not revolve
around RGCC.
|Apparently, he, like Steve at
|ICD, figures that the less said, the sooner the thread will drop.
|
| Wrong.
i can't speak for Hirsch, but if it were me, i would not
dignify that kind of insulting drivel with -any- kind of response,
just on principle.
--
--- don fong ``i still want the peace dividend''
--
>Ed Schröder (rebc...@xs4all.nl) wrote:
>: I have posted my reply also to Marty's email address so he couldn't miss
>: my comments. Till now I have not received any reply from Marty. Not here
>: and not in RGCC.
> Marty Hirsch does not respond to posts that he doesn't like. He
>doesn't even respond to posts that ask for information, such as the one I
>posted about the 10MB Hash table limit. Apparently, he, like Steve at
>ICD, figures that the less said, the sooner the thread will drop.
> Wrong.
I'm just a lowly "D" player out here in Idaho, yet Marty has answered
virtually all of my e-mail inquiries. It looks _so_ unprofessional to
see him attacked in this newsgroup. Can't we just face it that all
"book" knowledge and opening theory is an admission of our weakness,
whether we're a human or a computer program? If we extol a program
because it has the ability to find an elusive draw, we must also
give credit to the program that has a better book. Rather than
"steaming" here about how bad it is, I would politely ask that the
programmers go and remove those "cookable" weaknesses from your books!
Your book is only as strong as its weakest link, and M-Chess has
proven to be a very tough chain to crack. Do you want to have a
tournament with "all books off," or play the game as we currently
know it?
__
john quill taylor / /\
writer at large / / \
Hewlett-Packard, Storage Systems Division __ /_/ /\ \
Boise, Idaho U.S.A. /_/\ __\ \ \_\ \
e-mail: jqta...@hpdmd48.boi.hp.com \ \ \/ /\\ \ \/ /
Telephone: (208) 396-2328 (MDT = GMT - 6) \ \ \/ \\ \ /
Snail Mail: Hewlett-Packard \ \ /\ \\ \ \
11413 Chinden Blvd \ \ \ \ \\ \ \
Boise, Idaho 83714 \ \ \_\/ \ \ \
Mailstop 852 \ \ \ \_\/
\_\/
"When in doubt, do as doubters do." - jqt -
haiti, rwanda, cuba, bosnia, ... we have a list,
where is our schindler?
I agree.
Between the hyperbole and personal attacks, there's not much room left
for useful information here. :)
Bob
1- Mchess 5 has a killer-book that makes it crush Genius 3, Rebel 6 and
Hiarcs 3.
2- Because of this killer book Mchess's rating in the SSDF list has been or
is quite a bit higher than its chess engine would have achieved otherwise.
3- To the extent that they reflect the results of games, sometimes even
double games, won by cooked opening lines, the ratings given by SSDF do
not reflect the real strength of a chess engine, and therefore in some
specific cases they are misleading consumers that pay attention to this
rating list before making a buying decision and they are giving top honors
where top honors are not due.
4- A different rating procedure should be followed if we want to know and
let people know the relative strength of chess engines.
Whether or not the inclusion of cooked lines in the opening book is a fair
approach, it may morally be a matter of opinion. In practical terms what I
find not arguable is that "killer books" give a false idea of the real
strength of a chess engine.
Enrique
john quill taylor <jqta...@hpdmd48.boi.hp.com> schrieb im Beitrag
<540fqj$l...@hpbs2500.boi.hp.com>...
> da...@laraby.tiac.net (James Garner) wrote:
>
> >Ed Schröder (rebc...@xs4all.nl) wrote:
>
>
> Your book is only as strong as its weakest link, and M-Chess has
> proven to be a very tough chain to crack.
The book of M-Chess Pro can be cooked like any other book.
Do you want to have a
> tournament with "all books off," or play the game as we currently
> know it?
I'd prefer that rather than see computer chess reduced to cooking books
Alexander Fuchs
Maybe someone in this newsgroup presented themselves this way but I
certainly don't remember it. I do remember the question being asked
was "What should I do next to improve my program; killer book or
stronger chess engine?"
A stronger chess engine is what most customers really want to buy,
but the trouble is, killer books can make an engine look stronger than
it really is.
Maybe a good learning algorithm that can develop such books during
auto-play mode is one answer. That way, authors can still devote
most of their time to improving the chess engine.
I don't know if this is the best answer or not, or even if there
is a good answer; but in any case, let's get the question straight!
Joe Stella
> Graham Laight wrote:
>[...]
Ed,
I agree w/ most everything you say,
except I fail to see how you can get
rid of killer-books.
As already pointed out, Fischerandom is
not an adequate solution, nor is having
the opening books turned off (there is
virtually NO WAY to assure that the evaluation-
function does not assure a "cooked" opening
anyway).
I suggest the consumer rely upon a benchmark
rather than the results of competition if they
are interested in purchasing the "strongest"
chess program (rather than the "best in competition").
Meanwhile, I maintain that MChess5 has done nothing
unethical -- this aspect of competition has been
too long neglected (though certainly it has always
been present, if only in a more subtle form). I
think it's about time that the anti-computer computer
emerges. I'd have tried to equip it with a way to
take advantage of the opponent's horizon-effect.
Victory need not always go to the strongest, and I
happen to admire the MChess5 team for taking Lasker's
advice (PLAY THE MAN!). There's a certain amount of
irony and justice to it all -- like when lawyers first
started suing other lawyers.
Kevin.
-Live by the sword, die by the sword.
I don't want to comment your taste of chess programs. If you like having a
program which gets many, many points in the Swedish list - where they
aren't even willing to do such a primitive thing as killing doubles - you
will of course buy the program which is so nice in "cooking" just according
to your kind of taste in chess and simlpy has a "cooked" rating as well
according to the facts.
For me it's something completely else: My only interest is that in the
playing style and playing strength of a program (and as long as strength is
ok - not necesarily best - I'm looking at style in first place).
Mchess would be an attractive program without killer booking as well,
although I doubt it ever was strong enough for the first place in the
Swedish list. It was simply perfect for the Swedish way of testing :-)))
But the only reason why I bought it is the remaining playing strength and
style after diregarding the whole cooking. It's okay for me if a
programmer's team adopts the book to the playing style: not anything
more!!!
So I'm definitely not interested in this kind of programming and if it goes
on like this it will sooner or later have influence on my decision: simply
not to buy this kind of programs any more.
By the way: the learning function regularly produces more stupid stuff than
interesting games.
If you ever saw the variations on one variation produced by two computers,
the results are in my eyes in 80% of the games boring, in 15% ridiculous
and in 5% interesting. I'm simply not interested in this kind of opening
études either.
All in all: I'm happy to see the real playing strength and style of
programs after
a) switching off learning functions immediately
b) using variety books instead of tornament book
c) playing tournaments with certain openings for black and white and then
switching off beoth books.
Yours Dirk!
--
Yours Dirk
Kevin James Begley <kjbe...@chimi.engr.ucdavis.edu> schrieb im Beitrag
<53un7s$r...@mark.ucdavis.edu>...
> Mark Uniacke (ma...@acc-ltd.demon.co.uk) wrote:
> : I have never put killer lines in Hiarcs' opening book for computer
> : opponents. What limited time I have I prefer to devote to work on the
> : chess engine.
> : I belive chess programs should be developed for the users/customers who
> : are willing to purchase them. It seems some chess programs are being
> : developed to beat other chess programs as a main priority. Surely this
> : cannot be right?
>
That's a great idea! Ed should be motivated to improve Rebel Book to
refute all of the Mchess6 killer moves. :)
Eran
Hi,
I suggest that maybe all of you, chess programmers who don't like
killer-book, establish an honest, international organization where
computer chess tests can be made most accurate and reliable results. For
example, a name for the new international organization can be
"International Computer Chess Standard." It is where all computer chess
tests must meet strict standard requirements where killer-book is not
allowed, for instance. All chess programmers can be members of that
organization and can vote equally what yes and what not allow to include
anything such as killer-book in strict standard. So, I hope this make
all chess programmers satisfied and happy. Furthermore, creating newer
chess programs should meet "International Computer Chess Standard"
requirements, because this also helps customers all over the world to
buy correct chess software happily without confusion. I understand that
SSDF may be no longer reliable, because the computer chess tests are in
poor and unfair condition; maybe this condition is too liberal and not
strict enough. Therefore, I think establishing a new strict and honest
organization is a very good idea, it will solve many problems in both
computer chess tests and chess software and clear any confusion among
chess programmers and customers alike.
Eran
> From: Mark Uniacke <ma...@acc-ltd.demon.co.uk>
>
> :A few weeks ago Marty Hirsch (author of Mchess5) wrote:
> :
> : "Opening preparation against commercial opponents matters somewhat, but
> : not as much as one might expect, because an SSDF rating is based on
> : hundreds of games against at least twenty opponents."
> :
> :I replied that at matters AT LEAST 100 ELO points on SSDF.
>
> >Having seen the results I have to agree with you Ed.
>
> Thank you for backing me up Mark, at least I know that I am not
> alone now.
>
> [ snip ]
>
> >It is obvious from my testing too that MChess5 has a heavily "cooked"
> >book for Genius2/3, Rebel6 and Hiarcs3. Which incidentally were MChess'
> >main opposition when it was released.
> >This means there are at least 7 SSDF matches of 20 games each which are
> >influenced by the killer lines and NOT the relative engines strengths.
> >There is no doubt in my opinion that killer lines in a cooked book on
> >this scale will severely affect the SSDF rating of MChess5.
>
> Unfortunately I have to agree.
> I have the same results and the same conclusions as Mark.
>
> [ snip ]
>
> :Coming to the GOAL of this posting:
> :
> :- Is this the future of computer chess?
> :- Spending months of our time on cooked books to get a good rating on
> :SSDF?
> :- Should the programmers of Genius, Hiarcs and Rebel do the same?
>
> :I obvious prefer to spend my time on improving the chess engine of Rebel
> :rather than spending months of my time looking for weak points in other
> :chess programs and add total won lines to the Rebel opening book!
>
> >I have never put killer lines in Hiarcs' opening book for computer
> >opponents. What limited time I have I prefer to devote to work on the
> >chess engine.
> >I belive chess programs should be developed for the users/customers who
> >are willing to purchase them. It seems some chess programs are being
> >developed to beat other chess programs as a main priority. Surely this
> >cannot be right?
>
> Corect, I have never seen any cooked book lines in Hiarcs3/4, neither
> did I found any cooked lines in Genius2/3/4. Just Like Rebel, Hiarcs
> and Genius opening books are written for humans and I am in favor to
> keep it that way!!
>
> :What to do?
>
> >I think Chris mentioned about learning and this may be the only way
> >forward for us all. However, it leaves a serious problem with the rating
> >lists like the SSDF whose accuracy is surely being severely affected,
> >particularly when new programs released now and in the future get to
> >play "old" programs like Genius2/3, Hiarcs3 and Rebel6.
> >I believe such a large number of possible "cooked matches" gives
> >programs like MChess5 an inflated rating.
>
> I agree, with or without learning, book cooking gains a lot of ELO points
> especially when SSDF test older versions which does not have a learning
> system. These older versions:
> - Genius 2.0
> - Genius 3.0
> - Hiarcs 3.0
> - Rebel 6.0
>
> and soon
>
> - Genius 4.0
> - Hiarcs 4.0
> - Rebel 7.0
>
> are easy victims for "cooked lines".
> No defense possible!
>
> :Comments are *VERY* welcome because I want to know what you all think
> :about this subject.
> :I mean if nobody really cares why should I care any longer?
>
> >Ed, you are not alone.
>
> Great!
>
> >Regards,
> > Mark
>
> >Author of Hiarcs3, Hiarcs4 and soon Hiarcs5!
>
I think we have to be careful. The opening is a very important part of
the game and therefore is a good opening book a very important part of a
chess program. Everybody is trying to make programs more "human like"
and I often read attacks against the bad bad brute forcers. It IS human
like to study openings, learn them by heart and trying to fool your
opponent with opening tricks. Sure it is disgusting to study an
opponents opening book and use cooked books, but what's wrong about a
GOOD opening book without any cooking? Is it also disgusting to use
endgame databases? At what point is an opening a "killer variant"?
I can understand Ed's feelings, but I also understand e.g. Alex Kures
feelings. He (and some others) worked for 2 years to make a good opening
book for Nimzo (without any cooked variations). His work would have been
in vain if only "pure" chess engines are allowed in further
competitions.
Just a few thoughts of my own....
Andreas Mader
>In article: <53ting$n...@news.xs4all.nl> "Ed Schr?der" <rebc...@xs4all.nl>
writes:
>> <snip>
>>
>> Personally I find this behavior disgusting since it hides the truth of the
>> real playing strength of a chess program.
>>
>> But I really wonder if I have any choice left!
>>
>> What to do?
>>
>> Comments are *VERY* welcome because I want to know what you all think
>> about this subject.
>>
>> I mean if nobody really cares why should I care any longer?
>>
>> Just confused and worried.
>>
>> - Ed Schroder -
>>
>I, for one, am in full agreement with your view, Ed.
While we are on this subject, I want to "shift gears" a second. If you
are trying to figure out which program is best, you are using the wrong
metric if by "best" you mean "the program most likely to perform best
against a strong human opponent."
If you want the program that can produce the highest ICC/FICS/whatever
rating, by *only* playing other computers, the SSDF rating is ideal and
this is exactly what it shows... how the programs will stack up against
each other.
If you want a program that produces the highest quality chess against a
human opponent, that's another matter, and has to be measured in a different
way, namely by real USCF or FIDE ratings. Since there are very few USCF
events for computers to participate in, and since a FIDE computer membership
is astronomically expensive, you don't have that option. Your only real
choice is to take advantage of the computers on the servers, play them, ask
the operators what program they are using, and then make up your mind as to
which program you like. You'll likely find that each has different
characteristics that you may like/dislike.
If you look at your needs honestly, you'll likely decide that the "engine"
is not the whole package, otherwise you could not sell a Saturn automobile.
The GUI is also important and will make the program either more enjoyable
or a miserable opponent, depending on how well you like the GUI, how easy
it is to use, and whether it supports the things you want to use it for.
In short, the SSDF is but one data point you should use in selecting the
right program for you. treat it like buying a car. Certainly you'd test-
drive before making up your mind I'd hope? Ditto for a hess program...
:
: For me it's something completely else: My only interest is that in the
: playing style and playing strength of a program (and as long as strength is
: ok - not necesarily best - I'm looking at style in first place).
Good point. One of my old favorites was Dave Kittingers SuperConstellation
that would play speculative attacking sacrifices. You had to watch h7 with
a careful eye. :) Most programs nowadays don't behave like this. I did
have a pretty marvelous version of Crafty a few months back that was attacking
left and right. Lost too many games however, but the "style" was quite
flamboyant... almost Tal-like.
:
: Mchess would be an attractive program without killer booking as well,
: although I doubt it ever was strong enough for the first place in the
: Swedish list. It was simply perfect for the Swedish way of testing :-)))
: But the only reason why I bought it is the remaining playing strength and
: style after diregarding the whole cooking. It's okay for me if a
: programmer's team adopts the book to the playing style: not anything
: more!!!
:
It's still hellishly strong. There's little difference between Rebel,
Mchess Pro, Genius, and others. Not enough that you could really tell
the difference by looking at the games, until you get to know each program
well enough to understand each one's unique differences. I'm beginning to
develop a "feel" for Rebel 8, ChessMaster 5000, Genius 4 and Fritz 4, based
on hundreds of games played by Lonnie against Crafty. Each one has its
own set of strengths and weaknesses, and these even change from version to
version...
: So I'm definitely not interested in this kind of programming and if it goes
: on like this it will sooner or later have influence on my decision: simply
: not to buy this kind of programs any more.
:
: By the way: the learning function regularly produces more stupid stuff than
: interesting games.
: If you ever saw the variations on one variation produced by two computers,
: the results are in my eyes in 80% of the games boring, in 15% ridiculous
: and in 5% interesting. I'm simply not interested in this kind of opening
: études either.
:
: All in all: I'm happy to see the real playing strength and style of
: programs after
: a) switching off learning functions immediately
: b) using variety books instead of tornament book
: c) playing tournaments with certain openings for black and white and then
: switching off beoth books.
:
: Yours Dirk!
:
You'd love Crafty then. Unfortunately, it's book is *so* wide, it gets
into more trouble than Dennis the Menace. Makes for interesting games,
and for interesting losses too.. :)
(snip Hirsch stuff)
>Can't we just face it that all
>"book" knowledge and opening theory is an admission of our weakness,
>whether we're a human or a computer program? If we extol a program
>because it has the ability to find an elusive draw, we must also
>give credit to the program that has a better book.
>Rather than
>"steaming" here about how bad it is, I would politely ask that the
>programmers go and remove those "cookable" weaknesses from your books!
>Your book is only as strong as its weakest link, and M-Chess has
>proven to be a very tough chain to crack. Do you want to have a
>tournament with "all books off," or play the game as we currently
>know it?
yeah, okay, but it's not quite that simple.
ed's question is: should he spend his time (or maybe, waste his time) countering
'killer' lines in other programs that are specifically targeted to make his program
look bad? Or would he better devote that time to developing the strongest chess
engine he possibly can?
despite the seductive parallel, human play and computer chess aren't exactly the same.
Humans, of course, play to gain a tactical advantage and win. Computer programs have
a slightly different mission: the number of victories they can score in the short run
is perhaps less important than success in the overall project -- that of creating the
strongest chess-playing engine possible, over the course of time.
in that sense computer chess is somewhat out of the realm of pure sport -- pure
winning and losing -- and instead moves more closely to the realm of art, the quest to
reach an ideal.
***to me, in order to encourage and ultimately achieve that lofty goal, computer chess
competition should maximize the importance of the programmer's skill and the engine's
strength, while minimizing all other factors.***
which it already does, to some degree -- running programs on identical machines to
make the contest 'fair,' for example
now you might say, well, playing a game is playing a game, whether by human or
machine.
but killer lines aren't invented by machines -- they're invented by humans, special
humans whose superior positional skill and experience enable them to see through a
machine's weaknesses, especially in opening play. and, they're static rather than
dynamic -- they're not part of the 'thinking' of the program, only a guide to that
thinking with an outside assist.
so killer lines can be seen as kind of human-created crutches (or brass knuckles),
unbalancing the contest and enabling a given program to perform beyond its inherent
strength and capacity. they turn computer chess back into human chess.
that's because 'cooked' books, made by people, and then externally appended to the
engine itself, are not, strictly speaking, making a contribution to the larger mission
of computer chess itself. developing them is a different challenge than that of
developing the strongest chess playing engine. while it might be fun once in a while
to spring such a 'surprise,' when you see two strong programs come out of competition
at 19-2, you know something is not quite right.
ed's worry -- that cooked books minimize the importance of the chess engine, the
programmer's crown jewel and masterpiece -- seems reasonable and even a bit alarming.
if a killer line can actually conceal the playing weakness of a program, it makes any
tournament nearly meaningles.
ed's question is actually very practical: what should he do? To me, cooked books
distract from and undermine the unique and ancient dream of creating a chess-playing
automaton. I really don't want ed, or any other programmer, wasting his time trying
to find and build in lines which will beat Genius, or MChess, or Hiarcs, or any other
program, especially if it's at the expense of program development.
if one is interested in seeing the development of the ultimate chess-playing engine,
it tends to follow that engaging in these short-run tactical skirmishes using cooked
books -- say, to win the top rating on the SSDF list -- distracts from the central,
unique mission of programmers.
of course all this has a commercial ramification as well: but knocking off the
competition in the SSDF list with a program which may be inferior in playing strength
also seems to defeat the purpose of that list itself. the list then becomes very
misleading, and could lead astray people who are thinking of buying a program, and who
look to the SSDF to find the one playing the strongest overall game. it almost
becomes a problem of truth in advertising, or something.
this discussion actually reaches into a broader area as well, the general problem of
benchmarking computer chess programs as accurately as possible. of course humans do
that for themselves in OTB play, but for computers -- still at a very early stage of
technology -- the problem is slightly different.
Perhaps this discussion will lead to a better, more fair way to test program playing
strength.
My own feeling is that a true test would limit both programs to the same, standardized
book -- just as testers use two identical CPU's and hardware systems when running a
fair contest between programs. Otherwise the result has very little meaning --
exactly for the same reason that a contest between programs on unequal computers
doesn't reveal very much.
Such mechanical 'rules' -- like limiting competitors to a standardized opening book,
are part and parcel of every mature sport, whether limiting sail area, engine size,
ball size, take your pick. Baseball is played on a standard diamond; it certainly
hasn't hurt the game. Examples are too numerous to mention. to make a regatta really
exciting, you limit the amount of sail area any boat can use and make dozens of other
strict rules to enforce a 'one design' craft: that tends to highlight the competitive
skill and strategic savvy of each skipper and crew, makes for a more exciting race,
and tells us to a much greater degree which crew performed best -- which wouldn't
happen in a race between between big boats, little boats, yawls, catamarans,
windsurfers and so on. (although all major yacht racing uses complex formulas to
handicap racing results between differing craft as well...).
something like that seems like a reasonable strategy for computer chess to consider.
to me the central problem with killer lines is that they tend to
undermine any attempt to measure strength accurately in
computer-vs-computer chess contests. they also move computer chess
contests away from the crucial arena of the programmer's skill and the
engine's strength, and replace it with a much more mundane and
short-sighted activity.
-- garb leon
Dear Ed,
MChess 5 is not the only "book cooking" program on the Swedish Rating List.
Some programmers have put in special killer lines against their most important
competitors for more than ten years! It has caused some irritation to me from
time to time, but it is very difficult to solve this problem in a way, that
everybody could agree upon.
For example the Mach II (or was it Mach III?) from Fidelity had several
killerlines against some version of the Novag Expert. I was then worried about how
that would affect the rating figure. But when whe had played more than 500 games,
it only mattered about 5-10 points.
The conclusion was that it is important to play as many games as possible against
as many opponents as possible. Then a biased result will drown in the flood of games.
One can say that the team behind MChess (Marty Hirsch/Sandro Necchi) has made this
problem bigger, because MChess has killer lines against several opponents. And many
of them!
The killer library (right or wrong) does of course have an affect on the rating, but
not as much as you (Ed) believe. I wanted to give you the correct proportion on this
"problem", so I ran some tests with our ratingprogram. It is easy to just remove some
results and then make a new list.
So I removed all results between MChess 5.0 and Rebel6, Hiarcs3, Genius2 and Genius3
- programs that MCPro5 is told to be cooked against. All together 170 games.
First: Here is the top of the official rating list from the 11th of September.
THE SSDF RATING LIST 1996-09-11 50990 games played by 156 computers
Rating + - Games Won Oppo
------ --- --- ----- --- ----
1 Genius 3.0 Pentium 90 MHz 2420 29 -28 626 64% 2320
2 MChess Pro 5.0 Pentium 90 MHz * 2418 28 -27 699 65% 2313
3 Rebel 6.0 Pentium 90 MHz 2415 31 -31 520 60% 2339
4 Rebel 7.0 Pentium 90 MHz 2412 28 -27 671 61% 2330
5 Genius 4.0 Pentium 90 MHz 2409 27 -26 705 65% 2298
6 Hiarcs 4.0 Pentium 90 MHz 2392 30 -30 545 57% 2341
7 Genius 4.0 486/50-66 MHz 2391 31 -31 516 60% 2319
8 Nimzo 3.0 Pentium 90 MHz 2388 30 -29 577 60% 2314
9 Hiarcs 3.0 Pentium 90 MHz 2380 31 -30 525 57% 2333
10 MChess Pro 4.0 Pentium 90 MHz 2367 30 -30 538 54% 2341
11 Genius 3.0 486/50-66 MHz 2362 24 -24 870 63% 2265
12 Fritz 3.0 Pentium 90 MHz 2361 29 -29 593 55% 2324
13 R30 v. 2.5 2356 52 -48 215 68% 2226
2 MChess Pro 5.0 Pentium 90 MHz, 2418
Genius 3 P90 13-7 Rebel 6.0 P90 16-4 Rebel 7.0 P90 8.5-11.5
Genius 4 P90 10-10 Hiarcs 4 P90 6.5-13.5 Geniu4 486/66 9.5-10.5
Nimzo 3.0 P90 6-14 Hiarcs 3 P90 16.5-3.5 MCPro 4.0 P90 9-11
Geniu3 486/66 11.5-8.5 Fritz 3.0 P90 12.5-7.5 R30 v. 2.5 8-12
MCPro5 486/66 10-10 Rebel7 486/66 11.5-8.5 Geniu2 486/66 15-5
Kallis198 P90 11-9 WChess P90 6-14 MCPr40 486/66 12-8
Fritz 4.0 P90 2-4 WChess 486/66 14.5-5.5 Hiarc3 486/66 16-4
Rebel6 486/66 16-4 Genius 68 030 7.5-2.5 CM30 King 2.0 21-8
ChGen1 486/66 22-8 MCPr35 486/66 15.5-4.5 Decade P90 13-7
Fritz3 486/66 12-8 Lyon 68030 15-5 Comet32 P90 14.5-5.5
Kallis 486/66 33-7 SPARC 20 MHz 14.5-5.5 Meph. RISC 18.5-1.5
Chess M. King 4-0 Sapphire 19.5-0.5
And here is the same list without 170 games for MCPro5:
SAME LIST - GAMES REMOVED! 50820 games played by 156 computers
Rating + - Games Won Oppo
------ --- --- ----- --- ----
1 Genius 3.0 Pentium 90 MHz 2425 30 -28 606 65% 2318
2 Rebel 6.0 Pentium 90 MHz 2424 32 -31 500 62% 2337
3 Rebel 7.0 Pentium 90 MHz 2412 28 -27 671 61% 2330
4 Genius 4.0 Pentium 90 MHz 2408 27 -26 705 65% 2298
5 Hiarcs 4.0 Pentium 90 MHz 2392 30 -30 545 57% 2341
6 Genius 4.0 486/50-66 MHz 2391 31 -31 516 60% 2319
7 Hiarcs 3.0 Pentium 90 MHz 2389 32 -31 505 58% 2330
8 Nimzo 3.0 Pentium 90 MHz 2388 30 -29 577 60% 2314
9 MChess Pro 5.0 Pentium 90 MHz * 2386 31 -30 529 62% 2302
10 MChess Pro 4.0 Pentium 90 MHz 2367 30 -30 538 54% 2341
11 Genius 3.0 486/50-66 MHz 2363 25 -24 850 64% 2263
12 Fritz 3.0 Pentium 90 MHz 2361 29 -29 593 55% 2324
13 R30 v. 2.5 2353 52 -48 215 68% 2223
9 MChess Pro 5.0 Pentium 90 MHz, 2386
Rebel 7.0 P90 8.5-11.5 Genius 4 P90 10-10 Hiarcs 4 P90 6.5-13.5
Geniu4 486/66 9.5-10.5 Nimzo 3.0 P90 6-14 MCPro 4.0 P90 9-11
Fritz 3.0 P90 12.5-7.5 R30 v. 2.5 8-12 Rebel7 486/66 11.5-8.5
MCPro5 486/66 10-10 Kallis198 P90 11-9 WChess P90 6-14
MCPr40 486/66 12-8 Fritz 4.0 P90 2-4 WChess 486/66 14.5-5.5
CM30 King 2.0 21-8 ChGen1 486/66 22-8 MCPr35 486/66 15.5-4.5
Fritz3 486/66 12-8 Lyon 68030 15-5 Comet32 P90 14.5-5.5
Kallis 486/66 33-7 SPARC 20 MHz 14.5-5.5 Meph. RISC 18.5-1.5
Chess M. King 4-0 Sapphire 19.5-0.5
The "cooking" has this far given MCPro5 32 ratingpoints, (and not more than 100!)
32 points is not much, but of course it looks much better to be No 2 than No 9!
Well, I will follow this thread hoping that anybody will have an acceptable
solution to take out the plus effect of killer books! Of course I also am inte-
rested in "the real" playing strength of the programs. But please remember, that
the opening library must be one part of this strength.
I think that only the programmers can do anything about this, I don't really
think that the SSDF can prevent it. I agree with Chris Whittington, that learning
functions can help out. And I will also remind, that a program with many, many
variations in it's library and with a good, random play will be more difficult to
trap with killer lines.
At the same time it would be nice for us humans to play against programs with
a wide variety of play.
Goran Grottling (who once started the SSDF rating list...)
PS. BTW, I can confirm your news about Rebel8 on the next rating list. After 511
games it has a rating of 2479 and is indeed the new Number One! Here are our results
so far:
1 Rebel 8.0 Pentium 90 MHz, 2479
Genius 3 P90 11.5-8.5 MCPro 5.0 P90 13.5-6.5 Rebel 6.0 P90 10.5-9.5
Rebel 7.0 P90 12.5-7.5 Genius 4 P90 9.5-10.5 CM5000 P90 1.5-0.5
Hiarcs 4 P90 11.5-8.5 Geniu4 486/66 11.5-8.5 Nimzo 3.0 P90 13.5-6.5
Hiarcs 3 P90 10.5-9.5 MCPro 4.0 P90 15-5 Geniu3 486/66 15-5
Fritz 3.0 P90 15.5-4.5 Rebel7 486/66 15-5 MCPro5 486/66 19-1
Geniu2 486/66 11.5-8.5 Kallis198 P90 15-5 WChess P90 11-9
Hiarc3 486/66 15-5 Rebel6 486/66 14-6 ChGen1 486/66 9.5-1.5
Decade P90 12-8 MCPr35 486/66 15.5-4.5 Fritz3 486/66 17-3
Lyon 68030 14-6 Kallis 486/66 20.5-3.5 SPARC 20 MHz 10-3
Meph. RISC 1-0
Next official list will appear October 23.