Stobor's BT results... compare?

Tom Kerrigan

unread,

May 10, 1996, 3:00:00 AM5/10/96

to

I just ran the BT suite. Here are the seconds per position on my 486/80.

1. 0 11. 4 21. 26
2. 387 12. 126 22. 1018
3. 844 13. 250 23. 532
4. 1964 14. 11 24. 898
5. 183 15. 64 25. 239
6. 12 16. 10 26. 23
7. 2014 17. 227 27. 24
8. 28 18. 376 28. 11
9. 32 19. 1872 29. 12
10. 216 20. 62 30. 483

This translates to...

486/80: 2160
p5/133: 2330
p6/200: 2402

I wonder if everybody would post figures for their programs so I can make sure I'm
at least in the right ballpark... Particularly the German programmers, as this
seems to be the German national test suite. :)

Thanks in advance,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

If God had wanted you to go around nude, He would have given you bigger
hands.

Chris Whittington

unread,

May 12, 1996, 3:00:00 AM5/12/96

to

The test often used now is BT2630 whicxh has superceded the previous BT suite. Some new and more difficult
positions have been added (13, 15,16,21,23,28,29). Most of these new positions are not solved by any program,
the idea of adding them was that they were considered the type of positin that the 'next-generation' of programs
might be able to solve. The old positions that were deleted were considered too easy for programs now.

Bednorz Tonissen position results for the updated BT2630 test were listed in May 1994, so there will be some
more up to date data somewhere presumably.

Test rules:
Put on infinite thinking.
Set up position. (NB programs like Genius start thinking *within* the setup process - so be quick !)
Leave for 15 minutes (900 secs) and then stop.
If position solution is showing at 15 minutes, then record initial time that this position was solved. If position
not solved, use solution time of 900 seconds.
NB. Finding solution and then changing mind within 15 minutes doesn't count as a solution.
NB. Finding solution, changing mind, then finding solution again - use the solution time for the *latest*
solving.

Formula for German/Swedish ELO:
Add all solution times together (in seconds, and use 900 as time for positions not solved)
Divide result by 30.
Substract this result from 2630 to get the ELO.

Then add 70-100 to convert to English ELO - contentious !
Or add 120-200 to convert to USCF - also contentious !

Problems with test.
Apart from all the usual criticisms like small subset, doesn't measure games but positions etc. etc., some very
sharp Austrians detected that a certain program (which shall be nameless) appeared to 'recognise' the BT
positions and solve them accordingly; (in Germany, the BT position results were being taken as close to
the Swedish rating list as a performance indicator and hence test performance was very important for
commercial programs).
The detective work was to invert the positions (eg black pieces became white and vice-versa) and then
get the program to solve as normal. The program in question appeared to have given the solution move
to any BT position that it detected a 0.15 pawn bonus. Of course the program didn't recognise the inverted
positions and failed to solve many of them, but those it did solve from the inverted position were given 0.15
pawn less evaluation !
Last year, I asked someone close to the team of the program in question about this. He didn't deny it but
said 'everybody does it' .......
Personally, I don't think everybody does do it (the Austrian detectives didn't report any other occurences,
and I think they tested several programs for such creative solving capability), but tests that have been in
existence for a while do have to be treated with caution.

Finally remember that the test positions were chosen so that the results would correlate with the Swedish
rating list of the time, which placed Genius at the top. One could also argue that the test just contains positons
that the 'top' programs are good at solving.

There are many more program tested than the ones below, but it would take me too long to type them all in,
so subset of results only (and the Chess System Tal results are from the current experimental version, tested
as of today - and no, CST doesn't recognise, nor is it tuned in to, the BT positions ! also CST is currently
operating without its endgame code, with it it has solved quickly positions 2,5,23,25, the final version will
probably test out at around 2375 or so)

*apologies for the shattering of the formating of the table below*
*there should be (and are when rearranged) figures below each column)*

Solution Genius2 Rebel 6 CM4000 Fritz 3 MChessPro Hiarcs C System Tal
P 90 486/66 486/66 486/66 486/50 486/50 P 90
1 Nxg7 0 1 19 2 3 1 0
2 Bxb6 0 1:13 49 1:14 25 6:41 1:12
3. Re6 0 44 21 5 12 1:07 11
4. Qf7 3:56 - - - - 11:42 -
5 Ka6 39 1 1:13 5 11 1:26 9:39
6 ..e3 0 33 17 1:16 4 14 3
7 ..Rd6 0 27 7 2 4 14 1
8. Rxc6+ 0 29 2:50 2 1 1 10
9. ...g5 12:41 3 10 - 33 12:06 10:44
10 Rxg7+ 42 21 27 2:39 14 3:51 38
11 ...Qxh2 0 5 23 24 35 2 1
12 ...Qe4 16 2:05 2 29 3:24 - 2
13 ...Be6 - - - - - - 22
14 Rxh7 10 5 3 7 29 4 3
15 e5 - - - 3:47 - - -
16 ...Nxg2 - - - 57 - - 3
17 Qxf4 0 1:48 6 12 21 24 2
18 d6 0 24 1:12 6 1:35 2:08 14
19 ...f3 12 53 1:31 1:12 1:38 - 3:41
20 Ra2 2:54 11 27 8 6:46 4:03 9
21 Re1 - - - - - - -
22 a3 21 9 58 - 19 4 2:26
23 g4 1:35 - - - 13:05 - -
24 g6 18 - - 1:23 - 9:52 8:32
25 ...Nd3 20 - 3:37 - - - -
26 f5 - 1 13 - 1 - 1:33
27 e6 34 1:59 40 4:15 11 1:11 -
28 e5 - - - - - - 2
29 O-O-O - 2:42 - - - - -
30 f4 41 10 14:23 9:26 - 1 -

ELO 2369 2331 2300 2274 2270 2190 2308

Position data is as follows:
Sorry, but I use a slightly strange version of EPD.
lower case=black
board starts at a8.
So in the first position below: black rook a8, black queen b8, black rook e8, black king g8
black pawn f7 etc. ......
white to play, key move f5g7

30 positions : 30 Bednorz. BT=2630
rq2r1k1/5pp1/p7/4bNP1/1p2P2P/5Q2/PP4K1/5R1R/w f5g7 C
6k1/2b2p1p/ppP3p1/4p3/PP1B4/5PP1/7P/7K/w d4b6 C
5r1k/p1q2pp1/1pb4p/n3R1NQ/7P/3B1P2/2P3P1/7K/w e5e6 C
5r1k/1P4pp/3P1p2/4p3/1P5P/3q2P1/Q2b2K1/B3R3/w a2f7 C
3B4/8/2B5/1K6/8/8/3p4/3k4/w b5a6 C
1k1r4/1pp4p/2n5/P6R/2R1p1r1/2P2p2/1PP2B1P/4K3/b e4e3 C
6k1/p3q2p/1nr3pB/8/3Q1P2/6P1/PP5P/3R2K1/b c6d6 C
2krr3/1p4pp/p1bRpp1n/2p5/P1B1PP2/8/1PP3PP/R1K3B1/w d6c6 C
r5k1/pp2p1bp/6p1/n1p1P3/2qP1NP1/2PQB3/P5PP/R4K2/b g6g5 C
2r3k1/1qr1b1p1/p2pPn2/nppPp3/8/1PP1B2P/P1BQ1P2/5KRR/w g1g7 C
1br3k1/p4p2/2p1r3/3p1b2/3Bn1p1/1P2P1Pq/P3Q1BP/2R1NRK1/b h3h2 C
8/pp3k2/2p1qp2/2P5/5P2/1R2p1rp/PP2R3/4K2Q/b e6e4 C
2bq3k/2p4p/p2p4/7P/1nBPPQP1/r1p5/8/1K1R2R1/b c8e6 C
3r1rk1/1p3pnp/p3pBp1/1qPpP3/1P1P2R1/P2Q3R/6PP/6K1/w h3h7 C
2b1q3/p7/1p1p2kb/nPpN3p/P1P1P2P/6P1/5R1K/5Q2/w e4e5 C
2krr3/pppb1ppp/3b4/3q4/3P3n/2P2N1P/PP2B1P1/R1BQ1RK1/b h4g2 C
4r1k1/p1qr1p2/2pb1Bp1/1p5p/3P1n1R/3B1P2/PP3PK1/2Q4R/w c1f4 C
8/4p3/8/3P3p/P2pK3/6P1/7b/3k4/w d5d6 C
3r2k1/pp4B1/6pp/PP1Np2n/2Pp1p2/3P2Pq/3QPPbP/R4RK1/b f4f3 C
r4rk1/5p2/1n4pQ/2p5/p5P1/P4N2/1qb1BP1P/R3R1K1/w a1a2 C
k7/8/PP1b2P1/K2Pn2P/4R3/8/6np/8/w e4e1 C
rnb1k2r/pp2qppp/3p1n2/2pp2B1/1bP5/2N1P3/PP2NPPP/R2QKB1R/w a2a3 C
8/7p/8/p4p2/5K2/Bpk3P1/4P2P/8/w g3g4 C
R7/3p3p/8/3P2P1/3k4/1p5p/1P1NKP1P/7q/w g5g6 C
8/8/3k1p2/p2BnP2/4PN2/1P2K1p1/8/5b2/b e5d3 C
2r3k1/pbr1q2p/1p2pnp1/3p4/3P1P2/1P1BR3/PB1Q2PP/5RK1/w f4f5 C
3r2k1/p2r2p1/1p1B2Pp/4PQ1P/2b1p3/P3P3/7K/8/w e5e6 C
rnb1k1nr/p2p1ppp/3B4/1p1N1N1P/4P1P1/3P1Q2/PqP5/R4Kb1/w e4e5 C
r1b1kb1r/pp1n1ppp/2q5/2p3B1/Q1B5/2p2N2/PP3PPP/R3K2R/w e1c1 C
2k5/2p3Rp/p1pb4/1p2p3/4P3/PN1P1P2/1P2KP1r/8/w f3f4 C

best of luck

Chris Whittington

Tom Kerrigan

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

Hum, looks like either nobody has the BT test or nobody has a chess program! Just
my luck...

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

Harrisberger's Fourth Law of the Lab:
Experience is directly proportional to the amount of equipment
ruined.

Ed Schröder

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

Chris Whittington wrote..

>Problems with test.
>Apart from all the usual criticisms like small subset, doesn't measure
>games but positions etc. etc., some very sharp Austrians detected that a
>certain program (which shall be nameless) appeared to 'recognise' the
>BT positions and solve them accordingly; (in Germany, the BT position
>results were being taken as close to the Swedish rating list as a
>performance indicator and hence test performance was very important
>for commercial programs). The detective work was to invert the
>positions (eg black pieces became white and vice-versa) and then get
>the program to solve as normal. The program in question appeared to have
>given the solution move to any BT position that it detected a 0.15 pawn
>bonus. Of course the program didn't recognise the inverted positions and
>failed to solve many of them, but those it did solve from the inverted
>position were given 0.15 pawn less evaluation !

It's obvious Chris is refering to Rebel 7.0

I don't know why Chris wants this to be a secret because the subject
has been discussed in a few magazines openly.

For details you can check:
CSS, 95/3 page 51
PC SCHACH, 95/1 page 21
For those who can read German.

The system involved was simply the beginning of a learning feature
like several other chessprograms have and the suggestion that I was
trying to fool customers by manipulating 3-4 positions of the BT2630
test is not only incredible but also unjustified.

>Last year, I asked someone close to the team of the program in question
>about this. He didn't deny it but said 'everybody does it' .......

Chris, I don't know who this could have been, but you could have
asked *ME* straightaway, I was in the same tournament hall as you know.

>Finally remember that the test positions were chosen so that the results
>would correlate with the Swedish rating list of the time, which placed
>Genius at the top.

Well, it looks like you make it a habit spreading bad rumours about
programmers. I remember about 2 months ago you were also accusing
Richard Lang for deliberately weakening his Genius program here in
the newsgroup.

I know Richard Lang personally as a very fair person and I refuse to
believe this kind of unfounded bullshit.

- Ed Schroder -

Robert Hyatt

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

In article <4n70be$g...@news.xs4all.nl>,
Ed Schröder <10065...@compuserve.com> wrote:
-->Chris Whittington wrote..
-->
-->>Problems with test.
-->>Apart from all the usual criticisms like small subset, doesn't measure
-->>games but positions etc. etc., some very sharp Austrians detected that a
-->>certain program (which shall be nameless) appeared to 'recognise' the
-->>BT positions and solve them accordingly; (in Germany, the BT position
-->>results were being taken as close to the Swedish rating list as a
-->>performance indicator and hence test performance was very important
-->>for commercial programs). The detective work was to invert the
-->>positions (eg black pieces became white and vice-versa) and then get
-->>the program to solve as normal. The program in question appeared to have
-->>given the solution move to any BT position that it detected a 0.15 pawn
-->>bonus. Of course the program didn't recognise the inverted positions and
-->>failed to solve many of them, but those it did solve from the inverted
-->>position were given 0.15 pawn less evaluation !
-->
-->It's obvious Chris is refering to Rebel 7.0
-->
-->I don't know why Chris wants this to be a secret because the subject
-->has been discussed in a few magazines openly.
-->
-->For details you can check:
-->CSS, 95/3 page 51
-->PC SCHACH, 95/1 page 21
-->For those who can read German.
-->
-->The system involved was simply the beginning of a learning feature
-->like several other chessprograms have and the suggestion that I was
-->trying to fool customers by manipulating 3-4 positions of the BT2630
-->test is not only incredible but also unjustified.
-->
-->>Last year, I asked someone close to the team of the program in question
-->>about this. He didn't deny it but said 'everybody does it' .......
-->
-->Chris, I don't know who this could have been, but you could have
-->asked *ME* straightaway, I was in the same tournament hall as you know.
-->
-->>Finally remember that the test positions were chosen so that the results
-->>would correlate with the Swedish rating list of the time, which placed
-->>Genius at the top.
-->
-->Well, it looks like you make it a habit spreading bad rumours about
-->programmers. I remember about 2 months ago you were also accusing
-->Richard Lang for deliberately weakening his Genius program here in
-->the newsgroup.
-->
-->I know Richard Lang personally as a very fair person and I refuse to
-->believe this kind of unfounded bullshit.
-->
-->- Ed Schroder -
-->

This is exactly why, when asked what program to buy, I give my own opinion,
but also suggest that anyone give the programs a "test drive". I'd be willing
to bet that given a year or two at most, I know how to write a program that would
climb to the top of the Swedish list. However, this same program would very
likely fare very poorly against strong human opponents. Would that be fair?
Who knows? As I've said, playing well against computers is one thing, playing
well against humans is another, playing well against *both* is yet another.

These problem suites likely represent a really bad source of information for
several reasons, the most obvious being that once a suite is used, it's not too
difficult to optimize a program for them. If you want to investigate this, take
the CCR test and run it on Crafty. The last time I did, I got numbers in the
40+ range. Take the rating this produces, and then compute crafty's rating by
playing a match against Genius. You'll find the two ratings significantly
different. Which rating is more accurate? Who knows, maybe I have a killer
anti-Genius book. :) Or maybe I have an anti-anti-test-suite approach. :)

Personally, if I were selling Crafty, I'd probably do anything I could to make it
appear as strong as possible. Anti-computer play, "cooking suites", etc. It seems
no more "far out" to cook suites than it does to "cook books"... :)

Bottom line simply must be "play it yourself" before you buy anything. When there's
enough programs out there, statistical analysis of ratings on suites, vs ratings in
OTB tournaments will likely be interesting. I'd have to say that Ed deserves a
pat on the back for ingenuity. :)
--
Robert Hyatt Computer and Information Sciences
hy...@cis.uab.edu University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170

Vincent Diepeveen

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

>Ed Schroeder wrote...

>>Chris Whittington wrote..
>>Problems with test.

[About the difference in solving known testsets problems and only inverting them]

>It's obvious Chris is refering to Rebel 7.0

To speak more generally, it is very STRANGE that
certain problems are solved quickly by most commercial programs that are in test
sets, where similar problems are not found that quickly.

A testset should be secret!

>I don't know why Chris wants this to be a secret because the subject

>has been discussed in a few magazines openly.

>For details you can check:

>CSS, 95/3 page 51

>PC SCHACH, 95/1 page 21

>For those who can read German.

>The system involved was simply the beginning of a learning feature

>like several other chessprograms have and the suggestion that I was

>trying to fool customers by manipulating 3-4 positions of the BT2630

>test is not only incredible but also unjustified.

For example, take Genius and some of the BK-testset problems.
Now make a bad positional move A (tactical not loosing) after entering the position,
take it back, and let the program think about it. It very probably will
prefer the bad positional move A.
where usually however a user will only input the right move.
Now Genius will very quickly find the answer.

Besides that the 'learning' algorithm is very bad (although i also wouldn't know
how to make it better; i guess it is not possible to program intelligence),
most users knowing nothing about chessprograms get a
wrong impression, which can be seen as a form of cheating.

>I know Richard Lang personally as a very fair person and I refuse to

>believe this kind of unfounded bullshit.

Very fair persons can program dirty tricks.

May be i'll program it in next week... :)

Diep scores about 2300 currently at the BT2450 test, it simply needs a certain
plydepth before it finds a certain combination, which takes at least a certain time... :)

'Smart learning algorithms' will decrease this depth and easily get
its elo to 2400... :)

Vincent Diepeveen
vdie...@cs.ruu.nl
--
+--------------------------------------+
|| email : vdie...@cs.ruu.nl ||
|| Vincent Diepeveen ||
+======================================+

Chris Whittington

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

"Ed Schröder" <10065...@compuserve.com> wrote:
>
> Chris Whittington wrote..

>
> >Problems with test.
> >Apart from all the usual criticisms like small subset, doesn't measure
> >games but positions etc. etc., some very sharp Austrians detected that a
> >certain program (which shall be nameless) appeared to 'recognise' the
> >BT positions and solve them accordingly; (in Germany, the BT position
> >results were being taken as close to the Swedish rating list as a
> >performance indicator and hence test performance was very important
> >for commercial programs). The detective work was to invert the
> >positions (eg black pieces became white and vice-versa) and then get
> >the program to solve as normal. The program in question appeared to have
> >given the solution move to any BT position that it detected a 0.15 pawn
> >bonus. Of course the program didn't recognise the inverted positions and
> >failed to solve many of them, but those it did solve from the inverted
> >position were given 0.15 pawn less evaluation !
>

> It's obvious Chris is refering to Rebel 7.0
>

> I don't know why Chris wants this to be a secret because the subject
> has been discussed in a few magazines openly.

For many people it is probably not at all obvious that the program
in question was Rebel.

'Secret', because I posted the above not to generate a fight with you (I save
that for the ICCA :) ), but as a piece of info (already in the public
domain) illustrating a problem with test suites.
Isn't this the forum for such discussions / reports ?

>
> For details you can check:
> CSS, 95/3 page 51
> PC SCHACH, 95/1 page 21
> For those who can read German.
>
> The system involved was simply the beginning of a learning feature
> like several other chessprograms have and the suggestion that I was
> trying to fool customers by manipulating 3-4 positions of the BT2630
> test is not only incredible but also unjustified.

Your language - not mine.

>
> >Last year, I asked someone close to the team of the program in question
> >about this. He didn't deny it but said 'everybody does it' .......
>

> Chris, I don't know who this could have been, but you could have

> asked *ME* straightaway, I was in the same tournament hall as you know.

It was not this AEGON (where we spoke) but last year, I spoke with
your operator (I'm not sure of his name). True, could have spoken to
you, but you weren't right there at the time in question, and I was only
there for a short time.

>
> >Finally remember that the test positions were chosen so that the results
> >would correlate with the Swedish rating list of the time, which placed
> >Genius at the top.

Ed - I really am *not* trying to make a fight with you.

There was no reference to your name, and although this issue is in the
public domain since the publishing of the story last year in the
magazines you list, I deliberately didn't mention your name.

But now you answer it in this aggressive way, I ask:

1. The Austrians were first surprised when Rebel didn't solve one
of the BT positions when an unimportant pawn was placed on a different
square. In theory this pawn shouldn't have made any difference.

2. The Austrians then inverted the positions and ran the tests again.
They then allege a constant positive 0.15 pawn differential for the
solution move from the original position to the inverted position, in
several of the BT positions.

Question 1: Why ?

I can see that if you have an algorithm that detects 'already solved'
positions via a hash lookup (as Mchess 5 is alleged to have) then you
might well make an addition to the evaluation, to 'help' find the move
again if you have less time of something. Nothing wrong with that. But maybe
maybe it shouldn't apply to well known test positions.

Question 2: If this is the case, does Rebel have a BT ELO score higher
than if it were to meet the test positions for the first time ?

I've read all the Schach and Computer Spiele's since the original
article - and have not come across any comment or statement or
further information on the subject.
The implication of the original German article (or maybe a reading in
to it) was that something slightly naughty was going on. A rebuttal
from you at the time would have been in order.
My German is not perfect by any means, so I may have missed it; but,
if there was nothing printed:

Question 3. Did you send in a rebuttal of the original article ?
Was it printed ? If not, why not ?

>
> Well, it looks like you make it a habit spreading bad rumours about

> programmers. I remember about 2 months ago you were also accusing

> Richard Lang for deliberately weakening his Genius program here in

> the newsgroup.

>
> I know Richard Lang personally as a very fair person and I refuse to
> believe this kind of unfounded bullshit.
>

Now this is unreal.
1. It was not an accusation.
2, I suggested that his program has less knowledge now and more
speed to deal with fast program opposition (Fritz). I also predicted that
he would start restoring knowledge again now that the 'main' opposition,
namely Mchess, was a knowledge and not speed program.
This seems a reasonable and open subject for debate. Since nobody came
in to agree with my view, and you and one other disagreed (in a normal
debating way) I didn't mention it again.
3. The concept of 'deliberately weakening' is nonsense, obviously he's
trying to make it 'stronger' by taking out knowledge from the evaluation
and thereby gaining speed from the search. Bob Hyatt has the analogy of
the diameter and height of a cylinder to illustrate this phenomenom.
4. To suggest this as spreading a bad rumour is outrageously out of order.

Please tell me the function of rec.games.chess.computer if it is not
to have debate about issues of program strength/style and so on ...... ?

In this case, I have to say that it's *you* making the unfounded
accusation, an unsubtle twisting of my original post about Genius,
presumably with the intent of diverting attention from the initial
point above.

> - Ed Schroder -
>
>
>

OK, so my style is controversial, various people (as well as you) have
got upset. Personally I believe in openness, there is a great deal
of 'behind the scenes' gossip in the computer chess world (and I'm not
referring to you, Ed), hardly surprising when occupied by those great
co-operators and giants of social intercouse, programmers and chess
players. Sometimes, in the past few years, I've been on
the receiving end of the gossip (all unjustified, of course :)).

Personally, I think some controversy and asking of difficult questions
makes life (a) more interesting and (b) more understandable. But if you'ld
like me to just shut up, then please say so.

Best wishes,

Chris Whittington

unread,

May 13, 1996, 3:00:00 AM5/13/96

to

kerr...@frii.com (Tom Kerrigan) wrote:
>
> Hum, looks like either nobody has the BT test or nobody has a chess program! Just
> my luck...
>
> Cheers,
> Tom
>

Come on , Tom :)

My reply above gave what you wanted - namely figures for various programs.
The thread got diverted into a did Rebel or didn't Rebel, but the
figures are there.

test suites are dead useful in seeing what other programs can do.
if they can do it, then why can't I sort of thing.

that's why the BT test suites are so neat, because of the large
number of results quoted it is possible to not only look for
reasons why one's own program doesn't perform, but also look
for patterns in the solving capability and style of other programs.

best regards,

Chris Whittington

Tom Kerrigan

unread,

May 14, 1996, 3:00:00 AM5/14/96

to

Hum, my last post was pretty sarcastic because I hadn't received any replies to
this thread for a few days at least (maybe a week? can't remember...). I just now
realize that there has been a bit of discussion, and I apologize if anybody took
offense.

I hope to have Stobor's results on the new BT26?? soon!

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

God is Dead
-- Nietzsche
Nietzsche is Dead
-- God
Nietzsche is God
-- The Dead

Vincent Diepeveen

unread,

May 14, 1996, 3:00:00 AM5/14/96

to

Last night i tested Diep at the 2630 test.

Result at a P100: 2290 currently, but i know that this means:
It is better to say: between 2200 and 2400

>God is Dead
> -- Nietzsche
>Nietzsche is Dead
> -- God

It is much better so than the extended version!

Chris Whittington

unread,

May 14, 1996, 3:00:00 AM5/14/96

to

how did you do on the 'new' positions? hardly any of the current
top programs can solve any of them (except for CST of cource :))

Chris Whittington

Eric Hallsworth

unread,

May 14, 1996, 3:00:00 AM5/14/96

to

In article <4n8j3q$7...@europa.frii.com>, Tom Kerrigan
<kerr...@frii.com> writes

>
>_______________________________________________________________________________
>Tom Kerrigan kerr...@frii.com O-
>

>God is Dead
> -- Nietzsche
>Nietzsche is Dead
> -- God

>Nietzsche is God
> -- The Dead

Jesus is Alive... the Bible
--
Best wishes,
Eric Hallsworth, Computer Chess Magazine, The Red House,
46 High Street, Wilburton, Cambs CB6 3RA

Ed Schröder

unread,

May 14, 1996, 3:00:00 AM5/14/96

to

Rebel and the BT test...
As explained in my previous posting, the system involved was a first
attempt to write a total new learning algorithm. Not doing it like
other commercial chessprograms but by a total new system. It will
be operational next year (we hope).

Now I should be a fool to explain my ideas here in the newsgroup *at
this moment* so it will a remain a secret till than how the system
works. It's my idea and for the moment I like to keep it that way.
Logical enough?

>In this case, I have to say that it's *you* making the unfounded
>accusation, an unsubtle twisting of my original post about Genius,
>presumably with the intent of diverting attention from the initial
>point above.

Chris, read the above, that's all their is to say.

AEGON tournament...
Maybe you can't remember anymore but we did speak each other at the
AEGON tournament both *this* year and also *last* year.
Both years I was present their every evening.
So something is wrong here.

Genius...
I remembered it very well you did say:
Richard Lang deliberately weakened his program.
I answered this was highly unlikely.

- Ed Schroder -

Tom Kerrigan

unread,

May 15, 1996, 3:00:00 AM5/15/96

to

Ed, it seems to me that the score on this test suite depends on how fast the
program can solve the problems. From what I've seen so far, this learning feature
that you're talking about is specific to positions, so it follows that if the
feature is helping you on the tests then the program has seen the position before.
It also seems to me that if the position has been seen before, the time it spent
searching *then* should be added to the time it took for the run "now". Correct me
if I'm wrong, of course, but lowering the time it took to solve the position is
cheating, and by ignoring the time spent searching the positions previously you're
doing just this.

Cheers,
Tom

P.S. I haven't converted the new test to an EPD file, so I haven't run it yet. I
hope to do this sometime tonight.

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

"Two sure ways to tell a sexy male; the first is, he has a bad memory.
I forget the second."

Anders Thulin

unread,

May 15, 1996, 3:00:00 AM5/15/96

to

In article <4n7c94$r...@krant.cs.ruu.nl>,
Vincent Diepeveen <vdie...@cs.ruu.nl> wrote:

>A testset should be secret!

That would be a very effective way of withdrawing it from criticism.

Test sets are tricky animals: there is usually such a lot of hidden
assumptions in them. Some people assume that these assumptions are
eternal verities, and thus get very upset if it turns out that they
aren't.

A good test set should document all such assumptions, and, if
necessary, the method used for verifying them. If there is no such
documentation, there is no way to decide if the test tests anything in
particular, or if it just another excercise in wishful thinking.

Does the BT test document these assumptions? Or is there any other
way of deciding if it still is a useful test to apply?

--
Anders Thulin Anders...@lejonet.se 013 - 23 55 32
Telia Research AB, Teknikringen 2B, S-583 30 Linkoping, Sweden

Ed Schröder

unread,

May 15, 1996, 3:00:00 AM5/15/96

to kerr...@frii.com

kerr...@frii.com (Tom Kerrigan) wrote:

>Ed, it seems to me that the score on this test suite depends on how fast
>the program can solve the problems. From what I've seen so far, this
>learning feature that you're talking about is specific to positions, so
>it follows that if the feature is helping you on the tests then the
>program has seen the position before.
>It also seems to me that if the position has been seen before, the
>time it spent searching *then* should be added to the time it took for >the run "now".
>Correct me if I'm wrong, of course, but lowering the time it took
>to solve the position is cheating, and by ignoring the time spent
>searching the positions previously you're doing just this.
>Cheers,
>Tom

Hello Tom,

The first program with a learn option was a Fidelity dedicated chess
computer. I forgot the name but it's certainly 3-4 years ago and it was
a program of Dan and Kate Spracklen. It was hot news at that time.

Afterthat most commercial PC programs took over this idea (and ofcourse
other chessprogrammers too).

Remember a learning feature (although not big) is a selling argument.
Many customers want to see it in a chessprogram.

All todays learning systems are position based (as far as I know).

We also have planned a new (but unique) learn feature for the future.

- Ed Schroder -

Tom Kerrigan

unread,

May 15, 1996, 3:00:00 AM5/15/96

to

Ed, thanks for the info but I think I understand the basic idea behind modern
chess program learning. That wasn't the point of my post. My point was that
REBEL's BT score was inflated, explanation or no. From the tone of your posts I
got the feeling that you were using the explanation as some sort of excuse, and I
see that as inappropriate. If you weren't, then I'm sorry to have wasted your time.

Anyway! Here are Stobor's times on the BT2630:

1. 0 11. 5 21. -
2. 666 12. 128 22. -
3. - 13. - 23. -
4. - 14. 12 24. 692
5. 108 15. 797 25. 346
6. 12 16. - 26. 25
7. - 17. 232 27. 27
8. 31 18. 461 28. -
9. 34 19. - 29. -
10. 236 20. 68 30. 824

This translates to 2143. Clearly not in a professional class whatsoever. I'll look
at a few of the positions tonight to find any glaring errors. I notice that every
program save mine solves position 3 in a few seconds, so I'll probably start with
that. In any case, this should give you an idea of where Stobor stands at this point.

I wonder how other, non-professional programs fare on this test? Ferret, Crafty,
Shredder, Gromit, Francesca, etc...?

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

"You've got to think about tomorrow!"

"TOMORROW! I haven't even prepared for *_________ yesterday* yet!"

Chris Whittington

unread,

May 15, 1996, 3:00:00 AM5/15/96

to

"Ed Schröder" <10065...@compuserve.com> wrote:
>
snip-snip

>
> Genius...
> I remembered it very well you did say:
> Richard Lang deliberately weakened his program.
> I answered this was highly unlikely.
>
> - Ed Schroder -
>
>

Ed, you'll be pleased to know that I found the original post about
which we disagree.

Original post starts ********

>On the other issue of Richard Lang insertion into Deep Blue team, it
>seems that contrary to general impressions Ricahrd has been
>*removing* knowledge from hsi program for the past 2 - 3 years.
>
>the reason being that over this time period the main commercial
>threat has been from the strong and well marketed program Fritz in
>Ricjards most important market place, Germany. To keep ahead of
>Fritz (he targeted Mchess before this) he took out knowledge (particular,
>king attack terms) in order to fight Fritz. Now that Fritz is beaten
>(in terms of the Swedish list), Richard will turn his attention back
>to Mchess, and will therefore have to restore knowledge.
>
>I'ld be interested in what the two other commercial PC programmers on
>this group (Mark Uniacke and Ed Scroder) and others think of
>this analysis.
>
>Best regards,
>
>Chris Whittington

original post ends ********

OK, I concede that I could have made it absolutely clear, and not
relied on common assumptions. It does surprise me, Ed, that you would
need it spelt out so pedantically.

Instead of saying:

'To keep ahead of Fritz (he targeted Mchess before this) he took
out knowledge (particular, king attack terms) in order to fight Fritz'

I could have said:

'To keep ahead of Fritz (he targeted Mchess before this) he took
out knowledge (particular, king attack terms) in order to *GAIN SPEED
AND* fight Fritz'

My assumption was that this was implied - since we all know that
Fritz is a nodes per second and nodes per second and nodes per second
program.

Clear enough now ?

Best regards,

Chris Whittington

unread,

May 15, 1996, 3:00:00 AM5/15/96

to

"Ed Schröder" <10065...@compuserve.com> wrote:
>
> Rebel and the BT test...
> As explained in my previous posting, the system involved was a first
> attempt to write a total new learning algorithm. Not doing it like
> other commercial chessprograms but by a total new system. It will
> be operational next year (we hope).
>
> Now I should be a fool to explain my ideas here in the newsgroup *at
> this moment* so it will a remain a secret till than how the system
> works. It's my idea and for the moment I like to keep it that way.
> Logical enough?

That's logical, but the questions remain unanswered ........

1. Does Rebel do better at the BT tests than it would if it met
them cold ?

2. Since the two German articles went into print last year, you didn't
seem to rebut them. Why not ?

>
> >In this case, I have to say that it's *you* making the unfounded
> >accusation, an unsubtle twisting of my original post about Genius,
> >presumably with the intent of diverting attention from the initial
> >point above.
>

> Chris, read the above, that's all their is to say.

No it isn't, but its probably all you want to say :)

>
> AEGON tournament...
> Maybe you can't remember anymore but we did speak each other at the
> AEGON tournament both *this* year and also *last* year.
> Both years I was present their every evening.
> So something is wrong here.

We spoke this year, but not about the issue in question, its an old
issue.
Last AEGON (last year), I introduced myself to you, and said 'lets speak
later' - but later never came.

>
> Genius...
> I remembered it very well you did say:
> Richard Lang deliberately weakened his program.

I remember it better. I said I thought he had removed knowledge from
his program, in order to make it faster. That he had done this because
his real threat in the important German market came from Fritz (a fast
program), and he was targeting Fritz as a program to stay ahead of.
Now that the threat of Fritz has receded, he will likely be targeting
Mchess as his dangerous opponent. Mchess is an intelligence program, so
Lang will restore intelligence (and lose speed probably).

This is certainly not saying 'he deliberately weakened his program', and
it is disingenious of you to suggest so.

You should well know that speed and knowledge are trade-offs.
And you should also well know that it would be totally stupid and
ridiculous to suggest that any programmer would 'deliberately weaken his
program'.

> I answered this was highly unlikely.

This quick answer, I remember this too, indicates that you simply
didn't understand what was posted at the time.
You may not agree with my analysis of Genius, but its not a light
analysis - it was reached after some thought and experience of many test
games and autoplayer. Maybe its true/false/partially true or
rubbish - but certainly debatable.

So, rather than uncharitably accuse you of lying by distorting
my post, I shall assume that you simply misunderstood.
Easily done, I suppose.

Anyway, pat the Genius side-issue, are you going to answer the
questions posed? You certainly don't need to reveal any trade
secrets in giving a straightforward yes or no to the first and
a reason for allowing German readers
of Schach and Computer Spiele to 'make their own minds up' on the
original published allegations rather than give them the benefit of
your own view by a rebutal at the time ?

Chris Whittington

>
> - Ed Schroder -
>
>

Chris Whittington

unread,

May 16, 1996, 3:00:00 AM5/16/96

to

Hi Tom,

Results indicate:

a) no special case endgame code to deal with passed pawn and single
pieces - position 2

b) check and check threat extensions need something more - positions 3,10,12

c) no perpetual check draw detection code, or else short on check
extensions - position 7

BTW - what do you think of the next WMCCC taking place in Indonesia ?

Long way to travel ......

Best regards

Chris Whittington

Bruce Moreland

unread,

May 16, 1996, 3:00:00 AM5/16/96

to

Here is an updated BT2630 table, including my results for Ferret (on two
different machines, so people can compare P6/200 with P5/133 if they wish).

Of the hard problems, Ferret solved #21 fully, it got a score of +2 pawns
at 11 plies and +4 pawns at 14 plies. The program got the solution move
for #28, but is still at -2 pawns at 11 plies. It gets #23 at +6 after 31
minutes, which obviously doesn't count as far as the test is concerned.

Ferret does very badly on some of these, it will be fun to figure out why.
It is the worst program on #1, for instance, even on nice hardware.

Please feel free to add to this table, or to update hardware for the
commercial programs. Also, please feel free to send results directly to
me and I'll put them in my spreadsheet and post them.

I still don't think that suites can provide an accurate estimate of ELO,
it is too easy to cheat and/or manicure, they don't reflect the skill of
the program at getting to a position where it has a tactical win, the
sample is very small, and often they represent pathological cases that do
not occur in real games other than rarely.

bruce

PS. When I computed the score for Chess System Tal I got 2310 instead
of 2308.

Genius2 CM4000 MChess CSTal Ferret Stobor
P5/90 486/66 486/50 P5/90 P6/200 486/80
| Rebel6 | Fritz3 | Hiarcs | Ferret | Crafty |
| 486/66 | 486/66 | 486/50 | P5/133 | P6/200 |

1 Nxg7 0 1 19 2 3 1 0 349 142 59 0
2 Bxb6 0 73 49 74 25 401 72 6 3 69 666
3 Re6 0 44 21 5 12 67 11 205 88 5 900
4 Qf7 236 900 900 900 900 702 900 191 80 900 900
5 Ka6 39 1 73 5 11 86 579 0 0 0 108
6 e3 0 33 17 76 4 14 3 0 0 4 12
7 Rd6 0 27 7 2 4 14 1 0 0 1 900
8 Rxc6+ 0 29 170 2 1 1 10 1 0 0 31
9 g5 761 3 10 900 33 726 644 11 5 1 34
10 Rxg7+ 42 21 27 159 14 231 38 7 3 18 236
11 Qxh2 0 5 23 24 35 2 1 1 0 5 5
12 Qe4 16 125 2 29 204 900 2 11 5 17 128
13*Be6 900 900 900 900 900 900 22 900 900 900 900
14 Rxh7 10 5 3 7 29 4 3 6 2 2 12
15*e5 900 900 900 217 900 900 900 900 900 118 797
16*Nxg2 900 900 900 57 900 900 3 900 900 900 900
17 Qxf4 0 108 6 12 21 24 2 91 39 12 232
18 d6 0 24 72 6 95 128 14 17 9 72 461
19 f3 12 53 91 72 95 900 221 246 105 34 900
20 Ra2 174 11 27 8 406 243 9 53 23 4 68
21*Re1 900 900 900 900 900 900 900 246 103 900 900
22 a3 21 9 58 900 19 4 146 900 900 17 900
23*g4 95 900 900 900 785 900 900 900 900 39 900
24 g6 18 900 900 83 900 592 512 56 23 250 692
25 Nd3 20 900 217 900 900 900 900 30 13 12 346
26 f5 900 1 13 900 1 900 93 0 0 39 25
27 e6 34 119 40 255 11 71 900 1 0 118 27
28*e5 900 900 900 900 900 900 2 179 74 900 900
29*O-O-O 900 162 900 900 900 900 900 900 900 900 900
30 f4 41 10 863 566 900 1 900 19 9 423 824
ELO 2369 2331 2300 2275 2270 2190 2310 2392 2426 2406 2143

* = A position considered to be "hard"

Chris Whittington

unread,

May 17, 1996, 3:00:00 AM5/17/96

to

Hi Bruce,

If you can face the typing, I've got a printed listing of 30-40
programs results for this test (ranging from almost all PC's prgs
and many dedicated machines).

If you have a fax, I can send them to you.

Best regards,

Chris Whittington

Ed Schroder

unread,

May 17, 1996, 3:00:00 AM5/17/96

to

>From: Chris Whittington <chr...@cpsoft.demon.co.uk>
>>"Ed Schröder" <10065...@compuserve.com> wrote:

>> Rebel and the BT test...
>> As explained in my previous posting, the system involved was a first
>> attempt to write a total new learning algorithm. Not doing it like
>> other commercial chessprograms but by a total new system. It will
>> be operational next year (we hope).
>>
>> Now I should be a fool to explain my ideas here in the newsgroup *at
>> this moment* so it will a remain a secret till than how the system
>> works. It's my idea and for the moment I like to keep it that way.
>> Logical enough?

>That's logical, but the questions remain unanswered ........

>1. Does Rebel do better at the BT tests than it would if it met
>them cold ?

>2. Since the two German articles went into print last year, you didn't
>seem to rebut them. Why not ?

>No it isn't, but its probably all you want to say :)

Chris, I will *NOT* answer your questions for 2 reasons:

1) As a chessprogrammer yourself (you are a well respected chessprogrammer
and I have a high opinion of your program) you *KNOW* very well that a
learning algorithm works with hashing and that a reversed position will
give a *TOTAL DIFFERENT* hashkey. You *KNOW* that, so you should also
know that the accusation of CSS that I have cheated 3-4 positions of the
BT-test based on their "reversed position" theory is unfounded due to
their lack of knowlegde since only chessprogrammers understand such
principles. So why answer your questions since you *KNOW* the answer
already? As a chessprogrammer yourself you *MUST* know that I have *NOT*
cheated the BT-test and still you want people to believe this.
Now I have a question .... Why Chris?

2) I did not like the *WAY* you introduced the subject.
To me you only wanted to trough some mud on REBEL.

I will explain myself concerning this...

I am not angry that you have mentioned the BT subject, but I am
angry about the *WAY* you brought it under the attention.

I see you as a very intelligent guy with smart ideas, you have
proven yourself enough in the past and I respect you very much.
But if you write: "a certain program (which shall be nameless)" a
few times, you stimulate peoples curiosity, since they would love
to KNOW who that *mysterious* program could be.

After your initial posting people:
a) Would have asked for the name of the program;
b) Some Europeans should have written ...
"Hey that's Rebel, it was in CSS and PC-Schach ...
What a bad impression that would make!

Here is my criticism to you Chris, as a smart guy as you are you could
have easily for seen this, and I really wonder if you didn't.
To me you planned it from the very first start to trough some mud on
Rebel.

I should have had *NO* problem with the whole subject if you had
played it openly, describing the subject and asking me ...
"Hey Ed, how about it?"....

But doing it in the way you did has made me angry.
You call it "controversial", I call it sneaky.

Chris, we are both *commercial* chessprogrammers, we both try to make
a living to make the best possible chess software. Now this kind of
discussions is *NOT* good for both of us.
We should fight each other in tournaments on the chessboard (not with
words) and take a beer after the game no matter the result.

The other 2 points...

>> AEGON tournament...
>> Maybe you can't remember anymore but we did speak each other at the
>> AEGON tournament both *this* year and also *last* year.
>> Both years I was present their every evening.
>> So something is wrong here.

>We spoke this year, but not about the issue in question, its an old
>issue. Last AEGON (last year), I introduced myself to you, and said
>'lets speak later' - but later never came.

Correct, so you could have asked me straightaway!

The Genius issue...
I remembered you were using the "deliberately weaken his program" term
very well. Since you have reposted the article and the sentence is not
there I must give you the benefit of the doubt.
I don't have the original message, so I resign here.
I guess I own you an apologize concerning the Genius issue.

- Ed Schroder -

Ed Schroder

unread,

May 17, 1996, 3:00:00 AM5/17/96

to

Chris Whittington <chr...@cpsoft.demon.co.uk> wrote:
>"Ed Schröder" <10065...@compuserve.com> wrote:
>>
> snip-snip
>>

>> Genius...
>> I remembered it very well you did say:
>> Richard Lang deliberately weakened his program.

>> I answered this was highly unlikely.
>>

Tom Kerrigan

unread,

May 17, 1996, 3:00:00 AM5/17/96

to

First, I found a quasi-bug in my program by examining a few of the positions where
I was screwing up. A few people have told me that they return a draw score if the
position happens twice, and then they told me that every commercial program does
this. It sounded like "common knowledge" so I coded it, and it didn't hurt a quick
WAC run, so I decided to keep it. What I wasn't told is that if a position occurs
once in the *search* then a draw score is returned. Anyway, this sillyness is
fixed and now Stobor gets around 2200 on the BT2630. I don't want to post the
times just yet, because this was a relatively simple fix and maybe there are a few
more to come. No need to flood the newsgroups with jillions of BT times. :)

Second, I'd like to address this to Ed, but I'm posting because Chris would
probably like to hear it. Ed: You don't seem to be responding to my posts, and I
thought that it was because our newsgroups are out of synch, but you are avoiding
the question when Chris asks it too. I think all Chris is doing is pointing out
that the REBEL scores are inflated, and you seem to deny this by saying that your
learning function is helping solve the problems. This is still cheating! For
example, I don't clear my hash tables, so I can search a BT position for a week,
and then say "I'm going to time this BT position" and get to 20 ply instantly with
the solution. Wouldn't you say this is cheating? How is this learning function of
yours different? It is by no means an excuse to inflate your score, only an
explanation.

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

A year spent in artificial intelligence is enough to make one believe
in God.

Chris Whittington

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

Ed, this makes no sense.
We know that the hash key is different for black and white to move,
same position.
We know that the hash key is is different for 'same' position, but with
black and white colours inverted. This is precisely what the Austrian/CSS
test revealed: namely that the original BT test positions appeared
to have been hashed, with the 'correct' move stored and the 'correct'
move given a 0.15 pawn bonus within the search tree.
With colours inverted there was no hash match, and the 'correct' move
therefore didn't get the bonus (and anyway the 'correct' move would
need inverting also, Nf5-g7 would need be be inverted to Nf4-g2 etc.).
The Austrian detectives thought of this and tested it, that's why
they published the original article - the big question to explain
is why would Rebel exhibit this feature ?

There are two posibilities for hash 'learning'.
a) that employed, we assume, by Mchess5 - which is to hash game positions
together with the time used, the move found, some sort of kludge flag
that states whether the move was 'good' or lead to a loss some moves
later. Then whenever the game is repeated (as games between chess programs
often do) Mchess5 has a mechanism for playing 'good' moves immediately,
and for either thinking longer on the 'bad' moves or for giving lines
arising from 'bad' moves some sort of fractional pawn penalty whenever
they appear in the search tree.

b) hashing test positions, so that whenever the program has to 'solve'
the test position again, the 'correct' move is given a little bonus
for all lines arising from it within the search tree.

I find (a) totally acceptable (although it came under some heavy attack
on this user group earlier in the year, personally I supported the approach).
i) it reduces the number of repeat games in computer-computer game series.
ii) it gives the program the intelligence to not make the same mistake again.
iii) when all program have this feature (as they surely will) then
the advantage that Mchess5 currently has will disappear, and the Swedish
rating list will not be based on so many duplicate games.

The question of whether the program should ship with a pre-stored
hash table of game moves so that an a priori advantage is gained
over competing programs is perhaps contentious.

I find (b) totally unacceptable. It seems to be a deception without any
advantage in terms of chess playing strength.
Leaving aside the issue of what Rebel does, we don't know, because you
won't say, maybe Rebel works like (a), whether by accident or design,
and maybe it doesn't;
I suugest that we should all agree a code of conduct on hash learning,
namely, that we all agree not to hash 'setup' positions (positions
which the program doesn't have a history move back to the standard
start position).

This code of conduct is a proposal to all programmers on r.g.c.c.
I'm not sure if a similar code ought or ought not to apply to shipping
product with type (a) game move hash tables pre-stored .......

As a side issue, with regard to hash learning and the Swedish rating
list:

My experience of autoplayer test games, and also a view given by
Goran Grottling, is that computer-computer games lead to many
games being repeated. Its difficult to say what the repeat rate
is, but it could be something like 2 out of a series of 10 games; and
with a series of 30 games or so, you will often have seen *all* the
discrete games that a pair of programs will play against each other.

This means that a program with hash learning, able to take avoidance
measures against bad game repeats, would play 'good' moves instantly
gaining time, could maybe win 1 extra game in 10.
Such an improvement would be worth 50+ ELO statistical ELO points, BUT
IT WOULDN'T REPRESENT ANY EXTRA CHESS PLAYING STRENGTH.
( I assume that hash learning doesn't make much difference with computer
human play - there are not so many game repeats; and humans will
already have strategies for not playing the same bad game twice.)

Extrapolating, if the 'top' programs all exhibit hash learning next year,
and the Swedes play them against each other BUT ALSO AGAINST the currently
rated programs - as they will have to to get a proper result, then
next years hash-learners can show 50-100 statistical ELO points more
WITHOUT ANY MORE CHESS KNOWLEDGE.

This effect comes from the Swedish list being computer-computer and
not computer-human - and, as Grottling has pointed out, there is
little he can do about this - logistically it is too problematical.

My contention is that this disparity between the Swedish list
and reality as relected by human ELO lists has probably been growing
for similar reasons for some time, and that it will continue to grow
in the future.

Several years ago the Swedish list was re-based (I think they
deducted 70 ELO's from all programs) on the grounds that humans
had learnt better to play against computers.

I'ld argue that we have come to a time for another rebasing. The
big question is by how much ?

> Now I have a question .... Why Chris?
>
> 2) I did not like the *WAY* you introduced the subject.
> To me you only wanted to trough some mud on REBEL.
>
> I will explain myself concerning this...
>
> I am not angry that you have mentioned the BT subject, but I am
> angry about the *WAY* you brought it under the attention.
>
> I see you as a very intelligent guy with smart ideas, you have
> proven yourself enough in the past and I respect you very much.
> But if you write: "a certain program (which shall be nameless)" a
> few times, you stimulate peoples curiosity, since they would love
> to KNOW who that *mysterious* program could be.

I only wrote it once.

>
> After your initial posting people:
> a) Would have asked for the name of the program;
> b) Some Europeans should have written ...
> "Hey that's Rebel, it was in CSS and PC-Schach ...
> What a bad impression that would make!

The only European who wrote 'it's Rebel' was yourself.

>
> Here is my criticism to you Chris, as a smart guy as you are you could
> have easily for seen this, and I really wonder if you didn't.
> To me you planned it from the very first start to trough some mud on
> Rebel.

I can absolutely assure you, Ed that I am not trying, nor did I set
out with a plan, to throw mud at Rebel.

Tbere have been two recent caes of the use of the 'shall be nameless'
method. Both from Bob Hyatt. One, some time ago, referred to somebody
who allegedly cheated as an operator in some series of tournaments - I
think Bob said something like 'we all know who it is'.
Second from Bob in the ICD/PBM business about someone who was thinking
of pirating his
program. In neither case did anyone follow up with 'who' (although
to be fair, is case 2, it didn't take much guessing).

In this issue, it would have taken a lot of guessing. You can tell
that there is almost no communication between what we do, and think
important, in Europe and the American programmers (and vice-versa).
We think the BT test is a big deal, in Germany it had almost
bible-like status.
These American guys had never seen the test (four of them put up
their results as soon as they knew it, one of them typed in a load
of figures into his speadsheet, another put the EPD's into the FTP
site) - they didn't even know the test, let alone who may have been
referred to. The facts clearly show they were interested in the test
*itself*, not all this stuff we're doing.

Nobody, in the past two 'nameless' cases made any attempt to point
fingers.
i) why would I assume that it would happen in this case ?
ii) I have nothing at all to gain from an argument with you.
iii) personally I don't care who was being referred to by Bob, it isn't
important in the context.
iv) I'ld prefer us not to be arguing at all - and for me, most
of these postings have been about defence from what you know accept
was an untrue counter-attack.

No, Ed, the reason you got dragged in was because you over-reacted.
Somehow, my posting was the red thermonuclear button on your desk, and
you pressed it.

It was absolutely not my intention, nor my prediction, that this would
happen - but since you responded by (untrue) counter-attack, I wasn't
left with much choice to pursue it.

>
> I should have had *NO* problem with the whole subject if you had
> played it openly, describing the subject and asking me ...
> "Hey Ed, how about it?"....
>
> But doing it in the way you did has made me angry.
> You call it "controversial", I call it sneaky.

I'm sorry you got angry, I guess that's my fault.
I was not being sneaky - that's not my way.

>
> Chris, we are both *commercial* chessprogrammers, we both try to make
> a living to make the best possible chess software. Now this kind of
> discussions is *NOT* good for both of us.
> We should fight each other in tournaments on the chessboard (not with
> words) and take a beer after the game no matter the result.

Actually, I thought I'ld try and persuade the ICCA to give me amateur
status :) After all, they do seem to give it out in strange cases.

>
> The other 2 points...
>
> >> AEGON tournament...
> >> Maybe you can't remember anymore but we did speak each other at the
> >> AEGON tournament both *this* year and also *last* year.
> >> Both years I was present their every evening.
> >> So something is wrong here.
>
> >We spoke this year, but not about the issue in question, its an old
> >issue. Last AEGON (last year), I introduced myself to you, and said
> >'lets speak later' - but later never came.
>
> Correct, so you could have asked me straightaway!
>

No, you were in the middle of a game, and operators can't take part in
conversations because they would miss moves and lose clock time.

>

genius stuff closed so snip-snip

>
> - Ed Schroder -
>
>

First rule of holes: when you're in one, stop digging.

Best regards,

Chris Whittington

Steven Schwartz

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

Chris and Ed.
I respect both of you fellows far too much to have to read this thread
which incorporates some intelligent observations mixed in with some
(escalating) personal bickering.

If I might be permitted an analogy here... think of a child (me) who
is adversely affected by two parents who are bickering. You love them
both and are torn by the battle.

The above insight has come from a lowly retailer.
For the words of a higher authority, I turn to Bob Hyatt
who posted on May 11 the following final response to the ICD/Your Move vs.

PBM "debate": "... any idiot should be capable of making a choice he/she

can live with after reading everything that's been posted. and re-posted.

and included. and re-included. ad nauseam."

I enjoy a good fight, and you gentlemen have had about 12 rounds. To
further quote Bob in his post, "It's much more fun to read about
computer chess issues and answers, not about computer *marketing* issues..
"
Well, I have now sampled both within the last two weeks, and I must say
that I enjoy the earlier marketing "battles" much better. We all have
our own agenda.

I would MUCH prefer to see you gentlemen beat each other up on the
chessboard; then we all benefit from your wisdom!

There is an awfully long way to go (maybe never) before the perfect
chess program is realized, and if it can be, you two are among very few
in the world who might make that dream a reality. There is an even more
attainable target... Garry Kasparov. It would be a shame
if mundane distractions like this thread (at least the personal aspects
of it) distracted you from those goals.

Let's call it a stalemate.

Regards, Steve (ICD/Your Move Chess & Games)

Robert Hyatt

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

In article <83242700...@cpsoft.demon.co.uk>,
Chris Whittington <chr...@cpsoft.demon.co.uk> wrote:
-->
-->Ed, this makes no sense.
-->We know that the hash key is different for black and white to move,
-->same position.
-->We know that the hash key is is different for 'same' position, but with
-->black and white colours inverted. This is precisely what the Austrian/CSS
-->test revealed: namely that the original BT test positions appeared
-->to have been hashed, with the 'correct' move stored and the 'correct'
-->move given a 0.15 pawn bonus within the search tree.
-->With colours inverted there was no hash match, and the 'correct' move
-->therefore didn't get the bonus (and anyway the 'correct' move would
-->need inverting also, Nf5-g7 would need be be inverted to Nf4-g2 etc.).
-->The Austrian detectives thought of this and tested it, that's why
-->they published the original article - the big question to explain
-->is why would Rebel exhibit this feature ?
-->
-->
-->There are two posibilities for hash 'learning'.
-->a) that employed, we assume, by Mchess5 - which is to hash game positions
-->together with the time used, the move found, some sort of kludge flag
-->that states whether the move was 'good' or lead to a loss some moves
-->later. Then whenever the game is repeated (as games between chess programs
-->often do) Mchess5 has a mechanism for playing 'good' moves immediately,
-->and for either thinking longer on the 'bad' moves or for giving lines
-->arising from 'bad' moves some sort of fractional pawn penalty whenever
-->they appear in the search tree.
-->
-->b) hashing test positions, so that whenever the program has to 'solve'
-->the test position again, the 'correct' move is given a little bonus
-->for all lines arising from it within the search tree.
-->
-->I find (a) totally acceptable (although it came under some heavy attack
-->on this user group earlier in the year, personally I supported the approach).
--> i) it reduces the number of repeat games in computer-computer game series.
--> ii) it gives the program the intelligence to not make the same mistake again.
--> iii) when all program have this feature (as they surely will) then
-->the advantage that Mchess5 currently has will disappear, and the Swedish
-->rating list will not be based on so many duplicate games.
-->
-->The question of whether the program should ship with a pre-stored
-->hash table of game moves so that an a priori advantage is gained
-->over competing programs is perhaps contentious.
-->
-->I find (b) totally unacceptable. It seems to be a deception without any
-->advantage in terms of chess playing strength.
-->Leaving aside the issue of what Rebel does, we don't know, because you
-->won't say, maybe Rebel works like (a), whether by accident or design,
-->and maybe it doesn't;
-->I suugest that we should all agree a code of conduct on hash learning,
-->namely, that we all agree not to hash 'setup' positions (positions
-->which the program doesn't have a history move back to the standard
-->start position).
-->
-->This code of conduct is a proposal to all programmers on r.g.c.c.
-->I'm not sure if a similar code ought or ought not to apply to shipping
-->product with type (a) game move hash tables pre-stored .......
-->
--> <snip>
-->
-->This effect comes from the Swedish list being computer-computer and
-->not computer-human - and, as Grottling has pointed out, there is
-->little he can do about this - logistically it is too problematical.
-->
-->My contention is that this disparity between the Swedish list
-->and reality as relected by human ELO lists has probably been growing
-->for similar reasons for some time, and that it will continue to grow
-->in the future.
-->
--> <snip>
-->
-->
-->Tbere have been two recent caes of the use of the 'shall be nameless'
-->method. Both from Bob Hyatt. One, some time ago, referred to somebody
-->who allegedly cheated as an operator in some series of tournaments - I
-->think Bob said something like 'we all know who it is'.
-->Second from Bob in the ICD/PBM business about someone who was thinking
-->of pirating his
-->program. In neither case did anyone follow up with 'who' (although
-->to be fair, is case 2, it didn't take much guessing).
-->

good point. I've tried to be careful. How, many know about (a) from
participating in the WMCCC or ACM events. It's been oft-discussed.

(b) deserves no further comment. should it happen, I'll be responsible
for taking whatever action is appropriate.

-->In this issue, it would have taken a lot of guessing. You can tell
-->that there is almost no communication between what we do, and think
-->important, in Europe and the American programmers (and vice-versa).
-->We think the BT test is a big deal, in Germany it had almost
-->bible-like status.
-->These American guys had never seen the test (four of them put up
-->their results as soon as they knew it, one of them typed in a load
-->of figures into his speadsheet, another put the EPD's into the FTP
-->site) - they didn't even know the test, let alone who may have been
-->referred to. The facts clearly show they were interested in the test
-->*itself*, not all this stuff we're doing.
-->
-->Nobody, in the past two 'nameless' cases made any attempt to point
-->fingers.
--> i) why would I assume that it would happen in this case ?
--> ii) I have nothing at all to gain from an argument with you.
--> iii) personally I don't care who was being referred to by Bob, it isn't
-->important in the context.
--> iv) I'ld prefer us not to be arguing at all - and for me, most
-->of these postings have been about defence from what you know accept
-->was an untrue counter-attack.
-->
-->No, Ed, the reason you got dragged in was because you over-reacted.
-->Somehow, my posting was the red thermonuclear button on your desk, and
-->you pressed it.
-->
-->It was absolutely not my intention, nor my prediction, that this would
-->happen - but since you responded by (untrue) counter-attack, I wasn't
-->left with much choice to pursue it.
-->
-->
-->I'm sorry you got angry, I guess that's my fault.
-->I was not being sneaky - that's not my way.
-->
-->
-->Actually, I thought I'ld try and persuade the ICCA to give me amateur
-->status :) After all, they do seem to give it out in strange cases.

Indeed. :)

My only comment, not a "thrashing of Ed" either, is that there's a very
fine line about testing. For example, if I let crafty use it's big opening
book, several of the CCR positions are solved correctly and instantly, which
would do wonders to inflate its rating, but would also be highly misleading
without an explanation.

I don't tune crafty to tests. As a result, I've gotten wildly varying
ratings from one test to another. Crafty does horribly on the CCR test,
reasonably well on the BT2630 test, and much better than I expected on the
old Bratko-Kopec test. Not by design on any of them, however.

It would make sense to be able to disable any sort of "learning" database
when running positions that test positional/tactical understanding, rather
than trying to test positional/tactical *experience*. There is a difference.

Bob

Chris Whittington

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

Ok, sorry. Peace Ed ?

Best regards,

Chris Whittington.

Any responses to my idea about the Swedish rating list problem
with hash-learners and a possible downwards re-rating of
all programs ......... ?

John Stanback

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

Here are the results for Zarkov4 on the BT test.
Hardware is a P5/100.

1 Nxg7 1
2 Bxb6 28
3 Re6 11
4 Qf7 900
5 Ka6 19
6 e3 5
7 Rd6 3
8 Rxc6+ 1
9 g5 0
10 Rxg7+ 23
11 Qxh2 18
12 Qe4 10
13*Be6 900
14 Rxh7 2
15*e5 120
16*Nxg2 900
17 Qxf4 2
18 d6 152
19 f3 102
20 Ra2 9
21*Re1 900
22 a3 1
23*g4 900
24 g6 235
25 Nd3 142
26 f5 900
27 e6 36
28*e5 648
29*O-O-O 135
30 f4 900

ELO 2363

John Stanback

Tom Kerrigan

unread,

May 18, 1996, 3:00:00 AM5/18/96

to

I have absolutely nothing against learning programs. It seems to me that a
computer will never hold the world champion title without being able to learn.
However, I do have two arguments against using such a feature to help with test
suites.

The sort of learning features we've been talking about are hash files. Stobor has
a hash file, but I prefer to call it the opening book. I can put all of the BT
positions in Stobor's book and get a program that scores 2630 on this silly test
without actually playing any better. This would be great and everybody should do
it, only then the test is rendered worthless because it doesn't measure anything.
A question begs asking here: why only use REBEL's hash file to help with a few BT
positions? Why not all of them? REBEL would be the only program ever to score 2630
on the test. That would be excellent marketing. Or would that look just a bit too
suspicious?

The second argument is that REBEL learned 4/30 problems on the test suite (I think
it was 4, but the exact number doesn't really matter). What are the chances that
in an actual game REBEL will have learned every 7th position, and these positions
happen to have brillant, game-winning tactical solutions? If this was the case,
Kasparov and Deep Blue would be toast, with REBEL as the world champion. Clearly
this isn't the case, which means there's an experimental error here, and it's
being caused by the hash file.

I think it's clear to anybody reading this thread that some sort of cheating has
taken place and the only reason the discussion continues is because Chris needs to
hear Ed say, "I cheated." From my point of view this seems rather childish on
Chris's side, but it's equally childish for Ed to avoid hard questions for totally
unrelated reasons. Perhaps it's time for us to grow up, at least in the interest
of network bandwidth. :)

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

Families, when a child is born
Want it to be intelligent.
I, through intelligence,
Having wrecked my whole life,
Only hope the baby will prove
Ignorant and stupid.
Then he will crown a tranquil life
By becoming a Cabinet Minister
-- Su Tung-p'o

Ed Schroder

unread,

May 18, 1996, 3:00:00 AM5/18/96

to kerr...@frii.com

>Tom Kerrigan wrote

>Second, I'd like to address this to Ed, but I'm posting because Chris
>would probably like to hear it. Ed: You don't seem to be responding to
>my posts, and I thought that it was because our newsgroups are out of
>synch, but you are avoiding the question when Chris asks it too.
>I think all Chris is doing is pointing out that the REBEL scores are
>inflated, and you seem to deny this by saying that your learning function
>is helping solve the problems. This is still cheating!
>For example, I don't clear my hash tables, so I can search a BT position
>for a week, and then say "I'm going to time this BT position" and get
>to 20 ply instantly with the solution. Wouldn't you say this is cheating?
>How is this learning function of yours different?
>It is by no means an excuse to inflate your score, only an explanation.

Tom,

I did not receive any posting from you however I have answered one
of your postings here in the newsgroup. Maybe you should address your
questions also to my email address too so I can't miss them.

Yes, it's true I am avoiding Chris questions.
If you have followed the argue I have with Chris than you will notice
that I have given 2 reasons for not responding to his questions.

If you state that a "learning feature" is cheating than most of todays
commercial chessprograms are cheating since most of them have a learning
function. Remember that people are asking for it, they expect it in a
chessprogram.

But you can of course argue about it.
I guess you will be not the only one who find a learning function
cheating.

Personally I have no objections against learning functions.
I think it's smart.

General (funny) remark...
For years chessprogrammers got complaints why the programs still made the
same mistakes. Now most programmers have done something against it, but
still not everbody seems to be satisfied.

- Ed Schroder -

Bruce Moreland

unread,

May 19, 1996, 3:00:00 AM5/19/96

to

In article <4nlh57$h...@europa.frii.com>, kerr...@frii.com says...
>[snip]

>
>I think it's clear to anybody reading this thread that some sort of
cheating has
>taken place and the only reason the discussion continues is because Chris
needs to
>hear Ed say, "I cheated." From my point of view this seems rather
childish on
>Chris's side, but it's equally childish for Ed to avoid hard questions
for totally
>unrelated reasons. Perhaps it's time for us to grow up, at least in the
interest
>of network bandwidth. :)

>[snip]

If anybody has the "time to solve" figures for the positions as posted,
and for the reflected positions, I would be interested in seeing them.

bruce

Ed Schroder

unread,

May 19, 1996, 3:00:00 AM5/19/96

to chr...@cpsoft.demon.co.uk

Hi Chris,

Of course I accept your peace offer!
Let's forget it ever happened.

I will start working now on answering your BT questions.

Best regards,

- Ed Schroder -

Ed Schroder

unread,

May 19, 1996, 3:00:00 AM5/19/96

to

Since Chris and I made peace I like to explain what is (was) going on
with the BT-test and Rebel in detail. I have nothing to hide, only to
admit an error that has been made in the past.

As explained before the system involved was a first attempt to write
a learning algorithm for Rebel. The system will be operational next
year, at least we hope.

The system is *position* based using hashkeys.
Complete games are simply divided into positions too.
And if people don't like the system they simply turn it off.

For testing purposes a small table with a mixture of hashkeys was created
in the program using the #define statement to make the programming and
testing part more easier. It's a common way for programmers at least
for me.

Now here is the error, at release time probably due to all off the time
pressure it was simply forgotten to turn the #define OFF, leaving the
test table and coding *active* in the final version of Rebel 6.0

And yes, this test table contains several positions of the BT-test.
I have looked at the test table and here are the results:
BT-2400 : 6 positions (4,5,8,9,27,28)
BT-2450 : 4 positions (5,8,9,28)
BT-2630 : 3 positions (5,8,9)

Since Rebel knows this positions it has an advantage and gained some
ELO points on the BT-test. How much? I really don't know, I never
tested it.

I never reacted on the articles because of the following reasons:

1) The PC-SCHACH magazine was a 3 monthly magazine. It makes less sense
to react 3 months later, since you already have lost the discussion
by then on time (but you can argue about that of course).

2) Before the Austrians of PC-SCHACH started their article, I received
a letter from them saying: "Ed, we have enough evidence that you have
tricked the BT-TEST. Now tell us or we will publish it."
I do not respond to such letters.

3) I wouldn't give up my "learning system" idea and make it public.

4) To make it clear I must explain all about hashkeys, reversed
positions etc. etc. and probably the whole system involved to be
creditable (convincing).

Looking at these 4 points *together* I still think I have made the
right decision at that time, meaning not responding (point 2 and 3
are (were) my *main* reasons).

Now the worst thing that can happen to me is that some people think I
have cheated 3-4 positions of the BT-TEST. I guess I have to live
with that.

The Internet...
The r.g.c.c. is another story, at least here you have a chance to
defend yourself. When the initial posting of Chris appeared I was
at least able to react immediately although I was unhappy with the way
Chris introduced the subject, but that is solved now.

My view on these tests...
Now that I have told all there is to know about the BT/REBEL subject
I like to return to the initial intention of this newsgroup and give
my view on the ever increasing number of testsets. It is good such
tests exists, it means computerchess is alive and sometimes I receive
mail from people saying that REBEL doesn't perform well on a certain
position. Due to this mails I have certainly made some improvements
on Rebel. Thanks!

So far so good, at least through my eyes...
But I totally disagree with tests of only 15-30 positions which tests
also claim an ELO rating for it. My *main* objections are:
a) too LESS positions
b) too much tactical positions
c) too less positional positions

Todays best test I know is the LCT-II test.
At least in this test the author has tried to make a good selection.
There are 3 chapters:
1) the tactical part (combinations)
2) positional play
3) endgame
I am also satisfied with the choices the author of LCT-II made although
I can't explain why, they simply seems more suitable to test a
chessprogram than the other tests I have seen, but that is just a
feeling of mine.

My only criticism on this test is that the number of positions (35) does
not justify to give any chessprograms a rating, since 35 is to less.
I would suggest the author to include:
a) much more positions (at least a factor of 3) (excluding luck!)
b) give *different* (elo) points for a all positions (difficulty level)
c) introducing more specific chapters (positional pawn sacrifice, bad
bishop, king safety, mobility, closed positions etc. etc.)

This is not an easy job.
It also means that the *time* rules for such a test must be changed.
I guess if you have a test of 150 positions there must be a much
shorter time control.

To improve REBEL I use a lot of the positions of "Beat the Masters".
An advantage of "Beat the Masters" (BTM) is that they give for each
position a decreasing number of points for *several* moves.

A BTM example (let's say from the starting position) looks as follows:
e4=10 d4=10 c4=8 Nf3=8 .... c3=1
The points are given by good chess players.

If I test a new version I run about 700 positions and sometimes my whole
test database of 1250 positions and simply check the sum of points to be
better. If it looks promising the version involved is tested with the
Autoplayer software.

- Ed Schroder -

Martin Borriss

unread,

May 20, 1996, 3:00:00 AM5/20/96

to

In article <4nigc4$m...@europa.frii.com>, kerr...@frii.com (Tom Kerrigan) writes:
> First, I found a quasi-bug in my program by examining a few of the positions where
> I was screwing up. A few people have told me that they return a draw score if the
> position happens twice, and then they told me that every commercial program does
> this. It sounded like "common knowledge" so I coded it, and it didn't hurt a quick
> WAC run, so I decided to keep it. What I wasn't told is that if a position occurs
> once in the *search* then a draw score is returned.

Could you explain what you mean by the very last sentence?

--
Martin....@inf.tu-dresden.de

Tom Kerrigan

unread,

May 20, 1996, 3:00:00 AM5/20/96

to

>> First, I found a quasi-bug in my program by examining a few of the positions where
>> I was screwing up. A few people have told me that they return a draw score if the
>> position happens twice, and then they told me that every commercial program does
>> this. It sounded like "common knowledge" so I coded it, and it didn't hurt a quick
>> WAC run, so I decided to keep it. What I wasn't told is that if a position occurs
>> once in the *search* then a draw score is returned.
>
>Could you explain what you mean by the very last sentence?

The most obvious way to score a draw by repetition is to keep a list of hash keys
in the actual game *and* the search path, and when search() is called check to see
if the current hash key is in this list. If so, return the draw score.

Right now a lot of programs are keeping a list like this, but they don't keep the
keys from the actual game, just the search path. With this method, it is possible
to return a draw score once the current position has been seen twice (once in the
actual game and once in the search path).

This was explained to me by Bob Hyatt as returning a draw score once a position
has been seen twice, which is exactly how I coded it. In retrospect, it doesn't
take a genius to see that this is downright stupid, and I've fixed the problem.

Another explanation of Stobor's poor showing on this test is that I've been
reworking the evaluation function and I took out my endgame knowledge to make
things more legible. Once I get everything up and running I will send Bruce new
and better Stobor test scores. :)

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

You've been leading a dog's life. Stay off the furniture.

Robert Hyatt

unread,

May 20, 1996, 3:00:00 AM5/20/96

to

In article <4npd7c$h...@irz301.inf.tu-dresden.de>,
Martin Borriss <mb...@irz.inf.tu-dresden.de> wrote:
-->
-->In article <4nigc4$m...@europa.frii.com>, kerr...@frii.com (Tom Kerrigan) writes:
-->> First, I found a quasi-bug in my program by examining a few of the positions where
-->> I was screwing up. A few people have told me that they return a draw score if the
-->> position happens twice, and then they told me that every commercial program does
-->> this. It sounded like "common knowledge" so I coded it, and it didn't hurt a quick
-->> WAC run, so I decided to keep it. What I wasn't told is that if a position occurs
-->> once in the *search* then a draw score is returned.
-->
-->Could you explain what you mean by the very last sentence?
-->
-->

The way many programs detect draws is as follows: the second time a position occurs,
*period*, the position is considered a repetition.

The more modern approach is to say that if a position is repeated once in the search,
then the same position has occurred twice in the search and can likely be forced a
third time so it's a draw. If the position occurs once in the game history, and
once in the search, it's not a repetition, because the opponent (or you) might
escape. If the position occurs twice in the real game history, then the first
occurrence in the search is a true repetition (3rd) and is also a draw...

Mark Mittelstaedt

unread,

May 20, 1996, 3:00:00 AM5/20/96

to

-> If you state that a "learning feature" is cheating than most of
-> todays commercial chessprograms are cheating since most of them have
-> a learning function. Remember that people are asking for it, they
-> expect it in a chessprogram.
->
-> But you can of course argue about it.
-> I guess you will be not the only one who find a learning function
-> cheating.
->
-> Personally I have no objections against learning functions.
-> I think it's smart.

Regardless what the situation is one way or the other, I think you
people should get off of Ed's back! I think it is really great to have
one of the Professional Programmers here on rec.games.computer.chess not
to mention the programmer with a program in the top two. Some of my
personal advice to all the programmers out there is that making a living
in the chess software area is extremely difficult so if you don't
feel like answering a question just say sorry I don't want to reply to
that. I think if nobody else respects that I at least will.

Mark Mittelstaedt

_ _ -------------------------------------------------------------
|_|_| PC-OHIO PCBoard Online * pcohio.com * V34+ 33.6: 216-381-3320
|_|_| The Best BBS in America * Cleveland, OH * Go Tribe
-------------------------------------------------------------

Mark Mittelstaedt

unread,

May 20, 1996, 3:00:00 AM5/20/96

to

-> both and are torn by the battle.
->
-> The above insight has come from a lowly retailer.
-> For the words of a higher authority, I turn to Bob Hyatt
-> who posted on May 11 the following final response to the ICD/Your
-> Move vs.
-> PBM "debate": "... any idiot should be capable of making a choice
-> he/she
-> can live with after reading everything that's been posted. and
-> re-posted.
-> and included. and re-included. ad nauseam."
->
-> I enjoy a good fight, and you gentlemen have had about 12 rounds. To
-> further quote Bob in his post, "It's much more fun to read about
-> computer chess issues and answers, not about computer *marketing*
-> issues.. "
-> Well, I have now sampled both within the last two weeks, and I must
-> say that I enjoy the earlier marketing "battles" much better. We all
-> have our own agenda.
->
-> I would MUCH prefer to see you gentlemen beat each other up on the
-> chessboard; then we all benefit from your wisdom!
->
-> There is an awfully long way to go (maybe never) before the perfect
-> chess program is realized, and if it can be, you two are among very
-> few in the world who might make that dream a reality. There is an
-> even more attainable target... Garry Kasparov. It would be a shame
-> if mundane distractions like this thread (at least the personal
-> aspects of it) distracted you from those goals.

Well said! And I agree 100%

Ed Schroder

unread,

May 21, 1996, 3:00:00 AM5/21/96

to mark.mit...@pcohio.com

mark.mit...@pcohio.com (Mark Mittelstaedt) wrote:
>-> If you state that a "learning feature" is cheating than most of
>-> todays commercial chessprograms are cheating since most of them have
>-> a learning function. Remember that people are asking for it, they
>-> expect it in a chessprogram.
>->
>-> But you can of course argue about it.
>-> I guess you will be not the only one who find a learning function
>-> cheating.
>->
>-> Personally I have no objections against learning functions.
>-> I think it's smart.
>
>Regardless what the situation is one way or the other, I think you
>people should get off of Ed's back! I think it is really great to have
>one of the Professional Programmers here on rec.games.computer.chess not
>to mention the programmer with a program in the top two. Some of my
>personal advice to all the programmers out there is that making a living
>in the chess software area is extremely difficult so if you don't
>feel like answering a question just say sorry I don't want to reply to
>that. I think if nobody else respects that I at least will.
>
>
>Mark Mittelstaedt

Thank you Mark.

It was indeed no fun at all and I am glad things are solved now.

- Ed Schroder -

Martin Borriss

unread,

May 21, 1996, 3:00:00 AM5/21/96

to

Thanks to Bob and Tom for their explanations. If I understand correctly, I use
the 'old' approach, e.g., everytime I see a position for the second time I score
this as draw. I do not distinguish (hash entry wise) between positions from the
search and positions which are in the game history. The only difference is that
game tree positions are forced into the hash table, overwriting old entries in
any case.

The problem Tom sees and Bob mentiones is if a position occures once in the game
history and once in the search. Then, so you say, scoring a draw is not right
since further repetition might be avoided.
I still don't exactly see why this is a major problem for Toms testscore. In what
cases does a program deviate from a position it reached once in the history and
once in the current search?

E.g., 1.e4 e5 2.Nf3 Nc6 3.Bb5 [this is the game history entry] Nd4
now the search tries 4.Bf1 and scores 4...Nc6 5.Bb5 as a draw. No
problem here, since there are better moves than 4.Bf1.

It's not a big deal to use the modern approach, but I'd prefer to know why ;)
Another interesting topic, since one of the many embarrassing features of
Gullydeckel is to score the first and the 27th repetition (in the game
history) as draw.
I've seen the program losing on time because of the following effect:
1. Human repeats endlessly moves, does not claim draw. Neither does program.
2. Program accumulates information in hash tables.

Now two cases:
3a) Program finds a good way to play on and deviates from repetition 27 ;)
Program does not lose on time, human is confused and surprised and
will probably lose.
3b) Program is searching deeper and deeper (due to hashing), making it
occasionally lose on time (dubious time management).

Martin

--
Martin....@inf.tu-dresden.de

Tom Kerrigan

unread,

May 21, 1996, 3:00:00 AM5/21/96

to

Sorry, you misread my post. Hyatt explained this "once in actual game, once in
search path" as returning a draw when the position is seen twice. He didn't say
anything about where the position was seen twice, so all I did was put a counter
in my repetition-finder so it would return TRUE if it found the position in the
actual game and search path twice. The reason I say this is stupid is because once
in a while you find a position where the solution is a draw by repetition (BT2630
#5, I think) but search() doesn't return a draw score until 2 ply after it should,
so finding the solution takes 8 times longer than it should, which was my case in
a few of the BT2630 problems.

Hope this clears things up.

BTW, sorry about the long sig. I think you might find it funny anyway. :)

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

One promising concept that I came up with right away was that you could
manufacture personal air bags, then get a law passed requiring that
they be installed on congressmen to keep them from taking trips. Let's
say your congressman was trying to travel to Paris to do a fact-finding
study on how the French government handles diseases transmitted by
sherbet. Just when he got to the plane, his mandatory air bag,
strapped around his waist, would inflate -- FWWAAAAAAPPPP -- thus
rendering him too large to fit through the plane door. It could also
be rigged to inflate whenever the congressman proposed a law. ("Mr.
Speaker, people ask me, why should October be designated as Cuticle
Inspection Month? And I answer that FWWAAAAAAPPPP.") This would save
millions of dollars, so I have no doubt that the public would violently
support a law requiring airbags on congressmen. The problem is that
your potential market is very small: there are only around 500 members
of Congress, and some of them, such as House Speaker "Tip" O'Neil, are
already too large to fit on normal aircraft.
-- Dave Barry, "'Mister Mediocre' Restaurants"

Robert Hyatt

unread,

May 21, 1996, 3:00:00 AM5/21/96

to

In article <4nsshh$7...@europa.frii.com>,
Tom Kerrigan <kerr...@frii.com> wrote:
-->Sorry, you misread my post. Hyatt explained this "once in actual game, once in
-->search path" as returning a draw when the position is seen twice. He didn't say
-->anything about where the position was seen twice, so all I did was put a counter
-->in my repetition-finder so it would return TRUE if it found the position in the
-->actual game and search path twice. The reason I say this is stupid is because once
-->in a while you find a position where the solution is a draw by repetition (BT2630
-->#5, I think) but search() doesn't return a draw score until 2 ply after it should,
-->so finding the solution takes 8 times longer than it should, which was my case in
-->a few of the BT2630 problems.
-->
-->Hope this clears things up.
-->

Jeez, now *I'm* lost. :) in the BT suite there's *no* game
history. In crafty or any other program it would seem to make
no difference if you blew the test because the 2nd repeat is a
repetition without a game history to work with?

If I misunderstood you, try again. If you misunderstood me,
restate your premise. Perhaps this is just one giant mis-
understanding? :)

BTW, Crafty counts twice as a draw, whether it's twice in search or once in
history and once in search. I tried it the other way but didn't like it as
well as this as it would often take longer to see a repetition when you
require two reps in search or three overall. I now treat two as a rep
regardless, just like I did in Cray Blitz. I have a switch in CB to flip
it to either approach, but always seemed to like the simple one better...

Jouni Uski

unread,

May 22, 1996, 3:00:00 AM5/22/96

to

Sorry Ed!
My mail server is very slow and I got Your explanation too late.
Thanks for giving THE TRUTH!
Jouni

Martin Borriss

unread,

May 23, 1996, 3:00:00 AM5/23/96

to

Can I interpret it as follows:
You detect a repetition if it was found in the game history *and* *twice* in the
search? But you want to detect it if it was twice altogether?

Linking this to the original topic:
Two positions in BT2630, #7 and #12 contain repetitions as solution (according to
the Gullydeckel prototype ;) ):

Funny PV's follow ( #12, 6 plies the eval seems questionable to me :))

#7:
(6) c6-d6 d4xd6 e7-e3 g1-h1 e3-f3 h1-g1 [0]

#12:
(6) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 e2-h2 g3-g1 h1xg1 e4-b1 e1-e2 b1xg1 h2xh3 [-53]
(7) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 e2-h2 e4-b1 e1-e2 b1-c2 e2-e1 [0]
(8) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 b8-b7 [0]

Does this mean that Stobor found those repetitions later (before you removed
your bug) ?

I'm still interested to know what difference it makes where the first
occurence of the position comes from.

In article <4nsshh$7...@europa.frii.com>, kerr...@frii.com (Tom Kerrigan) writes:
> Sorry, you misread my post. Hyatt explained this "once in actual game, once in

> search path" as returning a draw when the position is seen twice. He didn't say

> anything about where the position was seen twice, so all I did was put a counter

> in my repetition-finder so it would return TRUE if it found the position in the

> actual game and search path twice. The reason I say this is stupid is because once

> in a while you find a position where the solution is a draw by repetition (BT2630

> #5, I think) but search() doesn't return a draw score until 2 ply after it should,

> so finding the solution takes 8 times longer than it should, which was my case in

> a few of the BT2630 problems.
>

> Hope this clears things up.
>

I seem to be a little slow recently :(
8 times longer for 2 plies is a nice branching factor.

Martin

--
Martin....@inf.tu-dresden.de

Tom Kerrigan

unread,

May 23, 1996, 3:00:00 AM5/23/96

to

What you *should* do (and what I'm doing now) is return a draw score when you see
any repetition, either in the game or the search. This is because the program
should decide on the same move it did in the past, so it will continue to repeat,
and it will eventually be a draw. It can be argued that if you don't do this, then
the hash table gets filled and the program may deviate from the repetition because
it found a better move. I think this is a valid point, but it seems a bit unlikely
to me. I prefer my method now because a move at the root can automatically get a
draw score, and because that is usually the best move everything else will usually
fail low. What happens to my program in these cases is that it finishes, say, 14
ply in complicated middlegames, so I have no complaints there.

>I seem to be a little slow recently :(
>8 times longer for 2 plies is a nice branching factor.

Oh, I just guessed at that. I have no idea what the actual number is. :)

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

From the Pro 350 Pocket Service Guide, p. 49, Step 5 of the
instructions on removing an I/O board from the card cage, comes a new
experience in sound:

5. Turn the handle to the right 90 degrees. The pin-spreading
sound is normal for this type of connector.

Bruce Moreland

unread,

May 23, 1996, 3:00:00 AM5/23/96

to

I enjoyed doing this suite and collecting everyone's results. I would like to do more
suites and collect more results.

1) If anyone has any BT2630 numbers, please send them to me at bru...@microsoft.com.
I will post updates as appropriate. I got your fax, Chris, and typed in results for some
of the other programs. I would like to get more, I have very few on >= a P5/133, and am
missing ANY pentium numbers for several of the commercial programs.

2) Another interesting suite is La Puce Echiqueenne, as it includes a positional,
tactical, and endgame section. I would like to get results for that suite as well. If
you do this suite, please send me the time at which your program got a solution and held
it until the end of the 210 second period. I would rather see "69" than "1:09", but
either format is fine. This suite takes 98 minutes to run.

3) Finally I would like to collect results for the CCR ("one hour") test. My own
program does horribly at this test and I would like to figure out why. If you do this
suite, please send me your scores for each position at each checkpoint. This suite takes
50 minutes to run.

bruce

--------------

Appendix

I include both the Lapuce and CCR tests in the out-dated CI format, which should be easily
translatable into EPD, and yes, I know I shouldn't be using this anymore, Mr. Edwards. If
someone would like to post EPD's for this it'd be cool.

I added a "miss" keyword to CI format, this means: "Score the problem as correct if you do
NOT want to make a move that is in the 'solution' set." I use this keyword in the "CCR"
set.

Lapuce test

noop
noop LA PUCE ECHIQUEENNE computer chess test
noop
noop 28 positions:
noop position 1-9 positional test (POS)
noop position 10-19 tactical test (TAC)
noop position 20-28 endgame test (END)
noop
noop - The program must be set to infinite level.
noop - The selected time for each position is the time when the
noop program finds the best move and does not change after that.
noop - For each position: if the program finds the best move
noop between 0 - 5 sec. -> 16 points
noop 6 - 30 sec. -> 12 points
noop 31 - 90 sec. -> 8 points
noop 91 - 210 sec. -> 5 points
noop >210 sec. -> 0 points
noop
noop - ELO = POS total points * 2.4 +
noop TAC total points * 2.2 +
noop END total points * 1.4 + 1950
noop
noop
echo position 001
svfe 2r2rk1/1p1bq3/p3p2p/3pPpp1/1P1Q4/P7/2P2PPP/2R1RBK1 b - - 0 1
srch Bb5
echo position 002
svfe r2qkb1r/3n1p1p/p4p2/3pp3/Q1BNb3/7P/PP3PP1/R1B1R1K1 w kq - 0 1
srch Rxe4
echo position 003
svfe 1r1r2k1/2b1qp1p/b1p3p1/p1p1p3/2P1P3/1PN1BP2/P1Q3PP/R2R2K1 b - - 0 1
srch Rd4
echo position 004
svfe 1nr5/2rbkppp/p3p3/Np6/2PRPP2/8/PKP1B1PP/3R4 b - - 0 1
srch e5
echo position 005
svfe r4rk1/1ppnq1p1/p1n1b2p/2P1pp2/8/P2P1NP1/1B2PPBP/2RQ1RK1 w - - 0 1
srch Nh4
echo position 006
svfe 1r1r2k1/p4pp1/2bpp2p/2p5/2P5/1P1BP3/PR3PPP/3R2K1 b - - 0 1
srch a5 Kf8
echo position 007
svfe r1r3k1/1b2qnpp/p2b1p2/Pp1p4/3P1P1P/1Q2P1P1/1B1NBK2/R2R4 b - - 0 1
srch Rc4
echo position 008
svfe 2rq1rk1/pb2bpp1/1p1p1n1p/4pP2/4P3/1BNQ3R/PPP3PP/R1B4K b - - 0 1
srch Rxc3
echo position 009
svfe r2qrnk1/pp3ppb/3b1n1p/1Pp1p3/2P1P2N/P5P1/1B1NQPBP/R4RK1 w - - 0 1
srch Bh3
echo position 010
svfe 1rb1r1k1/p1q3pp/1p2ppn1/6NB/Pb1B4/8/1PP1Q1PP/3R1RK1 w - - 0 1
srch Nxh7
echo position 011
svfe 6k1/5p2/3P2p1/7n/3QPP2/7q/r2N3P/6RK b - - 0 1
srch Rxd2
echo position 012
svfe 2b1r1k1/r4ppp/p7/2pNP3/4Q3/q6P/2P2PP1/3RR1K1 w - - 0 1
srch Nf6+
echo position 013
svfe 2r2r2/3qbpkp/p3n1p1/2ppP3/6Q1/1P1B3R/PBP3PP/5R1K w - - 0 1
srch Rxh7+
echo position 014
svfe rn2kb1r/p3qppp/2p2n2/1p2p1B1/2B1P3/1QN5/PPP2PPP/R3K2R w - - 0 1
srch Nxb5
echo position 015
svfe rrq1k3/2p2p2/3pb1p1/3N2Pp/p1PpPR2/P2P3P/1P6/K2Q1R2 w - - 0 1
srch Rxf7
echo position 016
svfe r2qr1k1/pb1nb1pp/1p2pn2/2p1Np2/2PP1B2/3B1N2/PP2QPPP/R4RK1 w - - 0 1
srch Nf7
echo position 017
svfe r2r2k1/1p1n1pb1/2qB2pp/p1P5/1p6/1Q6/P3BPPP/3RR1K1 w - - 0 1
srch Qxf7+
echo position 018
svfe rr6/2pqn1k1/3b1pb1/2pPp1p1/ppP1P1P1/1P2NQN1/P2B1PK1/R6R w - - 0 1
srch Rh7+
echo position 019
svfe 1r4k1/5ppp/rqp1p3/p5B1/PnRPN3/bP1R1Q2/5PPP/6K1 w - - 0 1
srch Nf6+
echo position 020
svfe 4k3/5p2/3K2p1/4P2p/7P/8/6P1/8 w - - 0 1
srch g3
echo position 021
svfe 8/4p3/8/3P3p/P2pK3/6P1/7b/3k4 w - - 0 1
srch d6
echo position 022
svfe 1R6/1R5p/3p2pk/3P1p2/5P2/6r1/r7/7K w - - 0 1
srch Rxh7+
echo position 023
svfe 8/5Bp1/4P3/6pP/1b1k1P2/5K2/8/8 w - - 0 1
srch Kg4
echo position 024
svfe 8/2b5/8/2Np4/P2K2k1/8/8/8 w - - 0 1
srch Nb7
echo position 025
svfe 8/7k/7P/7P/p1n1p3/2P5/2K5/2B5 b - - 0 1
srch Kg8 Kh8
echo position 026
svfe 6k1/4pp1p/3p2p1/P1pPb3/R7/1r2P1PP/3B1P2/6K1 w - - 0 1
srch Bb4
echo position 027
svfe 8/5p2/1p3k1n/6p1/PpP3P1/5B1P/6K1/8 w - - 0 1
srch c5
echo position 028
svfe 8/3R2pp/5k2/7P/4P3/8/3p2r1/1K6 b - - 0 1
srch Rg1+
rtrn

CCR test

noop Checkpoints at 15, 30, 60, 120 seconds
noop One point for being right at each checkpoint. If you are right at 15 seconds,
noop wrong at 30 seconds, right at 60 seconds, and wrong at 120 seconds, you get two
noop points, you don't have to hold your result until the end of the search.
noop
noop maximum score is 100 points
noop
echo position 1-01
svfe rn1qkb1r/pp2pppp/5n2/3p1b2/3P4/2N1P3/PP3PPP/R1BQKBNR w KQkq - 0 6
srch Qb3
echo position 1-02
svfe rn1qkb1r/pp2pppp/5n2/3p1b2/3P4/1QN1P3/PP3PPP/R1B1KBNR b KQkq - 1 6
srch Bc8
echo position 1-03
svfe r1bqk2r/ppp2ppp/2n5/4P3/2Bp2n1/5N1P/PP1N1PP1/R2Q1RK1 b kq - 0 10
srch Nh6
echo position 1-04
svfe r1bqrnk1/pp2bp1p/2p2np1/3p2B1/3P4/2NBPN2/PPQ2PPP/1R3RK1 w - - 0 12
srch b4
echo position 1-05
svfe rnbqkb1r/ppp1pppp/5n2/8/3PP3/2N5/PP3PPP/R1BQKBNR b KQkq - 2 5
srch e5
echo position 1-06
svfe rnbq1rk1/pppp1ppp/4pn2/8/1bPP4/P1N5/1PQ1PPPP/R1B1KBNR b KQ - 0 5
srch Bxc3+
echo position 1-07
svfe r4rk1/3nppbp/bq1p1np1/2pP4/8/2N2NPP/PP2PPB1/R1BQR1K1 b - - 0 12
srch Rfb8
echo position 1-08
svfe rn1qkb1r/pb1p1ppp/1p2pn2/2p5/2PP4/5NP1/PP2PPBP/RNBQK2R w KQkq - 0 6
srch d5
echo position 1-09
svfe r1bq1rk1/1pp2pbp/p1np1np1/3Pp3/2P1P3/2N1BP2/PP4PP/R1NQKB1R b KQ - 0 9
srch Nd4
echo position 1-10
svfe rnbqr1k1/1p3pbp/p2p1np1/2pP4/4P3/2N5/PP1NBPPP/R1BQ1RK1 w - - 0 11
srch a4
echo position 1-11
svfe rnbqkb1r/pppp1ppp/5n2/4p3/4PP2/2N5/PPPP2PP/R1BQKBNR b KQkq - 0 3
srch d5
echo position 1-12
svfe r1bqk1nr/pppnbppp/3p4/8/2BNP3/8/PPP2PPP/RNBQK2R w KQkq - 1 6
srch Bxf7+
echo position 1-13
svfe rnbq1b1r/ppp2kpp/3p1n2/8/3PP3/8/PPP2PPP/RNBQKB1R b KQ - 0 5
miss Nxe4
echo position 1-14
svfe rnbqkb1r/pppp1ppp/3n4/8/2BQ4/5N2/PPP2PPP/RNB2RK1 b kq - 2 6
miss Nxc4
echo position 1-15
svfe r2q1rk1/2p1bppp/p2p1n2/1p2P3/4P1b1/1nP1BN2/PP3PPP/RN1QR1K1 w - - 0 12
srch exf6
echo position 1-16
svfe r1bqkb1r/2pp1ppp/p1n5/1p2p3/3Pn3/1B3N2/PPP2PPP/RNBQ1RK1 b kq - 1 7
srch d5
echo position 1-17
svfe r2qkbnr/2p2pp1/p1pp4/4p2p/4P1b1/5N1P/PPPP1PP1/RNBQ1RK1 w kq - 0 8
miss hxg4
echo position 1-18
svfe r1bqkb1r/pp3ppp/2np1n2/4p1B1/3NP3/2N5/PPP2PPP/R2QKB1R w KQkq - 0 7
srch Bxf6
echo position 1-19
svfe rn1qk2r/1b2bppp/p2ppn2/1p6/3NP3/1BN5/PPP2PPP/R1BQR1K1 w kq - 4 10
srch Bxe6
echo position 1-20
svfe r1b1kb1r/1pqpnppp/p1n1p3/8/3NP3/2N1B3/PPP1BPPP/R2QK2R w KQkq - 2 8
srch Ndxb5
echo position 1-21
svfe r1bqnr2/pp1ppkbp/4N1p1/n3P3/8/2N1B3/PPP2PPP/R2QK2R b KQ - 1 11
miss Kxe6
echo position 1-22
svfe r3kb1r/pp1b1ppp/1q2pn2/n2p4/3P1B2/2PB1N2/PPQ2PPP/RN2K2R w KQkq - 2 11
srch a4
echo position 1-23
svfe r1bq1rk1/pppnnppp/4p3/3pP3/1b1P4/2NB3N/PPP2PPP/R1BQK2R w KQ - 5 7
srch Bxh7+
echo position 1-24
svfe r2qkbnr/ppp1pp1p/3p2p1/3Pn3/4P1b1/2N2N2/PPP2PPP/R1BQKB1R w KQkq - 1 6
srch Nxe5
echo position 1-25
svfe rn2kb1r/pp2pppp/1qP2n2/8/6b1/1Q6/PP1PPPBP/RNB1K1NR b KQkq - 0 6
miss Qxb3
rtrn

Robert Hyatt

unread,

May 23, 1996, 3:00:00 AM5/23/96

to

In article <4o1q86$p...@irz301.inf.tu-dresden.de>,

Martin Borriss <mb...@irz.inf.tu-dresden.de> wrote:
-->

-->Can I interpret it as follows:
-->You detect a repetition if it was found in the game history *and* *twice* in the
-->search? But you want to detect it if it was twice altogether?
-->
-->Linking this to the original topic:
-->Two positions in BT2630, #7 and #12 contain repetitions as solution (according to
-->the Gullydeckel prototype ;) ):
-->
-->Funny PV's follow ( #12, 6 plies the eval seems questionable to me :))
-->
-->#7:
-->(6) c6-d6 d4xd6 e7-e3 g1-h1 e3-f3 h1-g1 [0]
-->
-->#12:
-->(6) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 e2-h2 g3-g1 h1xg1 e4-b1 e1-e2 b1xg1 h2xh3 [-53]
-->(7) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 e2-h2 e4-b1 e1-e2 b1-c2 e2-e1 [0]
-->(8) e6-e4 b3xb7 f7-f8 b7-b8 f8-f7 b8-b7 [0]
-->
-->Does this mean that Stobor found those repetitions later (before you removed
-->your bug) ?
-->
-->I'm still interested to know what difference it makes where the first
-->occurence of the position comes from.
-->
-->
-->In article <4nsshh$7...@europa.frii.com>, kerr...@frii.com (Tom Kerrigan) writes:
-->> Sorry, you misread my post. Hyatt explained this "once in actual game, once in
-->> search path" as returning a draw when the position is seen twice. He didn't say
-->> anything about where the position was seen twice, so all I did was put a counter
-->> in my repetition-finder so it would return TRUE if it found the position in the
-->> actual game and search path twice. The reason I say this is stupid is because once
-->> in a while you find a position where the solution is a draw by repetition (BT2630
-->> #5, I think) but search() doesn't return a draw score until 2 ply after it should,
-->> so finding the solution takes 8 times longer than it should, which was my case in
-->> a few of the BT2630 problems.
-->>
-->> Hope this clears things up.
-->>
-->
-->I seem to be a little slow recently :(
-->8 times longer for 2 plies is a nice branching factor.
-->

Pretty typical with null-move search, R=2. Of course the branching
factor is not really that low, we're just cheating on depth quite a
bit to make it look low.

The reason most people use the twice in search or 3 times overall for
repetition testing is that if you say twice, period, such as once in
history and once in in the search, that once in the search might be the
move at the root, which cuts the search of that move off instantly. No
PV, no predicted move, no move to "ponder." In Crafty I solved this in
a different way by resorting to a search for my opponent to get a move
for him if the PV only has one move, otherwise you could just sit and
wait which most people don't like, viewing it as time wasted not
searching.

B. Bauer

unread,

May 24, 1996, 3:00:00 AM5/24/96

to

Here is an updated table with BT results.
1. Stobor should have 20 ELO points more.
2. Results for Crafty-9.27 on a PPC601 and on a hypothetical PPC601x2 with 133 MHz
3. Gnuchess 4.0 pl77.

Genius2 CM4000 MChess CSTal Ferret Stobor Cr9.27 Spector
P5/90 486/66 486/50 P5/90 P6/200 486/80 PPC601 P5/133
| Rebel6 | Fritz3 | Hiarcs | Ferret | Cr9.28 | GNU-77 | Cr9.27 | Zarkov4 |
| 486/66 | 486/66 | 486/50 | P5/133 | P6/200 | PPC601 | PPC601x2| P5/100 |

1 Nxg7 0 1 19 2 3 1 0 349 142 59 0 1 439 220 437 1
2 Bxb6 0 73 49 74 25 401 72 6 3 69 66 71 338 169 522 28
3 Re6 0 44 21 5 12 67 11 205 88 5 900 227 55 28 552 11
4 Qf7 236 900 900 900 900 702 900 191 80 900 900 900 900 900 900 900
5 Ka6 39 1 73 5 11 86 579 0 0 0 108 69 5 3 84 19
6 e3 0 33 17 76 4 14 3 0 0 4 12 421 29 15 4 5
7 Rd6 0 27 7 2 4 14 1 0 0 1 900 8 9 5 12 3
8 Rxc6+ 0 29 170 2 1 1 10 1 0 0 31 71 523 262 20 1
9 g5 761 3 10 900 33 726 644 11 5 1 34 900 139 70 126 0
10 Rxg7+ 42 21 27 159 14 231 38 7 3 18 236 72 148 74 900 23
11 Qxh2 0 5 23 24 35 2 1 1 0 5 5 105 28 14 10 18
12 Qe4 16 125 2 29 204 900 2 11 5 17 128 900 118 59 23 10
13*Be6 900 900 900 900 900 900 22 900 900 900 900 900 900 900 900 900
14 Rxh7 10 5 3 7 29 4 3 6 2 2 12 22 11 6 100 2
15*e5 900 900 900 217 900 900 900 900 900 118 797 900 507 254 900 120
16*Nxg2 900 900 900 57 900 900 3 900 900 900 900 316 900 900 900 900
17 Qxf4 0 108 6 12 21 24 2 91 39 12 232 74 73 37 900 2
18 d6 0 24 72 6 95 128 14 17 9 72 461 44 176 88 900 152
19 f3 12 53 91 72 95 900 221 246 105 34 900 900 196 98 900 102
20 Ra2 174 11 27 8 406 243 9 53 23 4 68 132 29 15 115 9
21*Re1 900 900 900 900 900 900 900 246 103 900 900 900 900 900 900 900
22 a3 21 9 58 900 19 4 146 900 900 17 900 12 900 900 900 1
23*g4 95 900 900 900 785 900 900 900 900 39 900 900 900 900 900 900
24 g6 18 900 900 83 900 592 512 56 23 250 692 900 473 237 900 235
25 Nd3 20 900 217 900 900 900 900 30 13 12 346 183 72 36 330 142
26 f5 900 1 13 900 1 900 93 0 0 39 25 0 4 2 1 900
27 e6 34 119 40 255 11 71 900 1 0 118 27 900 621 76 36
28*e5 900 900 900 900 900 900 2 179 74 900 900 900 900 900 900 648
29*O-O-O 900 162 900 900 900 900 900 900 900 900 900 900 900 900 900 135
30 f4 41 10 863 566 900 1 900 19 9 423 824 105 900 470 900 900

Unsoved 7 9 9 10 10 11 8 6 6 6 11 11 10 8 15 7
Time 7819 9908 10808 9588 6126 14004 12372 15912
Time 8964 10661 13212 7126 6719 13528 9983 8003

ELO 2369 2331 2300 2275 2270 2190 2310 2392 2426 2406 2163 2179 2218 2297 2100 2363
* = A position considered to be "hard"
PPC601x2 means a hypothetical PPC601 with 133 MHz.

Bernhard
--
# Bernhard Bauer Phone:+49 711 685 3428 #
# Universitaet Stuttgart FAX: +49 711 685 3438 #
# Institut fuer Aero- und Gasdynamik #
# Pfaffenwaldring 21 #
# D 70569 Stuttgart e-mail:b.b...@iag.uni-stuttgart.de #

Steven J. Edwards

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

For positions more than 1 ply deep in the main search tree, Spector
compares the current position with prior positions (both in the tree
and in the move history) using 64 bit hash keys. If a match is
detected, the position is scored as a draw. Spector looks at the side
to move and the halfmove count and efficiently makes the minimum
number of comparisons.

For positions at ply 1 (the candidate moves), Spector uses a full
position match instead of hash keys to reliably detect threefold
repetion. A twofold repetion at ply 1 is scored as a draw and the
subtree is not searched or even generated.

The only hazard with the above approach is that a time control
boundary may be crossed between a twofold repetion at ply 1 where only
a little analysis time was available for the first occurance but a
large amount of time is available for the second. It is possible that
a second repretion may indeed lead to a win, but the program can't see
this because it assumes that any future analysis for that position
would be a waste of time. One way to treat this situation is to
record the analysis time for each played move and compare the time
used for the first occurance with the time budget for the second
occurance. But this doesn't seem to be worthwhile, at least as long
as there are plenty of other things that need work.

-- Steven (s...@mv.mv.com)

Eric Hallsworth

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

In article <4o4a7j$3n...@info4.rus.uni-stuttgart.de>, "B. Bauer"
<vabb...@iagrs06.IAG.Uni-Stuttgart.de> writes

>
>Here is an updated table with BT results.
>

> Genius2 CM4000 MChess CSTal Ferret Stobor Cr9.27
>Spector
> P5/90 486/66 486/50 P5/90 P6/200 486/80 PPC601
>P5/133
> | Rebel6 | Fritz3 | Hiarcs | Ferret | Cr9.28 | GNU-77 | Cr9.27
>| Zarkov4 |
> | 486/66 | 486/66 | 486/50 | P5/133 | P6/200 | PPC601 |
>PPC601x2| P5/100 |

.... results table deleted

>Unsoved 7 9 9 10 10 11 8 6 6 6 11 11 10 8
>15 7
>Time 7819 9908 10808 9588 6126 14004 12372
>15912
>Time 8964 10661 13212 7126 6719 13528 9983
>8003
>
>ELO 2369 2331 2300 2275 2270 2190 2310 2392 2426 2406 2163 2179 2218 2297
>2100 2363

Which Hiarcs and MChess are we talking about here? Certainly NOT Hiarcs4
or MChess Pro5! which both score MUCH better than the above.
--
Best wishes,
Eric Hallsworth, Computer Chess Magazine, The Red House,
46 High Street, Wilburton, Cambs CB6 3RA

Chris Whittington

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

Eric Hallsworth <er...@elhchess.demon.co.uk> wrote:
>
> In article <4o4a7j$3n...@info4.rus.uni-stuttgart.de>, "B. Bauer"
> <vabb...@iagrs06.IAG.Uni-Stuttgart.de> writes
> >
> >Here is an updated table with BT results.
> >
> > Genius2 CM4000 MChess CSTal Ferret Stobor Cr9.27
> >Spector
> > P5/90 486/66 486/50 P5/90 P6/200 486/80 PPC601
> >P5/133
> > | Rebel6 | Fritz3 | Hiarcs | Ferret | Cr9.28 | GNU-77 | Cr9.27
> >| Zarkov4 |
> > | 486/66 | 486/66 | 486/50 | P5/133 | P6/200 | PPC601 |
> >PPC601x2| P5/100 |

> ..... results table deleted

>
> >Unsoved 7 9 9 10 10 11 8 6 6 6 11 11 10 8
> >15 7
> >Time 7819 9908 10808 9588 6126 14004 12372
> >15912
> >Time 8964 10661 13212 7126 6719 13528 9983
> >8003
> >
> >ELO 2369 2331 2300 2275 2270 2190 2310 2392 2426 2406 2163 2179 2218 2297
> >2100 2363
> Which Hiarcs and MChess are we talking about here? Certainly NOT Hiarcs4
> or MChess Pro5! which both score MUCH better than the above.
> --
> Best wishes,
> Eric Hallsworth, Computer Chess Magazine, The Red House,
> 46 High Street, Wilburton, Cambs CB6 3RA

Well, post the results then ....

Best regards,

Chris Whittington

Tom Kerrigan

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

Hum, my function to detect reps only considers the halfmove count. I have been
meaning to figure out how to use the side to move, but null moves can screw this
all up, so I've just been ignoring it.

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

Corrupt, stupid grasping functionaries will make at least as big a
muddle of socialism as stupid, selfish and acquisitive employers can
make of capitalism.
-- Walter Lippmann

Robert Hyatt

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

or use the "third repetition is absolute draw, repetitions in either history
or the search path, second repetition is a draw if both occur in the search
path only. That avoids part of the problem, because the third repetition is
a draw no matter what, while in the search if repeating the second time is
good, then you can probably force the third repetition.

The best-known type of problem position is you are facing two connected
rooks on the 7th, one of the rooks checks you and you have two legal moves.
There's a 50% chance that you take a path that leads you to one edge and
back to this position where you say "draw" while if you go the other way,
you reach a square where you can avoid the repetition. The problem is
that if you start the wrong way, and count two-fold repetition as "draw"
you have to somehow "blow through" the 2nd repetition to see that the third
can be avoided. It's just as likely that if you can choose between repeating
going one way (unknowingly going for the original position where you get to
go "through" it to safety, or going back the other way to hit a third rep,
you'll take the latter and draw rather than finding your way out of the
repetition and on to the win.

In Cray Blitz we always favored the longest drawing line to try and avoid
this. Our draw score was N+ply to encourage this, although the transposition
table screws this calculation up by grafting paths inside the tree without
knowing anything about how to adjust the N+len calculation, since the trans
ref table doesn't have any info about the moves along the path...

More if you are interested...

Bob

Steven J. Edwards

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

kerr...@frii.com (Tom Kerrigan) writes:

>Hum, my function to detect reps only considers the halfmove count. I have been
>meaning to figure out how to use the side to move, but null moves can screw this
>all up, so I've just been ignoring it.

Easy. Just store the active color with the stored hash key for each
played move, and compared the stored values with the current value.

-- Steven (s...@mv.mv.com)

Chris Whittington

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

kerr...@frii.com (Tom Kerrigan) wrote:
>
> Hum, my function to detect reps only considers the halfmove count. I have been
> meaning to figure out how to use the side to move, but null moves can screw this
> all up, so I've just been ignoring it.
>

> Cheers,
> Tom
>
> _______________________________________________________________________________
> Tom Kerrigan kerr...@frii.com O-
>
> Corrupt, stupid grasping functionaries will make at least as big a
> muddle of socialism as stupid, selfish and acquisitive employers can
> make of capitalism.
> -- Walter Lippmann

Just xor a MOVE_HASH_CONSTANT every time you move.
This deals with black/white and null move.

Chris Whittington

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

I can remember winning a game in a Chess Olympiad tourney (1991?)
because the opponent had a 3-move draw detector, instead of a 2-mover.

Chris Whittington

Tom Kerrigan

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

Ah, evidently I misread Steven's post. I thought he said that he used both the
halfmove counter and the side to move to speed up repetition detection. I was
simply trying to communicate that I don't use the latter because if a nullmove is
played, the side to move gets screwed up, so if you're searching every other hash
key for a repetition because it's black's turn to move, then you could end up
searching white-to-move positions because nullmove gets these things out of sync.

I've hashed in the side to move since day one. Just thought I'd clear this up.

Cheers,
Tom

_______________________________________________________________________________
Tom Kerrigan kerr...@frii.com O-

The meek shall inherit the earth -- they are too weak to refuse.

Robert Hyatt

unread,

May 25, 1996, 3:00:00 AM5/25/96

to

In article <4o8678$c...@europa.frii.com>,
Tom Kerrigan <kerr...@frii.com> wrote:
-->Ah, evidently I misread Steven's post. I thought he said that he used both the
-->halfmove counter and the side to move to speed up repetition detection. I was
-->simply trying to communicate that I don't use the latter because if a nullmove is
-->played, the side to move gets screwed up, so if you're searching every other hash
-->key for a repetition because it's black's turn to move, then you could end up
-->searching white-to-move positions because nullmove gets these things out of sync.
-->
-->I've hashed in the side to move since day one. Just thought I'd clear this up.
-->
-->Cheers,
-->Tom
-->

I use two lists, and don't have the problem at all... However, the minute
you let a draw score hit the hash table you have a problem. I allow it,
and put up with it. and curse it. and complain about it. and still I
tolerate it. :)

Chris Whittington

unread,

May 26, 1996, 3:00:00 AM5/26/96

to

When the initial results (6 programs?) were posted, I made it
clear that these figures were from 1994-June. The idea is for the
table of results to grow - with current commercial programs - and
with experimental and amateur versions. (That makes it Hiarcs 2.1
and Mchess 4 ?????, or Mchess3 ???)

Bruce Morland has undertaken to keep the results table up to date.
I assume he will be posting it back here every so often.

Chris Whittington

Steven J. Edwards

unread,

May 26, 1996, 3:00:00 AM5/26/96

to

hy...@cis.uab.edu (Robert Hyatt) writes:
>In article <Dry0G...@mv.mv.com>, Steven J. Edwards <s...@mv.mv.com> wrote:

>>For positions more than 1 ply deep in the main search tree, Spector
>>compares the current position with prior positions (both in the tree
>>and in the move history) using 64 bit hash keys. If a match is
>>detected, the position is scored as a draw. Spector looks at the side
>>to move and the halfmove count and efficiently makes the minimum
>>number of comparisons.
>>
>>For positions at ply 1 (the candidate moves), Spector uses a full
>>position match instead of hash keys to reliably detect threefold
>>repetion. A twofold repetion at ply 1 is scored as a draw and the
>>subtree is not searched or even generated.
>>
>>The only hazard with the above approach is that a time control
>>boundary may be crossed between a twofold repetion at ply 1 where only
>>a little analysis time was available for the first occurance but a
>>large amount of time is available for the second. It is possible that
>>a second repretion may indeed lead to a win, but the program can't see
>>this because it assumes that any future analysis for that position
>>would be a waste of time. One way to treat this situation is to
>>record the analysis time for each played move and compare the time
>>used for the first occurance with the time budget for the second
>>occurance. But this doesn't seem to be worthwhile, at least as long
>>as there are plenty of other things that need work.

>or use the "third repetition is absolute draw, repetitions in either history

>or the search path, second repetition is a draw if both occur in the search
>path only. That avoids part of the problem, because the third repetition is
>a draw no matter what, while in the search if repeating the second time is
>good, then you can probably force the third repetition.

>The best-known type of problem position is you are facing two connected
>rooks on the 7th, one of the rooks checks you and you have two legal moves.
>There's a 50% chance that you take a path that leads you to one edge and
>back to this position where you say "draw" while if you go the other way,
>you reach a square where you can avoid the repetition. The problem is
>that if you start the wrong way, and count two-fold repetition as "draw"
>you have to somehow "blow through" the 2nd repetition to see that the third
>can be avoided. It's just as likely that if you can choose between repeating
>going one way (unknowingly going for the original position where you get to
>go "through" it to safety, or going back the other way to hit a third rep,
>you'll take the latter and draw rather than finding your way out of the
>repetition and on to the win.

>In Cray Blitz we always favored the longest drawing line to try and avoid
>this. Our draw score was N+ply to encourage this, although the transposition
>table screws this calculation up by grafting paths inside the tree without
>knowing anything about how to adjust the N+len calculation, since the trans
>ref table doesn't have any info about the moves along the path...

>More if you are interested...

I remember your paper in the _ICCAJ_ about the draw heuristic in Cray
Blitz with depth based scores for draws. As I recall, Cray Blitz had
a forbidden zone of scores near zero reserved for draws by a tweak
that pushed no-draw but balanced positions just outside the maximum
draw value.

I did not implement anything like this because I felt that it would
screw up the transposition score retrieval and because I am uneasy
with having a single variable (the score) carry two linds of
information. There are already some difficulties with embedding
mate/loss information in score returns (again, with the transposition
subsystem) and I wanted to avoid further hazards.

-- Steven (s...@mv.mv.com)

Steven J. Edwards

unread,

May 26, 1996, 3:00:00 AM5/26/96

to

Chris Whittington <chr...@cpsoft.demon.co.uk> writes:
>kerr...@frii.com (Tom Kerrigan) wrote:

>> Hum, my function to detect reps only considers the halfmove count. I have been
>> meaning to figure out how to use the side to move, but null moves can screw this
>> all up, so I've just been ignoring it.

>Just xor a MOVE_HASH_CONSTANT every time you move.

>This deals with black/white and null move.

I have an uneasy feeling about storing the side to move as part of the
hash key. It seems like another source of the dreaded "takes two
weeks to debug mysterious error" type of problem if a false positive
match occurs with different colors. Store teh side to move each time
and you'll sleep better at night.

-- Steven (s...@mv.mv.com)

Robert Hyatt

unread,

May 26, 1996, 3:00:00 AM5/26/96

to

In article <Ds0uD...@mv.mv.com>, Steven J. Edwards <s...@mv.mv.com> wrote:
>
>I remember your paper in the _ICCAJ_ about the draw heuristic in Cray
>Blitz with depth based scores for draws. As I recall, Cray Blitz had
>a forbidden zone of scores near zero reserved for draws by a tweak
>that pushed no-draw but balanced positions just outside the maximum
>draw value.
>
>I did not implement anything like this because I felt that it would
>screw up the transposition score retrieval and because I am uneasy
>with having a single variable (the score) carry two linds of
>information. There are already some difficulties with embedding
>mate/loss information in score returns (again, with the transposition
>subsystem) and I wanted to avoid further hazards.
>
>-- Steven (s...@mv.mv.com)
>
>

The main difficulty with what we did is that the transposition table can
break this score. For example, if you find reach position X and store
"draw in 31 plies", and then later retrieve it (same position, but a different
sequence of moves to get here) the draw in 31 might be completely wrong.

The problem is that the path from Position X to the drawing point is based
on moves made not only after position X to reach the draw, but on moves
made before X as well. This info is not stored, so that when we retrieve
X later, the draw score is at best a wild guess. Worked well at times,
but we often saw draw in 200 plies, when we could not possibly search
that deeply.

I'm "back to basics" in Crafty because the negamax makes keeping a
"window of scores blocked off for only draw scores" is confusing
in the negamax representation.

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <83311625...@cpsoft.demon.co.uk>,
chr...@cpsoft.demon.co.uk says...
>[snip]

>
>When the initial results (6 programs?) were posted, I made it
>clear that these figures were from 1994-June. The idea is for the
>table of results to grow - with current commercial programs - and
>with experimental and amateur versions. (That makes it Hiarcs 2.1
>and Mchess 4 ?????, or Mchess3 ???)
>
>Bruce Morland has undertaken to keep the results table up to date.
>I assume he will be posting it back here every so often.
>
>Chris Whittington

Yes, I will. I'll also post La Puce and CCR (1 hour) results as well.
I've only had one report on these suites though.

bruce

--
The opinions expressed in this message are my own personal views
and do not reflect the official views of Microsoft Corporation.

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <Dry0G...@mv.mv.com>, s...@mv.mv.com says..

>
>The only hazard with the above approach is that a time control
>boundary may be crossed between a twofold repetion at ply 1 where only
>a little analysis time was available for the first occurance but a
>large amount of time is available for the second. It is possible that
>a second repretion may indeed lead to a win, but the program can't see
>this because it assumes that any future analysis for that position
>would be a waste of time. One way to treat this situation is to
>record the analysis time for each played move and compare the time
>used for the first occurance with the time budget for the second
>occurance. But this doesn't seem to be worthwhile, at least as long
>as there are plenty of other things that need work.
>

>-- Steven (s...@mv.mv.com)

Assuming you don't clear your hash table, you could get ANYTHING back the
second time you examine a position, for any length of time.

Here are problems with scoring a position drawn ANY time you see it twice:

1) I was getting into these positions where I was under attack, and
losing, but my program is checking like crazy with its queen, but it's
still still thinking it's -2, it's just pushing the even grosser stuff
over the horizon. So it's roaring around checking away at -2, but then
the opponent makes a mistake and it really IS a perpetual. So now the
opponent moves his king back into a position where my program can choose
between the move that guarantees a perpetual, and the move that gets
it back into the -2 position (which it also thinks is a draw, since it has
already seen it once). It goes for the -2 position, the opponent has
learned to avoid the perpetual check, and now finds the win.

2) I had an ending vs gnuchess a long time ago, which was a K+P ending,
and gnu wasn't seeing as much as mine was, it didn't know that a
particular position was won for it, and it allowed my program to escape
to a position that really was drawn. The game seemed on the verge of
being a repetition draw, but my program allowed gnu to get back to the
"won" position, at which point it found the win.

3) My program was two pawns down versus a human. The human made a move
with a knight, which was a blunder, as it lost a pawn. My program made a
move with its knight, which threatened a pawn that couldn't be moved or
defended. The opponent thought about this, and decided to concede the
pawn. The move he made to concede the pawn involved undoing the move he'd
just made, at which point my program undid its move (since it would
rather go for the "draw" position than the position where it is a pawn
down), and the human was two pawns ahead again.

This is why I score a position a draw if I see it twice in the search, or
twice in the move history + once in the search. I have found that to do
otherwise costs points in practical play.

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <Drz5A...@mv.mv.com>, s...@mv.mv.com says...

>
>kerr...@frii.com (Tom Kerrigan) writes:
>
>>Hum, my function to detect reps only considers the halfmove count. I
have been
>>meaning to figure out how to use the side to move, but null moves can
screw this
>>all up, so I've just been ignoring it.
>

>Easy. Just store the active color with the stored hash key for each
>played move, and compared the stored values with the current value.

Either that or hash some random crud (a constant value) into your key
every time you change side to move.

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <4o8dhb$g...@pelham.cis.uab.edu>, hy...@cis.uab.edu says...

>
>I use two lists, and don't have the problem at all... However, the minute
>you let a draw score hit the hash table you have a problem. I allow it,
>and put up with it. and curse it. and complain about it. and still I
>tolerate it. :)
>
>Bob

The draw score doesn't have to hit the hash table, it can cause problems even if you
don't hash a draw score.

If you have a position where you have two moves, once of which is +2.00, another one
which is +1.50, your program will chose the +2.00 move, and you will get +2.00 in
your hash table.

But if you have seen the +2.00 move before, your program will take the +1.50 move,
and you'll get +1.50 in your hash table.

If you then get to this node via another move path, in which the +1.50 is the one
that you would have seen before, you may cut off based upon +1.50 even though a) the
+2.00 move is available, and b) the +1.50 move leads to a rep.

This is one of those cases where you just cross your fingers and hope for the best,
after having verified that your program won't crash or exhibit overly bizarre
behavior if your search goes non-deterministic.

bruce

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <Ds0uM...@mv.mv.com>, s...@mv.mv.com says...

>
>I have an uneasy feeling about storing the side to move as part of the
>hash key. It seems like another source of the dreaded "takes two
>weeks to debug mysterious error" type of problem if a false positive
>match occurs with different colors. Store teh side to move each time
>and you'll sleep better at night.

Do it the other way and I bet you'll go a little faster.

You can avert catastrophe in case of a key collision by taking your hash
table move with a grain of salt.

Assuming you don't get confused if you find a move for the other side in
your hash element, all you are doing by inserting "color" into the hash
element is adding a bit to your hash key.

bruce

Bruce Moreland

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <Ds0uD...@mv.mv.com>, s...@mv.mv.com says...

>
>I remember your paper in the _ICCAJ_ about the draw heuristic in Cray
>Blitz with depth based scores for draws. As I recall, Cray Blitz had
>a forbidden zone of scores near zero reserved for draws by a tweak
>that pushed no-draw but balanced positions just outside the maximum
>draw value.

It is amazing how many ideas you have, experiment with, and discard, while
writing a chess program.

It is also amazing how many of the ideas in Bob's papers are actually
worth doing.

This isn't one of them.

I don't have a good solution to this problem. If you are in a position
where you can either choose RxN KxR, which is a draw because the opponent
has a bishop and you have nothing, or play RxB, at which point it is an
intricate but drawn R vs N ending, where you have the R, you probably want
to do the latter, since there are some practical winning chances, but it
is hard to represent this as a "score".

bruce

John Stanback

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

> In Zarkov I avoid most of these problems by not counting the second occurance
of a position as a draw if it is within 2 plies of the root of the search tree.
This is really simple to implement and ensures that the computer won't just make
a move which repeats the position making the faulty assumption that the best
the opponent will be able to do is draw.

John Stanback

Chris Whittington

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

Nice idea.

I'll try it for Chess System Tal which often picks a draw at depth
2-3 and cuts it for the main line.

Often this occurs when CST had been thinking it was better for
the previous moves and then starts to lose the game.

The 'draw' line is often the prelude to a loss.

Been wondering for some time how to deal with it.

Chris Whittington

Robert Hyatt

unread,

May 27, 1996, 3:00:00 AM5/27/96

to

In article <83322224...@cpsoft.demon.co.uk>,
Chris Whittington <chr...@cpsoft.demon.co.uk> wrote:
-->John Stanback <j...@verinet.com> wrote:
-->>
-->> Bruce Moreland wrote:
-->> >
-->> > Here are problems with scoring a position drawn ANY time you see it twice:
-->> >
-->> > 1) I was getting into these positions where I was under attack, and
-->> > losing, but my program is checking like crazy with its queen, but it's
-->> > still still thinking it's -2, it's just pushing the even grosser stuff
-->> > over the horizon. So it's roaring around checking away at -2, but then
-->> > the opponent makes a mistake and it really IS a perpetual. So now the
-->> > opponent moves his king back into a position where my program can choose
-->> > between the move that guarantees a perpetual, and the move that gets
-->> > it back into the -2 position (which it also thinks is a draw, since it has
-->> > already seen it once). It goes for the -2 position, the opponent has
-->> > learned to avoid the perpetual check, and now finds the win.
-->> >
-->> > 2) I had an ending vs gnuchess a long time ago, which was a K+P ending,
-->> > and gnu wasn't seeing as much as mine was, it didn't know that a
-->> > particular position was won for it, and it allowed my program to escape
-->> > to a position that really was drawn. The game seemed on the verge of
-->> > being a repetition draw, but my program allowed gnu to get back to the
-->> > "won" position, at which point it found the win.
-->> >
-->> > 3) My program was two pawns down versus a human. The human made a move
-->> > with a knight, which was a blunder, as it lost a pawn. My program made a
-->> > move with its knight, which threatened a pawn that couldn't be moved or
-->> > defended. The opponent thought about this, and decided to concede the
-->> > pawn. The move he made to concede the pawn involved undoing the move he'd
-->> > just made, at which point my program undid its move (since it would
-->> > rather go for the "draw" position than the position where it is a pawn
-->> > down), and the human was two pawns ahead again.
-->> >
-->> > This is why I score a position a draw if I see it twice in the search, or
-->> > twice in the move history + once in the search. I have found that to do
-->> > otherwise costs points in practical play.
-->> > In Zarkov I avoid most of these problems by not counting the second occurance
-->> of a position as a draw if it is within 2 plies of the root of the search tree.
-->> This is really simple to implement and ensures that the computer won't just make
-->> a move which repeats the position making the faulty assumption that the best
-->> the opponent will be able to do is draw.
-->>
-->>
-->> John Stanback
-->
-->
-->Nice idea.
-->
-->I'll try it for Chess System Tal which often picks a draw at depth
-->2-3 and cuts it for the main line.
-->
-->Often this occurs when CST had been thinking it was better for
-->the previous moves and then starts to lose the game.
-->
-->The 'draw' line is often the prelude to a loss.
-->
-->Been wondering for some time how to deal with it.
-->
-->Chris Whittington
-->

I seem to remember Hsu mentioning somewhere along the way that when DT
found a repetition "best" it became very suspicious, particularly if it
was ahead. That's usually one step down a long staircase. I have this
on my list to play with one day as well.

Steven J. Edwards

unread,

May 28, 1996, 3:00:00 AM5/28/96

to

brucemo (Bruce Moreland) writes:
>In article <Ds0uM...@mv.mv.com>, s...@mv.mv.com says...

>>I have an uneasy feeling about storing the side to move as part of the
>>hash key. It seems like another source of the dreaded "takes two
>>weeks to debug mysterious error" type of problem if a false positive
>>match occurs with different colors. Store teh side to move each time
>>and you'll sleep better at night.

>Do it the other way and I bet you'll go a little faster.

Not really. I check the hash key first, so it's rare that the active
color has to be checked as well.

>You can avert catastrophe in case of a key collision by taking your hash
>table move with a grain of salt.

Always. That's why Spector has a legality tester for retrieved moves.

>Assuming you don't get confused if you find a move for the other side in
>your hash element, all you are doing by inserting "color" into the hash
>element is adding a bit to your hash key.

Not quite. The added bit is not in the same equivalence class as the
other bits; it can never be confused due to collision as a result of
key construction.

-- Steven (s...@mv.mv.com)

Martin Borriss

unread,

May 28, 1996, 3:00:00 AM5/28/96

to

In article <4ntghe$a...@pelham.cis.uab.edu>, hy...@cis.uab.edu (Robert Hyatt) writes:

[...]

> BTW, Crafty counts twice as a draw, whether it's twice in search or once in
> history and once in search. I tried it the other way but didn't like it as
> well as this as it would often take longer to see a repetition when you
> require two reps in search or three overall. I now treat two as a rep
> regardless, just like I did in Cray Blitz. I have a switch in CB to flip
> it to either approach, but always seemed to like the simple one better...

OK, that's what I wanted to hear. If it works for Crafty, it should be fine for
me (and for Tom?). The thing I learned is the potential problem with having
the first occurence at the root (which I happily ignore now).

--
Martin....@inf.tu-dresden.de

Chris Whittington

unread,

May 28, 1996, 3:00:00 AM5/28/96

to

Well, I tried it - not expecting anything dramatic, but, first game,
one move, it put a repetition draw in at depth 5, instead of the
usual 2 or 3.

And stepped down the long staircase.

Aaaarrrrggghhh.

Chris Whittington

Robert Hyatt

unread,

May 28, 1996, 3:00:00 AM5/28/96

to

In article <83331736...@cpsoft.demon.co.uk>,

Chris Whittington <chr...@cpsoft.demon.co.uk> wrote:
-->>

-->> I seem to remember Hsu mentioning somewhere along the way that when DT

-->> found a repetition "best" it became very suspicious, particularly if it
-->> was ahead. That's usually one step down a long staircase. I have this
-->> on my list to play with one day as well.
-->> --

-->> Robert Hyatt Computer and Information Sciences

-->> hy...@cis.uab.edu University of Alabama at Birmingham
-->> (205) 934-2213 115A Campbell Hall, UAB Station
-->> (205) 934-5473 FAX Birmingham, AL 35294-1170
-->
-->Well, I tried it - not expecting anything dramatic, but, first game,
-->one move, it put a repetition draw in at depth 5, instead of the
-->usual 2 or 3.
-->
-->And stepped down the long staircase.
-->
-->Aaaarrrrggghhh.

-->
-->Chris Whittington
-->

I think that we need to try noticing that if you appear to be ahead (by
how much is subject to discussion) and discover that a repetition is the
best move suddenly, it might be worthwhile to try for another ply, although
it might well be that the game is already lost.

Vincent Diepeveen

unread,

May 30, 1996, 3:00:00 AM5/30/96

to

In <4obu0b$i...@news.microsoft.com> brucemo (Bruce Moreland) writes:

>In article <Ds0uD...@mv.mv.com>, s...@mv.mv.com says...
>>
>>I remember your paper in the _ICCAJ_ about the draw heuristic in Cray
>>Blitz with depth based scores for draws. As I recall, Cray Blitz had
>>a forbidden zone of scores near zero reserved for draws by a tweak
>>that pushed no-draw but balanced positions just outside the maximum
>>draw value.
>
>It is amazing how many ideas you have, experiment with, and discard, while
>writing a chess program.
>
>It is also amazing how many of the ideas in Bob's papers are actually
>worth doing.
>
>This isn't one of them.

Correct. I tried it. It never takes the draw.
When it sees a draw, it has a good reason to not take it, because
of its score. It waits and sits until it cannot take a draw anymore.

>I don't have a good solution to this problem. If you are in a position
>where you can either choose RxN KxR, which is a draw because the opponent
>has a bishop and you have nothing, or play RxB, at which point it is an
>intricate but drawn R vs N ending, where you have the R, you probably want
>to do the latter, since there are some practical winning chances, but it
>is hard to represent this as a "score".
>
>bruce
>

--
+--------------------------------------+
|| email : vdie...@cs.ruu.nl ||
|| Vincent Diepeveen ||
+======================================+

Robert Hyatt

unread,

May 30, 1996, 3:00:00 AM5/30/96

to

In article <4ok5ed$f...@krant.cs.ruu.nl>,
Vincent Diepeveen <vdie...@cs.ruu.nl> wrote:
-->In <4obu0b$i...@news.microsoft.com> brucemo (Bruce Moreland) writes:
-->
-->>In article <Ds0uD...@mv.mv.com>, s...@mv.mv.com says...
-->>>
-->>>I remember your paper in the _ICCAJ_ about the draw heuristic in Cray
-->>>Blitz with depth based scores for draws. As I recall, Cray Blitz had
-->>>a forbidden zone of scores near zero reserved for draws by a tweak
-->>>that pushed no-draw but balanced positions just outside the maximum
-->>>draw value.
-->>
-->>It is amazing how many ideas you have, experiment with, and discard, while
-->>writing a chess program.
-->>
-->>It is also amazing how many of the ideas in Bob's papers are actually
-->>worth doing.
-->>
-->>This isn't one of them.
-->
-->Correct. I tried it. It never takes the draw.
-->When it sees a draw, it has a good reason to not take it, because
-->of its score. It waits and sits until it cannot take a draw anymore.
-->
-->>I don't have a good solution to this problem. If you are in a position
-->>where you can either choose RxN KxR, which is a draw because the opponent
-->>has a bishop and you have nothing, or play RxB, at which point it is an
-->>intricate but drawn R vs N ending, where you have the R, you probably want
-->>to do the latter, since there are some practical winning chances, but it
-->>is hard to represent this as a "score".
-->>
-->>bruce

Sorry to disagree, but the technique worked well, and I could likely get
Harry to produce some positions where this is the *only* way to reliably
solve certain problems.

The only drawback was the trans/ref table could perturb the draw-in-n
scores by grafting sub-trees from one point to another without fully
understanding the moves between the two sub-trees.

However, if "you tried it and it never took the draw" you screwed it up.
I recall one game against Zarkov in an ACM event where this saved the day
and let us win a game that likely would have been a draw, because Cray
Blitz kept taking the longest path to repetition which eventually took
us to the *only* path that could win.. Without this, it would take pure
luck to choose between 5 moves that lead to a draw, and only one of them
actually leads to a win, but the win is so far away....

I have not yet done it in Crafty, only because the negamax algorithm makes
it too easy to screw up this "window". Once I have time to think about it
I plan on re-implementing it here. If you look up Harry's paper "The Cray
Blitz Draw Heuristic" in the ICCA journal, you'll find a couple of positions
that this solves quite well.

Against humans it's even better, because you always keep the draw option in
sight, but take the longest path to give the human a chance to make a mis-
step and lose the draw option.

Dave Slate and I had many discussions about this. His recommendation was
to never let a draw score get stored in the trans/ref table. He apparently
did this in his programs for reasons unknown, but it would have solved the
draw-in-N getting changed to a draw-in-N+R(x) where R(x) is nearly a random
number at times. I tried it in Cray Blitz, but liked the performance gain
obtained by finding draw scores as well as mates, which I also store as
absolute mate scores in Crafty (as I did in Cray Blitz) with no problems.

Bob