Fail High reductions

Jon Dart

unread,

Oct 7, 1996, 3:00:00 AM10/7/96

to

There is a mention in the new ICCA journal of an article presented at
the 8th Advances in Computer Chess conference on "Fail High reductions."
The author claims a 130 point or so ELO increase in the strength of
the program "Zugzwang" after implementing this technique.

I looked around the net for the article and found it at
http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
(Rainer's Feldmann's home page).

The 130 point increase in strength is less impressive once you realize
that the program didn't use the null move heuristic before, and now
uses "failhigh reductions", which are a variation of the null move
heuristic. But still, I'd be interested in any opinions on this paper.
It's not clear that the Fail High reduction technique is better than
a straight implementation of null moves, but it may be worth a try.

By the way, the latest ICCA journal also has a nice article on
hashtable replacement algorithms that looks interesting. They found a
2-level table, with different replacement schemes in each table, generally
better than a single table.

--Jon

Robert Hyatt

unread,

Oct 7, 1996, 3:00:00 AM10/7/96

to

Jon Dart (jd...@best.com) wrote:
:
: There is a mention in the new ICCA journal of an article presented at

: the 8th Advances in Computer Chess conference on "Fail High reductions."
: The author claims a 130 point or so ELO increase in the strength of
: the program "Zugzwang" after implementing this technique.
:
: I looked around the net for the article and found it at
: http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
: (Rainer's Feldmann's home page).
:
: The 130 point increase in strength is less impressive once you realize
: that the program didn't use the null move heuristic before, and now
: uses "failhigh reductions", which are a variation of the null move
: heuristic. But still, I'd be interested in any opinions on this paper.
: It's not clear that the Fail High reduction technique is better than
: a straight implementation of null moves, but it may be worth a try.

Is this an English translation?

:
: By the way, the latest ICCA journal also has a nice article on

: hashtable replacement algorithms that looks interesting. They found a
: 2-level table, with different replacement schemes in each table, generally
: better than a single table.

Ken Thompson has used such a scheme for years. He had one table that was
an "always replace" table, and a second (1/2 the size of the first) that
is a depth-priority replacement scheme. Ferret has used this, and I have
a version of this for Crafty as well, although the old Cray Blitz scheme
offers better performance if memory is fast (not the case on the PC's of
course.)

Jon Dart

unread,

Oct 7, 1996, 3:00:00 AM10/7/96

to

In article <53bdqr$f...@juniper.cis.uab.edu>,

Robert Hyatt <hy...@crafty.cis.uab.edu> wrote:
>Jon Dart (jd...@best.com) wrote:
>:

>: There is a mention in the new ICCA journal of an article ...

>: http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
>: (Rainer's Feldmann's home page).
>:

>Is this an English translation?
>

It is English language, postscript format.

--Jon

Ed Schröder

unread,

Oct 8, 1996, 3:00:00 AM10/8/96

to

jd...@best.com (Jon Dart) wrote:
>
>There is a mention in the new ICCA journal of an article presented at
>the 8th Advances in Computer Chess conference on "Fail High reductions."
>The author claims a 130 point or so ELO increase in the strength of
>the program "Zugzwang" after implementing this technique.
>
>I looked around the net for the article and found it at

>http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
>(Rainer's Feldmann's home page).

Does anybody has this article in normal TEXT format?

I have tried this idea myself in the past with disappointing results.
Maybe Rainer Feldmann did a better job.

- Ed Schroder -

>The 130 point increase in strength is less impressive once you realize
>that the program didn't use the null move heuristic before, and now
>uses "failhigh reductions", which are a variation of the null move
>heuristic. But still, I'd be interested in any opinions on this paper.
>It's not clear that the Fail High reduction technique is better than
>a straight implementation of null moves, but it may be worth a try.
>

>By the way, the latest ICCA journal also has a nice article on
>hashtable replacement algorithms that looks interesting. They found a
>2-level table, with different replacement schemes in each table, generally
>better than a single table.
>

>--Jon

Vincent Diepeveen

unread,

Oct 9, 1996, 3:00:00 AM10/9/96

to

In <53bado$d...@shellx.best.com> jd...@best.com (Jon Dart) writes:

>There is a mention in the new ICCA journal of an article presented at
>the 8th Advances in Computer Chess conference on "Fail High reductions."
>The author claims a 130 point or so ELO increase in the strength of
>the program "Zugzwang" after implementing this technique.
>
>I looked around the net for the article and found it at
>http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
>(Rainer's Feldmann's home page).
>

>The 130 point increase in strength is less impressive once you realize
>that the program didn't use the null move heuristic before, and now
>uses "failhigh reductions", which are a variation of the null move
>heuristic. But still, I'd be interested in any opinions on this paper.

>It's not clear that the Fail High reduction technique is better than
>a straight implementation of null moves, but it may be worth a try.

Why do you conclude that FH works better than nullmove?
How many games did you play with your program?

I think that FHreductions is another form of very
dubious forward pruning, with an even worse assumption than nullmove.

>By the way, the latest ICCA journal also has a nice article on
>hashtable replacement algorithms that looks interesting. They found a
>2-level table, with different replacement schemes in each table, generally
>better than a single table.

The 2 level table definitely works better.

>--Jon

Vincent Diepeveen
vdie...@cs.ruu.nl
--
+--------------------------------------+
|| email : vdie...@cs.ruu.nl ||
|| Vincent Diepeveen ||
+======================================+

Vincent Diepeveen

unread,

Oct 9, 1996, 3:00:00 AM10/9/96

to

In <53cmbs$s...@news.xs4all.nl> "Ed Schrder" <rebc...@xs4all.nl> writes:

>jd...@best.com (Jon Dart) wrote:
>>
>>There is a mention in the new ICCA journal of an article presented at
>>the 8th Advances in Computer Chess conference on "Fail High reductions."
>>The author claims a 130 point or so ELO increase in the strength of
>>the program "Zugzwang" after implementing this technique.
>>
>>I looked around the net for the article and found it at
>>http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
>>(Rainer's Feldmann's home page).
>
>

>Does anybody has this article in normal TEXT format?
>
>I have tried this idea myself in the past with disappointing results.
>Maybe Rainer Feldmann did a better job.

Feldmann forgot to compare with a nullmove (R=2) implementation, although
i think that this would have been an easy compare.

I did test FHR. Another research that has not objectively compared to
already used phenomena's (nullmove in this case).
Comparing to full-width search i guess i can even
think out methods that are far better (+150elo points). The problem is that
things like sacrafices, pawn levers, checks etc. are not searched that deeply,
so a program becomes tactical disabled when using FHR.

The assumption of
nullmove is an evaluation of a search (depth-2), where FHR assumption is
based on an evaluation of a search (depth=0). So the assumption of FHR
is much weaker than from nullmove. In the best case FHR searches the
same tree nullmove (R=1) does (although the opponent may make a move).

Another weak thing about FHR is that there is no way you can afterwards
conclude whether FHR has failed, where nullmove can detect this and still
search the entire tree at depth d instead of d-2.

In fact FHR is a advanced form of forward pruning based on evaluation:

if eval-i > beta then prune

has been transformed to FHR:

if eval-i > beta then prune a ply (so search a ply less).

If you do use FHR then you can better extend this to Fail Reductions
(first described by Vincent Diepeveen in this email, but probably
already a dozen or more times invented and tried by others):

if eval-i > beta
or eval+i < alfa
then prune a ply

which definitely works better than fhr. If you prune, then prune them all!

This algorithm i tested 1-1.5 years ago in Diep (spring-summer 1995).
It works far better than Fail High Reductions, but performs miserably compared
to nullmove. Not to mention a program using an selfimproved version of
nullmove!

Robert Hyatt

unread,

Oct 9, 1996, 3:00:00 AM10/9/96

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
: In <53bado$d...@shellx.best.com> jd...@best.com (Jon Dart) writes:
:
:
: >There is a mention in the new ICCA journal of an article presented at

: >the 8th Advances in Computer Chess conference on "Fail High reductions."
: >The author claims a 130 point or so ELO increase in the strength of
: >the program "Zugzwang" after implementing this technique.
: >
: >I looked around the net for the article and found it at
: >http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html
: >(Rainer's Feldmann's home page).

: >
: >The 130 point increase in strength is less impressive once you realize

: >that the program didn't use the null move heuristic before, and now
: >uses "failhigh reductions", which are a variation of the null move
: >heuristic. But still, I'd be interested in any opinions on this paper.
:
: >It's not clear that the Fail High reduction technique is better than
: >a straight implementation of null moves, but it may be worth a try.
:
: Why do you conclude that FH works better than nullmove?
: How many games did you play with your program?
:
: I think that FHreductions is another form of very
: dubious forward pruning, with an even worse assumption than nullmove.
:
: >By the way, the latest ICCA journal also has a nice article on
: >hashtable replacement algorithms that looks interesting. They found a
: >2-level table, with different replacement schemes in each table, generally
: >better than a single table.
:
: The 2 level table definitely works better.
:
: >--Jon

:
: Vincent Diepeveen

: vdie...@cs.ruu.nl
: --
: +--------------------------------------+
: || email : vdie...@cs.ruu.nl ||
: || Vincent Diepeveen ||
: +======================================+

Unless you are on a vector machine with high memory bandwidth. Then the
Cray Blitz strategy is marginally better, until you have to run with a hash
table that is significantly smaller than optimal. Then the Cray Blitz hashing
algorithm will dominate the two-table scheme badly...

Vincent Diepeveen

unread,

Oct 10, 1996, 3:00:00 AM10/10/96

to

I agree, in fact you use a 1 hashtable and more than a 2 tuple probe strategy?
The more tuples you search to see whether you can update,
the better the update, the better the hashtable should perform. Seems logical.

I'm thinking about a hashtable implementation replacement with 1 hashtable,
hash h and just trying whether it is smart to replace tuple h, h+1, or h+2.
So a 3 probe strategy. The risk is of course chaining.

I guess that this must already work better than the 2 table scheme,
which is the same as trying 2 tuples in 1 hashtable (assuming that
a 2 table scheme has only half the number of tuples the 1 table scheme has).

Counting the number of nodes definitely works better than storing depth,
although storing depth takes only around 6 bits, and counting number of nodes
something like 32 bits. This will probably be the case because i use nullmove,
but on the other hand my feeling says that a move near the root is
more important, because another move could not give that quickly a cutoff.

Extended experiments i haven't tried however. I simply lack time at this
moment to do that, and just before DCCC i don't want to risk bugs in my
hashtable... :)

Robert Hyatt

unread,

Oct 10, 1996, 3:00:00 AM10/10/96

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
:
: I agree, in fact you use a 1 hashtable and more than a 2 tuple probe strategy?

: The more tuples you search to see whether you can update,
: the better the update, the better the hashtable should perform. Seems logical.

Right. and on the cray the "other" entries are free to "fetch" and compare
with, since the "set" of entries forms a vector...

Here's what I did:

take the low order "N" bits of the hash signature and use this as the
first address to probe in the table. then take the next 8 bits, and use
this as a constant offset to this key to fetch 8 entries, one from the
original address, the next from the original address + offset, the next
from the original address + 2*offset, etc.

This gives a set of 8 entries, and, even better, this is a different set of
8 entries than the set of 8 produced by a different hash signature that just
happened to have the same low order N bits as this hash signature. The only
"cray wrinkle" here is that the hash table has to be N+2048 entries long to
accomodate the max offset of 255 *8 (actually less than 2048 but who's
counting on a machine with 2 gigawords (16 gigabytes) of main memory? :) )

due to the vector hardware, loading 8 takes no more time than loading one
hash entry, and then extracting the "draft" to see which should be replaced
also becomes a vector operation. All very fast. On the PeeCee it's a bad
idea of course, since memory bandwidth is very low.

:
: I'm thinking about a hashtable implementation replacement with 1 hashtable,

: hash h and just trying whether it is smart to replace tuple h, h+1, or h+2.
: So a 3 probe strategy. The risk is of course chaining.

exactly. that's why I used the above two-level hashing scheme so that while
I still get chains, they are different chains for each hash signature that
maps to the same same set of 8 addresses, but that have a different 8-bit
rehash value.

:
: I guess that this must already work better than the 2 table scheme,

: which is the same as trying 2 tuples in 1 hashtable (assuming that
: a 2 table scheme has only half the number of tuples the 1 table scheme has).

major advantage of 2 table approach is simplicity. next advantage is you can use
50% more memory, since a single table can't be larger than 1/2 of memory if it
is constrained to use power-of-two addressing (modulus is slow on many machines.)

:
: Counting the number of nodes definitely works better than storing depth,

: although storing depth takes only around 6 bits, and counting number of nodes
: something like 32 bits. This will probably be the case because i use nullmove,
: but on the other hand my feeling says that a move near the root is
: more important, because another move could not give that quickly a cutoff.

I'm not counting nodes, I store "draft" which is depth below this node.

:
: Extended experiments i haven't tried however. I simply lack time at this

: moment to do that, and just before DCCC i don't want to risk bugs in my
: hashtable... :)

:

Not a good place to have bugs. They are generally catastrophic there...

Urban Koistinen

unread,

Oct 11, 1996, 3:00:00 AM10/11/96

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
<cut>

: Counting the number of nodes definitely works better than storing depth,
: although storing depth takes only around 6 bits,
: and counting number of nodes something like 32 bits.

Use yyyyyxxx -> 1.xxx * 2^yyyyy

Compares are cheap as they can be done with yyyyyxxx as an unsigned char.

To get probabilistic accuracy when adding:
1. add the numbers as long enough ints
2. put the top part of the new number c in d=yyyyyxxx
3. remove the top part from c
4. generate a random number 0..(-1+2^(yyyyy-3))
5. if the random number is less than c, increment d
when overflowing, stay with all bits 1 or accept some losses when
searching more than 2^32 nodes

: This will probably be the case because i use nullmove,

: but on the other hand my feeling says that a move near the root is
: more important, because another move could not give that quickly a cutoff.

: Extended experiments i haven't tried however. I simply lack time at this

Robert Hyatt

unread,

Oct 14, 1996, 3:00:00 AM10/14/96

to

Urban Koistinen (m10...@abc.se) wrote:

: Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
: <cut>
: : Counting the number of nodes definitely works better than storing depth,
: : although storing depth takes only around 6 bits,
: : and counting number of nodes something like 32 bits.

I'm not sure there's a lot of difference in the two from past tests. I'll
try again of course, but the last time I tested this I found the depth
replacement idea slightly better, particularly if I use the liberal check
extensions I have now. These extensions tend to drive the node counts for
some branches way high, causing them to outweigh other branches that have
fewer checks. For tactical positions this might be better, but for overall
game play, depth replacement worked better as I recall. More later...

:
: Use yyyyyxxx -> 1.xxx * 2^yyyyy

:
: Compares are cheap as they can be done with yyyyyxxx as an unsigned char.
:
: To get probabilistic accuracy when adding:
: 1. add the numbers as long enough ints
: 2. put the top part of the new number c in d=yyyyyxxx
: 3. remove the top part from c
: 4. generate a random number 0..(-1+2^(yyyyy-3))
: 5. if the random number is less than c, increment d
: when overflowing, stay with all bits 1 or accept some losses when
: searching more than 2^32 nodes

not many are going to have to worry about searching even 2,000,000,000
nodes in a real game, much less an unsigned int's 4,000,000,000 nodes...

:
: : This will probably be the case because i use nullmove,

Vincent Diepeveen

unread,

Oct 14, 1996, 3:00:00 AM10/14/96

to

In <53kq9d$1...@oden.abc.se> m10...@abc.se (Urban Koistinen) writes:

>Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
><cut>
>: Counting the number of nodes definitely works better than storing depth,
>: although storing depth takes only around 6 bits,
>: and counting number of nodes something like 32 bits.
>

>Use yyyyyxxx -> 1.xxx * 2^yyyyy
>
>Compares are cheap as they can be done with yyyyyxxx as an unsigned char.
>
>To get probabilistic accuracy when adding:
>1. add the numbers as long enough ints
>2. put the top part of the new number c in d=yyyyyxxx
>3. remove the top part from c
>4. generate a random number 0..(-1+2^(yyyyy-3))
>5. if the random number is less than c, increment d
>when overflowing, stay with all bits 1 or accept some losses when
>searching more than 2^32 nodes

Nice to describe this, i already use a similar
method (not 8 bits, but around 24 bits, so that i have an about 8 bits
for hashtable: 32 + 8 + hashbits). 8 bits to do this seemed not enough,
as i tried 16 bits without success. The way i store seems however simpeler,
but i don't want to get into implementation details.

The problem is that when you search arount 12 ply (so after a night) at
middlegame, you have searched around (12 hours) 4000x3600x12 = 4.36.12.10000
Which is quite a lot of nodes, and 8 bits givs you only 256 possibilities,
which is more than depth storing will give you, so it will give better
results, but when you store the entire number in say 24 bits, then you
have 16 million possibilities, which is a lot more than 256, and works
accordingly.

This scheme however still looses a lot of bits compared to depth storing.
24 - 6 = 18 bits you loose around 2 bytes, at PC's memory is limited.

>: This will probably be the case because i use nullmove,
>: but on the other hand my feeling says that a move near the root is
>: more important, because another move could not give that quickly a cutoff.

>: Extended experiments i haven't tried however. I simply lack time at this
>: moment to do that, and just before DCCC i don't want to risk bugs in my
>: hashtable... :)

Urban Koistinen

unread,

Oct 15, 1996, 3:00:00 AM10/15/96

to

Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:
: Urban Koistinen (m10...@abc.se) wrote:

: : Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
: : <cut>
: : : Counting the number of nodes definitely works better than storing depth,
: : : although storing depth takes only around 6 bits,
: : : and counting number of nodes something like 32 bits.

: I'm not sure there's a lot of difference in the two from past tests. I'll

: try again of course, but the last time I tested this I found the depth
: replacement idea slightly better, particularly if I use the liberal check
: extensions I have now. These extensions tend to drive the node counts for
: some branches way high, causing them to outweigh other branches that have
: fewer checks. For tactical positions this might be better, but for overall
: game play, depth replacement worked better as I recall. More later...

: :
: : Use yyyyyxxx -> 1.xxx * 2^yyyyy

: :
: : Compares are cheap as they can be done with yyyyyxxx as an unsigned char.
: :
: : To get probabilistic accuracy when adding:
: : 1. add the numbers as long enough ints
: : 2. put the top part of the new number c in d=yyyyyxxx
: : 3. remove the top part from c
: : 4. generate a random number 0..(-1+2^(yyyyy-3))
: : 5. if the random number is less than c, increment d
: : when overflowing, stay with all bits 1 or accept some losses when
: : searching more than 2^32 nodes

: not many are going to have to worry about searching even 2,000,000,000

: nodes in a real game, much less an unsigned int's 4,000,000,000 nodes...

I had crafty analyze a position for long enough to notice that you
don't use 64 bit ints for node count. (without reading the source)

Urban.K...@abc.se - e...@algonet.se

Urban Koistinen

unread,

Oct 16, 1996, 3:00:00 AM10/16/96

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
V: In <53kq9d$1...@oden.abc.se> m10...@abc.se (Urban Koistinen) writes:

V: >Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
V: ><cut>
V: >: Counting the number of nodes definitely works better than storing depth,
V: >: although storing depth takes only around 6 bits,
V: >: and counting number of nodes something like 32 bits.
V: >
V: >Use yyyyyxxx -> 1.xxx * 2^yyyyy
V: >
V: >Compares are cheap as they can be done with yyyyyxxx as an unsigned char.
V: >
V: >To get probabilistic accuracy when adding:
V: >1. add the numbers as long enough ints
V: >2. put the top part of the new number c in d=yyyyyxxx
V: >3. remove the top part from c
V: >4. generate a random number 0..(-1+2^(yyyyy-3))
V: >5. if the random number is less than c, increment d
V: >when overflowing, stay with all bits 1 or accept some losses when
V: >searching more than 2^32 nodes

V: Nice to describe this, i already use a similar
V: method (not 8 bits, but around 24 bits, so that i have an about 8 bits
V: for hashtable: 32 + 8 + hashbits). 8 bits to do this seemed not enough,
V: as i tried 16 bits without success. The way i store seems however simpeler,
V: but i don't want to get into implementation details.

Hard to compare without getting into details.

V: The problem is that when you search arount 12 ply (so after a night) at
V: middlegame,
V: you have searched around (12 hours) 4000x3600x12 = 4.36.12.10000
V: Which is quite a lot of nodes, and 8 bits givs you only 256 possibilities,
V: which is more than depth storing will give you, so it will give better
V: results, but when you store the entire number in say 24 bits, then you
V: have 16 million possibilities, which is a lot more than 256, and works
V: accordingly.

Are you refering to your own, simpler encoding?
My idea is to have a small number of possibilities stored but
to chose the possibilities so that the most important part is
kept.

V: This scheme however still looses a lot of bits compared to depth storing.
V: 24 - 6 = 18 bits you loose around 2 bytes, at PC's memory is limited.

That is with your scheme.
Ah, so you don't need depth to decide if the transposition table
can be used, (deduced from the subtraction of 6)
or you use depth as exponent but that is dangerous and likely
inefficient.

With my scheme I carefully trade accuracy for bits.

Urban.K...@abc.se - e...@algonet.se

Robert Hyatt

unread,

Oct 19, 1996, 3:00:00 AM10/19/96

to

Urban Koistinen (m10...@abc.se) wrote:

: Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:
: : Urban Koistinen (m10...@abc.se) wrote:
: : : Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:
: : : <cut>

: : : : Counting the number of nodes definitely works better than storing depth,
: : : : although storing depth takes only around 6 bits,
: : : : and counting number of nodes something like 32 bits.
:
: : I'm not sure there's a lot of difference in the two from past tests. I'll

: : try again of course, but the last time I tested this I found the depth
: : replacement idea slightly better, particularly if I use the liberal check
: : extensions I have now. These extensions tend to drive the node counts for
: : some branches way high, causing them to outweigh other branches that have
: : fewer checks. For tactical positions this might be better, but for overall
: : game play, depth replacement worked better as I recall. More later...
:
: : :

: : : Use yyyyyxxx -> 1.xxx * 2^yyyyy
: : :
: : : Compares are cheap as they can be done with yyyyyxxx as an unsigned char.
: : :
: : : To get probabilistic accuracy when adding:
: : : 1. add the numbers as long enough ints
: : : 2. put the top part of the new number c in d=yyyyyxxx
: : : 3. remove the top part from c
: : : 4. generate a random number 0..(-1+2^(yyyyy-3))
: : : 5. if the random number is less than c, increment d
: : : when overflowing, stay with all bits 1 or accept some losses when
: : : searching more than 2^32 nodes
:
: : not many are going to have to worry about searching even 2,000,000,000

: : nodes in a real game, much less an unsigned int's 4,000,000,000 nodes...
:
: I had crafty analyze a position for long enough to notice that you
: don't use 64 bit ints for node count. (without reading the source)
:
: Urban.K...@abc.se - e...@algonet.se

Jeez. :) you are right... unsigned int. more than 4 billion nodes,
eh? Jeez...