Hash functions for use with a transition table.

Tim Foden

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

I am currently writing a chess program (aren't we all!) and right now I'm
fiddling with transition tables.

My hash function code is:

class Board
{
...
UINT8 m_sboard[64]; // simple board
...
public:
...
UINT32 Hash() const;
...
};

UINT32 Board::Hash() const
{
UINT32 hash = 0;
const UINT8* pPiece = &m_sboard[0];

for( int i = 0; i < 64; i++ )
{
hash += (hash << 3) + (hash << 5) + *pPiece++; // hash *= 41
}

return hash;
}

where:
0x0 = empty,
0x1..0x6 = black pawn..king,
0x9..0xe = white pawn..king,

my current problem is that this sometimes produces hashes with the
same value from positions that are different. can anyone suggest a
more reliable hash so I don't have to store a version of the board
in every transition table record.

maybe making the hash 64 bits would be enough?

thanks, Tim Foden.

Robert Hyatt

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

Tim Foden (t...@diamond.prestel.co.uk) wrote:

: my current problem is that this sometimes produces hashes with the

: same value from positions that are different. can anyone suggest a
: more reliable hash so I don't have to store a version of the board
: in every transition table record.

: maybe making the hash 64 bits would be enough?

: thanks, Tim Foden.

Most of us are using the "Zobrist" algorithm...

Assuming you have a board[64], with values of 0=empty, and 1-6 = white
pieces and -1 - -6 are black pieces, do this:

create an array of 64 bit integers random[64][13]...

initialize it to random numbers.

then compute the hash as follows:

hash=0;

for sq=0 to 63
if (board[sq]) hash^=random[sq][board[sq]+6];

and you have a nice hash key. Notice that for any non-empty square,
you exclusive-or in the random value.

Another big advantage here is that if you want to move a piece from
square 0 to square 7 (let's assume this is a white rook, value=4 in
the range 1-6).

then hash^=random[0][10]; // 10 = 4+6 to bias pieces from -6 to +6
hash^=random[7][10]; // to 0-12, our valid subscript range.

notice that this takes the original hash computed with the loop, and
dynamically updates it as a piece is moved, without having to loop again,
by only changing the hash signature using the two squares that change.

this works because if you XOR a value with the same number two times,
you get the original value... undoing it...

Hope this helps. You can find the precise code (if you'd like to see it
in real life) buried in Crafty. MakeMove and UnMakeMove update the hash
signature, while init.c initially computes the hash signature after a
position is set up...

Dap Hartmann

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:

>create an array of 64 bit integers random[64][13]...
>initialize it to random numbers.

Does anyone have any experience using different sets of
random numbers? In Dappet, we use the same set all the time,
thus ensuring repeatable results. Years ago, I established a
set of random numbers that seemed to work best. I was very
surprised how much influence the choice of random numbers
had on the program's performance. After mindlessly trying
out various sets, I simply settled on the one that seemed to
perform best overall. I guess what you really need, ideally,
is a set of 'random' numbers which truely minimize the chances
of collisions.

Also:
Why don't we start a nice discussion on the topic of the aging
of entries in a transposition tabel. Here's what Dappet does:
After every move (carried out on the board), the length on
which the move+score in the table is based is decremented
for all entries in the table, thus ensuring that older entries
will eventually disappear (get overwritten).
The idea is that entries based on very deep lines may be
useful for a longer period than shorter ones, but that they
too have to be disposed of at some point.
We also use the double hashtable approach, in which an
entry is overwritten in Table1, only if the length of the
present analysis is longer than what is already stored in
Table1. Otherwise, it is simply stored in Table2, which is always
overwritten (and acts as the short-term memory, so to speak).

dap

Tim Foden

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

Dap Hartmann <d...@abitibi.harvard.edu> wrote in article
<331d7...@cfanews.harvard.edu>...

> hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:
>
> >create an array of 64 bit integers random[64][13]...
> >initialize it to random numbers.

I have done this, and I still get around 39 collisions of
the same hash value from different boards when I search around
500,000 nodes (6 ply + quiesce, blacks first move after Nf3,
28.5 secs). This is about the same (or possibly worse) than
the hash function I used before. I suspect that it may be
because Microsoft's random number generator is cr^h^huseless.
Maybe I should try a better generator, but I haven't got one handy.

> Does anyone have any experience using different sets of
> random numbers? In Dappet, we use the same set all the time,
> thus ensuring repeatable results. Years ago, I established a
> set of random numbers that seemed to work best. I was very
> surprised how much influence the choice of random numbers
> had on the program's performance. After mindlessly trying
> out various sets, I simply settled on the one that seemed to
> perform best overall. I guess what you really need, ideally,
> is a set of 'random' numbers which truely minimize the chances
> of collisions.

I would be interested in your set of numbers!

> Also:
> Why don't we start a nice discussion on the topic of the aging
> of entries in a transposition tabel. Here's what Dappet does:
> After every move (carried out on the board), the length on
> which the move+score in the table is based is decremented
> for all entries in the table, thus ensuring that older entries
> will eventually disappear (get overwritten).
> The idea is that entries based on very deep lines may be
> useful for a longer period than shorter ones, but that they
> too have to be disposed of at some point.
> We also use the double hashtable approach, in which an
> entry is overwritten in Table1, only if the length of the
> present analysis is longer than what is already stored in
> Table1. Otherwise, it is simply stored in Table2, which is always
> overwritten (and acts as the short-term memory, so to speak).
>
> dap
>
>

I do something I read in an article I found on the web (I can't find
the printout right now to give a reference). Basically all items
in the table have a STALE flag. Whenever you start a new iterative
deeping search you set the stale flag for all items in the table.
If you do a lookup in the table, and find a stale, but otherwise
valid entry, you set the STALE flag to FALSE, and use the entry
as normal. If, when adding to the table, you find a stale entry,
you simply overwrite it.

Also, when adding entries to the table, if the previous entry is
not stale, you compare the depth values to keep the deepest
records.

Valavan Manohararajah

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

Dap Hartmann wrote:

hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:

>create an array of 64 bit integers random[64][13]...
>initialize it to random numbers.

Does anyone have any experience using different sets of

random numbers? In Dappet, we use the same set all the time,
thus ensuring repeatable results. Years ago, I established a
set of random numbers that seemed to work best. I was very
surprised how much influence the choice of random numbers
had on the program's performance. After mindlessly trying
out various sets, I simply settled on the one that seemed to
perform best overall. I guess what you really need, ideally,
is a set of 'random' numbers which truely minimize the chances
of collisions.

In Rajah, I also have a fixed set of random numbers that I use.
I wrote a program driven by a random number generator that
returns numbers that are a certain hamming distance away from
all other numbers generated so far..... Currently I think my
minimum hamming distance is 23.

Also:
Why don't we start a nice discussion on the topic of the aging
of entries in a transposition tabel. Here's what Dappet does:
After every move (carried out on the board), the length on
which the move+score in the table is based is decremented
for all entries in the table, thus ensuring that older entries
will eventually disappear (get overwritten).
The idea is that entries based on very deep lines may be
useful for a longer period than shorter ones, but that they
too have to be disposed of at some point.
We also use the double hashtable approach, in which an
entry is overwritten in Table1, only if the length of the
present analysis is longer than what is already stored in
Table1. Otherwise, it is simply stored in Table2, which is always
overwritten (and acts as the short-term memory, so to speak).

dap

Nice technique for aging the table entries.... I currently use
a simple "old" flag. I set it to 1 everytime a new search starts.
....this gets pretty slow for very large tables.

brucemo

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

Dap Hartmann wrote:

> Does anyone have any experience using different sets of
> random numbers? In Dappet, we use the same set all the time,
> thus ensuring repeatable results. Years ago, I established a
> set of random numbers that seemed to work best. I was very
> surprised how much influence the choice of random numbers
> had on the program's performance. After mindlessly trying
> out various sets, I simply settled on the one that seemed to
> perform best overall. I guess what you really need, ideally,
> is a set of 'random' numbers which truely minimize the chances
> of collisions.

How did you measure performance? Perhaps all you proved is that your methods of
measuring performance don't work.

> Why don't we start a nice discussion on the topic of the aging
> of entries in a transposition tabel. Here's what Dappet does:
> After every move (carried out on the board), the length on
> which the move+score in the table is based is decremented
> for all entries in the table, thus ensuring that older entries
> will eventually disappear (get overwritten).
> The idea is that entries based on very deep lines may be
> useful for a longer period than shorter ones, but that they
> too have to be disposed of at some point.
> We also use the double hashtable approach, in which an
> entry is overwritten in Table1, only if the length of the
> present analysis is longer than what is already stored in
> Table1. Otherwise, it is simply stored in Table2, which is always
> overwritten (and acts as the short-term memory, so to speak).

I asked a question about hashing, in the chess newsgroup in 1994.

Ken Thompson answered it, and I implemented his answer.

His answer involved two tables.

1) This table is "always replace".

2) This table is "replace if deeper, or if this entry is left over from a
previous search".

This works fine and I don't have to clear my hash table.

bruce

Anders Thulin

unread,

Mar 5, 1997, 3:00:00 AM3/5/97

to

In article <01bc2901$511f8ea0$0100007f@fork>,
Tim Foden <t...@diamond.prestel.co.uk> wrote:

>my current problem is that this sometimes produces hashes with the
>same value from positions that are different.

That's how hashing works: as you're not using the unique key, but
the value hash(key), and the range of the hashing function is much
smaller than the key range, you can't assume you can avoid collisions.
(Unless you use perfect hashing, and that's practical only when you
have a relatively small set of keys, or the keys have a structure
that's easily analyzed in terms of perfect hashing.)

The value hash(key) always gives you a set of possible hits. A good
hashing function will keep the average set size small, but can't
guarantee singleton sets.

Thus, what you ask for is impossible. You always need to verify
that the key of the found item is the same as the key of the
position you're looking for.

> can anyone suggest a >more reliable hash so I don't have to store a

version of the board >in every transition table record.

As long as you use a hash function that reduces the number of
significant bits in the key, you can't.

>maybe making the hash 64 bits would be enough?

You'd need at least the minimum number of bits to represent a chess
position to guarantee that no collisions will occur. The theoretical
minimum is very expensive to reach, but a reasonably practical method
requires 179 bits, worst case. And if that isn't in the FAQ yet, it
ought to be -- it used to be a recurring thread in this news group.

--
Anders Thulin Anders...@lejonet.se 013 - 23 55 32
Telia Engineering AB, Teknikringen 2B, S-583 30 Linkoping, Sweden

Marcel van Kervinck

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

> create an array of 64 bit integers random[64][13]...

> hash=0;
> for sq=0 to 63
> if (board[sq]) hash^=random[sq][board[sq]+6];

One question: Why waste time hashing empty squares?
IMO, zobrist[12][64] would be just fine.

Marcel
-- _ _
_| |_|_|
|_ |_ Marcel van Kervinck
|_| mar...@stack.nl

Robert Hyatt

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Dap Hartmann (d...@abitibi.harvard.edu) wrote:
: hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:

: >create an array of 64 bit integers random[64][13]...

: >initialize it to random numbers.

: Does anyone have any experience using different sets of

: random numbers? In Dappet, we use the same set all the time,
: thus ensuring repeatable results. Years ago, I established a
: set of random numbers that seemed to work best. I was very
: surprised how much influence the choice of random numbers
: had on the program's performance. After mindlessly trying
: out various sets, I simply settled on the one that seemed to
: perform best overall. I guess what you really need, ideally,
: is a set of 'random' numbers which truely minimize the chances
: of collisions.

: Also:
: Why don't we start a nice discussion on the topic of the aging

: of entries in a transposition tabel. Here's what Dappet does:
: After every move (carried out on the board), the length on
: which the move+score in the table is based is decremented
: for all entries in the table, thus ensuring that older entries
: will eventually disappear (get overwritten).
: The idea is that entries based on very deep lines may be
: useful for a longer period than shorter ones, but that they
: too have to be disposed of at some point.
: We also use the double hashtable approach, in which an
: entry is overwritten in Table1, only if the length of the
: present analysis is longer than what is already stored in
: Table1. Otherwise, it is simply stored in Table2, which is always
: overwritten (and acts as the short-term memory, so to speak).

: dap

I simply store a 3bit ID. at the start of each new search (after making a
move) I compute trans_id=++trans_id&7;
if (trans_id==0) trans_id++;

this gives me a number between 1 - 7. when I store, if the ID's are different,
I always overwrite, since the entry is from an old search. I bypass "0" so I
can use that for "permanent entry" for the position learning...

Robert Hyatt

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Valavan Manohararajah (man...@bnr.ca) wrote:
: Dap Hartmann wrote:

: hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:

: >create an array of 64 bit integers random[64][13]...
: >initialize it to random numbers.

: Does anyone have any experience using different sets of
: random numbers? In Dappet, we use the same set all the time,
: thus ensuring repeatable results. Years ago, I established a
: set of random numbers that seemed to work best. I was very
: surprised how much influence the choice of random numbers
: had on the program's performance. After mindlessly trying
: out various sets, I simply settled on the one that seemed to
: perform best overall. I guess what you really need, ideally,
: is a set of 'random' numbers which truely minimize the chances
: of collisions.

: In Rajah, I also have a fixed set of random numbers that I use.

: I wrote a program driven by a random number generator that
: returns numbers that are a certain hamming distance away from
: all other numbers generated so far..... Currently I think my
: minimum hamming distance is 23.

: Also:

: Why don't we start a nice discussion on the topic of the aging
: of entries in a transposition tabel. Here's what Dappet does:
: After every move (carried out on the board), the length on
: which the move+score in the table is based is decremented
: for all entries in the table, thus ensuring that older entries
: will eventually disappear (get overwritten).
: The idea is that entries based on very deep lines may be
: useful for a longer period than shorter ones, but that they
: too have to be disposed of at some point.
: We also use the double hashtable approach, in which an
: entry is overwritten in Table1, only if the length of the
: present analysis is longer than what is already stored in
: Table1. Otherwise, it is simply stored in Table2, which is always
: overwritten (and acts as the short-term memory, so to speak).

: dap

: Nice technique for aging the table entries.... I currently use

: a simple "old" flag. I set it to 1 everytime a new search starts.
: ....this gets pretty slow for very large tables.

Try the "rolling ID" trick most of us use. a counter that goes from 1 to 7
and then rolls back to 1, where you roll it at the top of Iterate() before
you start a search after making a move. Then you only inc the counter and
go, and don't have to cycle thru the whole table. If the current ID is not
the same as the ID stored in the table entry, it's old...

Robert Hyatt

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Marcel van Kervinck (mar...@stack.nl) wrote:
: Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

: > create an array of 64 bit integers random[64][13]...

: > hash=0;

: > for sq=0 to 63
: > if (board[sq]) hash^=random[sq][board[sq]+6];

: One question: Why waste time hashing empty squares?
: IMO, zobrist[12][64] would be just fine.

I don't... but if you do the above, you have to take the trouble to
convert the value of a square to the range 0-11, when, in fact, they
are in the range 0-12... At present, I just use square_value+6 and
probe, without having to compress the zero out before I index...

Robert Hyatt

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Tim Foden (t...@diamond.prestel.co.uk) wrote:

: Dap Hartmann <d...@abitibi.harvard.edu> wrote in article
: <331d7...@cfanews.harvard.edu>...

: > hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:
: >
: > >create an array of 64 bit integers random[64][13]...

: > >initialize it to random numbers.

: I have done this, and I still get around 39 collisions of

: the same hash value from different boards when I search around
: 500,000 nodes (6 ply + quiesce, blacks first move after Nf3,
: 28.5 secs). This is about the same (or possibly worse) than
: the hash function I used before. I suspect that it may be
: because Microsoft's random number generator is cr^h^huseless.
: Maybe I should try a better generator, but I haven't got one handy.

Two things. First. mathematically this should not happen if you are using
real 64bit values. So are you sure the things are 64bits?
Second, watch how you generate them. If you call a RNG that produces numbers
in the range 0<=N<1 (ie a floating point number) and then just stuff this into
an int (using a union so it doesn't get converted to zero of course), the upper
end of that number is not random, it's the exponent, which is the same or
nearly the same for all of the numbers generated.

If you really have 64bit random integers, and you have verified that there are
no (or only a few) duplicates, you should go for much longer than 500K nodes
before you get the number of hits you mention. Notice that "0" hits is not
possible, because you are hashing....

: > Does anyone have any experience using different sets of
: > random numbers? In Dappet, we use the same set all the time,
: > thus ensuring repeatable results. Years ago, I established a
: > set of random numbers that seemed to work best. I was very
: > surprised how much influence the choice of random numbers
: > had on the program's performance. After mindlessly trying
: > out various sets, I simply settled on the one that seemed to
: > perform best overall. I guess what you really need, ideally,
: > is a set of 'random' numbers which truely minimize the chances
: > of collisions.

: I would be interested in your set of numbers!

: > Also:
: > Why don't we start a nice discussion on the topic of the aging
: > of entries in a transposition tabel. Here's what Dappet does:
: > After every move (carried out on the board), the length on
: > which the move+score in the table is based is decremented
: > for all entries in the table, thus ensuring that older entries
: > will eventually disappear (get overwritten).
: > The idea is that entries based on very deep lines may be
: > useful for a longer period than shorter ones, but that they
: > too have to be disposed of at some point.
: > We also use the double hashtable approach, in which an
: > entry is overwritten in Table1, only if the length of the
: > present analysis is longer than what is already stored in
: > Table1. Otherwise, it is simply stored in Table2, which is always
: > overwritten (and acts as the short-term memory, so to speak).
: >
: > dap
: >

: >

: I do something I read in an article I found on the web (I can't find

Vincent Diepeveen

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

In <01bc29ac$a28eea00$724698c2@fork> "Tim Foden" <t...@diamond.prestel.co.uk> writes:

><331d7...@cfanews.harvard.edu>...
>> hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:
>>
>> >create an array of 64 bit integers random[64][13]...
>> >initialize it to random numbers.
>
>I have done this, and I still get around 39 collisions of
>the same hash value from different boards when I search around
>500,000 nodes (6 ply + quiesce, blacks first move after Nf3,
>28.5 secs). This is about the same (or possibly worse) than
>the hash function I used before. I suspect that it may be
>because Microsoft's random number generator is cr^h^huseless.
>Maybe I should try a better generator, but I haven't got one handy.

One should not use the random generator of a computer to fill these
arrays. If you put these numbers in a graph you get several lines
(multi-lineair verband in Dutch). Also 64 bits numbers that are generated
out of 32 bits random seems not to work.

I use some randomnumbers from a program called mathematica.
These numbers i put after each other. Next i turned on overwrite mode.
Then the big job started. I randomly here and there overtyped numbers,
randomly scrolling up and down. This seems the best way to get random
numbers. Just the numbers of Mathematica gave about once every
500,000 - 1000,000 nodes a collision, where the current implementation
hardly gives any. It is of course quite complicated to measure the
number of collissions. You need to increase tuplesize for this.
My way of detecting collisions was to compare whether the move
stored in hashtable was a legal one. This excludes collisions that give
a cutoff (i first try to get a cutoff, counting was afterwards), or
accidentily are positions where the move is legal (which is no accident,
as only doing a few moves usually gives a position with a high chance that
a certain move there is also (semi)legal.

The collision rate now has dropped especially when i began storing more
more bits in transpositiontuples, but still i sometimes think it suffers
from the limited number of bits i store in tuples. I use currently
around 18+32+8 = 58 bits (depending on the size of the transtable).
This is only 6 bits less than the optimal 64
bits. The 58 bits will practically be reduced as i use a more than 1
probes i guess.

The number of bits used in transtable seems to give more problems than
the randomness of mathematica versus the current randomness.

>
>I do something I read in an article I found on the web (I can't find
>the printout right now to give a reference). Basically all items
>in the table have a STALE flag. Whenever you start a new iterative
>deeping search you set the stale flag for all items in the table.
>If you do a lookup in the table, and find a stale, but otherwise
>valid entry, you set the STALE flag to FALSE, and use the entry
>as normal. If, when adding to the table, you find a stale entry,
>you simply overwrite it.

After doing a move i always clean all tables entirely. This is a waste
of results, but prevents the problems that you get different results
resulting from the different
parameters giving different evaluations for the same position x.

>Also, when adding entries to the table, if the previous entry is
>not stale, you compare the depth values to keep the deepest
>records.

Multi-probe where you compare the number of nodes needed for a branch seems
to work the best. This increases tuplesize, but works great.

Vincent

--
+----------------------------------------------------+
| Vincent Diepeveen email: vdie...@cs.ruu.nl |
| http://www.students.cs.ruu.nl/~vdiepeve/ |
+----------------------------------------------------+

Robert Hyatt

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:

: In <01bc29ac$a28eea00$724698c2@fork> "Tim Foden" <t...@diamond.prestel.co.uk> writes:

: ><331d7...@cfanews.harvard.edu>...
: >> hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:
: >>
: >> >create an array of 64 bit integers random[64][13]...
: >> >initialize it to random numbers.
: >
: >I have done this, and I still get around 39 collisions of
: >the same hash value from different boards when I search around
: >500,000 nodes (6 ply + quiesce, blacks first move after Nf3,
: >28.5 secs). This is about the same (or possibly worse) than
: >the hash function I used before. I suspect that it may be
: >because Microsoft's random number generator is cr^h^huseless.
: >Maybe I should try a better generator, but I haven't got one handy.

: One should not use the random generator of a computer to fill these
: arrays. If you put these numbers in a graph you get several lines
: (multi-lineair verband in Dutch). Also 64 bits numbers that are generated
: out of 32 bits random seems not to work.

If you use a good RNG, the results are pretty good. My RNG seeds came from
Mathematica, the RNG code is well-known. I generate 32 bit values, and then
piece them together to get 64bits. I've also run classic RNG tests on these
numbers to confirm that they are random, don't cycle too quickly, don't have
any correlation to speak of, nor have excessive (or two few) runs, and so forth.

: I use some randomnumbers from a program called mathematica.

: These numbers i put after each other. Next i turned on overwrite mode.
: Then the big job started. I randomly here and there overtyped numbers,
: randomly scrolling up and down. This seems the best way to get random
: numbers.

Actually, such numbers turn out to be not very random. You should run 'em
through a few tests, like poker, runs, uniformity, etc...

: Just the numbers of Mathematica gave about once every

Or do as I do in Crafty. Remember the parameters you might adjust, then
call the pre-analyzer to adjust them, and if none are changed, leave the
trans table stuff aline. If anything changes, zero *only* the score part
of each entry, because the move ordering is still probably very accurate.

: >Also, when adding entries to the table, if the previous entry is

: >not stale, you compare the depth values to keep the deepest
: >records.

: Multi-probe where you compare the number of nodes needed for a branch seems
: to work the best. This increases tuplesize, but works great.

I don't like the node count. If a position lies along the PV, it will have a
high node count. Then if you change from that to another move, all of those
entries will also have high node counts. And if you search a position that
stimulates lots of search extensions it too will have a high node count. And
if you search a position with alpha,alpha+1, it will have a lower node count
than the same position searched with alpha,beta where beta > alpha+1. When
I played with this, I didn't like the results. You may be doing it better than
I did of course...

: Vincent

Dap Hartmann

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

d...@abitibi.harvard.edu (Dap Hartmann) wrote:

Remy Coulom sent me an email:

I have read in his PhD thesis that Jean-Christophe Weill does not use
random values, but codes that are computed using the theory of
error-correcting codes (BCH codes, or something like that). I
personaly
use random codes, because I have never found any explanation of what
these BCH codes are. The theory says that BCH codes give a lower
probability than random codes.

I would love to know more about the theory of error-correcting codes
and will try to find a book about them when I have time. I
unfortunately
can not say more about this. Probably the best solution would be to
ask Jean-Christophe Weill directly.

Remi

(I did not post this in the newsgroup because I am not allowed to.
You can post my letter if you think it is worth it)

I believe Aske Plaat told me that JCW's thesis is available
on line, as a postscript file.

I asked him how come he was not allowed to post anything;
This was his reponse:

When I start the news reader, it tells me :

NOTE: This machine does not have permission to post articles.
Please don't waste your time trying.

I asked an admin once, but he gave me no good answers. Probably some
students abused posting in the net once, and my school decided not to
allow people to post in the newsgroup any more. Anyway, I find it is
a pain, because I love the rec.games.chess.computer newsgroup and
would
often be interested in participating in the discussion that take place
there.
I'll work in a lab for a training period in a few weeks, and then I'll
be able to post.

Dap Hartmann

unread,

Mar 6, 1997, 3:00:00 AM3/6/97

to

vdie...@cs.ruu.nl (Vincent Diepeveen) wrote:

>I use some randomnumbers from a program called mathematica.

Alternatively, I believe that there is a CD-ROM release of
Numerical Recipes. As the source code takes up only a few
Mb (I think it used to be on one single 3.5" floppy, the
authors decided to fill the rest of the CD with random
numbers -- for whoever needs them.
Almost like filling a 35 minute CD with an additional 45
minutes of static. For those who need that to block out
background noises. Works quite well, actually.
Right now, I'm trying desperately to get some work done,
but there's this jackass who's been playing with his chainsaw
all day long...
The famous Dutch author Simon Vestdijk used to turn on the
vacuum cleaner when he wrote. That appiance is now in the
museum of literature, in The Hague. Simon, of course,
is pushing up the daisies.

dap

Dennis Breuker

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Dap Hartmann wrote:
>
> d...@abitibi.harvard.edu (Dap Hartmann) wrote:
>
> Remy Coulom sent me an email:
>
> I have read in his PhD thesis that Jean-Christophe Weill does not use
> random values, but codes that are computed using the theory of
> error-correcting codes (BCH codes, or something like that). I
> personaly
> use random codes, because I have never found any explanation of what
> these BCH codes are. The theory says that BCH codes give a lower
> probability than random codes.
>
> I would love to know more about the theory of error-correcting codes
> and will try to find a book about them when I have time. I
> unfortunately
> can not say more about this. Probably the best solution would be to
> ask Jean-Christophe Weill directly.

See

Warnock T. and Wendroff B. (1988).
Search Tables in Computer Chess.
ICCA Journal, Vol. 11, No. 1, pp. 10-13.

and

Feldmann R. (1993).
Game Tree Search on Massively Parallel Systems.
PhD thesis, University of Paderborn, Germany.
Appendix A

for more info on Bose-Chaudhuri-Hocquenghem (BCH) codes.

Dennis.

Vincent Diepeveen

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

I know that if you are typing random numbers in yourselve, and choose
between

1 2 3 4 5 6 7 8 9 0

that the chance you type a 5 or 6 is bigger than the
chance you type a 9 or 0 or 1 or 2.

I prevented this by saying to myself: let's type few zero's.
then i randomly scrolled through the text, not forgetting that i shouldn't
type too much in the middle of the text.

In this way (didn't have enough random numbers from mathematica, i don't
have the program myself but got a lot of numbers from Gijsbert Wiesenekker
few years ago) i created a lot of new numbers which performed great after
testing.

> You should run 'em
>through a few tests, like poker, runs, uniformity, etc...

Note: it is true that my testing wasn't optimal.

If i say: it gave an hashfault once every 500,000 nodes, then this is
in reality a hashfault once every 500,000/n , as a node is not the
same as an hash-update.

The pre-analyzer in Diep ALWAYS gives different parameters for at least
a few out sometimes more. This is of course a small part of the total
amount of parameters. Still i don't want to hurt the purity of the
evaluation.

It is true i can still let in the move ordering, which will speed up
things in tournament play. This is on my still-to-do list almost 2 years now.

I need however also make an option then that takes care that it is cleaned,
to compare testresults. when testing positions i'm simply too lazy to first
click the menu, then click 'clear hash', before testing new things.

That extra clicking would take too much of my systemtime, or i would forget it,
concluding that FHR gives more extra ply than the loss of tactical insight..:)

>: >Also, when adding entries to the table, if the previous entry is
>: >not stale, you compare the depth values to keep the deepest
>: >records.
>
>: Multi-probe where you compare the number of nodes needed for a branch seems
>: to work the best. This increases tuplesize, but works great.
>
>I don't like the node count. If a position lies along the PV, it will have a
>high node count. Then if you change from that to another move, all of those
>entries will also have high node counts. And if you search a position that
>stimulates lots of search extensions it too will have a high node count. And
>if you search a position with alpha,alpha+1, it will have a lower node count
>than the same position searched with alpha,beta where beta > alpha+1. When
>I played with this, I didn't like the results. You may be doing it better than
>I did of course...

You use a more probes in Crafty: P=8.
In diep i use P=3, although there are plans to make this P=4.

In Diep i don't have PVS or nullwindowsearch anymore.
I use a window [alfa,beta] where alfa from the root is based on the
previous ply and i add few tens of pawns to search. When i use
PVS Diep's number of nodes needed pro ply definitely goes down.

The alfabeta dependant selective search/extensions
doesn't work well when i use PVS.

In diep i use a system that tries to detect moves
that go back in a line where one probably gets hashcutoffs,
and where the killermoves work better.

It seems that all things one does in ones program are correlated/connected.

Perhaps this explains why number of nodes works in Diep, and not in
Crafty?

Ronald de Man

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

d...@abitibi.harvard.edu (Dap Hartmann) writes:

>d...@abitibi.harvard.edu (Dap Hartmann) wrote:

>Remy Coulom sent me an email:

>I have read in his PhD thesis that Jean-Christophe Weill does not use
>random values, but codes that are computed using the theory of
>error-correcting codes (BCH codes, or something like that). I
>personaly
>use random codes, because I have never found any explanation of what
>these BCH codes are. The theory says that BCH codes give a lower
>probability than random codes.
>
>I would love to know more about the theory of error-correcting codes
>and will try to find a book about them when I have time. I
>unfortunately
>can not say more about this. Probably the best solution would be to
>ask Jean-Christophe Weill directly.
>

>Remi
>
>(I did not post this in the newsgroup because I am not allowed to.
>You can post my letter if you think it is worth it)

>I believe Aske Plaat told me that JCW's thesis is available
>on line, as a postscript file.

It is reasonable to assume that a set of 'random' values is good if
the 'distance' between positions having the same hashkey is as large
as possible. Here 'distance' is the number of moves that have to be
made or retracted to go from one board position to the other. Making or
retracting one move is equivalent to XORing the hashkey with 2 (for
a normal move) or 3 (for a capture) 'random' values (and then you have
castling moves of course).

So instead of number of moves you could look at the minimal number of
'random' values you have to XOR with to reach the same hashkey again.
XORing such a set of random values obviously give an outcome of 0.
So an ideal set of 'random' values will have the property that the
minimal cardinality of a subset that XORs to 0 is as large as possible.
Let's call this number d.

Now the translation to error-correcting codes. A random value of say 64 bits
can be represented as a vector in a 64-dimensional vector space over
GF(2) (the field with 2 elements). XORing random values now is the same
as adding these vectors. Now let these vectors be the column vectors of
a matrix H. Assuming you need 768 random values (probably a little less
because there are no pawns on the 1st and 8th rank, but you also need
some values for castling rights and en passent), this will be a
64x768-matrix with entries in {0,1}. A set of these vectors adding to zero
corresponds to a vector of length 768 in the kernel of H. Let C denote
this kernel. Then C is a linear binary code of length 768. Coding theorists
are interested in the minimal distance of this code. This is the minimal
Hamming distance between any two elements of C. Since this is a linear code,
this is equal to the minimal number of 1s in the non-zero vectors of C, so
equal to the minimal cardinality of a subset of our random values that
XOR to 0, so equal to d.

Now the matrix H is called the parity check matrix of the code C.
The code C is a [768,768-64,d]-code. You want to find a code C that
has large d. Then you can compute H and take the columns as 'random' values
for hashing.

BCH-codes are a particular class of codes. Apparently one of these codes
has the properties we require. But I have to admit that I did not read
what exactly JCW does with BCH-codes, and everything I wrote above is
just something I thought about when I was trying to come up with a way to
produce good random values. In the end, I just generated them with a
random generator, because that was much easier to do. Maybe one day
I will try to do better by using the idea above. But it would surprise
me if that would really be significantly better in practice.

I can also tell something about the way I produced my random values.
I generated the values byte by byte. Each byte was generated by taking
a 32-bit random integer (using some not-too-bad random generator), and
modding it by some prime number between 300 and 350. If the result
was less than 256, I kept the value, otherwise I threw it away and
generated another. This prevented any 'binary structure' in my values.
But I think I have probably been a little to paranoid, because later
I did some tests with values directly generated by Knuth's algorithm, and
the effectiveness of hashing did not seem to worsen.

Ronald de Man

Marcel van Kervinck

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

> I don't like the node count. If a position lies along the PV, it will have a
> high node count. Then if you change from that to another move, all of those
> entries will also have high node counts. And if you search a position that
> stimulates lots of search extensions it too will have a high node count. And
> if you search a position with alpha,alpha+1, it will have a lower node count
> than the same position searched with alpha,beta where beta > alpha+1.

Well, that's the whole idea of storing node-counts, isn't it? To
more precisely reflect the amount of work done.

> When I played with this, I didn't like the results. You may be
> doing it better than I did of course...

What do you mean:
-1- "I didn't like it"
-2- "It didn't work"

In the latter case, you're the first to report so. After the
publication of Breuker's ICCAJ article virtually anyone who has
tried it confirmed those results.

Robert Hyatt

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Vincent Diepeveen (vdie...@cs.ruu.nl) wrote:

: You use a more probes in Crafty: P=8.

: In diep i use P=3, although there are plans to make this P=4.

: In Diep i don't have PVS or nullwindowsearch anymore.
: I use a window [alfa,beta] where alfa from the root is based on the
: previous ply and i add few tens of pawns to search. When i use
: PVS Diep's number of nodes needed pro ply definitely goes down.

: The alfabeta dependant selective search/extensions
: doesn't work well when i use PVS.

: In diep i use a system that tries to detect moves
: that go back in a line where one probably gets hashcutoffs,
: and where the killermoves work better.

: It seems that all things one does in ones program are correlated/connected.

: Perhaps this explains why number of nodes works in Diep, and not in
: Crafty?

Anything is possible. I generally find it very difficult to compare
results between programs. Bruce and I are reasonably similar in what
we are doing, yet some positions I'm faster, in most, he's faster, in
about 1/2 I search larger trees in the other 1/2 I search smaller trees.
So, "similar" is not anything like "identical"... which is a pain...

:)

Robert Hyatt

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Marcel van Kervinck (mar...@stack.nl) wrote:
: Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

: > I don't like the node count. If a position lies along the PV, it will have a

: > high node count. Then if you change from that to another move, all of those
: > entries will also have high node counts. And if you search a position that
: > stimulates lots of search extensions it too will have a high node count. And
: > if you search a position with alpha,alpha+1, it will have a lower node count
: > than the same position searched with alpha,beta where beta > alpha+1.

: Well, that's the whole idea of storing node-counts, isn't it? To

: more precisely reflect the amount of work done.

: > When I played with this, I didn't like the results. You may be

: > doing it better than I did of course...

: What do you mean:

: -1- "I didn't like it"
: -2- "It didn't work"

1. The trees searched by Crafty were larger using this and, as a result,
the search times were lengthened. Not by huge amounts, but by measurable
amounts.

2. It did not produce faster or better results.

3. (side issue) storing a node count will hurt unless you store some
sort of node_count/N which scales the value down to avoid adding too
much to a hash-table entry...

It's been over a year since I tried this. I'll run the tests again right now
and report the results in a couple of hours... I think I'll run it through
Win At Chess and report the "sum of squares of solution times" which is a
good indicator. bigger means slower, smaller means faster, but it also
takes into account the occasional random noise where *any* change will
affect some few solution times...

Stay tuned for further reports... :)

: In the latter case, you're the first to report so. After the

: publication of Breuker's ICCAJ article virtually anyone who has
: tried it confirmed those results.

Almost guaranteed that each search will respond differently to most
any sort of ordering changes. Results forthcoming in a couple of
hours... BTW, this isn't a new idea. Harry Nelson tried this
maybe 12 years ago in Cray Blitz. He became fascinated in problem-
solving, and did seem to think it was better when he was working on
the Win at Chess suite. But when we tried it in a real game or two
it was slightly worse. If WAC looks good, I'll also try some normal
(non-tactical) positions on it as well to see what happens...

brucemo

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Robert Hyatt wrote:

> and report the results in a couple of hours... I think I'll run it through
> Win At Chess and report the "sum of squares of solution times" which is a
> good indicator. bigger means slower, smaller means faster, but it also

What the heck is this? You mean "square root"?

bruce

Robert Hyatt

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

: Marcel van Kervinck (mar...@stack.nl) wrote:
: : Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:

: : > I don't like the node count. If a position lies along the PV, it will have a

: : > high node count. Then if you change from that to another move, all of those
: : > entries will also have high node counts. And if you search a position that
: : > stimulates lots of search extensions it too will have a high node count. And
: : > if you search a position with alpha,alpha+1, it will have a lower node count
: : > than the same position searched with alpha,beta where beta > alpha+1.

: : Well, that's the whole idea of storing node-counts, isn't it? To

: : more precisely reflect the amount of work done.

: : > When I played with this, I didn't like the results. You may be

: : > doing it better than I did of course...

: : What do you mean:

: : -1- "I didn't like it"
: : -2- "It didn't work"

: 1. The trees searched by Crafty were larger using this and, as a result,
: the search times were lengthened. Not by huge amounts, but by measurable
: amounts.

: 2. It did not produce faster or better results.

: 3. (side issue) storing a node count will hurt unless you store some
: sort of node_count/N which scales the value down to avoid adding too
: much to a hash-table entry...

: It's been over a year since I tried this. I'll run the tests again right now

: and report the results in a couple of hours... I think I'll run it through

: Win At Chess and report the "sum of squares of solution times" which is a
: good indicator. bigger means slower, smaller means faster, but it also

: takes into account the occasional random noise where *any* change will

: affect some few solution times...

: Stay tuned for further reports... :)

: : In the latter case, you're the first to report so. After the
: : publication of Breuker's ICCAJ article virtually anyone who has
: : tried it confirmed those results.

Here's the win at chess results. Sum of squares of solution times,
which obviously biases the results toward the longer solution times.

Normal "draft" replacement = 68096
node count replacement = 67081

Here's a list of solution times that are different, in a format of
old/new, time in seconds:

12/11 50/46 11/12 9/8
4/3 3/4 54/55 18/19
0/1 15/14 3/2 16/13
27/28 47/39 12/11

If you add those up by replacement scheme, and then
subtract, you find that the new approach saved 17
seconds overall. Actually, the savings was a little
less than this as I only measure time in seconds to
produce my summary chart, while Crafty actually keeps
time to the nearest .01 seconds. several problems took
.2 to .5 seconds longer, but this doesn't show up in the
totals above, although it would make them quite a bit closer.

So, to interpret this, it seems that the node count replacement
scheme, for the win at test problem suite, does save some time.

In a few other tests, I discovered that fine #70 takes one additional
ply to solve with the new scheme, requiring 20 plies with the NC
replacement strategy vs 19 plies for the draft replacement scheme.

Some other results: kopec position #22, to depth=11, 60 seconds
vs 60 seconds...

Here's some results for the first few kopec positions that are
not tactical in nature... (again, draft rep/nodes rep in seconds)
note also, these are not times to find the right move, but are simply
times needed to search to some arbitrary depth (10-12 depending on
the position.)

2: 10/10 3: 67/67 4: 58/49
5: 66/71 6: 7/8 7: 24/24
8: 85/95 9: 84/83 10: 19/19

here, it is 6 seconds slower using the node replacement scheme
rather than the draft replacement.

Here's one last test, from the advances in computer chess 4 article
on Cray Blitz, the "mate in 10 position" from David Bronstein vs some
computer years before.

draft=4, node count=15. This is a pathological case, in that it is
wildly tactical, with some branches that produce huge trees and some
that produce almost nothing. The hash table used here (and in all the
above tests) was 24mb. The draft replacement took 7 plies to find
this mate. The node count replacement took 8 plies to find it.
Obviously something important got overwritten due to a large number of
nodes, but a relatively shallow search.

You can draw your own conclusions. For most positions, they break even.
For some, either can win. Since I have to store the "draft" because I
need it to be sure I can use an entry when I do a look-up, that replacement
is free. To do the node count idea, you need to expand the hash table
entry length, which reduces the number of entries in the table by some
finite amount. If you want to store a 16 bit value, and your table is
already 16 bytes per entry, taking it to 18 costs 1/8th of your total
table space, which is a non-trivial amount. The tests I ran here did not
factor that in, I just used exactly the same number of entries for each
test. Reducing by 1/8 isn't a huge reduction, but you can certainly see
the difference in search times, which would probably bring the draft
replacement scheme back to the front in WAC, where it is in front at the
non-tactical Kopec positions.

I can run any other tests you might be interested in, as I'll save this
code (which was pretty trivial to add). However, now, as in the past,
I'm sticking with draft replacement until something better comes along.

Robert Hyatt

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

brucemo (bru...@nwlink.com) wrote:
: Robert Hyatt wrote:

: > and report the results in a couple of hours... I think I'll run it through
: > Win At Chess and report the "sum of squares of solution times" which is a
: > good indicator. bigger means slower, smaller means faster, but it also

: What the heck is this? You mean "square root"?

: bruce

No... sum(time1^2 + time2^2 + ... +time300^2)

the idea is that a single long time really hurts, while several short
times don't. It's a sort of measure of efficiency. IE try the case
where you solve all 300 in 1 second each, while I solve the first 295
in zero seconds, but take 60 seconds each to solve the last 5. Our
total time is equal, but sum of squares for you=300, for me it is
5*3600... It just highlights the large solution times over the small
solution times...

Tim Foden

unread,

Mar 7, 1997, 3:00:00 AM3/7/97

to

Robert Hyatt <hy...@crafty.cis.uab.edu> wrote in article
<5fmifq$t...@juniper.cis.uab.edu>...

| Tim Foden (t...@diamond.prestel.co.uk) wrote:
|
|
| : Dap Hartmann <d...@abitibi.harvard.edu> wrote in article

| : <331d7...@cfanews.harvard.edu>...
| : > hy...@crafty.cis.uab.edu (Robert Hyatt) wrote:
| : >
| : > >create an array of 64 bit integers random[64][13]...
| : > >initialize it to random numbers.
|
| : I have done this, and I still get around 39 collisions of
| : the same hash value from different boards when I search around
| : 500,000 nodes (6 ply + quiesce, blacks first move after Nf3,
| : 28.5 secs). This is about the same (or possibly worse) than
| : the hash function I used before. I suspect that it may be
| : because Microsoft's random number generator is cr^h^huseless.
| : Maybe I should try a better generator, but I haven't got one handy.
|

| Two things. First. mathematically this should not happen if you are
using
| real 64bit values. So are you sure the things are 64bits?

OK. I am using MS's unsigned __int64 typedef'ed to UINT64 throughout.

| Second, watch how you generate them. If you call a RNG that produces
numbers
| in the range 0<=N<1 (ie a floating point number) and then just stuff this
into
| an int (using a union so it doesn't get converted to zero of course), the
upper
| end of that number is not random, it's the exponent, which is the same or
| nearly the same for all of the numbers generated.

MS's RNG generates 16bit values. The code I use to fill in the table goes
like this:

UINT64 Board::m_hashLookup[64][16];

void Board::GenHashLookupData()
{
srand( time(NULL) );

for( int i = 0; i < 64; i++ )
{

for( int j = 0; j < 16; j++ )
{
for( int k = 0; k < 6; k++ )
{
m_hashLookup[i][j] <<= 11;
m_hashLookup[i][j] |= (rand() >> 3) & 0x7FF;
}
}
}
}

I have since tried 2 other generators. The first was called TT800 and
I have modified it to return it's 32 values directly instead of converting
to doubles. Same results. The second is a simple lagged Fibonacci
sequence
that uses 64 bits throughout that I wrote myself using lag values I found
on the web. I fill in the initial Fibonacci sequence using the TT800 RNG.
Same results. I am coming to believe that my bug is somewhere else!

Here is the hash function is now use:

UINT64 Board::Hash() const
{
UINT64 hash = 0;

for( int i = 0; i < 32; i++ )
{
Piece* pPiece = m_piece[i];

if( pPiece == 0 )
continue;

// GetType() returns 1..7, GetSide() returns 0..1
int type = pPiece->GetType() | (pPiece->GetSide() << 3);

// GetPosition() returns 0..63
hash ^= m_hashLookup[pPiece->GetPosition()][type];
}

return hash;
}

I find that I have a clash when I do a lookup in the hash table, and find
the
hash values to be the same. I then check the current board position (64
bytes) against the one I stored in the transition record. If they are
different I increment the duplicates counter. Obviously I am only storing
these positions
in the table for debugging purposes, but at the moment I am getting too
many
duplicate hashes to take it out. I must be doing something wrong
somewhere.

|
| If you really have 64bit random integers, and you have verified that
there are
| no (or only a few) duplicates, you should go for much longer than 500K
nodes
| before you get the number of hits you mention. Notice that "0" hits is
not
| possible, because you are hashing....

I realise. The idea is to make to probability of a hash clash so small
that it
can be ignored because other problems in the chess program will become more
significant.

I like the idea of maximal hamming distance that has been suggested in
another
branch of this thread. I was just beggining to think along the lines of
generating my own numbers in a more empirical manner, and this seems the
best measure to use.

Tim Foden.

Robert Hyatt

unread,

Mar 8, 1997, 3:00:00 AM3/8/97

to

Tim Foden (t...@diamond.prestel.co.uk) wrote:

: Robert Hyatt <hy...@crafty.cis.uab.edu> wrote in article

: return hash;
: }

: Tim Foden.

Also compute this value to get an approximation of how many collisions you
can expect:

hash signature = 64bits.

real board position = (roughly) 160 bits...

that means that there are literally zillions of unique positions that map
to the same hash position.

As to how many hits you should get, you might try deja-news. John Stanback,
myself, and a couple of others beat this to death a couple of years ago. We
ran the same tests you are doing. Unfortunately, I don't remember what our
results were as to collisions per N probes... Maybe someone else will
recall.

However there are two issues:

1. probes per collision is one.

2. importance of the collision is another.

for 2, it gets interesting, since most collisions occur at places that won't
affect the root of the tree, and are therefore "benign". However, you are
going to get errors, there's no way around that...

Peter W. Gillgasch

unread,

Mar 9, 1997, 3:00:00 AM3/9/97

to

Dennis Breuker <bre...@cs.unimaas.nl> wrote:

> Warnock T. and Wendroff B. (1988).
> Search Tables in Computer Chess.
> ICCA Journal, Vol. 11, No. 1, pp. 10-13.

> Feldmann R. (1993).

> Game Tree Search on Massively Parallel Systems.
> PhD thesis, University of Paderborn, Germany.
> Appendix A

Have those and looked at 'em some time ago. Although 3 guys tried to
change the light bulb we couldn't figure it out how to generate a set of
numbers with this properties...

What did we miss ?

-- Peter

May God grant me the serenity to accept the things I cannot change,
courage to choke the living shit out of those who piss me off,
and wisdom to know where I should hide the bodies...

Robert Hyatt

unread,

Mar 9, 1997, 3:00:00 AM3/9/97

to

Anders Thulin (a...@nala.devnet.lejonet.se) wrote:
: In article <01bc2901$511f8ea0$0100007f@fork>,
: Tim Foden <t...@diamond.prestel.co.uk> wrote:

: >my current problem is that this sometimes produces hashes with the
: >same value from positions that are different.

: That's how hashing works: as you're not using the unique key, but
: the value hash(key), and the range of the hashing function is much
: smaller than the key range, you can't assume you can avoid collisions.
: (Unless you use perfect hashing, and that's practical only when you
: have a relatively small set of keys, or the keys have a structure
: that's easily analyzed in terms of perfect hashing.)

: The value hash(key) always gives you a set of possible hits. A good
: hashing function will keep the average set size small, but can't
: guarantee singleton sets.

: Thus, what you ask for is impossible. You always need to verify
: that the key of the found item is the same as the key of the
: position you're looking for.

: > can anyone suggest a >more reliable hash so I don't have to store a
: version of the board >in every transition table record.

One more thing. Since you store a move in the entry, you can also confirm
that the requirements for the move to be legal are met. IE, if the move
is O-O, you can check to see if that is legal. If not, toss it out. I
don't see such a "collision" *ever* as I log this particularly failure in
the log.nnn files for crafty. I check for 'em on occasion, and just did.
with 247 log files, there were 0 of these move verifications reported... but
I'm sure there were collisions.

: As long as you use a hash function that reduces the number of