random play

95 views
Skip to first unread message

Robert Hyatt

unread,
Nov 25, 1996, 3:00:00 AM11/25/96
to

Ronald de Man (de...@wsdw08.win.tue.nl) wrote:
: hy...@crafty.cis.uab.edu (Robert Hyatt) writes:
:
: >Ronald de Man (de...@wsdw08.win.tue.nl) wrote:
: >: hy...@crafty.cis.uab.edu (Robert Hyatt) writes:
: >: >Ronald de Man (de...@wsdw08.win.tue.nl) wrote:
: >: >:
: >: >: Every root move gets its own random value, to be used throughout the
: >: >: search. When you reorder root moves, you also reorder the random values.
: >: >: It has nothing to do with the search changing its mind. That are just
: >: >: fail-highs that can happen regardless. With a very stable search, the
: >: >: best move (counting in the randomvalues) will turn out to be the
: >: >: best already at the first iteration. There's no need for the search
: >: >: to change its mind.
: >: >:
: >: >: Ronald de Man
: >: >:
: >:
: >: >I agree with that part, so that there is not a steady succession of fail
: >: >high conditions. However, what about the hashing? two positions that have
: >: >pieces on the same squares won't have the same positional score since the
: >: >random value was passed from the root move... That has to hurt, particularly
: >: >in endgames where 90% hash hits is very common...
: >:
: >: >Bob
: >:
: >: The nice thing is that the random value is not really passed into the search.
: >: Suppose the current best move has value 2+1=3, with 2 the value without
: >: any randomization, and 1 the random value added. Now you want to decide
: >: whether a move, say with random value +0.5, is better than this current
: >: best move. That will be the case if its value without randomization
: >: is higher than 3-0.5=2.5, since 2.5+0.5=3=2+1. So you search this move
: > with window (2.5,2.5001), add +0.5 to the value returned, and compare
: >: this with 3. The point is, Search() doesn't know about the random
: >: values, and the values stored in the hash table are the 'correct' values.
: >: (And 'correct' here means: if a hash entry says 'this position has value
: >: at most 2.0', then searching this position with the right depth, and
: >: without any randomization, will return a value of at most 2.0.)
: >:
: >: Also, note that you'll probably won't randomize in the end game, because
: >: it's only useful the first few moves out of book.
: >:
: >: Ronald de Man
: >:
:
: >But now we are back to square one... anytime you add something that makes the
: >second move better, you are doing two things: (1) search time will double or
: >worse; (2) you have to factor the random value from the root into the tip
: >scores, otherwise you will never get the second move as better, because alpha
: >beta will prune it away. You can't just add something to the root, because there
: >is no "root score" to work with. It all comes from the tips, and is backed up...
:
: Hmm..., I thought we had cleared things up and had already decided that
: my method does not cause a steady succession of fail high conditions.
: Well ok, I'll try to be a little more clear. I assume we are talking
: about a chess program that has a general Search() routine, and separate
: code for searching the root, say SearchRoot(). I only change something
: in SearchRoot(). I'll give an explicit description, starting from this
: VERY primitive SearchRoot(), which only searches to a fixed depth.
:
: Move SearchRoot(depth) /* first */
: int depth;
: {
: GenerateMoves();
: alpha=-infinity;
: for(n=0;n<numMoves;n++) {
: MakeMove(move[n]);
: value=-Search(-infinity,-alpha,depth);
: UnMakeMove();
: if(value>alpha) {
: alpha=value;
: bestMove=move[n];
: }
: }
: return(bestMove);
: }
:
: Ok, now I add in randomization in a STUPID (meaning very expensive) way:
:
: Move SearchRoot(depth) /* second */
: int depth;
: {
: GenerateMoves();
: for(n=0;n<numMoves;n++)
: random[n]=Random();
: for(n=0;n<numMoves;n++) {
: MakeMove(move[n]);
: value[n]=-Search(-infinity,infinity,depth);
: UnMakeMove();
: }
: for(n=0;n<numMoves;n++)
: value[n]+=random[n];
: alpha=-infinity;
: for(n=0;n<numMoves;n++)
: if(value[n]>alpha) {
: alpha=value[n];
: bestMove=move[n];
: }
: return(bestMove);
: }
:
: The routine above does a full search for all root moves, thus wasting
: a lot of time. However, it is clear how the moves get randomized.
: In particular, nothing gets added at the tip nodes. Randomization
: takes place here at the root.
: My claim is that there is a way to achieve the exact same randomization
: of this second routine with the efficiency of the first.
: Here it goes:
:
: Move SearchRoot(depth) /* third */
: int depth;
: {
: GenerateMoves();
: for(n=0;n<numMoves;n++)
: random[n]=Random();
: alpha=-infinity;
: for(n=0;n<numMoves;n++) {
: MakeMove(move[n]);
: value = random[n]-Search(-infinity,random[n]-alpha,depth);
: UnMakeMove();
: if(value>alpha) {
: value=alpha;
: bestMove=move[n];
: }
: }
: return(bestMove);
: }
:
: Now convince yourself of:
:
: 1. The third routine is correct and gives exactly the same randomization
: as the second.
: 2. The third routine is as efficient as the first.
: 3. If two root moves lead to the same position encountered by Search(),
: the hash table entry generated at the first encounter will be valid
: for the second encounter.
:
: Note that the transformation from the first routine to the third is very
: simple and can be applied to any SearchRoot() routine. Just generate
: random numbers before starting to search, reorder these random numbers
: every time you reorder the root moves, so that moves keep the same random
: number attached to them, and substitute all appearances of
:
: -Search(-beta,-alpha,depth)
:
: by
:
: random[n]-Search(random[n]-beta,random[n]-alpha,depth)
:
: where n is the index of the root move you are searching.
: (You have to be a little careful with mate scores of course.)
: If you only want to randomize the first few moves out of book, just
: set random[n]=0 for all n after those first few moves (so you don't need
: separate RandomizedSearchRoot() and UnRandomizedSearchRoot() routines).
:
: Ronald de Man
:

I'm still convinced this hurts performance. The idea of storing best and
worst bounds is critical in PVS/NegaScout... the point being that everything
will fail high or low. If you offset the window, even by 1 millipawn, then
the things that normall fail low will fail low, but the things that normally
would fail high won't. (assuming you raised alpha and beta by .001). If you
lower them the same thing happens.

You also have a problem when you have path P1 that has moves m1, m2, m3, m4, ...
and path P2 that has moves m3, m2, m1, m4, ... You break that case completely
because with normal hashing, the position before m4 would be found in the table
and not searched. In your case it would be found but would return a wrong
value if a true value were stored, or else would return a wrong bound if the
entry only has a bound stored. In essence, you can't do *anything* to the "path"
because the path is not stored, only positions are in the hash, and there's no way
to determine that one position was stored with m1 at the root, and the other had
m3 at the root...


Robert Hyatt

unread,
Nov 26, 1996, 3:00:00 AM11/26/96
to

Ronald de Man (de...@wsdw08.win.tue.nl) wrote:
: hy...@crafty.cis.uab.edu (Robert Hyatt) writes:
:
:
: >You also have a problem when you have path P1 that has moves m1, m2, m3, m4, ...

: >and path P2 that has moves m3, m2, m1, m4, ... You break that case completely
: >because with normal hashing, the position before m4 would be found in the table
: >and not searched. In your case it would be found but would return a wrong
: >value if a true value were stored, or else would return a wrong bound if the
: >entry only has a bound stored. In essence, you can't do *anything* to the "path"
: >because the path is not stored, only positions are in the hash, and there's no way
: >to determine that one position was stored with m1 at the root, and the other had
: >m3 at the root...
:
: NO!
: (This was the third point I asked you to convince yourself of in my
: previous post.)
: If m1 is root move 2, m3 is root move 3, we search m1 (simplistically)
: as follows:
:
: MakeMove(move[2]);
: value = random[2] - Search(-infinity,random[2]-alpha,depth);
: UnMakeMove();

OK... I understand your point now... no one knows about the offset at
the root, except for the code that calls Search to process each root
move. That I buy now...

However, I still have trouble with perturbing the alpha/beta window in
any way, and your fudging with random[n] above is doing just that. If
move 2 is exactly 1 point better than move 1, and your random value is
2 (say) you fail high in one case, low when you search the other way.
BTW, your line above is not right, in that you would not search the
second move with -infinity, rather you'd use alpha which was the value
returned from searching the first root move, and beta=alpha+1. If you
aren't using PVS, I can see where this has no effect on what you are
doing other than to randomize a little and harm transposition table
entries a little due to the bounds being "fuzzy". However, if you use
PVS or scout, with a null-window, shifting it is very dangerous...


Ronald de Man

unread,
Nov 26, 1996, 3:00:00 AM11/26/96
to

hy...@crafty.cis.uab.edu (Robert Hyatt) writes:

>I'm still convinced this hurts performance. The idea of storing best and
>worst bounds is critical in PVS/NegaScout... the point being that everything
>will fail high or low. If you offset the window, even by 1 millipawn, then
>the things that normall fail low will fail low, but the things that normally
>would fail high won't. (assuming you raised alpha and beta by .001). If you
>lower them the same thing happens.

Yes, I realize that this might hurt performance a little. However I think
that the main benefit of the hash tables in the early middle game (which
is when we apply randomization) comes from stored best moves, since most
of the positions searched before will have been searched with lower depth
(in a previous iteration).
Also, it's difficult to analyze these things.
Suppose you have two moves, and the best move gets +0.001 added, and the
other gets -0.001. If the search returns value 1.000 for the first, the
root search will say it's 1.001, so the second move is searched with a window
of (1.002,1.003) (since 1.002-0.001=1.001). Now if the real value of
this second move is say 0.500, could it happen that searching it with a
window of (1.002,1.003) takes longer than with a window of (1.000,1.001)?
According to your reasoning it can, but that would be a little
counterintuitive, since it should be easier to decide that 0.500<=1.002
than 0.500<=1.000. (Well, hash tables are strange things so I wouldn't
be too surprised if this really could occur and besides, I haven't
spend that many time thinking about this.)

Anyway, in my tests I certainly don't experience any substantial losses
of performance. (And on FICS, I don't suffer from people repeating their
wins over and over again...)

>You also have a problem when you have path P1 that has moves m1, m2, m3, m4, ...
>and path P2 that has moves m3, m2, m1, m4, ... You break that case completely
>because with normal hashing, the position before m4 would be found in the table
>and not searched. In your case it would be found but would return a wrong
>value if a true value were stored, or else would return a wrong bound if the
>entry only has a bound stored. In essence, you can't do *anything* to the "path"
>because the path is not stored, only positions are in the hash, and there's no way
>to determine that one position was stored with m1 at the root, and the other had
>m3 at the root...

NO!


(This was the third point I asked you to convince yourself of in my
previous post.)
If m1 is root move 2, m3 is root move 3, we search m1 (simplistically)
as follows:

MakeMove(move[2]);
value = random[2] - Search(-infinity,random[2]-alpha,depth);
UnMakeMove();

(Where we got alpha from root move 1.)
Suppose this search finds an exact value for the position before m4.
Then this value is equal to the value returned by

Search(-infinity,infinity,depth-2)

where you start searching at the position before m4. So random[2]
does not influence this value.
Now you search m3:

MakeMove(move[3]);
value = random[3] - Search(-infinity,random[3]-alpha,depth);
UnMakeMove();

Now after m2,m1, the search is again in the position before m4.
The hash table has the exact value for this position, and this value
can also be used at this stage. Only after the search will we add
random[3] to it and compare it with alpha.

The same holds for bounds. If searching m1 gives that the position
before m4 has 1.000 as an upper bound, this means that

Search(-infinity,infinity,depth-2)<=1.000,

and this information can and will be used when searching m3.
(Since the random values appear in the search window they can influence
the 1.000, but the 'absolute truth' of the equality/inequality we get
is not affected.)

Note that I don't change anything in Search(), only something
in SearchRoot().

Now you might ask, if Search() doesn't get the random values, how can
we ever get any randomization? The answer is that we randomize at the
root, see my second SearchRoot() procedure of my previous post, where
the same randomization is effected in a very straightforward (but
expensive) way.

Ronald de Man


Ronald de Man

unread,
Nov 28, 1996, 3:00:00 AM11/28/96
to

hy...@crafty.cis.uab.edu (Robert Hyatt) writes:

>OK... I understand your point now... no one knows about the offset at
>the root, except for the code that calls Search to process each root
>move. That I buy now...

I'm glad :-)

>However, I still have trouble with perturbing the alpha/beta window in
>any way, and your fudging with random[n] above is doing just that. If
>move 2 is exactly 1 point better than move 1, and your random value is
>2 (say) you fail high in one case, low when you search the other way.
>BTW, your line above is not right, in that you would not search the
>second move with -infinity, rather you'd use alpha which was the value
>returned from searching the first root move, and beta=alpha+1. If you
>aren't using PVS, I can see where this has no effect on what you are
>doing other than to randomize a little and harm transposition table
>entries a little due to the bounds being "fuzzy". However, if you use
>PVS or scout, with a null-window, shifting it is very dangerous...

I do use PVS and null-windows. The line I gave was meant as a simple
illustration of what i'm doing. So using a null-window it would read

value = random[2] - Search(random[2]-alpha-1,random[2]-alpha,depth);

(since originally it was -Search(-alpha-1,-alpha,depth)).

I agree that shifting alpha and beta might make some hash tables entries
less usefull, but I think this effect (at least in the middle game) is
much smaller than for example the number of nodes you have to search
more/less in a certain position because of the more/less fail-highs you
get with a certain set of random numbers (at least, that is what I
experience).
And of course, adding randomness will never produce better moves from
a minimax point of view, but, if the loss in performance is small
enough, will most likely improve play against humans (on the long term)
and is an easy solution against killerlines.

Ronald de Man


Reply all
Reply to author
Forward
0 new messages