New improvement to alpha/beta + TT?

96 views
Skip to first unread message

Heiner Marxen

unread,
Jan 13, 1997, 3:00:00 AM1/13/97
to

Hello r.g.c.c fans and gurus!

I believe to have found an improvement of the usual alpha/beta (AB)
tree search, if combined with a transposition table (TT), in the way
e.g. crafty 11.13 does this (thank you, Bob).

Quick disclaimer: I may be wrong. I just started to read the crafty
sources, and have not yet implemented a full AB+TT myself. I write
this in order to be shown wrong, or approved right.

I understand that the basic recursive tree search function [Search()]
gets handed in a lower and an upper search bound, named alpha and beta.
By these values the calling function says: "Dear Search(), if you would
give me a result <= alpha or >= beta, I, your caller, will not gain
any advantage of how much below alpha or above beta that value is.
You may as well give me alpha or beta, resp."

Ok, I buy that (and I think I understand it). But in the presence
of a TT the value computed by Search() and returned to its caller,
is also used in another way: it is stored into the TT and may be
retrieved and found by another incarnation of Search(), which
--this is my main point-- may well be handed quite different alpha/beta
bounds. Hence, this latter Search() may well be interested in a value
larger than that former alpha, but smaller that the latter alpha, right?

Now, the first Search() of course will not continue to compute even
better values just to store them into the TT, hoping they will be re-used.
But sometimes such a better value could be readily available. For
this we first allow Search() to return a value v, which is
- above beta, still being a lower limit to the true (minimax) value,
which could even be larger, or is
- below alpha, still being an upper limit to the true value, which
could even be smaller.
In short: Search() may tell us, how far outside the bounds the value
is, at least.

Now, we find several sources for such values to propagate:
(a) A leaf node evalution may yield and immediately return such a value.
(b) A TT-search at the start of Search() may find such a value,
and propagate it.
(c) One of the recursive calls to Search() may yield a cutoff, i.e.
a value b >= beta. Instead of beta we now may return even b.
(d) After scanning all moves the collected maximum m is still below
alpha, because all recursive searches yielded values below alpha,
despite the bound we handed them.

I have no idea, how frequently this would occur, and unfortunately
I will not find the time to change "crafty" and test my idea :-(
But it occurs to me that sometimes there is information readily available
which could be used with nearly no overhead. Provided the above is
correct, it think it is worth to be considered/implemented.

Now, is all this correct? (If so, the basic idea can even be extended to
return a larger effective depth than the one demanded...)

Or is it known, already?

Eagerly waiting for comments... :-)
--
Heiner Marxen hei...@drb.insel.de

Robert Hyatt

unread,
Jan 14, 1997, 3:00:00 AM1/14/97
to

Heiner Marxen (hei...@news.drb.insel.de) wrote:
: Hello r.g.c.c fans and gurus!

Already known... called "fail-soft alpha/beta"... I tried it way back, but
with other things to debug at the same time, I canned it. I have not yet
gone back to try it...

John Tromp

unread,
Jan 14, 1997, 3:00:00 AM1/14/97
to

hei...@news.drb.insel.de (Heiner Marxen) writes:

>Hello r.g.c.c fans and gurus!

>I believe to have found an improvement of the usual alpha/beta (AB)
>tree search, if combined with a transposition table (TT), in the way
>e.g. crafty 11.13 does this (thank you, Bob).

>But sometimes such a better value could be readily available. For


>this we first allow Search() to return a value v, which is
>- above beta, still being a lower limit to the true (minimax) value,
> which could even be larger, or is
>- below alpha, still being an upper limit to the true value, which
> could even be smaller.
>In short: Search() may tell us, how far outside the bounds the value
>is, at least.

>(c) One of the recursive calls to Search() may yield a cutoff, i.e.


> a value b >= beta. Instead of beta we now may return even b.
>(d) After scanning all moves the collected maximum m is still below
> alpha, because all recursive searches yielded values below alpha,
> despite the bound we handed them.

Practically all existing programs already do this.

>Or is it known, already?

So it is:)

A typical (nega variety) alpha-beta routine has a fragment like this:

{ compute available moves in av[0..nav-1] }

score = -infinity;
for (i = 0; i < nav; i++) {
makemove(j=av[i]);
val = -ab(-beta, -alpha);
backmove();
if (val > score) {
if ((score = val) > alpha && (alpha = val) >= beta) {
return score; // beta-cutoff
}
}
}
return score; // can be less than alpha

Of course, with a TT you would try to store an exact or bounding score before
returning.
And before running the above loop,
you could try to tighten the given alpha&beta bounds with what's
stored in the TT, assuming that was searched with sufficient depth.

The most elaborate approach is to store in the TT an interval in which
the true score is known to be (at a particular searchdepth).
Then whenever you gain information, you intersect intervals to narrow things
down. But most programs store only one value in the TT, with a bit or two
indicating whether it's exact or a bound.

regards,

%!PS % -John Tromp (http://www.cwi.nl/~tromp/)
42 42 scale 7 9 translate .07 setlinewidth .5 setgray/c{arc clip fill
setgray}def 1 0 0 42 1 0 c 0 1 1{0 3 3 90 270 arc 0 0 6 0 -3 3 90 270
arcn 270 90 c -2 2 4{-6 moveto 0 12 rlineto}for -5 2 5{-3 exch moveto
9 0 rlineto}for stroke 0 0 3 1 1 0 c 180 rotate initclip}for showpage

Tom C. Kerrigan

unread,
Jan 14, 1997, 3:00:00 AM1/14/97
to

This sounds a lot like something called a "fail soft" search, but I can't
remember for the life of me how it's supposed to work. Anyone? :)

Anyway, the reason I'm posting is because I just read your thing twice and
I still don't know what you intend to do with the more accurate scores.
How are they going to improve anything? (sorry if my brain simply
censored this part... it does that for a lot of stuff...)

Cheers,
Tom

Simon Read

unread,
Jan 15, 1997, 3:00:00 AM1/15/97
to

hy...@cis.uab.edu (Robert Hyatt) wrote:

>Heiner Marxen (hei...@news.drb.insel.de) wrote:
>: Ok, I buy that (and I think I understand it). But in the presence
>: of a TT the value computed by Search() and returned to its caller,
>: is also used in another way: it is stored into the TT and may be
>: retrieved and found by another incarnation of Search(), which
>: --this is my main point-- may well be handed quite different alpha/beta
>: bounds. Hence, this latter Search() may well be interested in a value
>: larger than that former alpha, but smaller that the latter alpha, right?

Let me see if I understand: here are some scores:

S1>S2>S3>S4

S1 the upper bound the TT gives us
S2 the upper limit on the current search; a score above this
will cause a cutoff
S3 the lower bound the TT gives us
S4 the lower limit on the current search; a score below this
will cause a cutoff


We went to decide on a cutoff; we're currently searching with limits of
S2 and S4 ... we hit an entry in the TT and it says "score between S1 and S3"

We definitely need to re-search this entry, but there's no point in
giving the search a limit of less than S3 and there's no point in
giving the search a limit of more than S2.
So we tighten the search bounds to S2 and S3, then search that position.
This should save some time.

I don't think this is anything to do with fail-soft.

Of course, if we want information at a different depth to that in
the TT, that becomes irrelevant.

Simon / cranfield.ac.uk /
Freelance / @ /
Philosopher / s.read /


Valavan Manohararajah

unread,
Jan 15, 1997, 3:00:00 AM1/15/97
to

Fail-Soft is quite cool.... Actually to me, it looks more logical than
the
plain A-B algorithm. Fail-Soft can return scores outside the A-B
window.
This can be used to make the search more efficient by using the scores
returned as the new alpha/beta limits during research. Also, you can
use
the scores returned outside the window to make a guess as to which is
the
best move at this node, and then put it into the hash table. So this
method may give a few more cut-offs and may help move ordering slightly.

Only problem is that you may get very "interesting" PV lines with this
method. Most publications I have seen give PVS as:

int pvs(board pos,int alpha,int beta,int depth)
{
int value,score;

if(depth==0) {
evaluate();
}

score=pvs(pos(1),-beta,-alpha,depth-1);

if(score>alpha) {
alpha=score;

if(alpha>=beta) {
return alpha;
}

update_PV();
}

for(m=1;m<num_mvs;m++) {
value=pvs(pos(m),-alpha-1,-alpha,depth-1);

if(value>alpha && value<beta) {
value=pvs(pos(m),-beta,-value,depth-1);
}

if(value>score) {
score=value;

if(score>alpha) {
alpha=score;

if(alpha>=beta) {
return alpha;
}

update_PV();
}
}
}

return score;
}

What happens when the research returns a score lower than value but
bigger than alpha?

This can happen due to hash-table or null-move effects. Do you
update the PV in this case? You will only have a "partial line" since
the search cut-off some where below the current node.

The only way I see how to deal with the problem is to research with
-beta,-alpha - but this is throwing away the usefullness of fail-soft.

Anyone know how to deal with this problem?

Amir Ban

unread,
Jan 16, 1997, 3:00:00 AM1/16/97
to

Heiner Marxen (hei...@news.drb.insel.de) wrote:

[snipped due to stupid news server rules]

The idea is correct. I have been doing that in Junior for two years or so. The
improvement is nothing to be excited about, but it really costs nothing and is trivial
to implement, which doesn't happen so often for me.

Amir

Heiner Marxen

unread,
Jan 17, 1997, 3:00:00 AM1/17/97
to

In article <5bgult$v...@merlin.pn.org>,

Tom C. Kerrigan <kerr...@merlin.pn.org> wrote:
>This sounds a lot like something called a "fail soft" search, but I can't
>remember for the life of me how it's supposed to work. Anyone? :)

Yes, others identified this as the well known "fail soft", too. Via email
Mark Brockington told me it is known since the early 80s, at least.
I also located "http://www.xs4all.nl/~verhelst/chess/search.html", with
good, readable explanations for this and many other search techniques.

>Anyway, the reason I'm posting is because I just read your thing twice and
>I still don't know what you intend to do with the more accurate scores.
>How are they going to improve anything? (sorry if my brain simply
>censored this part... it does that for a lot of stuff...)

Unfortunately I were not clear enough :-(. I'll try again...

The more accurate scores are stored into the TT, and can be reused later.
The reusing instance may have different bounds, and may be able to gain
from the more accurate score, but not from the reduced score. E.g.:
Search(pos=P, depth=D, alpha=0.5, beta=0.6)
in one of the sub-searches finds a value=0.7 > 0.6 (=beta), which
produces a cut-off. With fail-soft we store 0.7 into the TT, otherwise
we would store 0.6 into the TT.

Now imagine that stored result to be found by a later call
Search(pos=P, depth=D, alpha=0.6, beta=0.7) [higher bounds!]
With the fail-soft TT-value 0.7 we now achive an immediate cut-off.
With the other method we would get 0.6 and not gain anything at all.

The more accurate scores may even help the caller to set a better
window for a re-search (after a fail-high).

Cheers,
Heiner

Reply all
Reply to author
Forward
0 new messages