Tree search issues!

Magnus Heldestad

unread,

May 26, 1997, 3:00:00 AM5/26/97

to

--------------BCBAF66D96DDD161DC751887
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Could anyone point me to some info, or explain to me, how Null Move
heuristic and killer moves works? Isn't killer moves just a weaker form
of history heuristic?

Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
reason I ask is that there was a couple of posts recently about move
ordering, and ETC is (in my opinion) ideally suited to place in the move
ordering (preferably as priority 1). My own experiment with ETC shows an
time advantage of about 15% when using it on all nodes except leaf nodes
(but NPS dropped some).

Here is a tricky question: according to what I have heard, MTD_f should
be faster than NegaScout+PVS. Is it possible to implement a version of
PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes
that fail high? For a description of MTD_f, see:
http://theory.lcs.mit.edu/~plaat/mtdf.html

Have to say this: I think Kasparov made a real fool of himself against
DB. Partly because he didn't play very well, but mostly because he acted
like a really bad looser after the match!

Magnus H

--------------BCBAF66D96DDD161DC751887
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<HTML><BODY>
Could anyone point me to some info, or explain to me, how Null Move heuristic
and killer moves works? Isn't killer moves just a weaker form of history
heuristic?
 
 Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
reason I ask is that there was a couple of posts recently about move ordering,
and ETC is (in my opinion) ideally suited to place in the move ordering
(preferably as priority 1). My own experiment with ETC shows an time advantage
of about 15% when using it on all nodes except leaf nodes (but NPS dropped
some).
 
 Here is a tricky question: according to what I have heard, MTD_f should
be faster than NegaScout+PVS. Is it possible to implement a version of
PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes that
fail high?  For a description of MTD_f, see: <A HREF="http://theory.lcs.mit.edu/~plaat/mtdf.html">http://theory.lcs.mit.edu/~plaat/mtdf.html</A> 
 
 Have to say this: I think Kasparov made a real fool of himself against
DB. Partly because he didn't play very well, but mostly because he acted
like a really bad looser after the match!
 
    Magnus H

</BODY>
</HTML>

--------------BCBAF66D96DDD161DC751887--

Robert Hyatt

unread,

May 26, 1997, 3:00:00 AM5/26/97

to

Magnus Heldestad (m...@ausys.se) wrote:

: --------------BCBAF66D96DDD161DC751887

: Content-Type: text/plain; charset=us-ascii
: Content-Transfer-Encoding: 7bit

: Could anyone point me to some info, or explain to me, how Null Move
: heuristic and killer moves works? Isn't killer moves just a weaker form
: of history heuristic?

Yes, but it has one *big* advantage. The history heuristic uses the from/to
square to maintain history. The killer heuristic stores the actual move. This
is important, because you can try the killers first, right after the captures,
*before* you generate the non-captures. If you get a cutoff, you get the
benefit of a quick cutoff, without the overhead of a full move generation.

: Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The

: reason I ask is that there was a couple of posts recently about move
: ordering, and ETC is (in my opinion) ideally suited to place in the move
: ordering (preferably as priority 1). My own experiment with ETC shows an
: time advantage of about 15% when using it on all nodes except leaf nodes
: (but NPS dropped some).

Not sure what you are asking above. Can you provide more information, as I've
not run across "ETC" that I recall, at least not by that name...

: Here is a tricky question: according to what I have heard, MTD_f should

: be faster than NegaScout+PVS. Is it possible to implement a version of
: PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes
: that fail high? For a description of MTD_f, see:
: http://theory.lcs.mit.edu/~plaat/mtdf.html

First, no. Nodes that fail high are not necessarily PV nodes. This is
one shortcoming of MTD as it is now called.

I've run mtd(f) tests quite recently... Here's a summary:

1. I tried a version of this in Cray Blitz, in 1980 or so, when we first started using
the null-window search. I used what Plaat now calls the binary version of mtd, which
tries a bisection approach. I used the traditional value passed from the last search
iteration, with the upper and lower bounds set to this score + a small amount, as I
do in Crafty. On a fail high, I'd try beta+upper_bound/2 as the new lower bound and
keep honing in, taking log2 passes to hone in...

2. I recently tried the MTD(f) algorithm in Crafty. And have since thrown it out. Once
again, the overhead is simply too much. It would probably work well, if your iteration-to-
iteration scores don't change much. The drawback is that you start searching with the
window last_score,last_score+1, and if it fails low, you decrease these by 1 and try again.
In the case of Crafty, where the eval is in millipawns, this simply didn't work, because to
win a pawn requires 1,000 researches... It might be better with centipawns, but I tried some
tests incrementing/decrementing by 10 and it didn't help.

I would postulate that the following make it non-viable for my case:

a. millipawn evaluation.

b. large positional scores that cause the eval to change significantly iteration by
iteration. This is *not* an even-odd situation either. If your eval produces scores that
have signficant fluctuations, MTD simply won't work. It might be quite good for a program
with a relatively simple evaluation, or one where the evaluation is somehow quite stable,
even though the pieces are moved around a lot. For me, it was typically 2x-3x slower than
normal.

Also, I found that the suggestion of using the iteration-2 score as a better starting
point for programs with significant score swings between iterations was not good enough.
The initial estimate has to be *very* close to work. The further off, the more work you
do. It's not hard at all to make it much slower.

I can provide the mtd driver for anyone interested, although there are a few other
issues that need changing. For example, you get no PV, so you know move "A" is better,
but not why, because you don't know what is expected at every other ply. I didn't like
this from a development/debugging point of view. Also crafty would produce many "!!"
type output lines because each MTD iteration fails high or low. I had to modify the
output code to forget the !! anouncement and also forget trying to produce a PV, since
a zero-width search can't return one.

: Have to say this: I think Kasparov made a real fool of himself against

Peter Kappler

unread,

May 26, 1997, 3:00:00 AM5/26/97

to

On Mon, 26 May 1997 17:16:27 +0200, Magnus Heldestad <m...@ausys.se>
wrote:

>Could anyone point me to some info, or explain to me, how Null Move
>heuristic and killer moves works? Isn't killer moves just a weaker form
>of history heuristic?

Yes, killer moves are a subset of the history heuristic. For a full
explanation of the history heuristic, see Bob Hyatt's post from May 23
("Move Ordering" thread).

>
>Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
>reason I ask is that there was a couple of posts recently about move
>ordering, and ETC is (in my opinion) ideally suited to place in the move
>ordering (preferably as priority 1). My own experiment with ETC shows an
>time advantage of about 15% when using it on all nodes except leaf nodes
>(but NPS dropped some).

I use ETC, and 15% speedup sounds about right.

--Peter

Robert Hyatt

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Peter Kappler (pe...@bitsource.com) wrote:
: On Mon, 26 May 1997 17:16:27 +0200, Magnus Heldestad <m...@ausys.se>
: wrote:

: --Peter

What is this? The idea of "sneaking one ply ahead", seeing which current move
will produce a TT hit, and then trying that move first? (Suggested by Jonathan
Schaeffer as I recall?)

Magnus Heldestad

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

> : >Btw, how many out there uses ETC (Enhanced Transposition
> Cutoffs). The
> : >reason I ask is that there was a couple of posts recently about
> move
> : >ordering, and ETC is (in my opinion) ideally suited to place in
> the move
> : >ordering (preferably as priority 1). My own experiment with ETC
> shows an
> : >time advantage of about 15% when using it on all nodes except
> leaf nodes
> : >(but NPS dropped some).
>
> : I use ETC, and 15% speedup sounds about right.
>
> : --Peter
>
> What is this? The idea of "sneaking one ply ahead", seeing which
> current move
> will produce a TT hit, and then trying that move first? (Suggested
> by Jonathan
> Schaeffer as I recall?)

In the beginning of AB, do the following

for all childs c of n
if c is a MAX node and lower >= beta
return lower
if c is a MIN node and upper <= alpha
return upper

This is for MiniMax version of AB. The NegaMax is simpler, replace the
first if with
if lower >= beta
return lower

This will move subtrees that produces cutoffs from the right to the left
part of the tree.

Magnus H

Peter Kappler

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

On 27 May 1997 01:12:22 GMT, hy...@crafty.cis.uab.edu (Robert Hyatt)
wrote:

>Peter Kappler (pe...@bitsource.com) wrote:
>
>: I use ETC, and 15% speedup sounds about right.
>
>: --Peter
>
>
>What is this? The idea of "sneaking one ply ahead", seeing which current move
>will produce a TT hit, and then trying that move first? (Suggested by Jonathan
>Schaeffer as I recall?)
>

Here's how I do it (and yes, I believe I read about it in an old
article by Schaeffer):

If I reach a node and can't return a score from the TT, I "sneak one
ply ahead" for each move in the movelist, and check to see if any of
those child positions are in the TT with a score that will cause a
cutoff back at the parent node.

--Peter

Plaat A

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Concerning ETC:

Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:
: Magnus Heldestad (m...@ausys.se) wrote:

: : Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The

: : reason I ask is that there was a couple of posts recently about move
: : ordering, and ETC is (in my opinion) ideally suited to place in the move
: : ordering (preferably as priority 1). My own experiment with ETC shows an
: : time advantage of about 15% when using it on all nodes except leaf nodes
: : (but NPS dropped some).

: Not sure what you are asking above. Can you provide more information, as I've
: not run across "ETC" that I recall, at least not by that name...

ETC stands for "Enhanced Transposition Cutoffs", and is described in the
following paper:
Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie de Bruin:
Exploiting Graph Properties of Game Trees
Proceedings of the 13th National Conference on Artificial Intelligence (AAAI'96),
August 4-8, 1996, Portland, Oregon. AAAI/MIT Press, Volume 1, page 234-239.

It's only 6 pages, and gives pseudo code of the idea. You can download it
as a postscript file from http://theory.lcs.mit.edu/~plaat/minimax.html

Basically, the idea is to first do a lookup in the TT, for each move that you
are about to make in a position, and see if it hits and causes a cutoff.
This is done *before* the normal recursive search call.

This way, you make better use of existing TT info, at the cost of (possibly)
more move generaions, and more TT-lookups.
It typically works best if you find a trade-off by doing ETC in the top part
of the tree, say, up until two ply above the leaves.

Now on to MTD(f):

: : Here is a tricky question: according to what I have heard, MTD_f should

: : be faster than NegaScout+PVS. Is it possible to implement a version of
: : PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes
: : that fail high? For a description of MTD_f, see:
: : http://theory.lcs.mit.edu/~plaat/mtdf.html

: First, no. Nodes that fail high are not necessarily PV nodes. This is
: one shortcoming of MTD as it is now called.

: I've run mtd(f) tests quite recently... Here's a summary:

[...]

: I would postulate that the following make it non-viable for my case:

: a. millipawn evaluation.

: b. large positional scores that cause the eval to change significantly iteration by
: iteration.

[...]

I've snipped most of Bob's text, in the interest of brevity, although most
of Bob's message describes results that are mostly consistent with our
own experiments, and interesting, so do go back to his original posting
to read it.

In our work we found Bob's points a. and b. to be entirely correct. If you have
a very fine grained eval, than you will do many re-searches, each one a
null-window call with a slightly better bound. Point b. concerns big swings
in the socre from iteration to iteration. Since MTD(f) uses the score
of the previous iteration as its seed, a bad seed does many re-searches,
causes a long 'convergence-process'.

Now in principle doing all those re-searches is not a problem, PROVIDED that
the search fits in your transposition table, and generating moves and doing
TT-lookups is reasonably fast. Also, it helps to store the leaf-eval values
in the TT, to prevent you from re-doing the eval at all those re-searches.
It turns out that for tournament play, 3 minutes per move, the search tree
of a typical program fits easily in the TT.
(Of course, you need a TT replacement scheme that works well.)

Even Cilkchess (the successor to *Socrates) which is a very fast program
that generates lots of nodes, uses MTD(f) without problems, the tree does fit
in the TT.

But still, Bob's point is valid: although a re-search is (or should be)
cheap, too many of them can kill you.
To find out more about our experience have a look at
http://theory.lcs.mit.edu/~plaat/mtdf.html, and the papers that you can find
there, since there's much more interesting stuff going on than just these
two points. To make a new algorithm work in a program that
has been tuned extensively, is a non-trivial task. Do not expect it to
work efficiently immediately. Chess trees are highly irregular,
and small changes can have big effects. Prepare to spend some time
re-tuning your program (assuming you decide it's worth the effort).
The easiest thing to decide between NegaScout and MTD(f) is probably
in a new, not yet so highly tuned, program.

Also, there are many other approaches you can take to using null-window
searches from the root. As Bob mentions, MTD(f) is just one of them; there
are many variations possible, some of which may work better for you. And
if you find a new one, or some new insights, I hope that you'll share
your insights so we all can benefit.

Incidentally, Bob's remarks that you don't know during
the search whether a node is a PV node is also correct, which may be a problem
for some programs, although I haven't found that for the programs I've worked
at.

The real advantage of MTD(f) for Cilkchess is its simplicity. Cilkchess
is a parallel program, and programming a parallel search algorithm is quite
complicated. MTD(f) does null-window searches *only*. There's no re-search
going on *during* the search. All re-searches are done at the root. This has
made for a tremendous simplification of our code, compared to NegaScout.
So, who cares if it's a few tens of percent more efficient (as it was in
our case), not having to spend hours debugging a parallel search algorithm is
worth much more. (I should also point out that we used Cilk, a C-based parallel
language which is very well suited for programming chess. It's freely
available at http://theory.lcs.mit.edu/~cilk/)

Aske Plaat

Robert Hyatt

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Peter Kappler (pe...@bitsource.com) wrote:
: On 27 May 1997 01:12:22 GMT, hy...@crafty.cis.uab.edu (Robert Hyatt)
: wrote:

: --Peter

OK. thought this was something I hadn't seen. At least I can now refer
to it by its offical name, ETC. Although that sounds a tad ugly, just like
that Cadillac "ETC" that got a lot of Caddy owners pissed at GM. :)

Hey buddy, you got onena 'em etcetera's? :)

Robert Hyatt

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Plaat A (as...@cs.vu.nl) wrote:
: Concerning ETC:

: Robert Hyatt (hy...@crafty.cis.uab.edu) wrote:
: : Magnus Heldestad (m...@ausys.se) wrote:

: : : Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
: : : reason I ask is that there was a couple of posts recently about move
: : : ordering, and ETC is (in my opinion) ideally suited to place in the move
: : : ordering (preferably as priority 1). My own experiment with ETC shows an
: : : time advantage of about 15% when using it on all nodes except leaf nodes
: : : (but NPS dropped some).

: : Not sure what you are asking above. Can you provide more information, as I've
: : not run across "ETC" that I recall, at least not by that name...

: ETC stands for "Enhanced Transposition Cutoffs", and is described in the
: following paper:
: Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie de Bruin:
: Exploiting Graph Properties of Game Trees
: Proceedings of the 13th National Conference on Artificial Intelligence (AAAI'96),
: August 4-8, 1996, Portland, Oregon. AAAI/MIT Press, Volume 1, page 234-239.

I'd already played with this Jonathan suggested I try this sometime last
year... I just had not heard it called "ETC"...

Bob

Ronald de Man

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

hy...@crafty.cis.uab.edu (Robert Hyatt) writes:

>Magnus Heldestad (m...@ausys.se) wrote:

>: Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
>: reason I ask is that there was a couple of posts recently about move
>: ordering, and ETC is (in my opinion) ideally suited to place in the move
>: ordering (preferably as priority 1). My own experiment with ETC shows an
>: time advantage of about 15% when using it on all nodes except leaf nodes
>: (but NPS dropped some).

>Not sure what you are asking above. Can you provide more information, as I've
>not run across "ETC" that I recall, at least not by that name...

Yes, you have. I remember reading a post of you reporting you tried it,
didn't like it but saved the code for possible later use.
ETC means you first generate all moves, and see if one of thems leads
to a hash hit causing an immediate cut off, before you search any of them.

I have tried this too and didn't like it either, but that also had to do with
the way my makemove works at the moment. So maybe I'll try it again some time.

>: Here is a tricky question: according to what I have heard, MTD_f should
>: be faster than NegaScout+PVS. Is it possible to implement a version of
>: PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes
>: that fail high? For a description of MTD_f, see:
>: http://theory.lcs.mit.edu/~plaat/mtdf.html

>First, no. Nodes that fail high are not necessarily PV nodes. This is
>one shortcoming of MTD as it is now called.

>I've run mtd(f) tests quite recently... Here's a summary:

>2. I recently tried the MTD(f) algorithm in Crafty. And have since thrown it out. Once

>again, the overhead is simply too much. It would probably work well, if your iteration-to-
>iteration scores don't change much. The drawback is that you start searching with the
>window last_score,last_score+1, and if it fails low, you decrease these by 1 and try again.
>In the case of Crafty, where the eval is in millipawns, this simply didn't work, because to
>win a pawn requires 1,000 researches... It might be better with centipawns, but I tried some
>tests incrementing/decrementing by 10 and it didn't help.

I have never tried MTD(f) myself, so I don't really know what I'm talking
about. But what I think is that even using millipawns the idea of MTD(f)
is not that you have to do 1000 researches when winning or losing a pawn.
I guess MTD(f) assumes a fail-soft implementation of alpha/beta, so that
there will be jumps greater than 1.
I'm a bit sceptical about MTD(f), because the paper that I read on it
stresses the fact that with this method the number of leaf nodes is
minimized. But what I'm interested in is the total number of nodes.
Again, I might have things wrong here.

>I can provide the mtd driver for anyone interested, although there are a few other
>issues that need changing. For example, you get no PV, so you know move "A" is better,
>but not why, because you don't know what is expected at every other ply. I didn't like
>this from a development/debugging point of view.

I think with a little hacking you can still produce a PV. I have a trick
in my PVS search that probably apply here too. Anyway, needs some thinking.

Ronald de Man

Robert Hyatt

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Ronald de Man (de...@wsdw10.win.tue.nl) wrote:
: hy...@crafty.cis.uab.edu (Robert Hyatt) writes:

: >Magnus Heldestad (m...@ausys.se) wrote:

: >: Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The

: >: reason I ask is that there was a couple of posts recently about move
: >: ordering, and ETC is (in my opinion) ideally suited to place in the move
: >: ordering (preferably as priority 1). My own experiment with ETC shows an
: >: time advantage of about 15% when using it on all nodes except leaf nodes
: >: (but NPS dropped some).

: >Not sure what you are asking above. Can you provide more information, as I've
: >not run across "ETC" that I recall, at least not by that name...

: Yes, you have. I remember reading a post of you reporting you tried it,

: didn't like it but saved the code for possible later use.
: ETC means you first generate all moves, and see if one of thems leads
: to a hash hit causing an immediate cut off, before you search any of them.

: I have tried this too and didn't like it either, but that also had to do with
: the way my makemove works at the moment. So maybe I'll try it again some time.

I figured this out yesterday. :)

I didn't try an "efficient" implementation however, just an implementation.
I will try it again, but I'd to a very simple "makemove" that *only*
computes the hash signature. In my test, I used the real MakeMove which
is much more expensive than what is needed.

However, I didn't get much in the way of a speedup, if any. But Crafty
has changed dramatically over the past year+, so it might work.

A word to the wise: If you try something that seems reasonable, but it
doesn't work, make a note. And try it again later. If you look at main.c,
in Crafty, you'll notice that I tried null-move, R=2 at least two times and
tossed it out both times, yet the third try was the charm. You really can't
do R=2 at 4-5 plies deep, because it will obscure *everything*.. Now that
Crafty goes 8-9 plies deep in blitz, it works very well...

The moral: what fails today might work tomorrow, after other changes are
given a chance to do their thing...

: >: Here is a tricky question: according to what I have heard, MTD_f should

: >: be faster than NegaScout+PVS. Is it possible to implement a version of
: >: PVS in MTD_f, and if so, what should be stored as PV nodes? All nodes
: >: that fail high? For a description of MTD_f, see:
: >: http://theory.lcs.mit.edu/~plaat/mtdf.html

: >First, no. Nodes that fail high are not necessarily PV nodes. This is
: >one shortcoming of MTD as it is now called.

: >I've run mtd(f) tests quite recently... Here's a summary:

: >2. I recently tried the MTD(f) algorithm in Crafty. And have since thrown it out. Once

: >again, the overhead is simply too much. It would probably work well, if your iteration-to-
: >iteration scores don't change much. The drawback is that you start searching with the
: >window last_score,last_score+1, and if it fails low, you decrease these by 1 and try again.
: >In the case of Crafty, where the eval is in millipawns, this simply didn't work, because to
: >win a pawn requires 1,000 researches... It might be better with centipawns, but I tried some
: >tests incrementing/decrementing by 10 and it didn't help.

: I have never tried MTD(f) myself, so I don't really know what I'm talking

: about. But what I think is that even using millipawns the idea of MTD(f)
: is not that you have to do 1000 researches when winning or losing a pawn.
: I guess MTD(f) assumes a fail-soft implementation of alpha/beta, so that
: there will be jumps greater than 1.
: I'm a bit sceptical about MTD(f), because the paper that I read on it
: stresses the fact that with this method the number of leaf nodes is
: minimized. But what I'm interested in is the total number of nodes.
: Again, I might have things wrong here.

: >I can provide the mtd driver for anyone interested, although there are a few other

: >issues that need changing. For example, you get no PV, so you know move "A" is better,
: >but not why, because you don't know what is expected at every other ply. I didn't like
: >this from a development/debugging point of view.

: I think with a little hacking you can still produce a PV. I have a trick

Magnus Heldestad

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Plaat A wrote:

> Now on to MTD(f):
>
> : : Here is a tricky question: according to what I have heard, MTD_f
> should
> : : be faster than NegaScout+PVS. Is it possible to implement a
> version of
> : : PVS in MTD_f, and if so, what should be stored as PV nodes? All
> nodes
> : : that fail high? For a description of MTD_f, see:
> : : http://theory.lcs.mit.edu/~plaat/mtdf.html
>

> Incidentally, Bob's remarks that you don't know during
> the search whether a node is a PV node is also correct, which may be
> a problem
> for some programs, although I haven't found that for the programs
> I've worked
> at.
>

> Aske Plaat

So, you can't store a PV node when using MTD_f. Is there any gain in
storing all moves that fail high, and try them first when you get a hit
from TT?

Or to make it general, how do you use the TT in MTD_f in the best way
(except for the normal store/retrieve)?

Magnus H

Robert Hyatt

unread,

May 27, 1997, 3:00:00 AM5/27/97

to

Magnus Heldestad (m...@ausys.se) wrote:
: Plaat A wrote:

: > Now on to MTD(f):

: >
: > : : Here is a tricky question: according to what I have heard, MTD_f
: > should
: > : : be faster than NegaScout+PVS. Is it possible to implement a
: > version of
: > : : PVS in MTD_f, and if so, what should be stored as PV nodes? All
: > nodes
: > : : that fail high? For a description of MTD_f, see:
: > : : http://theory.lcs.mit.edu/~plaat/mtdf.html
: >

: > Incidentally, Bob's remarks that you don't know during

: > the search whether a node is a PV node is also correct, which may be
: > a problem
: > for some programs, although I haven't found that for the programs
: > I've worked
: > at.

: >
: > Aske Plaat

: So, you can't store a PV node when using MTD_f. Is there any gain in
: storing all moves that fail high, and try them first when you get a hit
: from TT?

Actually, the concept of "PV" doesn't even exist... A PV node is
simply a node whose value V can be defined by alpha < V < beta. There
is no such V for MTD(f), since alpha=beta-1, there's no "room" in there
for one.

You should certainly try this, as the zero-window searches rip like the
blazes. Just don't get carried away with large positional scores or it
will be a total loser for you like it was in Crafty. If your positional
score swings only 100 units, that is 100 zero-width searches you will have
to do to refine the value sufficiently. That is more expensive than a
search over the interval alpha,alpha+200, which would only require one
iteration...

: Or to make it general, how do you use the TT in MTD_f in the best way

: (except for the normal store/retrieve)?

: Magnus H

You have to have one, or MTD(f) won't work. It depends on some of the
move ordering and so forth to make it efficient. However, I didn't find
this nearly so effective in the true MTD(f) implementation, because if
you search a position with N,N+1 bounds, and you fail low, you then
search with N-1,N bounds, and none of the old fail-hi/fail-low entries
will be useful on one edge of the search. Obviously anything that previously
failed high on N+1 would also fail high on N, but the other edge of the
graph gets no help...

Magnus Heldestad

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

Robert Hyatt wrote:

> [ ... stuff deleted]

> I didn't try an "efficient" implementation however, just an
> implementation.
> I will try it again, but I'd to a very simple "makemove" that *only*
>
> computes the hash signature. In my test, I used the real MakeMove
> which
> is much more expensive than what is needed.
>
> However, I didn't get much in the way of a speedup, if any. But
> Crafty
> has changed dramatically over the past year+, so it might work.

In my implementation, I have placed ETC in the move ordering. Currently
I am sorting every node except leaf nodes, and an hit in TT with lower
>= beta always gets placed first, and then I exit the move ordering,
since there is no use sorting anymore (remember, the node with lower >=
beta will immediately produce a cutoff, so no other moves will be
tried).

Also note, that an hit with lower < beta is still better than a node
with no hit at all, when it comes to sorting nodes. I am giving a hit on
a node in the move ordering about the same value as a pawn exchange.

With this approach, I gain about 15% time on ply 6 (a bit more on ply
7).

It would be interesting to hear from Bob H if this gives any speedup in
Crafty!

Magnus H

Magnus Heldestad

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

> : So, you can't store a PV node when using MTD_f. Is there any gain
> in
> : storing all moves that fail high, and try them first when you get
> a hit
> : from TT?
>
> Actually, the concept of "PV" doesn't even exist... A PV node is
> simply a node whose value V can be defined by alpha < V < beta.
> There
> is no such V for MTD(f), since alpha=beta-1, there's no "room" in
> there
> for one.
>
> You should certainly try this, as the zero-window searches rip like
> the
> blazes. Just don't get carried away with large positional scores or
> it
> will be a total loser for you like it was in Crafty. If your
> positional
> score swings only 100 units, that is 100 zero-width searches you
> will have
> to do to refine the value sufficiently. That is more expensive than
> a
> search over the interval alpha,alpha+200, which would only require
> one
> iteration...

In my test positions, on depth 6, AB(beta-1, beta) gets called around
5-15 times most of the times, with some positions up to 50 times, but
NEVER over 100. The trick is that, for MTD_f to work, you have to use a
fail soft version of AB. In several positions, AB only gets called 2
times, which is as good as it can get (ie, you prove the lower bound,
and then you prove the upper bound).

However, if you use a version of AB(alpha, beta) which never returns a
value lower than alpha, or higher than beta, MTD_f will be very
inefficient!

The problem with MTD_f is that many enhancements gets more complicated
or impossible (Null moves, PVS search, ...)

Magnus H

Dennis Breuker

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

Robert Hyatt wrote:
>
> Peter Kappler (pe...@bitsource.com) wrote:

> : On Mon, 26 May 1997 17:16:27 +0200, Magnus Heldestad <m...@ausys.se>
> : wrote:
>
> : >Could anyone point me to some info, or explain to me, how Null Move
> : >heuristic and killer moves works? Isn't killer moves just a weaker form
> : >of history heuristic?
>
> : Yes, killer moves are a subset of the history heuristic. For a full
> : explanation of the history heuristic, see Bob Hyatt's post from May 23
> : ("Move Ordering" thread).
>
> : >

> : >Btw, how many out there uses ETC (Enhanced Transposition Cutoffs). The
> : >reason I ask is that there was a couple of posts recently about move
> : >ordering, and ETC is (in my opinion) ideally suited to place in the move
> : >ordering (preferably as priority 1). My own experiment with ETC shows an
> : >time advantage of about 15% when using it on all nodes except leaf nodes
> : >(but NPS dropped some).
>

> : I use ETC, and 15% speedup sounds about right.
>
> : --Peter
>
> What is this? The idea of "sneaking one ply ahead", seeing which current move
> will produce a TT hit, and then trying that move first? (Suggested by Jonathan
> Schaeffer as I recall?)

Yes, it is. From Plaat's thesis (p.93):

``Before doing any search at an interior node, a quick check of all the
positions arising from this node in the transposition table may result
in
finding a cutoff. The technique to achieve Enhances Transposition
Cutoffs,
ETC, performs transposition table lookups on successors of a node,
looking
for transpositions into previously searched lines. [...] The reduction
in
search tree size offered by ETC is, in part, offset by the increased
computation per node. For chess and checkers, it appears that performing
ETC at all interior nodes is not optimal. A compromise, performing ETC
at all interior nodes that are more than 2 ply away from the leaves,
results
in most of the ETC benefits with only a small computational overhead,
making
it well suited for for use in both on-line and off-line algorithms.
Thus,
ETC is a practical enhancement to most Alpha-Beta search programs.''

Dennis.

Robert Hyatt

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

Magnus Heldestad (m...@ausys.se) wrote:
: Robert Hyatt wrote:

: Magnus H

I'm going to run this again, perhaps today if I have time. I'm also
curious. When I tried it a year ago or more, it was a "break even" deal.
(I only counted tree size because you really only need to update the
hash signature to probe the hash table, but I used the full MakeMove
because it was easier for a simple test.) My test was done like this,
however:

I generated the hash key, probed the *oppsite* hash table, and if I
found a fail low that would *really* fail low, I simply returned (beta)
at this node because this would be a fail high. I didn't even use the
search to do this as it saved a call. I did not check to see if there
was a "fail low" that could not fail low due to lack of hash depth, and
still try this move first, which I should have and which I will this
time...

Valavan Manohararajah

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

Robert Hyatt wrote:
> I'm going to run this again, perhaps today if I have time. I'm also
> curious. When I tried it a year ago or more, it was a "break even" deal.
> (I only counted tree size because you really only need to update the
> hash signature to probe the hash table, but I used the full MakeMove
> because it was easier for a simple test.) My test was done like this,
> however:
>
> I generated the hash key, probed the *oppsite* hash table, and if I
> found a fail low that would *really* fail low, I simply returned (beta)
> at this node because this would be a fail high. I didn't even use the
> search to do this as it saved a call. I did not check to see if there
> was a "fail low" that could not fail low due to lack of hash depth, and
> still try this move first, which I should have and which I will this
> time...

I think you should also perform any extensions you would usually do
before into peeking at next level's the hash table... when you factor
in the make/unmake and search extensions, this will start looking almost
like a real call to search().

Jonathan Schaeffer

unread,

May 28, 1997, 3:00:00 AM5/28/97

to

hy...@crafty.cis.uab.edu (Robert Hyatt) writes:
>I'm going to run this again, perhaps today if I have time. I'm also
>curious. When I tried it a year ago or more, it was a "break even" deal.
>(I only counted tree size because you really only need to update the
>hash signature to probe the hash table, but I used the full MakeMove
>because it was easier for a simple test.) My test was done like this,
>however:

There are two things to consider:
1) Tree size. It is the obvious metric to measure the benefits.
2) Quality of the search result.

The latter is very important. With search extensions, it is possible
that the benefits of ETC are not seen. If you want to use tree size
as a metric, you should use fixed depth searches. One thing that
we observed is that when you use ETC, you are "encouraging" more
transpositions, resulting in some nodes being searched deeper than
they normally would. This resulted in (occasionally) getting a better
result backed up to the root of the search tree.

The Deep Blue people tried ETC a year or so ago and reported that ETC
did not reduce the number of nodes in their (variable depth) searches,
but it did improve the number of problems they solved from their test
suites.

Robert Hyatt

unread,

May 29, 1997, 3:00:00 AM5/29/97

to

Jonathan Schaeffer (jona...@cs.ualberta.ca) wrote:

Good note. I'm still working on implementing a reasonable version for
testing. My generate in spurts is a minor sticking point I'm working
on...