Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MTD(f) and the PV

34 views
Skip to first unread message

Adrian Jackson

unread,
Mar 16, 2001, 12:33:27 AM3/16/01
to
Is it possible to maintain a principal variation while using
MTD(f)? I haven't been able to yet.

Steve Maughan

unread,
Mar 16, 2001, 6:20:11 AM3/16/01
to
Only by probing the hash table

Steve

"Adrian Jackson" <n...@spam.com> wrote in message
news:o4a3btsen2df1dmf4...@4ax.com...

Robert Hyatt

unread,
Mar 16, 2001, 9:39:05 AM3/16/01
to
Thomas Wolf <t_w...@my-deja.com> wrote:

> That's not quite correct. Of course you can. Build it in your
> negamax searcher (the one MTD(f) calls to do its zero-window
> searches) as usual. Back in the MTD(f) driver, copy it! In the
> next round of MTD(f), the negamax searcher will build a new PV,
> and you'll copy it again (overwriting the previous one).

How are you going to do this. At every other ply you will fail low and
that doesn't give you a best move. Every MTD(f) search either fails high
at the root, or it fails low. If it fails high at the root, you get the
best move, but you don't get a move for ply=2 since you searched every
move there and each time the score came back <= alpha. If you fail low
at the root, you have no best move there for the same reason.

There is a tricky way to do this. As you 'zero' in on the right score,
eventually you will reach a leaf position where the evaluation code
produces a score that matches one side of the search window you are
using. You can notice this and store the path of moves that led to
this position...


> When you get out of MTD(f), you have the PV.

> Note that all but the first negamax search in MTD(f) may actually
> build truncated PVs due to hash table cutoffs.

I would think it would build a PV of length 0 or 1, depending on whether
the search at the root fails low or high.

> If you want to avoid
> that, do not simply copy the returned PV, but merge it with the one
> from the previous negamax search within MTD(f): if the new PV is
> a prefix of the one you already have, then don't overwrite it. Only
> overwrite the previous copy if the new one deviates. Example: assume
> the first negamax search within MTD(f) returns a PV "A-B-C-D-E-F".

How is this possible when you do every search with a window of X and
X+1? If you get a score <= X, you have no idea of which move was best
at the root, much less anywhere else in the tree. If you get a value
>= X+1, you know the root move that failed high, but you have no idea
which move is best at ply=2...

> The
> second negamax search is cut short by hash table hits and just returns
> "A-B". Since "A-B" is a prefix of "A-B-C-D-E-F", keep the old one. If
> you always copied, you'd lose the useful information about the
> continuation "C-D-E-F".

> --
> Dr. Thomas Wolf <t_w...@my-deja.com>

--
Robert Hyatt Computer and Information Sciences
hy...@cis.uab.edu University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170

Thomas Wolf

unread,
Mar 16, 2001, 7:45:49 AM3/16/01
to
On Fri, 16 Mar 2001 11:20:11 -0000, "Steve Maughan"
<maughanNOS...@bigfoot.com> wrote:

That's not quite correct. Of course you can. Build it in your


negamax searcher (the one MTD(f) calls to do its zero-window
searches) as usual. Back in the MTD(f) driver, copy it! In the
next round of MTD(f), the negamax searcher will build a new PV,
and you'll copy it again (overwriting the previous one).

When you get out of MTD(f), you have the PV.

Note that all but the first negamax search in MTD(f) may actually

build truncated PVs due to hash table cutoffs. If you want to avoid


that, do not simply copy the returned PV, but merge it with the one
from the previous negamax search within MTD(f): if the new PV is
a prefix of the one you already have, then don't overwrite it. Only
overwrite the previous copy if the new one deviates. Example: assume

the first negamax search within MTD(f) returns a PV "A-B-C-D-E-F". The

Thomas Wolf

unread,
Mar 26, 2001, 4:06:52 AM3/26/01
to
On 16 Mar 2001 14:39:05 GMT, Robert Hyatt <hy...@crafty.cis.uab.edu>
wrote:

Now this had me puzzled -- Robert says, it can't work that way, but my
code does do it... But then I found out that indeed we're starting
from different ways of maintaining the PV in the first place. Sorry
about that, I overlooked it until now, even though Robert's comment on
my remark on using linked lists instead of a "triangular" array
(paraphrasing: "modifying the PV happens so rarely that this wouldn't
pay off") should already have told me that somehow I'm doing it
differently.

Anyway, re-reading the threads on PV maintenance, I notice now that
with "normal" negascout variants, you only modify the PV when the
current move is > alpha. In my MTD(f) PV maintenance, I do this each
time the current move's eval is > the previously known best move's
eval (MTD(f) using a failsafe zero-window search, i.e. "previously
known best move's eval" is always initialized to -infinity). This
has three consequences:

1. I need to modify the PV more often -- not only when eval > alpha,
but each time eval > best_eval_so_far. This is why the linked-list
implementation *is* significantly faster for my implementation.

2. Doing PV maintenance this way, I *do* get a useable PV at the end
of MTD(f). Basically, *during* MTD(f), my "PV" maintains a
"currently best" path through the game tree. And since MTD(f)
homes in to the optimum using repeated zero-window searches, I'll
have a true PV only in the end, when dropping out of the MTD(f)
driver.

3. Merging "PV"s as I described above is necessary because usually
the very last MTD(f) iteration is cut short by an early hash
table hit.

All this is similar to what Robert mentioned above: noting when a leaf
eval is within the (zero) window bounds, and then storing the path
that lead there. I do it the other way 'round: I store currently best
pathes until MTD(f) terminates, at which point the "currently best"
path is the PV. BTW, the pointer swap I have to do each time eval is
greater than best_eval_so_far doesn't seem to have any detrimental
effects on the performance. (But an array copy does!)

0 new messages