Rollouts for double/no-double when JF misevaluates equity

Philippe Michel

unread,

Jun 11, 1998, 3:00:00 AM6/11/98

to

Suppose one wants to look if X has a double in the following
position :

+24-23-22-21-20-19-+---+18-17-16-15-14-13-+
| X O | | O X |
| X O | | O X |
| O | | O X |
| O | | O X |
| O | | X |
| | | |64
| | | |
| | | |
| | | O |
| | | O |
| O X X X | | X O |
| O X X X | | X O |
+-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+

X on roll

12960 JellyFish level 5 rollouts give :

centered cube : equity = 0.68
O holds a 2 cube : equity = 0.37*2

This looks like a double, but according to the rollouts, X cubeless
equity is 0.48 and according to the level 5 evaluation 0.37 only.

So, is it possible that the rollouts underestimate the value of cube
access for X, since he is only 0.07 away from a cash instead of 0.18 ?

A rollout with a cash point of 0.44 instead of 0.55 (and the same seed
than before) to try to take care of this gives a centered cube equity
of 0.79, so it would no be a double (0.79 -> 2*0.37 ; the equity with
O holding the cube should be the one with the normal cash point, since he
won't be able to redouble before the position has changed substantially).

Should one really take care of these errors in JF static evaluation
in double/no-double rollouts ?
What would be the appropriate cash point ? 0.55 ? 0.55 + L5 evaluation
- rollouts' cubeless equity ? something inbetween ?
Should one check if the evaluation is more or less wrong for the market
losers vs. the more sterile rolls ?

Chuck Bower

unread,

Jun 12, 1998, 3:00:00 AM6/12/98

to

In article <6lp1lh$em9$1...@syseca.syseca.fr>,
Philippe Michel <mic...@syseca.fr> wrote:

I'm confused by this post. I don't see a problem with the rollouts.
BTW, my JFv3.0 gives evaluations at levels 5,6, and 7 around 0.30, not 0.37.
The evaluation appears to be wrong, since it says 0.30 cubeless equity
while the rollout says 0.48. Always believe a rollout over an evaluation,
unless there is strong, convincing evidence to the contrary.
The rollout says 0.68 with cube in the middle and (2*0.37=) 0.74 with
the cube in the non-roller's possession. If you believe the rollout,
then since 0.74 is greater than 0.68, X has an initial double. YOU
CANNOT compare cubeless equity with cubeful centered or cubeful one
side owned without some kind of serious adjustment. The level-6 cubless
(which I ran and received an answer in the low 0.5's) is consistent with
the level-5 cubeless as well as the level-5 limited cube rollouts. Double,
take. It is the level-5 (and higher, in this case) evaluations which
are suspect.

Chuck
bo...@bigbang.astro.indiana.edu
c_ray on FIBS

Philippe Michel

unread,

Jun 12, 1998, 3:00:00 AM6/12/98

to

In article <6lq382$lp3$1...@flotsam.uits.indiana.edu>,
Chuck Bower <bo...@bigbang.astro.indiana.edu> wrote:
>[...]

>then since 0.74 is greater than 0.68, X has an initial double. YOU
>CANNOT compare cubeless equity with cubeful centered or cubeful one
>side owned without some kind of serious adjustment. The level-6 cubless
>(which I ran and received an answer in the low 0.5's) is consistent with
>the level-5 cubeless as well as the level-5 limited cube rollouts. Double,
>take. It is the level-5 (and higher, in this case) evaluations which
>are suspect.

That is right, but my point was that a wrong level 5 evaluation could
cause errors in the estimation of the value of having access to the cube.

Take this overly simplified case:
Evaluation of current position is 0.4 and the equity variation of the
next exchange of rolls are:
0.0 50% of the time
+0.1 25% of the time
-0.1 25% of the time

So cubeless equity = live cube equity = 0.4 and cube access is worth nothing.

Now suppose that this evaluation is off by 0.1. We may suppose that next
roll evaluation are off by about the same value (because the position is
not too volatile for instance).

We then have:
cubeless equity = 0.5
live cube equity = 0.5*50%+0.4*25%+1.0*25% = 0.6 ; cube access is worth 0.1.

So, because static evaluation underestimates cubeless equity, it seems
that the rollouts would underestimate the favorite's equity when he has
access to the cube vs. when he doesn't.

I would be interested to see the results of rollouts by JellyFish and
Snowie in the following circumstances:
- position where both programs play the checkers accurately
- marginal double/no double
- one of the programs' static evaluation is wrong compared the rollouts
(that ought to give identical cubeless results for both programs),
whereas the other's isn't.
My guess is that if a program's evaluation underestimates cubeless
equity, his rollouts will tend towards doubling too loosely.

Michael J. Zehr

unread,

Jun 12, 1998, 3:00:00 AM6/12/98

to

The double/no-double question is one of the hardest ones to answer in
backgammon today. This problem is a subset of the class of problems in
which you have a decision to make this turn, and if you choose not to
take a particular action, you have the same decision to make next turn.
(Running off an anchor is another such category of problems.)

If you do your analysis by hand then you can try different strategies
(don't double until contact is broken, or don't run off the anchor until
either forced or a double is rolled) and compare them. However the best
strategy is likely to be between the two extremes.

When using a computer to do the analysis it gets tricky for exactly the
reasons that Philippe mentions -- you can set up a position in which
you've forced the computer to make a particular decision on this turn,
but you can't easily force the computer to make the decision you're
trying to analyze on the second turn.

This is how I would analyze such a position using Jellyfish:

I always like to start with a level 6 rollout. There's just too big of
a difference between level 5 and level 6 for me to want to start with
just a level 5 rollout. The number of games to pick depends on how
patient you are and how fast your computer is. <grin> It also depends
on how close the decision is. A 1296 rollout usually gives you a
standard deviation of about .01 in equity, equivalent to about .5% in
winning chances.

With the rollout equity you can sometimes answer the double/no-double
question without doing much more analysis. 216 level 6 rollouts (seed 1)
gives an equity of .508 with a standard deviation of .023. This is
quite a bit better than the level 7 evaluation of .400. Level 7
evaluation also gives a volatility of .164. If the equity plus the
volatility is over a certain amount (in the .60 range) then it's almost
certainly a double.

If a close answer were acceptable, I'd call it a double/take at this
point. If a more accurate answer were needed, there are a number of
next steps one could take.

One option is a L5 rollout with the cube. Before checking the cube
numbers, compare the cubeless equity against the L6 rollout. Philippe
indicates that L5 gets a cubeless result of .48 which is consistent with
the L6 numbers (given the standard deviations) so L5 is probably playing
the checkers properly.

Next we might try to see if L5 is handling the cube properly in the
short term. Give X and O various rolls, check the evaluation, then do a
rollout. In this position JF might correctly evaluate the position
after X closes the bar point, but might incorrectly evaluate the
position before there are builders threatening the bar point. If the L5
evaluation and the rollout are close next turn, then we can have some
confidence in the L5 cube equities from this position. If they aren't
close, we might have to adjust the settlement limit.

I wouldn't suggest the settlement limit by the full amount of the
evaluation and rollout equity difference, however. Sometimes X rolls
poorly to start and will be cashing some turns in the future where we
don't know if JF is under or over evaluating X's position. Perhaps
adjusting the settlement limit by about half the difference will give an
accurate answer. (How certain are you that the .55 settlement limit is
right to begin with?)

Finally you need to take into account the human aspects. Has your
opponent dropped a similar position in the past? How accurately do you
think he or she will play the position? (I expect there's more skill
required to play O's side of this game than X's. O has to carefully
balance leaving shots with bringing down builders to make points.)

The double/no-double decision is close here, but I'll agree with the
first L5 rollout -- double/take.

-Michael J. Zehr

Chuck Bower

unread,

Jun 15, 1998, 3:00:00 AM6/15/98

to

In article <6lqsi3$sb9$1...@syseca.syseca.fr>,
Philippe Michel <mic...@syseca.fr> wrote:

(snip)
>...my point was that a wrong level 5 evaluation could

>cause errors in the estimation of the value of having access to the cube.

(snip)

Sorry, Philippe. Now I understand. And I think the answer is "yes,
sometimes JF's (occasional) erroneous evaluation at level-5 will cause
errors in its limited cube rollout results." JF is not yet perfect.
Snowie is going to throw down the challenge with greater rollout capabilities
(including cubeful rollouts AND match score rollouts). Snowie will probably
be a quantum leap forward, just as JF has been. And will Snowie be the
last word? Almost certainly not. I'm hopeful that a new version of JF
will take up the challenge, and (am I dreaming?) even other, new robots.
This all assumes, of course, that the developers don't in the meantime
go broke because of their pricing policies. ;))