opening toughie

anaille charles

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to

O rolls 64 24-18,13-9.
X rolls 31 ?

Brian Sheppard

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to

anaille charles <cana...@envirolink.org> wrote in article
<85471499...@manatee.envirolink.org>...

>
> O rolls 64 24-18,13-9.
> X rolls 31 ?

Make the 5-point. The opponent will either have to choose between
saving his 18-point blot or improving the offensive side. He can't
do everything.

Hitting gives up too much. The opponent return hits with 16 shots,
and because you are set back 18 pips by the return hit whereas the
opponent is only sent back 7, you actually expect to lose ground
by hitting.

Brian

Mark Damish

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to

anaille charles (cana...@envirolink.org) wrote:
:
: O rolls 64 24-18,13-9.
: X rolls 31 ?
:

31: 8/5 6/5

Building your board takes precedence here. Your opponent 'might'
build a point, escape, but probably not both. Your near term shot
'quality' will be much higher with this point, and your near/mid/long
positional gain is guarenteed, and is in a position to improve.

The positional gains just outweigh the possable tactical gains
from fighting on your side of the board, even though your opponent
is likely to escape or make a point of his own. If he escapes,
you want a strong board so that hitting your future shots
has a real effect. If he makes a point, you will still have
an oppurtunity to hit on the next turn with a 2-pt board, probaby
slotting your bar. If he anchors, you adjust to that. Rolls permitting,
you'll probably be splitting your back men if he escapes or anchors
and if you cannot make another point.

The two strategies for early play against one escaped piece are:
- Build a board quickly, so that your shot quality is high.
- Split your back men so that you pressure the escaped man
immediately and threaten to make an advanced anchor.
The dice tend to dictate which to follow.

--
...Mark Damish mda...@bbn.com

Chuck Bower

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to

In article <85471499...@manatee.envirolink.org>,

anaille charles <cana...@envirolink.org> wrote:
>
> O rolls 64 24-18,13-9.
> X rolls 31 ?
>

So far I've seen two posted answers. (I suspect there will be
more.) For those of you who want to jump into the ring, why not include
the 42 reply as well?

Chuck
bo...@bigbang.astro.indiana.edu
c_ray on FIBS

Stephen Turner

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to cana...@envirolink.org

anaille charles wrote:
>
> O rolls 64 24-18,13-9.
> X rolls 31 ?

I'd only class myself as an intermediate, but FWIW, I make 5. That will stay
with you for the rest of the game. Besides, after hitting, 17 of O's 36 rolls
hit back on the bar point.

X rolls 42 is a more interesting question. Here I tend to hit.

--
Stephen Turner sr...@cam.ac.uk http://www.statslab.cam.ac.uk/~sret1/
Stochastic Networks Group, Statistical Laboratory,
16 Mill Lane, Cambridge, CB2 1SB, England Tel.: +44 1223 337955
"Collection of rent is subject to Compulsive Competitive Tendering" Cam. City

FAEspinosa

unread,

Jan 31, 1997, 3:00:00 AM1/31/97

to

On 31 Jan 1997, anaille charles wrote:

>
> O rolls 64 24-18,13-9.
> X rolls 31 ?

Make the 5...no reason to start hitting here. You lose too many pips if
he can return the favor and hit you on his entry, and O didn't lose too
many pips to do so...

Just a thought...

Fernando...

'''
(O O)
+--------------oOO--(_)-------------------+
| Fernando Espinosa |
| Assistant Instructor |
| The University of Texas at Austin |
| Fern...@ccwf.cc.utexas.edu |
+-----------------------oOO---------------+
|__|__|
|| ||
ooO Ooo

Donald Kahn

unread,

Feb 1, 1997, 3:00:00 AM2/1/97

to

On 31 Jan 1997 12:50:52 GMT, anaille charles <cana...@envirolink.org>
wrote:

>
> O rolls 64 24-18,13-9.
> X rolls 31 ?
>

The analyses of Messrs. Sheppard and Damish are convincing, and
JellyFish also agrees, giving the 5 point a .025 equity advantage
over hit and 24-21.

This amounts to a prediction that you will win 12 extra games per
1000 played. Not much - but why not the best?

DK

William C Bitting

unread,

Feb 1, 1997, 3:00:00 AM2/1/97

to

Stephen Turner wrote:

>
> anaille charles wrote:
> >
> > O rolls 64 24-18,13-9.
> > X rolls 31 ?
>

> I'd only class myself as an intermediate, but FWIW, I make 5. That will stay
> with you for the rest of the game. Besides, after hitting, 17 of O's 36 rolls
> hit back on the bar point.
>
> X rolls 42 is a more interesting question. Here I tend to hit.

hmm.. is it 16 return shots for O to hit a blot on the bar pt?
(65 64 63 62 61 52 43 33 22 = 16)

For what it's worth, from the mloner v jellyfish series both with
31 & 42 after an opening split to the 18pt it looks like the bots
go for the points. However, following a run with a 63 & 64 hits
were made in the only cases presented. There was no case where
31 or 42 followed an opening run with a 62, but I think I've seen
the bots in other matches make the points there. I wss surprised
to see mloner's play with the 31 following the run with 63.

wcb on FIBS

From the 300 match series - mloner v jellyfish - oct/dec 1995
Moves are noted from the players perspective; the bar point for
each player is noted as 13-7 8-7. * Crawford game; ! post
Crawford; an * in move 2 column denotes a "hit" |mat, reference to
match#: 00-99(series 1) 100-199(2) 200-299(3)
N1 notations: dn down; pt point; rn run; sl slot; sp split.
N2: h0=hit loose; h1=hit; h2=hit 2; h3=hit declined; a=note play.
Sort: R2, R1, score (W1)
W1 score R1 N1 move 1 W2 R2 N2 move 2 |mat
927 JF 0-0 62 sp 24-18 13-11 ML 31 h3 8-5 6-5 |204
928 ML 0-0 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |201
929 ML 0-0 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |293
930 ML 0-2 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |273
931 JF 0-4* 62 sp 24-18 13-11 ML 31 h3 8-5 6-5 |221
932 ML 1-3 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |166
933 ML 2-2 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |293
934 ML 4-0* 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |102
935 ML 4-1! 62 sp 24-18 13-11 JF 31 h3 8-5 6-5 |253
936 ML 1-3 63 sp 24-18 13-10 JF 31 h3 8-5 6-5 |240
937 JF 4-1* 63 rn 24-15 ML 31 h1 24-23 13-10* |234

987 JF 0-0 62 sp 24-18 13-11 ML 42 h3 8-4 6-4 |8
988 ML 0-1 62 sp 24-18 13-11 JF 42 h3 8-4 6-4 |149
989 ML 0-2 62 sp 24-18 13-11 JF 42 h3 8-4 6-4 |195
990 ML 2-2 62 sp 24-18 13-11 JF 42 h3 8-4 6-4 |190
991 ML 3-2 62 sp 24-18 13-11 JF 42 h3 8-4 6-4 |52
993 JF 0-0 63 sp 24-18 13-10 ML 42 h3 8-4 6-4 |220
995 ML 1-3 63 sp 24-18 13-10 JF 42 h3 8-4 6-4 |128
998 JF 4-0* 64 rn 24-14 ML 42 h1 13-11* 13-9 |37

Chuck Bower

unread,

Feb 7, 1997, 3:00:00 AM2/7/97

to

In article <85471499...@manatee.envirolink.org>,

anaille charles <cana...@envirolink.org> wrote:
>
> O rolls 64 24-18,13-9.
> X rolls 31 ?
>

As of a week after this original, six answers have been
posted, and the score is either 6-0 or 7-0 (trying not to count
JF level-7 evaluation twice) in favor of making the 5-point. OK.
Time to play contrarian (sort of).

Below are Jellyfish v2.01 level-6 cubeless rollouts (432 trials
each) for the 31 and 42 replies to the opening 6x (24/18, 13/y):

Plays: responder's std responder's
equity dev signif. win frac. signif.
62 (24/18,13/11) 31:

8/5,6/5 -0.061 0.014 36% 0.466 2%
13/7*,24/21 -0.055 0.014 63% 0.477 80%
13/7*,13/10 -0.084 0.016 1% 0.471 18%

63 (24/18,13/10) 31:

8/5,6/5 -0.054 0.014 99% 0.477 96%
13/7*,24/21 -0.100 0.013 1% 0.465 3%
13/7*,13/10 -0.108 0.015 < 0.5% 0.461 < 0.5%

64 (24/18,13/9) 31:

8/5,6/5 -0.061 0.013 74% 0.474 33%
8/7*,24/21 -0.073 0.014 26% 0.476 60%
8/7*,13/10 -0.106 0.015 < 0.5% 0.470 7%

62 (24/18,13/11) 42:

13/7* -0.035 0.015 86% 0.485 98%
8/4,6/4 -0.057 0.014 14% 0.471 2%

63 (24/18,13/10) 42:

13/7* -0.063 0.014 92% 0.477 98%
8/4,6/4 -0.089 0.012 8% 0.463 2%

64 (24/18,13/9) 42:

13/7* -0.016 0.014 60% 0.491 80%
8/4,6/4 -0.021 0.014 40% 0.485 20%

("Responder" means person rolling 31 or 42. "Significance" answers
the following question: "If you were able to do an infinite number
of Jellyfish level-6 cubeless rollouts, what is the chance that this
play would show up as BEST?" This is just a way of assigning
confidence to the rollout results from a STATISTICAL standpoint.)

First some general comments:

1) No matter which of the above ways responder chooses to play this
good roll, s/he is still the underdog, both in equity and in winning
chances.

2) Regardless of the 6x opening, and regardless of the reply (31 or 42),
there is not a huge difference between any result! -0.016 appears
to be best equity for responder and -0.108 is the worst. In winning
chances, the results seem even tighter: 49.1% is best; 46.1% worst.

3) Suppose you ask a panel of experts (or otherwise) "which reply
would you prefer (to the opening 6x above), 31 or 42?" How many
do you think would say "42"? And yet the above rollout results
seem to indicate that 42 reply (hitting) is better than 31 (however
you choose to play it)!!

And some more specific questions:

1) Why is hitting with 31 best after the 62, but not after 63 or 64?
Could it be that it's just a statistical aberation (36% significance
for making 5-pt) or is it because opener's checker on his/her 11 point
is further out of the range for the ensuing blot hitting contest
(in the case of 8/7*)?

2) In terms of winning chances, does making the 5-pt with 31 come in
first only once (after 63 opening) and come in LAST after 62, or
is this more statistical shenanigans?

Could Jellyfish be misplaying opener's subsequent positions WORSE
after the point making plays than it does after hitting? Or is it
misplaying responder's games WORSE after the hit? The answer to both
questions is "maybe" but I don't think it's very
likely. You must also ask if the six r.g.b. responders could be
wrong. (I know that SOME of those answers were based on none
other than the slimy salt water sucker, but I believe they were all
evaluations and not the MORE RELIABLE rollout results.)

There are many backgammon books (especially written in
the great Renaissance of the 70's) where "experts" stated confidently
and sometimes even dogmatically that play A or cube decision B was
clearly correct. Maybe against THEIR chouette opponents... Thanks
to the breakthrough work of Weaver, Tesauro, Dahl, and Wittmann,
a lot of the icons and idols have been exposed as charlatans.
A few have stood the test of time.

(Excuse me while I descend from my soapbox.) Where were we...?
Oh, the 31 and 42 replies to the opening 6x! The above rollouts
indicate that the point building plays and the hits are VERY CLOSE.
Based on the statistics AND the fact that JF level-6 doesn't play
PERFECT backgammon, I feel the prudent answer to the original
question is "too close to call."

Ron Karr

unread,

Feb 8, 1997, 3:00:00 AM2/8/97

to

Chuck Bower wrote:
>
> In article <85471499...@manatee.envirolink.org>,
> anaille charles <cana...@envirolink.org> wrote:
> >
> > O rolls 64 24-18,13-9.
> > X rolls 31 ?
> >
>

Interesting rollout results by Chuck. A few thoughts off the top of my
head:

>
> 3) Suppose you ask a panel of experts (or otherwise) "which reply
> would you prefer (to the opening 6x above), 31 or 42?" How many
> do you think would say "42"? And yet the above rollout results
> seem to indicate that 42 reply (hitting) is better than 31 (however
> you choose to play it)!!

One of the weaknesses of hitting with 31 is that it strips the 8 point,
thus reducing the chances to effectively cover the bar point or battle
for it. 42 doesn't have that problem. Also, 42 is 2 pips better in the
race.

>
> And some more specific questions:
>
> 1) Why is hitting with 31 best after the 62, but not after 63 or 64?
> Could it be that it's just a statistical aberation (36% significance
> for making 5-pt) or is it because opener's checker on his/her 11 point
> is further out of the range for the ensuing blot hitting contest
> (in the case of 8/7*)?

After 63 or 64, the opponent has another builder in place to attack the
checker you split to the 21 point. That's probably the difference.
>

Ron

Chuck Bower

unread,

Feb 9, 1997, 3:00:00 AM2/9/97

to

In article <32FCD0...@best.com>, Ron Karr <ka...@best.com> wrote:
(snip)

Chuck Bower wrote:
>>
>> 3) Suppose you ask a panel of experts (or otherwise) "which reply

>> would you prefer (to the opening 62, 63, 64 played 24/18, 13/x):

>> 31 or 42?" How many

>> do you think would say "42"? And yet the JF level-6 rollout results

>> seem to indicate that 42 reply (hitting) is better than 31 (however
>> you choose to play it)!!

>One of the weaknesses of hitting with 31 is that it strips the 8 point,
>thus reducing the chances to effectively cover the bar point or battle
>for it. 42 doesn't have that problem. Also, 42 is 2 pips better in the
>race.

(snip)

Good point, Ron. And I think we can investigate your theory further
with the following question:

Opener rolls 62, 63, or 64 and plays 24/18, 13/x (as before). NOW his/her
opponent rolls 51. What's the "correct" play? And when you've answered
that, suppose the reply roll was 65 instead of 51. How should THIS be
played. (NOTE: I'm asking ALL r.g.bg readers for their opinions, not
just Ron.)

I'll post in a couple of days what JF level-6 rollouts say.

Ole Jensen

unread,

Feb 10, 1997, 3:00:00 AM2/10/97

to

bo...@bigbang.astro.indiana.edu (Chuck Bower) writes:

> [JF Rollout results for responses to 6-x openings]

Thanks for another great post, Chuck, and I apologise that I am really
only responding to pick nits...

You say:

> "Significance" answers the following question: "If you were able to
> do an infinite number of Jellyfish level-6 cubeless rollouts, what
> is the chance that this play would show up as BEST?" This is just a
> way of assigning confidence to the rollout results from a
> STATISTICAL standpoint.)

Could you say a bit more about how this is calculated? It is not
clear to me from the wording of your "question" what is going on. In
fact, since the question is absolute (it doesn't mention the actual,
finite rollout), surely the answer is that one particular play has
probability 1 of coming out best, and the rest have probability 0,
regardless of any sampling anyone does? (It might be necessary to
assume that JF plays consistently and that no two plays are exactly
tied, for everything to converge appropriately, but I think you get my
point.)

From my non-expert knowledge of statistics, I would expect the
relevant question to be something along these lines: "Under the
hypothesis that play A would come out best in the infinite rollout,
what is the probability that this particular finite rollout should
give a result that is at least this 'far away' from that hypothesis?"
(where the meaning of "far away" needs to be explained more
precisely). If this is indeed the "real" question, is it obvious that
the probabilities for the different choices of "play A" add up to 1
(as is the case in your table), or do you have to normalise?

-- Ole (hoegh on FIBS)

John Clements

unread,

Feb 11, 1997, 3:00:00 AM2/11/97

to

In article <5dgccq$q...@dismay.ucs.indiana.edu>,
Chuck Bower <bo...@bigbang.astro.indiana.edu> wrote:

>other than the slimy salt water sucker, but I believe they were all

^^^^^^

'stinger', perhaps?

sorry, sorry.

john clements

Brian Sheppard

unread,

Feb 14, 1997, 3:00:00 AM2/14/97

to

Chuck Bower <bo...@bigbang.astro.indiana.edu> wrote in article
<5dgccq$q...@dismay.ucs.indiana.edu>...

> In article <85471499...@manatee.envirolink.org>,
> anaille charles <cana...@envirolink.org> wrote:
> >
> > O rolls 64 24-18,13-9.
> > X rolls 31 ?

> Plays: equity dev signif. win frac. signif.

> 62 (24/18,13/11) 31:
> 8/5,6/5 -0.061 0.014 36% 0.466 2%
> 13/7*,24/21 -0.055 0.014 63% 0.477 80%
> 13/7*,13/10 -0.084 0.016 1% 0.471 18%
>
> 63 (24/18,13/10) 31:
> 8/5,6/5 -0.054 0.014 99% 0.477 96%
> 13/7*,24/21 -0.100 0.013 1% 0.465 3%
> 13/7*,13/10 -0.108 0.015 < 0.5% 0.461 < 0.5%
>
> 64 (24/18,13/9) 31:
> 8/5,6/5 -0.061 0.013 74% 0.474 33%
> 8/7*,24/21 -0.073 0.014 26% 0.476 60%
> 8/7*,13/10 -0.106 0.015 < 0.5% 0.470 7%
>
> 62 (24/18,13/11) 42:
> 13/7* -0.035 0.015 86% 0.485 98%
> 8/4,6/4 -0.057 0.014 14% 0.471 2%
>
> 63 (24/18,13/10) 42:
> 13/7* -0.063 0.014 92% 0.477 98%
> 8/4,6/4 -0.089 0.012 8% 0.463 2%
>
> 64 (24/18,13/9) 42:
> 13/7* -0.016 0.014 60% 0.491 80%
> 8/4,6/4 -0.021 0.014 40% 0.485 20%

After 4-2 the hitting play always simulates best. This agrees with
my intuition about such positions. By hitting you prevent O from
doing a lot of things (making your 7-point, escaping a man, and
building his board). Making the 5-point is strong because the 5-point
is worth about the same as these things O might do. But the 4-point
is not worth as much, so making the 4-point gives way to the
higher priority of stopping O.

Note also a key difference between 3-1 and 4-2: after 4-2 X has
just one blot. So O does not have the possibility of hitting
a fly-shot in the outfield (as he has with 8/7* 13/10) and O does not
hit X with 2-2 and 4-4 (as he might after 8/7* 24/21).

One of the things I have learned from computers is that you really
have to keep the number of blots and fly shots down. I used to
diversify with abandon, but I now recognize that I was just tempting
fate.

> 1) Why is hitting with 31 best after the 62, but not after 63 or 64?
> Could it be that it's just a statistical aberation (36% significance
> for making 5-pt) or is it because opener's checker on his/her 11 point

> is further out of the range for the ensuing blot hitting contest
> (in the case of 8/7*)?

Statistical aberration is a possibility.

The tactical issue you mention (that the 24/21 split does not "come
under the gun" of the builder on the 11-point) is a real advantage.

Since there is a positional basis for the difference between these
plays, I am inclined to accept that it is genuine, rather than a
statistical aberration.

> 2) In terms of winning chances, does making the 5-pt with 31 come in
> first only once (after 63 opening) and come in LAST after 62, or
> is this more statistical shenanigans?

I think this is probably a realistic reflection of the position.
Making the 5-point is bound to lead to more gammon wins
than other plays. And opening up lots of blots (as after the hitting
plays) is bound to lead to more gammon losses. So, if the plays
nevertheless end up being almost equal, it must be because
other plays have better winning percentages.

> Could Jellyfish be misplaying opener's subsequent positions WORSE
> after the point making plays than it does after hitting? Or is it
> misplaying responder's games WORSE after the hit? The answer to both
> questions is "maybe" but I don't think it's very likely.

My gut feel is that the equity of owning the 5-point will increase
relative to the other plays when the cube is taken into account.
Recall that JF Level 6 rollouts are cubeless!

Making the 5-point (and keeping the anchor) gives a very stable
position that will not deteriorate to game-losing status for a while.
But O's position can come unglued in a hurry! I bet that when you
look at the equities using the cube (with a Level 5 rollout) you
will find that making the 5-point comes out relatively better.

> You must also ask if the six r.g.b. responders could be
> wrong. (I know that SOME of those answers were based on none

> other than the slimy salt water sucker, but I believe they were all

> evaluations and not the MORE RELIABLE rollout results.)

Rollouts are "more reliable" only in the sense that they are derived
from a well-understood, reproducible, statistical process. That doesn't
mean that the process is really a good model of Backgammon, however!

I have already cited the cubeless/cube distinction, which I believe
to be relevant. There is also the issue of JF's positional skills and
biases. The simulation is flawed to the extent that simulation results
depend on JF's skills and biases.

Please don't assume that I am implying that JF's simulation is bad
because JF plays badly. Not at all. For instance, my observation is
that JF handles priming positions exceptionally well. So shouldn't I
figure that my opponent is more likely to mishandle a prime than JF
is? Ditto for blitzes: JF plays these positions with unbelievable accuracy.
I also observe that JF's evaluation of back game positions is unreliable,
and it plays the checkers badly, too. Shouldn't I take JF's rollouts
of back games with a grain of salt?

I have a file of rollout results for all positions from Robertie's
Advanced Backgammon. My conclusion after examining the differences
between Robertie's opinion and JF rollouts is that Robertie is
probably correct in the majority of cases of disagreement.

Don't be entirely sure that these rollout results are more reliable than
human experts, even if we don't play as well as Robertie. JF Level 6
rollouts with variance reduction are basically JF Level 5 games, and
JF level 5 does not play better than I do.

> There are many backgammon books (especially written in
> the great Renaissance of the 70's) where "experts" stated confidently
> and sometimes even dogmatically that play A or cube decision B was
> clearly correct. Maybe against THEIR chouette opponents... Thanks
> to the breakthrough work of Weaver, Tesauro, Dahl, and Wittmann,
> a lot of the icons and idols have been exposed as charlatans.
> A few have stood the test of time.
>
> (Excuse me while I descend from my soapbox.)

And I descend from my soapbox! :-)

> Where were we...?
> Oh, the 31 and 42 replies to the opening 6x! The above rollouts
> indicate that the point building plays and the hits are VERY CLOSE.
> Based on the statistics AND the fact that JF level-6 doesn't play
> PERFECT backgammon, I feel the prudent answer to the original
> question is "too close to call."

To which I add that the practical answer to the original question
is "whatever you feel comfortable with." Since I learned the game
in the 70's (from those very charlatans who have since been exposed)
I am comfortable going all out for primes and blitzes, and I am
willing to play the occasional back game if I have to. So I make
the 5-point.

Brian

BTW: Thanks for doing the rollouts! It is always enlightening to
have real data.

Fredrik Andreas Dahl

unread,

Feb 14, 1997, 3:00:00 AM2/14/97

to

Brian Sheppard wrote:
<lots of stuff deleted>

> I have a file of rollout results for all positions from Robertie's
> Advanced Backgammon. My conclusion after examining the differences
> between Robertie's opinion and JF rollouts is that Robertie is
> probably correct in the majority of cases of disagreement.

I think you're badly wrong there.

>
> Don't be entirely sure that these rollout results are more reliable than
> human experts, even if we don't play as well as Robertie. JF Level 6
> rollouts with variance reduction are basically JF Level 5 games, and
> JF level 5 does not play better than I do.

JF level 6 rollouts with variance reduction ARE level 6 rollouts.
And: If you play as well as JF level 5, it should mean that your oppinion
is as accurate as JF's 'oppinion' (evaluation without lookahead).
Rollout results are a ton more reliable than that.

--
- Fredrik Andreas Dahl

Brian Sheppard

unread,

Feb 17, 1997, 3:00:00 AM2/17/97

to

Fredrik Andreas Dahl <fre...@sn.no> wrote in article
<330524...@sn.no>...

> Brian Sheppard wrote:
> <lots of stuff deleted>

> > I have a file of rollout results for all positions from Robertie's
> > Advanced Backgammon. My conclusion after examining the differences
> > between Robertie's opinion and JF rollouts is that Robertie is
> > probably correct in the majority of cases of disagreement.
>

> I think you're badly wrong there.

I would like to see evidence to support your position.

> > Don't be entirely sure that these rollout results are more reliable
than
> > human experts, even if we don't play as well as Robertie. JF Level 6
> > rollouts with variance reduction are basically JF Level 5 games, and
> > JF level 5 does not play better than I do.
>

> JF level 6 rollouts with variance reduction ARE level 6 rollouts.

I stand corrected, and apologize for my misunderstanding.

Brian

Brian Sheppard

unread,

Feb 17, 1997, 3:00:00 AM2/17/97

to

Fredrik Andreas Dahl <fre...@sn.no> wrote in article
<330524...@sn.no>...

I accidentally hit the "Post Message" button before composing my
reply, so kindly disregard my previous reply to Fredrik Dahl.

> Brian Sheppard wrote:
> > I have a file of rollout results for all positions from Robertie's
> > Advanced Backgammon. My conclusion after examining the differences
> > between Robertie's opinion and JF rollouts is that Robertie is
> > probably correct in the majority of cases of disagreement.
>

> I think you're badly wrong there.

Thank-you for taking the time to respond to my post.

I would like to see evidence to support your opinion.

> > Don't be entirely sure that these rollout results are more reliable
than
> > human experts, even if we don't play as well as Robertie. JF Level 6
> > rollouts with variance reduction are basically JF Level 5 games, and
> > JF level 5 does not play better than I do.
>

> JF level 6 rollouts with variance reduction ARE level 6 rollouts.

I stand corrected, and I apologize for any harm caused by my
misunderstanding.

> And: If you play as well as JF level 5, it should mean that your oppinion
> is as accurate as JF's 'oppinion' (evaluation without lookahead).
> Rollout results are a ton more reliable than that.

The implication (that rollout results are a ton more reliable than an
expert opinion in this particular situation) is not necessarily so, and
is the crux of my disagreement with Chuck Bower. In what follows I want
to confine my comments to the situation at hand, which is a very close
early-game checker play decision.

JF rollouts are a useful tool, but they have to be kept in perspective.
In my opinion, the posters in this newsgroup take the omnipotence of JF
too much for granted. Applying rollout results to a practical game
decision is hard, particularly in close checkerplay decisions.

Let's start with statistical variance. Let's take a checkerplay decision
where the difference is less than 0.01 points per game. A 1296-game rollout
typically has standard deviation around 0.030, so such a rollout is not
long enough to distinguish such close plays. How long a rollout do you
need?
About 9 times 1296 games, and even then you would see the wrong move
preferred
in 16% of such rollouts. Let's say you want 2 standard deviations of
accuracy.
Then you need 36 times 1296 games. Ouch! That will take forever. It is no
wonder
that I seldom see rollouts in this newsgroup longer than 1296 games.

A second issue: which of JF's many rollout techniques should one use to
detect this difference? There are too many types of JF rollouts for my
tastes. I have read about rollouts with different horizons, different
levels, using the cube and cubeless, with various settlement limits. Do
all types of JF rollouts reach the same conclusion in every situation?

I don't agree that Chuck did the right rollout! His rollouts were
cubeless, whereas in practical play the position of the cube can be
relevant.

Another implication (or maybe I'm reading more into your post than
you really said) is that the skill of JF Level 6 is required to play
this type of position accurately. Do I understand correctly that Level
6 rollouts are always cubeless? This seems like a severe limitation.

Another issue concerns differences among quite respectable computer
players.
I have seen JF and TD disagree on the merits of moves. And by
substantial amounts, too (sometimes > 0.1 equity!). The Forrest-Eggert
match analysis (available on-line--see the FAQ) contains several instances
of this. What am I to conclude from this?

JF and TD can also issue somewhat ridiculous differences for plays that
are almost equal. I recall one instance from the Forrest-Eggert match
in which a player's move was rated 0.05 less than the program's preferred
move, prompting Kit Woolsey to comment that evaluation simply couldn't
be right. I believe Kit's assessment, and therefore I have doubts about
JF (and TD) analysis.

Final point: if the difference in two plays is significantly less than
0.01,
aren't there other practical issues that matter more than the equity
difference? For example: the skill of the opponent, the difficulty of the
situation, the playing conditions, etc. To overemphasize theoretical
differences at the expense of practical factors is quite harmful, IMO.

To summarize, I would liken JF rollouts to a flashlight. It helps us find
our way in the dark. Sometimes I read postings here that liken JF to
the Sun.

Brian

Chuck Bower

unread,

Feb 18, 1997, 3:00:00 AM2/18/97

to

In article <01bc1a98$cc7370c0$3ac0...@polaris.mstone.com>,

(snip)

>
>> 2) In terms of winning chances, does making the 5-pt with 31 come in
>> first only once (after 63 opening) and come in LAST after 62, or

>> is this more statistical shenanigans? CRB
>
Brian answers:

>I think this is probably a realistic reflection of the position.
>Making the 5-point is bound to lead to more gammon wins
>than other plays.

I don't see the that this is clear. By NOT hitting, you give
the opening roller an easier chance to escape the checker on your
7-pt AND several chances to anchor up there. I would think that
these scenarios don't lead to that many gammons.

> And opening up lots of blots (as after the hitting
>plays) is bound to lead to more gammon losses. So, if the plays
>nevertheless end up being almost equal, it must be because
>other plays have better winning percentages.
>

This argument also is not crystal clear to me.
Sorry, but my rollout results are not in front of me, so I
can't look up the gammon results. I'm sure there not THAT much
different in any case.

(snip)

Brian continues:

>
>My gut feel is that the equity of owning the 5-point will increase
>relative to the other plays when the cube is taken into account.
>Recall that JF Level 6 rollouts are cubeless!
>Making the 5-point (and keeping the anchor) gives a very stable
>position that will not deteriorate to game-losing status for a while.
>But O's position can come unglued in a hurry! I bet that when you
>look at the equities using the cube (with a Level 5 rollout) you
>will find that making the 5-point comes out relatively better.

Here, also, I don't feel your case is substantiated by JF level-7
cubeless evaluations. The "volatilities" (standard deviation of two
ply lookahead equities for each case) look like this:

Reply Opening: 62 63 64

8/5, 6/5 0.171 0.189 0.158
24/21, 8/7* 0.156 0.165 0.203

Without going through the 1296 possibilities myself, JF's statistical
analysis of the 1296 outcomes SEEMS to indicate that the cube is not that
likely to be used more or less quickly regardless of the reply. A cubeful
rollout would be interesting (but must be done at the less reliable JF
level-5). I will try to remember to do this over the next few days.

Bottom line is that we are really getting into a VERY academic
"hair splitting" discussion. We BOTH agreed (didn't we?) that "too close
to call" was the practical answer as to how to play these responses.
I'm ready to move on and bore the average reader with other topics...

Brian Sheppard

unread,

Feb 18, 1997, 3:00:00 AM2/18/97

to

Brian Sheppard <bri...@mstone.com> wrote in article
<01bc1ce2$524e96c0$3ac0...@polaris.mstone.com>...

> Fredrik Andreas Dahl <fre...@sn.no> wrote in article
> <330524...@sn.no>...

> > Brian Sheppard wrote:
> > <lots of stuff deleted>

> > > I have a file of rollout results for all positions from Robertie's
> > > Advanced Backgammon. My conclusion after examining the differences
> > > between Robertie's opinion and JF rollouts is that Robertie is
> > > probably correct in the majority of cases of disagreement.
> >

> > I think you're badly wrong there.
>

>Hi Brian,
>
>What we could do is arrange to play a series of props based on the
>positions in Robertie's books. One simple method would be to take
>turns selecting positions to prop, with you taking Robertie's side, and
>me taking JF's. If you're right, you'll come out way ahead.
>
>David Montgomery
>monty on FIBS
>mo...@cs.umd.edu

Hi Monty,

You are foolish to make this offer. If I accepted, your brashness would
cost you almost 1 point per game. I will explain later.

But I will decline because my interest is in finding evidence that is of
greater reliability than the original sources. A hand-rolled proposition is
not sufficient.

Now, to support my position that JF rollouts should not be trusted
blindly...

I have done some hand rollouts against JF for some of the positions
in which it disagrees with Robertie. My observation is that JF plays
certain types of positions badly. The differences are small, but
remember this: Robertie's book is meant to enlighten expert players,
and there are a lot of close decisions in there. In many cases small
deviations from ideal play are sufficient to change the evaluations.

You can check this for yourself. Try position 183 from Robertie's second
edition (I am quoting from memory, since I left the book at home, but the
Robertie's move is 13/3* 7/2* 6/1*, and JF rollouts prefer 8/3(2)*
7/2(2)*.).

Jellyfish doesn't quite get this position, in which Black's main task is
to safely dismantle a prime restrained by White's anchors on the 4 and 5
points.

When playing the White pieces, JF often refuses to hit Black,
though without a hit White will lose a gammon for sure. Hitting Black is
a no-risk proposition for White.

For the Black pieces, JF often prefers to hold blockading points even
when it has the chance to dismantle them safely. This is true even when
Black has the chance to clear a point while White is on the bar. JF is also
too willing to bury checkers behind White's anchors.

You can verify these things for yourself by doing an interactive rollout.
When JF disagrees with your plays, it will show you its preferred move
and its guess at your equity loss. When you click the Verify button
to ask it to look deeper, you will often see huge swings in the equity
assessment. Such *instability* in JF's evaluation function is convincing
evidence that JF has not had sufficient training in such positions.

JF misevaluates Problem 127 ("Kauder's Paradox"). In this position
JF thinks White is a big underdog, and actually considers Black to have
over 50% chance of winning a gammon! This is quite wrong. Actually White
is a clear favorite. If you take JF's side in that proposition, you will
lose
about 1.33 points per game, since JF drops a double it should beaver.

You might think that a position like Kauder's Paradox is too extreme to
be relevant to practical play. Actually, that is not true. JF will happily
hit all 15 of your men any time it has the chance. If you happen to get a
backgame against JF, you can make hara-kiri plays until all your men
are hit, then build an outer-table prime after hitting JF. Your money game
equity is reasonable only because JF incorrectly beavers your double
after you complete the prime.

On the other side of the ledger, I have found a position
where Robertie seems to be going badly wrong (I recall
that number 80 is such a position).

My conclusion is that you place way too much faith in
JF. A case in point: Monty actually offered to accept the
proposition of my choice SIGHT UNSEEN! Is Monty's faith
in JellyFish not bounded by common sense? What about
yours?

Warm Regards,
Brian

Alexander Nitschke

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

Brian Sheppard wrote:
> <lots of stuff deleted>

> I have a file of rollout results for all positions from Robertie's
> Advanced Backgammon. My conclusion after examining the differences
> between Robertie's opinion and JF rollouts is that Robertie is
> probably correct in the majority of cases of disagreement.
>

> I think you're badly wrong there.
>

One problem with your file of rollout results is that they are only
JellyFish 1 Level5 rollouts! Level6 rollouts from JellyFish 2 are much
better and can be trusted much more. I made Level6 rollouts with
JellyFish 2 (and additionally Level5 rollouts with cube) for all
positions from Robertie's Advanced Backgammon (about 2000 rollouts :-)
and summarized the results.
In 62 problems the result of the Level6 rollouts changed the result of
the Level5 rollouts, but these were in most cases small errors. I would
say, your file is out of date :-)

Below I write more.

David Montgomery wrote:
>
>Hi Brian,
>
>What we could do is arrange to play a series of props based on the
>positions in Robertie's books. One simple method would be to take
>turns selecting positions to prop, with you taking Robertie's side, and
>me taking JF's. If you're right, you'll come out way ahead.
>

Brian Sheppard wrote:
> Hi Monty,
>
> You are foolish to make this offer. If I accepted, your brashness would
> cost you almost 1 point per game. I will explain later.
>
> But I will decline because my interest is in finding evidence that is of
> greater reliability than the original sources. A hand-rolled proposition is
> not sufficient.
>

I think my Level6 rollouts are of greater reliability but probably not
enough for you :-)

>
> Now, to support my position that JF rollouts should not be trusted
> blindly...
>
> I have done some hand rollouts against JF for some of the positions
> in which it disagrees with Robertie. My observation is that JF plays
> certain types of positions badly. The differences are small, but
> remember this: Robertie's book is meant to enlighten expert players,
> and there are a lot of close decisions in there. In many cases small
> deviations from ideal play are sufficient to change the evaluations.
>

You're right, I believe that there are some problems where the Level6
rollouts too give wrong results, not to mention the Level5 rollouts to
which you refer! But the error from the wrong rollout results are
relatively small, except one extreme problem (127).

>
> You can check this for yourself. Try position 183 from Robertie's second
> edition (I am quoting from memory, since I left the book at home, but the
> Robertie's move is 13/3* 7/2* 6/1*, and JF rollouts prefer 8/3(2)*
> 7/2(2)*.).
>

This is not correct for the Level6 rollout which correctly plays 13/3*
7/2* 6/1*. The equity difference between the two moves is 0.112, one of
the largest errors of the Level5 rollouts.

>
> JF misevaluates Problem 127 ("Kauder's Paradox"). In this position
> JF thinks White is a big underdog, and actually considers Black to have
> over 50% chance of winning a gammon! This is quite wrong. Actually White
> is a clear favorite. If you take JF's side in that proposition, you will
> lose about 1.33 points per game, since JF drops a double it should beaver.
>

Ok, this is a position which JellyFish simply doesn't understand.
JellyFish's by far biggest weakness is the misunderstanding of the value
of outfield primes and the lack of a technique to build such a prime and
roll it home.
But as far as I can assess the problems there was no other position with
a large error from the Level6 rollouts (and I think I'm not too bad for
this job).
You're right with these 1.33 points per game, but I'm sure there is no
other case with an error greater than 0.2 points per game even in the
backgame section.

>
> You might think that a position like Kauder's Paradox is too extreme to
> be relevant to practical play. Actually, that is not true. JF will happily
> hit all 15 of your men any time it has the chance. If you happen to get a
> backgame against JF, you can make hara-kiri plays until all your men
> are hit, then build an outer-table prime after hitting JF. Your money game
> equity is reasonable only because JF incorrectly beavers your double
> after you complete the prime.
>
> On the other side of the ledger, I have found a position
> where Robertie seems to be going badly wrong (I recall
> that number 80 is such a position).
>

Problem 80 probably is wrong by Robertie (0.096 difference in the
rollout).

I can tell you some problems where Robertie is really badly wrong:
Problem 48 (which is a clear take and not a pass)
Problem 52 (22/18 is clear best, not even mentioned by Robertie)
Problem 63 (another clear take)
Problem 84 (equity difference 0.55! according to Level6 rollout and I
believe this is not way wrong)
Problem 99 (clear pass, not a take)
Problem 149, 177a (both no double, Robertie would double)
Problem 199b (no double, Robertie would pass!)
Problem 252 (clear take, not a pass)
Problem 332 (no double, Robertie would double)

These errors have all equity losses of at least 0.2 (except Problem 52
with 0.17). There are 16 more problems with errors greater than 0.1 and
a total of 99 problems (out of 451) with an error greater than 0.01
(according to Level6 rollouts). I know that in some cases the rollouts
could be wrong and Robertie could be right, or the error of Robertie is
less severe than indicated, but there are many problems with wrong
solutions.

>
> My conclusion is that you place way too much faith in
> JF. A case in point: Monty actually offered to accept the
> proposition of my choice SIGHT UNSEEN! Is Monty's faith
> in JellyFish not bounded by common sense? What about
> yours?
>
> Warm Regards,
> Brian

After all, I sound like a worshipper of JellyFish Level6 rollouts :-)
But I can assure you that this is not the case. I know of some
weaknesses which can lead to wrong rollout results. Containing a hit
checker in extreme cases is a weakness of JellyFish which leads to wrong
rollout results in some backgames and some endgames.
The errors sum up in most cases to small overall equity errors, not
greater than 0.10 (which would already be a high number, 0.05 is more
realistic I think).

Finally, if we omit problem 127 (the Kauder Paradox), I'm really sure,
that you would end on a high losing side if you bet on Robertie's
opinions.

Best greetings

Alexander (acey_deucey in FIBS)

Fredrik Dahl

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

<stuff deleted)

>Problem 80 probably is wrong by Robertie (0.096 difference in the
>rollout).
>
>I can tell you some problems where Robertie is really badly wrong:
>Problem 48 (which is a clear take and not a pass)
>Problem 52 (22/18 is clear best, not even mentioned by Robertie)
>Problem 63 (another clear take)
>Problem 84 (equity difference 0.55! according to Level6 rollout and I
>believe this is not way wrong)
>Problem 99 (clear pass, not a take)
>Problem 149, 177a (both no double, Robertie would double)
>Problem 199b (no double, Robertie would pass!)
>Problem 252 (clear take, not a pass)
>Problem 332 (no double, Robertie would double)
>

Now take a look at 165, a half busted 21-backgame.
Robertie says the equity with white holding a 2cube is 1.5 points.
This is close to JFs evaluation without lookahead.
Level 6 rollouts give that it's very close to a take.
So if expert oppinion is better than JF, it's rather unstable,
even for positions chosen by the expert...

This said, I recommend Roberties books very much.
Although I believe the numbers from JF more than his in all but extreme
positions (and a normal well timed backgame is NOT one of those),
his way of explaining key consepts is great.
The books contain 400 very hard problems.

Fredrik Dahl.

Brian Sheppard

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

Fredrik Dahl <fred...@ifi.uio.no> wrote in article
<5eem49$j...@menja.ifi.uio.no>...
> <stuff deleted)

>
> >Problem 80 probably is wrong by Robertie (0.096 difference in the
> >rollout).
> >
> >I can tell you some problems where Robertie is really badly wrong:
> >Problem 48 (which is a clear take and not a pass)
> >Problem 52 (22/18 is clear best, not even mentioned by Robertie)
> >Problem 63 (another clear take)
> >Problem 84 (equity difference 0.55! according to Level6 rollout and I
> >believe this is not way wrong)
> >Problem 99 (clear pass, not a take)
> >Problem 149, 177a (both no double, Robertie would double)
> >Problem 199b (no double, Robertie would pass!)
> >Problem 252 (clear take, not a pass)
> >Problem 332 (no double, Robertie would double)
> >
>

> Now take a look at 165, a half busted 21-backgame.
> Robertie says the equity with white holding a 2cube is 1.5 points.

I want to offer many thanks to Fredrik Dahl and Alexander Nitschke
for offering concrete evidence with respect to rollouts of positions
from Robertie's book.

I have investigated our disagreements and come upon certain technical
issues that may contribute to them.

First, the source I used for rollouts was an on-line PostScript file
which is referenced in the FAQ. The JellyFish version used for those
rollouts was 1.0, rather than 2.0.

Second, Alexander Nitschke correctly points out that Level 6 does reverse
the opinion of Level 5 in quite a number of cases. The extra degree of
technical skill imparted by Level 6's lookahead seems to make a difference.

> You're right with these 1.33 points per game, but I'm sure there is no
> other case with an error greater than 0.2 points per game even in the
> backgame section.

I would very much like to have the 2000 rollouts that Alexander Nitschke
has performed, and I offer my assistance in publishing them, if that is
acceptable.

> Problem 84 (equity difference 0.55! according to Level6 rollout and I
> believe this is not way wrong)

I agree, and this is the largest rollout error that I found as well.

> I know of some weaknesses which can lead to wrong rollout results.
> Containing a hit checker in extreme cases is a weakness of JellyFish
> which leads to wrong rollout results in some backgames and some endgames.

I agree here. Even Level 6 does not play these situations particularly
well,
since the difference between good and bad outfield coverage isn't visible
for
several turns.

JF's tendency in such cases is to make outfield points, whereas the proper
technique is to spread men out in the 11 to 14 pip distance, and use
men that are closer to attack or slot key points.

> The errors sum up in most cases to small overall equity errors, not
> greater than 0.10 (which would already be a high number, 0.05 is more
> realistic I think).

I agree here, too, for the most part, with containment situations (and
backgames, which usually lead to containment situations) being a big
exception.

> Although I believe the numbers from JF more than his in all but extreme
> positions (and a normal well timed backgame is NOT one of those),

Here I disagree. JF's assessment of the backgame's potential is always
too low. There are several problems with JF's play in backgames that
render its judgments unsound.

First, JF plays the backgame side as though it were trying to win a race.
It will bury a man rather than expose a blot. It will hit the opponent, as
though it were possible to contain that man and win a priming game or race.
It will break its rearmost anchor so that a man is available at the edge of
the prime.
If the backgame's timing is unsound, JF will not play to rectify it.

Second, JF does not hang onto its back points long enough. JF seems to play
to save the gammon, rather than win. This is not surprising, because...

Third, when JF does hit a shot, JF mishandles the containment situation.
JF does not build outer table primes, and it does not spread its checkers
at the proper distance. I think its technique is to bring men around to
launch
an attack on the exposed man. This technique is not the worst possible, but
it loses equity in many cases because the chance of picking up a second man
is lost.

When I play backgames against JF, I have noticed that it handles the
blockading side badly, too.

First, JF will double a backgame too early. When you take into account
how badly it will play for the rest of the game, you can even beaver some
of its doubles.

Second, JF will always give a backgame adequate timing. You can always
leave
another blot open, and JF will hit it if it is at all possible. If it is
impossible, JF will hang around, waiting to hit it. The net result of this
is that every back game against JF becomes an extreme backgame--one that
might lead to a position like #127.

Third, JF doesn't deliberately play to crunch your inner board. It focuses
on bringing its men in cleanly, which is a worthwhile goal, to be sure, but
much, much better is to force a crunch, and only then bring the men in.

Fourth, and this is the biggest mistake of all, when you double after
hitting a man you get an unexpected bonus: JF beavers, even if you have a
solid prime!

So I distrust JF rollouts in which backgames are prominent. Chances are
that the trailing side is being misjudged.

> After all, I sound like a worshipper of JellyFish Level6 rollouts :-)

Not at all. You have done an admirable and thorough job of evaluating
the strengths and weaknesses of this tool. Thank-you very much for
sharing your conclusions!

Warm Regards,
Brian

Chuck Bower

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

I wonder if Charles Anaille knew what he was starting when he
asked the innocent question: "How do you play the 31 response to
the opening 64--24/18, 13/9?" We've been seeing some animated
debate about motherhood and apple pie. Actually, though, this
kind of debate (which includes "evidence") is quite healthy. I
have appointed myself (for the moment) a moderator. Here are some
things to keep in mind (IMHO).

1) "Advanced Backgammon" (2nd edition) was released about 5 years
ago. If Robertie were to put out a 3rd edition, I suspect he would
make even more changes. Backgammon is in a very active state of
evolution, thanks partly to the availability of strong software
(and also to the internet explosion, both playing sites and
"publication" sites). You can't hold him hostage to something he
said five years ago.

2) Jellyfish (and its many crystal friends) are also evolving. I've
heard a rumor that a new edition of JF is in the works which will
have many of the current version's loopholes fixed (like not knowing how
to roll home a prime). Presumably some of the other bots (besides TDG)
will be coming out with commercial versions to compete with JF for
the market. I think that is great!

3) I agree with Brian that we should not "blindly" accept JF results.
Equally important is that we not "blindly" accept the opinions of
human experts... JF's plays can (and should) be gone over with a
fine toothed comb (as Brian has done in some cases). Unfortunately
(OK, maybe fortunately) human experts cannot be scrutinized so easily,
except possibly when they "put it in writing." Hopefully these threads
won't deter them from continuing their inscriptions. (I doubt that
will happen. Could you imagine Jake Jacobs volutarily censoring
himself!! I almost accidentally used the homonym "censure", but any
good writer will do that, by MY definition of "good".)

4) I don't think a person should get bent out of shape when someone
else offers a proposition. (Just like you shouldn't be offended if
one of your doubles is beavered.) Propositions are a healthy
learning tool in backgammon (though not necessarily in marksmanship).

We are currently in another "Renaissance" of backgammon. Ideas
(old and new) are being studied and debated constantly. If Barclay
Cooke and Ozzie Jacoby arose from the grave, I suspect they would get
trounced for a while, even by sub-experts. I emphasize "for a while."
I hope these spirited and enlightening debates continue. Now, if
we could just get those bots to stop cheating...

David Montgomery

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

In article <01bc1dd5$dc4ffb40$3ac0...@polaris.mstone.com> "Brian Sheppard" <bri...@mstone.com> writes:
>JF misevaluates Problem 127 ("Kauder's Paradox"). In this position
>JF thinks White is a big underdog, and actually considers Black to have
>over 50% chance of winning a gammon! This is quite wrong. Actually White
>is a clear favorite. If you take JF's side in that proposition, you will
>lose
>about 1.33 points per game, since JF drops a double it should beaver.

I was confused in my earlier reply to Brian regarding this position.
Based on Robertie's estimate, I would indeed lose 1.33ppg playing JF's
side on this position, not 2/3ppg as I wrote.

David Montgomery

unread,

Feb 19, 1997, 3:00:00 AM2/19/97

to

This post starts out with my reply to Brian Sheppard's reply to my offer
to talk about playing props based on JF rollouts of Robertie's Advanced
Backgammon. This may be of only interest to me and Brian, I'm not sure.

Afterwards, however, are Kit Woolsey's and my comments on JF rollouts
from 4/7/96. This might be interest to people who are new to Jellyfish.
(Then again, maybe not. :-)

In article <01bc1dd5$dc4ffb40$3ac0...@polaris.mstone.com> "Brian Sheppard" <bri...@mstone.com> writes:

>> > > I have a file of rollout results for all positions from Robertie's
>> > > Advanced Backgammon. My conclusion after examining the differences
>> > > between Robertie's opinion and JF rollouts is that Robertie is
>> > > probably correct in the majority of cases of disagreement.

I wrote, in part, via email:

>>What we could do is arrange to play a series of props based on the
>>positions in Robertie's books. One simple method would be to take
>>turns selecting positions to prop, with you taking Robertie's side, and
>>me taking JF's. If you're right, you'll come out way ahead.

Brian replies:

>Hi Monty,
>
>You are foolish to make this offer. If I accepted, your brashness would
>cost you almost 1 point per game. I will explain later.

I still don't feel foolish, even after reading your post, though perhaps
I should. In part I don't feel foolish because what I sent you was a
offer to start negotiations, not a contract on a particular position.
If you *were* interested, we would have to talk about stakes, how the
positions are selected, what it means to go with Robertie (he often
calls things marginal take/drops and the like) or JF (based on the context
in your post, I meant I would choose my side based on JF rollouts, not
evaluation, but we would still have to say whether we were relying on
the published numbers, or my own JF6 numbers, or what, and wrt to the
published JF numbers there's the issue of the incorrect cash point setting
of 0.500), the number of trials for each prop (including the issue of
whether checker play props would be played for twice as many games),
and so forth. Despite perhaps appearing foolish, I'm still willing to
talk about it.

If you were interested, and depending on the rest of the arrangements,
I might be willing to make a very large side bet that you won't obtain
a result of anywhere near +1.00ppg, which again might be foolish, but we
can't really say until we work out some of the above points.

Brian wrote:
>Now, to support my position that JF rollouts should not be trusted
>blindly...

[Brian's comments on not trusting JF, and wrt Advanced Backgammon
problem #183 in particular deleted.]

>JF misevaluates Problem 127 ("Kauder's Paradox"). In this position
>JF thinks White is a big underdog, and actually considers Black to have
>over 50% chance of winning a gammon! This is quite wrong. Actually White
>is a clear favorite. If you take JF's side in that proposition, you will
>lose
>about 1.33 points per game, since JF drops a double it should beaver.

Perhaps I miscalculating something, but based on Robertie's analysis
I get that I would lose 2/3ppg. Based on the JF rollouts, I would play
that its a pass, and so I would pay you 1.0ppg to take. You would
beaver, and according to Robertie you would lose 1/3ppg, for a net of
2/3ppg to you. That's still a lot, and if this position were included
in our contract, I'd have to think very hard about where I could make
up the equity.

>My conclusion is that you place way too much faith in
>JF. A case in point: Monty actually offered to accept the
>proposition of my choice SIGHT UNSEEN! Is Monty's faith
>in JellyFish not bounded by common sense? What about
>yours?

Actually, I offered to negotiate a series of propositions to test
your suggestion that Robertie is more right than JF rollouts, and
I suggested an easy to understand method for doing that, so that
would understand what I had in mind. (That's why I wrote "One
simple method would be...")

Actually, I don't accept JF rollouts blindly, and am aware of almost
all of the criticisms you have made of JF rollouts in recent posts.
Here is a post I made to the newsgroup almost a year ago regarding this:

Article 12849 of rec.games.backgammon:
Path: mimsy!cs.umd.edu!not-for-mail
From: mo...@cs.umd.edu (David Montgomery)
Newsgroups: rec.games.backgammon
Subject: Re: How best to do Jellyfish rollouts? (long)
Date: 7 Apr 1996 11:39:32 -0400
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 282
Message-ID: <4k8njk$6...@twix.cs.umd.edu>
References: <4k6siv$2...@nyx10.cs.du.edu>
NNTP-Posting-Host: twix.cs.umd.edu

In article <4k6siv$2...@nyx10.cs.du.edu> fma...@nyx10.cs.du.edu (Farhan Malik) writes:
> I just bought the Jellyfish analyzer 2.0 and am trying to
>figure out the best way to perform rollouts. Depending on how I set
>the variables I get very different and even conflicting results.

[ rolling out 2 plays:
Play 1) 24/20/16* 13/9(2)
Play 2) 24/20/16* 8/4(2)
after an opening 4-1 played 13/9 6/5 ]

[ results so far:

Play 1) JF7 evaluation .486
Level 6 (36x) .516
Level 6 (36x) .451
Level 6 (106x) .439
Level 5 truncated (7776x) .484

Play 2) JF7 evaluation .461
Level 6 (36x) .413
Level 6 (36x) .559
Level 6 (106x) .530
Level 5 truncated (7776x) .491
]

> I don't like the idea of truncated rollouts because they rely
>heavily on JF's evaluation of the position. If it is incorrectly
>evaluating the position then the results are not worth much. It doesn't
>seem to evaluate backgames well and the above position easily turns into
>one.

Well, it's true that truncated rollouts rely on JF's evaluations,
but most of the time, and for most positions, this isn't much of
a problem. This is because the errors in JF's evaluation will
in large part cancel out -- sometimes the evaluation will be too
high and other times too low. And JF evaluations are really pretty
good. Better than human evaluations, anyway. Some error may remain
if the game tends to develop into positions in which there is some
bias in JF's evaluation. By itself, this usually isn't too much
of a problem, because most positions tend to branch out into a wide
variety of types of positions, and the positions which don't, and
for which JF's evaluations are off, are often positions that you
can't trust JF with anyway. If you review the rollouts of Robertie's
_Advanced_Backgammon_, you can get a good feeling for the amount
of error that typically arises from using truncated rollouts.

For the position in question, there should be very little trouble
with using truncated JF rollouts. JF understands opening checker
play very well, and the game is likely to evolve into a wide
variety of different kinds of positions, so there should be
relatively little bias due to truncation. I disagree that this
position will "easily" become a backgame. The first player should
generally be very much trying to avoid this scenario, and will usually
succeed. Certainly, with JF at the helm, this will very rarely
become a backgame.

The main advantages of truncated rollouts are two:
1) they are faster, and
2) they have lower variance. That is, they converge toward the
"infinite rollout" equity with fewer trials, on average.

Item two just means that you need fewer trials to get your
answer, so the advantage of truncated rollouts comes down
to just one thing, which is that they are faster.

The disadvantage of truncated rollouts is that sometimes they
are biased. This is less of a problem in a checker play rollout
(which is also when speed is more of a concern), but very
important for cube rollouts. But the more significant disadvantage
to truncated rollouts is that JF does not give you "live cube"
figures with truncated rollouts, which it does with non-truncated
rollouts. This is obviously a problem when you are rolling out
a cube action problem, but also a factor in many checker play
problems (see, for example, Jeremy Bagai's excellent article in
the Jan-Feb Inside Backgammon, or the solution to Inside Backgammon
quiz problem #110). For these reasons, I almost always do
complete rollouts, but truncated rollouts are not as suspect
as you think.

> I'm new to this rollout business and am not making much out of
>the above results. I'm also starting to think that JF rollouts are
>way overrated. I studied the JF rollouts of Robertie's Advanced
>Backgammon and I find Robertie's logic far more convincing than the
>rollouts in the vast majority of the problems.

Well, my guess is that you're overrating Robertie's logic. The
fact is, most interesting backgammon problems cannot be tackled
by logic. Over the board, we reason as best we can, but ultimately
we are just guessing based on our experience. Robertie recognizes
this himself. A few years back he sharply criticized a problem
solution by Kleinman (which was based on reasoning from general
principles), and backed up his criticism with (hand) rollouts.
Robertie wrote that backgammon was not "an exercise
in deductive logic" but rather, at least for correctly analyzing
positions, an exercise in empirical science. Rollout data is
exactly what is needed to determine the correct play, most
of the time.

The fact is that many of Robertie's solutions are after-the-fact.
Long propositions were played, and Robertie learned the result
[ or Robertie actually rolled the position out himself ]
and saved the position. In his book, he justifies the solution
based on logic or reasoning or breaking down the rolls or
emphasizing one very important feature of the position. In doing
this, he is showing the reader how one might approach the problem
over the board, which is exactly what you want to know to play
better backgammon. But the important thing to realize is that
the empirical data came first, and the reasoning to point you to
the correct play is derivative. Kit Woolsey has also often
emphasized this point, by saying how he has learned a lot from
trying to figure out rollout results which at first seemed unintuitive.

Now, as to whether JF rollouts are overrated -- I guess it depends
on the person and the position. JF rollouts are a tremendous source
of empirical data for a wide variety of positions. But they do
have their limitations. First of all, any rollout is subject to
statistical variation. So when results come out very close, there
is very good reason to be skeptical about the results' significance.
JF gives the standard deviations of the rollouts it performs, so
this can be a guide for that.

Secondly, any position can be misplayed. Putting aside for the
moment major thematic errors, small mistakes can be made favoring
one side or the other, and these small mistakes should add a little
more doubt to the significance of close results, even in positions
that we believe JF handles well.

Now, turning to the question of thematic errors, its well documented
that JF has a few of these. Here are the ones that come to mind
right now:

- JF gets low results with outside primes -- the further outside,
the more irrelevant the results. JF doesn't completely understand
how to walk a prime home against a single trapped checker.
- JF doesn't understand well how and when to try for a second checker
after a bearoff hit.
- JF gets high results in many backgames. However, I think this
bias has been overemphasized. In backgames nearing resolution,
where the timing issue has been resolved, as is the case in many
forward (e.g., 34 or 45) backgames, JF's results are not that
far off. In these cases, JF may give up a little due to having
to walk its prime home after a hit, but probably not much. In
deeper backgames, JF gives up more because capturing a second checker
may be a significant consideration. Also, JF doesn't always
understand when to split its rear checkers to generate more shots.
In positions where the timing issue is not yet resolved, or where
there is still significant forward equity, as in a two-way game,
JF *may* give up significant equity because it often will avoid
the backgame strategy that a human would choose. I emphasize may
because I think JF is often right in avoiding the backgame, and that
human players are often wrong about this. JF probably gives up
the most in well-timed deep backgames where the leader is still
a long ways from the bearin.
- JF can get weird results in noncontact positions. This problem
has probably been reduced by the bearoff database in JF2.0, but
JF still isn't the best tool for these kinds of positions.
- JF gets low results for many priming positions against one back,
even when the prime is deep in the board. This is especially
true when slotting the back of the prime is important. JF very
often doesn't do this when it is correct.
- Wilcox Snellings thinks JF gets high results vs deep anchor
games, especially vs ace point games. I don't know whether this
is true or not, but it's plausible. Part of the equity of acepoint
games comes from capturing a second checker after a late hit.
- JF can get results that are off in what I call "runaround" positions.
These are positions where one side is trying to navigate the last
few checkers around the opposition. An example is: side A has
4 checkers each on the 1, 2, and 3 points, and 1 checker each on
the 4, 17 and 18 points; side B has a closed board, and 1 checker each
on the 18, 19, and 20 points. JF doesn't count shots, so sometimes
it makes significant checker play errors when rolling these positions
out.
- JF gets low results in bearoff hit positions in which there is
a lot of play. For example,

X O O . X . | | . . . . X . [2]
O O | |
O O | |
O O | |
| |
X X | | X X X
. . O X O X | | X X X . X X

X's home board. O has 5 off

With O owning a 2-cube, X's equity is about 0.70. JF gets
.261 cubeless, .345 after doubling to 2 (3888 trials).
Interesting, humans tend to overrate the value of these
positions. [Kit Woolsey propped it as a pass.]
- many purely technical decisions are less amenable to rollouts,
whether by JF or humans. This is especially true if the technical
decision tends to repeat itself.
- because of the way JF uses the cube in live cube rollouts, sometimes
its cube numbers are way off. A common example is a position where
the trailer has a busted board, one checker back at the edge of a
five prime, and the leader has checkers back in the trailer's home
board. In this situation, the trailer may leap the prime and obtain
a double-in (which JF doesn't recognize), only to obtain a huge
cash one roll later. In general, if the trailer has only one common
recube variation, and this variation yields mostly weak doubles-in,
JF's live cube algorithm will not give accurate results.
- Another common live cube error ends up with the cube owner doing
*worse* owning the cube. Apparently when this happens JF has
erroneously played on for the gammon some of the time.

So yes, JF rollouts cannot be trusted implicitly. However, for
most positions JF rollouts are the best source for equities, and
considered carefully, the best tool for improving your game.

An interesting corrolary to the fact that JF misplays the above
situations, is that JF plays other types of positions *better*
than a human of overall equivalent strength would. This shows
up most prominently in the play of attacking positions, where
JF frequently gets results that are higher than humans get.

> I would appreciate some advice from those more experienced
>with rollouts as to how better utilize the program. What paramaters
>work best for the above rollout?

Here's my advice:
-Always do rollouts in multiples of 36 (unlike the 106 game rollout)
and in multiples of 1296 if doing level 5 rollouts.
-If you have time and a fast enough computer, do complete rollouts.
This way you avoid any bias and get the cube numbers as well.
-When doing checker play rollouts, set the seed the same for all
the plays under consideration.
-Don't regard checker play results that are within 2 standard
deviations as anything significant. If you don't want to bother
to look at the standard deviations, as a rule of thumb, consider
differences of .10 significant for rollouts of 1296, .07 for
2592, .06 for 3888.
-There are decreasing returns as you roll positions out more times.
You will reduce the standard deviation, but if the equities are
still close, the errors in checker play are probably more significant
than the random error. I usually don't roll plays out more than
3888 times.
-When rolling out checker plays, go ahead and roll out all those
plays that fit the themes of the position, even if you don't think
they are candidate plays. Occasionally one of these plays you
didn't like will actually turn out to be best, and you'll learn
something. If you're short on time, do small short truncated
rollouts first to identify the real candidates.
-Look at both the cube numbers and the cubeless numbers.
-If you really want to understand what's going on in a cube action
situation, rollout several variations of the position, so that
you can see how they affect the equity. Use the same seed for
all of these rollouts.
-DON'T just believe the rollout results as though they came
from on high. But try to understand the sorts of positions
where the results are off, and why, so that you can know
when you can trust JF and when you should be skeptical, and
the probable direction of the error.
-If you suspect that the rollout is biased, you can look at how
JF plays the first numbers, or set up a few important variations
to see how it plays those. You may find that with level 6 it
does a better job, in which case use that. If it still seems
to be playing the position wrong on level 6, use the interactive
rollout feature. One approach would be to play it 36x with
you playing one side, JF level 6 the other, and then another
36x with you playing the other side and JF level 6 the first.
If you're right that JF is screwing the position up (and you're
not), you'll see it in the results.
-Be careful about interpreting the rollout results for a
particular match score. JF does all its rollouts based on
choosing the best cubeless plays, with gammons and backgammons
counting (and counting equally for both sides), so the results
may not be valid in a match situation. For many match scores,
there is no satisfactory way to set the JF cashing parameter
to give a reasonable match live cube rollout, so you are better
off interpreting the cubeless numbers.

My experience is mostly with using JF level 5 rollouts [because in
April '96 I had just gotten JF 2.0], but it
may well be better to use JF level 6 by default. It certainly
plays better on level 6, and with JF's variance reduction algorithm,
its not a lot slower, effectively.

>scriabin on FIBS

Hope this is of some use to you,

David Montgomery
monty on FIBS

For completeness, here is Kit Woolsey's post from 4/7/96 regarding
the same topic:

Article 12892 of rec.games.backgammon:
Newsgroups: rec.games.backgammon
Path: mimsy!cs.umd.edu!haven.umd.edu!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu!math.ohio-state.edu!howland.reston.ans.net!ix.netcom.com!netcom.com!kwoolsey
From: kwoo...@netcom.com (Kit Woolsey)
Subject: Re: How best to do Jellyfish rollouts?
Message-ID: <kwoolseyD...@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
References: <4k6siv$2...@nyx10.cs.du.edu>
Date: Sun, 7 Apr 1996 06:41:51 GMT
Lines: 141
Sender: kwoo...@netcom8.netcom.com

[ Farhan Malik's questions, to which Kit was replying, deleted. ]

As you are finding out, you can need a pretty big sample size to get
accurate results with a full rollout. The luck factor can be pretty
large. For example, when I first had access to a rollout program I tried
rolling out an opening 4-2 1296 times and got the startling result that
13/11, 13/9 was a bit better than 8/4, 6/4! As you might guess this was
way off the end of the bell curve and a longer rollout quickly set things
straight, but it does give an idea of how large a sample size one might
need to be comfortable with full rollout results.

Truncated rollouts have two advantages. First of all, they obviously
take much less time. Secondly, the luck factor is cut down
considerably. This is because you aren't dependent on the lucky rolls at
the end of the game which determine the winner -- that is factored into
the jellyfish estimates. My experience has been that 1296 trials,
truncation 7, is quite sufficient for most play vs. play problems and
leads to good results. As an experiment, try taking some play vs. play
problem (avoid backgames -- JF does have problems there), and roll out
the two plays 1296 times truncation 7 (same seed to get the duplicate
dice, of course). Then try rolling them each out 10,000 times on a full
rollout (again, same seed). I predict that the relative results will be
very similar -- i.e. if the truncated rollout says that play A is .03
better than play B then the full rollout will say about the same. Note
that the truncated rollouts may give bad estimates for absolute equities
for various reasons, but for play vs. play problems they are very good.

There are many things that can screw up a JF rollout. The most common is
that it is making some big thematic mistake on the first roll or two,
which it will be repeating over and over. If you are really curious
about a position I suggest you see how it plays the first couple of
moves. Also the program may have trouble handling the overall position
decently -- this is often a problem in some end-game positions
particularly backgames. In general, however, the program plays quite
well even on level 5 (which you have to use to get the fast rollouts),
and for most normal positions the results are very accurate.

As for not believing rollouts, you have to be careful. Sure it is easy
to be convinced by an expert's arguments. He is probably convinced
himself. The problem is that his arguments may be based on false
premises, which can lead to false conclusions. Except for certain
end-games or technical plays it is very difficult to *prove* that a play
is correct -- one can argue the plusses and minuses of a play, but if the
weighting of the parameters is wrong you can get the wrong result.

Let's look at your actual example: the 4-4 response to the 4-1 opening.
We might find Robertie saying:
Making the four point is clear. When the opponent has two men on the
bar, it is a must to go for the throat. The swing if he rolls a four is
enormous. The tactical gains outweigh the slight positional disadvantage
of giving up the eight point.
On the other hand, we might find Woolsey saying:
Making the nine point is clear. You have a very strong advantage, and it
is time to solidify things. Bringing the builders down gives you
ammunition to pounce wherever your opponent enters, and the solidity of
the nine and eight points will hold your advantage whatever happens. The
positional gains from this play outweigh the slight temporary advantage of
making the four point.
Both arguments are reasonable, but which one is right? The answer is we
don't know! Only our judgment and experience can guide us here. Your
initial impression was that Robertie's arguments are correct, and the
tactical considerations are overriding. The rollouts showed the plays to
be very close (which, in fact, I think they are). You have learned
something -- in this sort of position you are overweighing the tactical
considerations. Now that you have this benchmark result down pat, you
can use it to help you with other similar plays. For example, suppose
your opponent opens with a 5-4, plays 13/9, 13/8, and you roll 4-4.
Should you play 24/16*, 8/4(2) or 24/16*, 13/9(2). My guess from your
comment about the original position is that, before seeing the rollout
results, you would have considered this a close call. Not any more! If
it was close with two men on the bar, then with only one man on the bar
the positional play of making the nine point must be considerably better
(which I believe it is). This is the way you improve your backgammon play.

When a rollout result is considerably different than what you would
expect, don't be quick to disbelieve it. Think about the position, and
see if maybe your weights of the relevant parameters are not correct.
Look hard -- there is almost always a reason for an unexpected rollout
result and if you can find that reason you will have learned a lot. I
have seen many experts (myself included) make plays which were .150 or
more worse than another play without having any idea that they were
making an error. There is a lot we have to learn about backgammon, and
the tool of the jellyfish rollout is by far the most valuable tool we
have today. The results are definitely not always gospel, but there is
often a lot of truth in them.

Kit

Chuck Bower

unread,

Feb 21, 1997, 3:00:00 AM2/21/97

to

In article <01bc1a98$cc7370c0$3ac0...@polaris.mstone.com>,
Brian Sheppard <bri...@mstone.com> wrote:

>

>My gut feel is that the equity of owning the 5-point will increase
>relative to the other plays when the cube is taken into account.
>Recall that JF Level 6 rollouts are cubeless!

> (Brian S.)

I did 10,368 level-5 rollouts (JF v2.01) each for the hit + split
play and the building play for the 31 response to opening 64 (24/18, 13/9).
I set the settlement limit at 0.57 (chosen because the above rollout
results used with Rick Janowski's cube handling theories say that the
drop point is around an equity of 0.59 for either side). Choosing the
settlement limit a little (0.02) below the drop point hopefully offsets
the market losing sequences. Here are the rollout results for a centered
cube:

total gammon + bg bg

24/21,8/7*
opener wins 55.4 1.7 0.1
responder wins 44.6 1.5 0.1 eq. = -0.111 sd. = 0.010

8/5, 6/5
opener wins 56.4 1.6 0.0
responder wins 43.6 1.8 0.1 eq. = -0.127 sd. = 0.010

JF also gives the cubeless level-5 results as -0.071(0.013) and -0.097(0.013)
respectively. (Note that, with access to the cube, the leader has better
equity than without access--which you should expect. Also, JF plays for
gammon if equity jumps from below the settlement limit to above 0.9 AND IF
gammon chances are greater than 10% of all games.)
Without going into the statistical analysis, these results
look consistent with the level-6 cubeless results above. Still no clear
winner. Go with comfort!