Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Difficult Crawford checker play.

0 views
Skip to first unread message

Sutter.

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
Hi
I'm well behind and faced with this checker play conundrum
Any thoughts?
Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)

+24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
| X O O | | O O X |
| O O | | O O |
| O O | | O O |
| O | | O |
| O | | |
| |BAR| |
| | | |
| | | |
| | | |
| X X X X | | |
| X X X X X X | | X X X |
+-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99

I still have some racing equity so should I move up? If I move up,
10/6 looks obvious except the bots don't like it.
Or is it better to hang back and try to be a nuisance?

Cheers Roland Sutter


Chuck Bower

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
In article <3799b820...@news.which.net>, <Rol@nd Sutter.> wrote:
>Hi
>I'm well behind and faced with this checker play conundrum
>Any thoughts?
>Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>
> +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
> | X O O | | O O X |
> | O O | | O O |
> | O O | | O O |
> | O | | O |
> | O | | |
> | |BAR| |
> | | | |
> | | | |
> | | | |
> | X X X X | | |
> | X X X X X X | | X X X |
> +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>
Although a gammon win for X would be nice, I think here
s/he is in bad enough shape that looking for the best play
to win the game is called for. So...

Put yourself on O's shoes for a moment. If X moves up
and cover the 6-point, then s/he will point on your head with
any combination of 1,2,3,4 (16 rolls). (Note that if you
stay on the 23-point, these are still good rolls for O. X
is likely to come back in on a low-point and be in a similar
type situation as above.) Four rolls (55,56,66) are forced
behind your lone thorn. But what about the other 16 rolls
which contain a small die and a large one? If s/he plays
behind you, then you can escape with a 5 or 6. If s/he
hits loose, you get a direct shot(s) with a 5 1/2 point board!
In those cases you are likely to be left alone, for at least
one roll.

Suppose you hang back. How will the game develop? O will get
several rolls to try and point on your head. Sometimes s/he'll be
able to make the 1-point and then pick and pass. An anchor on
the 23-point is a different story, but it looks to me like moving
up and trying to race is X's best chance.


Chuck
bo...@bigbang.astro.indiana.edu
c_ray on FIBS

Bob Stringer

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
Rol@nd, Sutter. wrote:
>
> Hi
> I'm well behind and faced with this checker play conundrum
> Any thoughts?
> Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>
> +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
> | X O O | | O O X |
> | O O | | O O |
> | O O | | O O |
> | O | | O |
> | O | | |
> | |BAR| |
> | | | |
> | | | |
> | | | |
> | X X X X | | |
> | X X X X X X | | X X X |
> +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>
> I still have some racing equity so should I move up? If I move up,
> 10/6 looks obvious except the bots don't like it.
> Or is it better to hang back and try to be a nuisance?
>
> Cheers Roland Sutter

Hi Roland,

With an even race, I vote to run. One man back isn't going to be able
to fend off all his builders.

What really caught my interest, though, was your comment that the bots
don't like 10/6 in combination with moving up. That made no sense to
me, so I checked it on Snowie, which, lo and behold, gave these as the
top 5 plays:

1. 3 23/21 5/1 -0.162
2. 3 23/21 7/3 -0.258 (-0.096)
3. 3 23/21 6/2 -0.269 (-0.107)
4. 3 23/21 13/9 -0.276 (-0.114)
5. 3 23/21 10/6 -0.284 (-0.122)

I have no idea what's going on here. It *can't* be good to both run and
waste pips by bringing a man deep into your board. And what can
possibly be the point of the "best" move, 5/1?

Does anyone have any idea what the bots see that I don't?

Regards,

Bob Stringer
To reply please replace "REMOVE" with "bob" in my address

David Montgomery

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
In article <3799E84A...@pacbell.net>,
Bob Stringer <REMO...@pacbell.net> wrote:

>Rol@nd, Sutter. wrote:
>> Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>> +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
>> | X O O | | O O X |
>> | O O | | O O |
>> | O O | | O O |
>> | O | | O |
>> | O | | |
>> | |BAR| |
>> | | | |
>> | | | |
>> | | | |
>> | X X X X | | |
>> | X X X X X X | | X X X |
>> +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>>
>
>What really caught my interest, though, was your comment that the bots
>don't like 10/6 in combination with moving up. That made no sense to
>me, so I checked it on Snowie, which, lo and behold, gave these as the
>top 5 plays:
>
> 1. 3 23/21 5/1 -0.162
> 2. 3 23/21 7/3 -0.258 (-0.096)
> 3. 3 23/21 6/2 -0.269 (-0.107)
> 4. 3 23/21 13/9 -0.276 (-0.114)
> 5. 3 23/21 10/6 -0.284 (-0.122)
>
>I have no idea what's going on here. It *can't* be good to both run and
>waste pips by bringing a man deep into your board. And what can
>possibly be the point of the "best" move, 5/1?

I don't believe this at all. I think the bots are confused.
Perhaps after 5/1 Snowie then thinks some loose hits are called
for, but these loose hits turn out to be wrong when it gets
to the third ply, making the 5/1 play look smart since it
encouraged errors on the second ply.


--
David Montgomery Beltway Backgammon Club
davidmo...@netzero.net Washington DC area BG Tournaments
monty on FIBS and GG www.cs.umd.edu/~monty/bbc.htm


Rodrigo Andrade

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
> Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>
> +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
> | X O O | | O O X |
> | O O | | O O |
> | O O | | O O |
> | O | | O |
> | O | | |
> | |BAR| |
> | | | |
> | | | |
> | | | |
> | X X X X | | |
> | X X X X X X | | X X X |
> +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>

> 1. 3 23/21 5/1 -0.162


> 2. 3 23/21 7/3 -0.258 (-0.096)
> 3. 3 23/21 6/2 -0.269 (-0.107)
> 4. 3 23/21 13/9 -0.276 (-0.114)
> 5. 3 23/21 10/6 -0.284 (-0.122)

>It *can't* be good to both run and
>waste pips by bringing a man deep into your board.

Snowie isn't simply wasting pipes: it is switching blots. Look at the X
builders on the 7-, 9-, and 10-points. Those builders have a far better
chance of covering the blots on points 5 and 6 than on points 1 and 6.

As for the deuce, the best play for X is 23/21 because X has no ammunition
to fight O on a backgame or try to anchor and hit a blot. Besides, O has
already passed all of X's men, so there's no way that X can intentionally
send other men to the bar. And above all, X is miles behind in the race. All
he can do is move 23/21 and cross his fingers to roll 55 or 66 if he wants
to have a chance.

Snowie's #1 play is a double bet. After the play, if O leaves a blot and X
hits it, X has good chances to trap it. If O does not leave a shot (or if O
does leave a shot and X misses), X can always get lucky and close the gap in
the race.

--
RODRIGO

===========================================================

"All religions of a spiritual nature are inventions of man. He has
created an entire system of gods with nothing more than his carnal brain.
Just because he has an ego and cannot accept it, he has had to externalize
it into some great spiritual device he calls 'God.'"

- The Satanic Bible
Anton Szandor LaVey

Bob Stringer

unread,
Jul 24, 1999, 3:00:00 AM7/24/99
to
Rodrigo Andrade wrote:
>
> > Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
> >
> > +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
> > | X O O | | O O X |
> > | O O | | O O |
> > | O O | | O O |
> > | O | | O |
> > | O | | |
> > | |BAR| |
> > | | | |
> > | | | |
> > | | | |
> > | X X X X | | |
> > | X X X X X X | | X X X |
> > +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
> >
>
> > 1. 3 23/21 5/1 -0.162
> > 2. 3 23/21 7/3 -0.258 (-0.096)
> > 3. 3 23/21 6/2 -0.269 (-0.107)
> > 4. 3 23/21 13/9 -0.276 (-0.114)
> > 5. 3 23/21 10/6 -0.284 (-0.122)
>
> >It *can't* be good to both run and
> >waste pips by bringing a man deep into your board.
>
> Snowie isn't simply wasting pipes: it is switching blots. Look
> at the X builders on the 7-, 9-, and 10-points. Those builders
> have a far better chance of covering the blots on points 5 and
> 6 than on points 1 and 6.

I fail to see the point of uncovering the 5 point, just so X can have
the *possibility* of covering it a second time with a builder from the
7, 9 or 10 points. 10/6 covers the 6 point *now*, leaves the 5 point
covers, with the checker on the 7 point all set to cover the ace point
if X rolls a 6. X should create a strong board now.

This position starts with 2 blots in the inner board, and Snowie's
"best" move ends up leaving 2 blots there. If X can make a 5th point in
the inner board now, then X should do it, because he may not be able to
do it later. Also, it's best to make the 6 point first, because if X
gets a shot right away, he'd much prefer O to have to come in on the ace
point instead of the 6.

As further evidence of Snowie's fuzzy "thinking" in this position, how
about the "second best" move, 7/3? That does absolutely nothing to
improve the position as far as I can see. It doesn't add a builder for
the ace point, and it takes one away from the 6 point. How can 7/3
possibly be better than 10/6?

>
> As for the deuce, the best play for X is 23/21 because X has no
> ammunition to fight O on a backgame or try to anchor and hit a
> blot. Besides, O has already passed all of X's men, so there's no
> way that X can intentionally send other men to the bar. And above
> all, X is miles behind in the race. All he can do is move 23/21 and
> cross his fingers to roll 55 or 66 if he wants to have a chance.

I agree that X should come up to the 21 point, but he's not miles behind
in the race. X has to make only 8 crossovers to get everyone home; O
has 7. Also, if you consider minus four pips with the roll as an even
race, then before he rolled X was 5 pips *ahead.*

>
> Snowie's #1 play is a double bet. After the play, if O leaves a blot
> and X hits it, X has good chances to trap it. If O does not leave a
> shot (or if O does leave a shot and X misses), X can always get
> lucky and close the gap in the race.

Again, I think the best best bet to trap O is to close the 6 point now.
With only the ace point open, O isn't going to want to hit loose, and
after X plays 23/21, if O doesn't point on him, X is a favorite to
escape in an even race.

That's my opinion, anyway.

Gregg Cattanach

unread,
Jul 25, 1999, 3:00:00 AM7/25/99
to
RE: this position:

| > I'm well behind and faced with this checker play conundrum
| > Any thoughts?
| > Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
| >
| > +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
| > | X O O | | O O X |
| > | O O | | O O |
| > | O O | | O O |
| > | O | | O |
| > | O | | |
| > | |BAR| |
| > | | | |
| > | | | |
| > | | | |
| > | X X X X | | |
| > | X X X X X X | | X X X |
| > +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99

As far as Snowie is concerned, I compared 23/21 10/6 vs. 23/21 5/1 as
mini-rollouts and got these results:
1. M 23/21 5/1 Eq.: -0.288
0.1% 4.8% 35.9% 64.1% 5.3% 0.2%
2. M 23/21 10/6 Eq.: -0.327 (-0.039)
0.0% 1.1% 34.5% 65.5% 2.8% 0.1%

Somehow 5/1 generates more gammons, (4.8% vs. 1.1%) which accounts for the
majority of the equity difference. The switching blot concept is what must
be going on. After 5/1, O may be 'forced' to hit loose on the 4 point. In
that case, if X returns hits with the 4, X has 6s, 5s, 2s, and 1s to cover
the 5 or 6 point, and on the next roll, if O fans, many other direct and
indirect numbers to finish the close out. Covering the blot on the ace is
much more difficult. If it's just a straight race, then 10/6 would be my
choice I guess, too, but this sequence that could lead to gammons might make
all the difference. And of course, X needs all the gammons he can get.

Question for the group: If you are O, (after 23/21 5/1) would you hit loose
on the 4 point, or just dump checkers behind the 21 point? Or was my
'theory' about how to get the extra gammons above all wet?

--
Gregg Cattanach
gcattana...@prodigy.net
Zox at GamesGrid, VOG
http://gateway.to/backgammon

Bob Stringer <REMO...@pacbell.net> wrote in message
news:3799E84A...@pacbell.net...
| Rol@nd, Sutter. wrote:
| >
| > Hi
| >


| > I still have some racing equity so should I move up? If I move up,
| > 10/6 looks obvious except the bots don't like it.
| > Or is it better to hang back and try to be a nuisance?
| >
| > Cheers Roland Sutter
|
| Hi Roland,
|
| With an even race, I vote to run. One man back isn't going to be able
| to fend off all his builders.
|

| What really caught my interest, though, was your comment that the bots
| don't like 10/6 in combination with moving up. That made no sense to
| me, so I checked it on Snowie, which, lo and behold, gave these as the
| top 5 plays:
|

| 1. 3 23/21 5/1 -0.162
| 2. 3 23/21 7/3 -0.258 (-0.096)
| 3. 3 23/21 6/2 -0.269 (-0.107)
| 4. 3 23/21 13/9 -0.276 (-0.114)
| 5. 3 23/21 10/6 -0.284 (-0.122)
|

| I have no idea what's going on here. It *can't* be good to both run and
| waste pips by bringing a man deep into your board. And what can
| possibly be the point of the "best" move, 5/1?
|

| Does anyone have any idea what the bots see that I don't?
|

Bob Stringer

unread,
Jul 25, 1999, 3:00:00 AM7/25/99
to
This makes some sense to me. After 23/21 5/1, *both* X's and O's
gammons go up. That suggests that O must be hitting loose. But
consider the following.

First the preliminaries. Assume X plays 23/21 10/6, and O then hits
loose. In the abstract, it seems that X is not a favorite to hit the
man on 23 [total of 15 shots -- any 4, 1-1, 2-2 and 3-1]. If X doesn't
hit, O will then be a favorite to cover and make a 5 point prime. Of
the 15 shots that hit, 11 both hit the man on 23 and cover either the 5
or 6 point. If X gets one of those 11 rolls, then O will not be a
favorite to enter, and X will be a favorite to cover the remaining
point.

However, all of this assumes that O is only going to leave one blot
after hitting loose. How's that going to happen? Any combination of
1's, 2's, 3's and 4's is going to hit *and* make the 23 point. Hitting
loose is going to be an issue only if one die is a 1, 2, 3 or 4, and the
other is a 5 or 6. But in that case O has to leave 2 blots in order to
hit, and that can't be right [can it?].

Because of this, what I imagine is happening in these rollouts is that O
first makes another point [probably the 22 or the 23], X doesn't escape
(even though he's now a favorite to do so), and O *then* hits loose --
one checker hits X on 23, and the other lands safely on the new inner
board point.

Does that sound right?

What still seems strange to me is this: if O is going to hit loose,
*whatever* the circumstance, why isn't X better off with his 6 point
covered for sure? Maybe X isn't going to completely close his board as
often as he will with the 5/1 play, but after 10/6, whenever O enters,
he is *always* going to have to enter on the ace point, and behind a 5
point prime at that. Leaving the 5 and 6 points uncovered for even a
moment just seems too risky to me. While X may cover them more often, O
also is going to escape more often -- hitting X and putting him behind a
4 prime in the process.

And one last question. What about Snowie's *second* best move, 23/21
7/3? That one still make absolutely no sense to me. I can't see how
that improves X's gammon chances at all. But since there was more to
the 5/1 move than I originally thought, maybe the same is true of 7/3.

David Montgomery

unread,
Jul 26, 1999, 3:00:00 AM7/26/99
to
In article <379B35FA...@pacbell.net>,
Bob Stringer <REMO...@pacbell.net> wrote:

>Gregg Cattanach wrote:
>> | > Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>> | > +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
>> | > | X O O | | O O X |
>> | > | O O | | O O |
>> | > | O O | | O O |
>> | > | O | | O |
>> | > | O | | |
>> | > | | | |
>> | > | X X X X | | |
>> | > | X X X X X X | | X X X |
>> | > +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>>
>> As far as Snowie is concerned, I compared 23/21 10/6 vs. 23/21 5/1 as
>> mini-rollouts and got these results:
>> 1. M 23/21 5/1 Eq.: -0.288
>> 0.1% 4.8% 35.9% 64.1% 5.3% 0.2%
>> 2. M 23/21 10/6 Eq.: -0.327 (-0.039)
>> 0.0% 1.1% 34.5% 65.5% 2.8% 0.1%
>>
>> Somehow 5/1 generates more gammons, (4.8% vs. 1.1%) which accounts for the
>> majority of the equity difference. The switching blot concept is what must
>> be going on. After 5/1, O may be 'forced' to hit loose on the 4 point. In

This still all seems like sillyness to me - I think Snowie is just messing
it up. O can simply ignore X's side of the board and play the same way.
O isn't forced to hit loose.

>What still seems strange to me is this: if O is going to hit loose,
>*whatever* the circumstance, why isn't X better off with his 6 point
>covered for sure?

I think he is. After 10/6 Snowie knows enough not to hit loose. After
5/1 sometimes Snowie mistakenly hits loose because of the weakness in
X's position. That's my theory unless someone shows it's wrong. 5/1
just can't be right!

>But since there was more to
>the 5/1 move than I originally thought, maybe the same is true of 7/3.

I don't think there is any more to either one of them. They're both
stupid. :-)

Bob Stringer

unread,
Jul 27, 1999, 3:00:00 AM7/27/99
to
David Montgomery wrote:

> [snipped]


>
> I don't think there is any more to either one of them. They're both
> stupid. :-)

This raises the question that I posed on this news group a few months
ago. At the time I got an informative response or two, but no one ever
really addressed the ultimate issue. So here it is again. What's the
basis for having any confidence in a rollout?

If you don't know which is the better strategy/tactic/move in a
particular position, and you question Snowie's evaluation at 3-ply, you
do a rollout. Snowie proceeds to play out the position numerous times.
However, why should only the specific position that you're investigating
pose a difficult question? On the very next roll, the very same, or a
similar, strategical or tactical issue may be presented again. Or maybe
an entirely different, but still difficult decision will have to be
made. Snowie is, I assume, going to be predisposed to handle certain
positions in certain ways, and the entire rollout is going to be
conducted according to those predispositions. And even if Snowie
doesn't have any predispositions, Snowie will still have to make
difficult decisions, and in each case it will have to proceed on the
basis of a decision that may or may not be correct. If you're not sure
what the "best" move is to start out with, and you don't know whether
Snowie is making the best decisions in subsequent positions, what's the
basis for your confidence in the rollout?

An example: assume the issue is whether to hold an anchor or to run.
You're not sure of Snowie's 3-ply evaluation, and so you do a rollout.
But when you're rolling out the hold decision, on the very next roll the
question whether to run or hold may still be there. In *that* case,
Snowie may play the position according to whatever principles led to
it's 3-ply evaluation of the original position -- the very evaluation
that you've questioned. If Snowie got it wrong the first time (i.e., at
3-ply), what's to prevent it from getting it wrong again? And again,
and again?

Our "difficult Crawford checker play" is another example. At 3-ply
Snowie says that the move which appears to "obviously" be the best one
is only *fifth* best. So we do a rollout to investigate further. And,
evidently, whatever led Snowie to misjudge the position at 3-ply also
leads to a suspicious rollout. In this case, I assume we simply reject
the rollout. But what about other, more difficult positions? In fact,
in positions that involve anything more than racing, how do we *ever*
have confidence that a rollout yields the "correct" play?

I don't think it's any answer that we can have confidence in the rollout
because Snowie has proven over time that it's a good BG player. The
same argument can be used to justify Snowie's decisions at 3-ply. Yet
when we question a 3-ply decision by Snowie, we do a rollout on Snowie.
It seems rather circular.

Frankly, I have the very same question about rollouts that are done by
humans. If an expert is not sure of the correct strategy in a
particular position, how can he do an effective rollout if subsequent
positions keep presenting similar strategic decisions?

I am not suggesting that rollouts in general are worthless or even
suspect. I'm not an expert, and the fact that experts rely on rollouts
is good enough for me. All I'm saying is that, in light of the
foregoing considerations, I don't understand the basis for confidence in
rollouts.

Can anyone shed light on this?

Sutter.

unread,
Jul 27, 1999, 3:00:00 AM7/27/99
to

I can only speak for this example. See my reply in the "difficult
crawford checker play thread" which I posted before I read this post.


>
>I don't think it's any answer that we can have confidence in the rollout
>because Snowie has proven over time that it's a good BG player. The
>same argument can be used to justify Snowie's decisions at 3-ply. Yet
>when we question a 3-ply decision by Snowie, we do a rollout on Snowie.
>It seems rather circular.
>
>Frankly, I have the very same question about rollouts that are done by
>humans. If an expert is not sure of the correct strategy in a
>particular position, how can he do an effective rollout if subsequent
>positions keep presenting similar strategic decisions?
>
>I am not suggesting that rollouts in general are worthless or even
>suspect. I'm not an expert, and the fact that experts rely on rollouts
>is good enough for me. All I'm saying is that, in light of the
>foregoing considerations, I don't understand the basis for confidence in
>rollouts.
>
>Can anyone shed light on this?

Well put it this way I doubt there are many positions Snowie can't
roll out correctly provided the settings are correct and I doubt any
player in the world would put his/her money up against it. It even
plays back games well at 3 ply if the situation arises.

Sutter.

unread,
Jul 27, 1999, 3:00:00 AM7/27/99
to
On 26 Jul 1999 03:49:17 -0400, mo...@cs.umd.edu (David Montgomery)
wrote:

>In article <379B35FA...@pacbell.net>,
>Bob Stringer <REMO...@pacbell.net> wrote:
>>Gregg Cattanach wrote:
>>> | > Match to 9. Score X-O: 1-8 (Crawford) X to play (4 2)
>>> | > +24-23-22-21-20-19-------18-17-16-15-14-13-+ 98
>>> | > | X O O | | O O X |
>>> | > | O O | | O O |
>>> | > | O O | | O O |
>>> | > | O | | O |
>>> | > | O | | |
>>> | > | | | |
>>> | > | X X X X | | |
>>> | > | X X X X X X | | X X X |
>>> | > +-1--2--3--4--5--6--------7--8--9-10-11-12-+ 99
>>>
>>> As far as Snowie is concerned, I compared 23/21 10/6 vs. 23/21 5/1 as
>>> mini-rollouts and got these results:
>>> 1. M 23/21 5/1 Eq.: -0.288
>>> 0.1% 4.8% 35.9% 64.1% 5.3% 0.2%
>>> 2. M 23/21 10/6 Eq.: -0.327 (-0.039)
>>> 0.0% 1.1% 34.5% 65.5% 2.8% 0.1%
>>>

>>> Somehow 5/1 generates more gammons, (4.8% vs. 1.1%) which accounts for the
>>> majority of the equity difference.
Snowie is mis-evaluating X's gammons chances because it's evaluation
is money based. At this score O will be playing a more cautious game
compared to money i.e hitting loose will be avoided (contra maybe to
money play strategy) The rollouts above are money based rollouts. The
fact is that if a shot presents itself at all O will have cautiously
advanced much further and gammon chances will be slim. A rollout with
checker play according to score shows both plays about equal (equity
-0.32 ish) with gammon chances 0.7 pct only, which makes sense with
what I've speculated above.

Thanks for all the replies


Roland

Ilia Guzei

unread,
Jul 28, 1999, 3:00:00 AM7/28/99
to
>
>
> This raises the question that I posed on this news group a few months
> ago. At the time I got an informative response or two, but no one ever
> really addressed the ultimate issue. So here it is again. What's the
> basis for having any confidence in a rollout?

I am no expert on rollouts so the following is a speculation. To make a
thorough statistical evaluation of a position (which I suppose rollouts have
been invented for) one needs to play each roll out of the 21 possible rolls
after every roll. Therefore, for example, if the position requires each
players to make 2 moves during bearoffs, then the number of positions to be
evaluated is 21*21*21*21 which is 194481 positions. How may rolls does an
average game take - 50 rolls by each player? Due to the limitations of the
output of my calculator let's say 38 rolls by each player. The total number
of rolls is 76. The number of positions to be evaluated is
21^76 = 37 * 10^99.
For simple bearoff positions my own program does 3 million rolls in 2.70
seconds on a 400MHz PC. At this rate
37*10^99 positions would take
(37*10^99) / (3*10^6) * (2.70) = (33*10^93) sec. Taking into
consideration that 1 day has 86400 sec a complete roll out will take forever
and ever. Geez, this way the entire game of backgammon can be tabulated and
made finite.

Alternatively, let's see how many complete roll considerations (evaluating
21 rolls after each roll) can be made in an hour.
21^Y = 4*10^11.
Y = 9. This means that only a sequence of 9 rolls (each and every from the
21) can be considered in 1 h. Say 5 by the player who needs to make the
decision and 4 by his opponent. After that the computer is to evaluate the
winning chances based on its artificial wisdom. This approach is
reasonable, perhaps. On the other hand the computer can be forced to roll
random numbers and play the ensuing positions but then there is a chance
that its rolls will have nothing in common with your actual rolls.

The correct answer about the rollout algorithm used by bots should come from
their creators, but it's likely to be a trade secret.

balashiha

Gary Wong

unread,
Jul 28, 1999, 3:00:00 AM7/28/99
to
Ilia Guzei <igu...@iastate.edu> writes:
> > This raises the question that I posed on this news group a few months
> > ago. At the time I got an informative response or two, but no one ever
> > really addressed the ultimate issue. So here it is again. What's the
> > basis for having any confidence in a rollout?
>
> I am no expert on rollouts so the following is a speculation. To make a
> thorough statistical evaluation of a position (which I suppose rollouts have
^^^^^^^^

> been invented for) one needs to play each roll out of the 21 possible rolls
> after every roll. Therefore, for example, if the position requires each
> players to make 2 moves during bearoffs, then the number of positions to be
> evaluated is 21*21*21*21 which is 194481 positions. How may rolls does an
> average game take - 50 rolls by each player? Due to the limitations of the
> output of my calculator let's say 38 rolls by each player. The total number
> of rolls is 76. The number of positions to be evaluated is
> 21^76 = 37 * 10^99.

Whoa... being thorough is one thing, but performing over a googol evaluations
to roll out a single position is quite another! Fortunately we don't have to
evaluate every position that could potentially result from each play to come
up with an answer we can have some reasonable confidence in; by taking a
shortcut here and there (and adding a dose of sampling theory) we can get away
with much less work than that.

The million dollar question is simple enough: out of all the games that could
result from playing this position, how many do we win (and how many of our
wins and losses are gammons, and how many are backgammons)? The model is
exactly the same as if we had an urn with a googol balls in it (it's a big
urn), and many of the balls have "win" written on them, and some say
"gammon loss", and if we look hard enough there are a few that read "backgammon
win", and so on. (Balls and urns are to probability theorists what teapots
and chequerboards are to computer graphics researchers, or "squeamish
ossifrage" is to cryptographers -- they seem to come with the territory.)
Instead of having the patience to count the googol balls, we just give the
urn a really good shake and then pull 100 balls out without looking, and say
for instance "Well, I got 53 wins, 31 losses, 9 gammon wins, 6 gammon losses,
and a backgammon win -- looks like my equity's roughly +0.26." and go home.
If we were a bit more thorough (but there's still a long way between my
"thorough" and yours!), we could go a bit further and figure out that by
cheating and measuring the sample proportions instead of the population
proportions, we introduced a standard error of 0.06 into our result.
(Of course, the trick is to select a sample size that's big enough that
you reduce the standard error to a tolerable level, but small enough that
the answer arrives before you get bored.)

It will come as no surprise that a rollout with a limited number of trials
follows exactly the same procedure. It's sufficient to say that the
proportion of wins/gammons etc. that come up when Jellyfish plays against
itself (say) 1296 times, aren't likely to vary all that much from the
proportion we would get if we measured the proportion of results in every
game we could possibly get of Jellyfish playing against itself. (Of course,
there may still be some doubt whether the results of JF vs. JF are
representative of the results of a perfect player vs. a perfect player, or
of you vs. Joe Average, but that's another story.)

> The correct answer about the rollout algorithm used by bots should come from
> their creators, but it's likely to be a trade secret.

Well, not all bot creators trade them, and they don't all keep secrets! :-)
In GNU Backgammon (ftp://ftp.cs.arizona.edu/people/gary/gnubg-0.0.tar.gz),
the function Rollout() in eval.c implements the procedure described above,
with the following improvements:

* Truncation: instead of rolling out all the way to the end of the game,
it can stop and pretend its evaluation after a few plies is perfect.
This may obviously introduce some amount of systematic error, but
in practice this may not matter because:
- it makes rollouts much faster, which means you can do more of
them (and thus trade sampling error for systematic error);
- different positions will be reached in different trials, so
the correlation between errors in each trial weakens and the
errors cancel out to some extent;
- if you are rolling out the positions after making different
plays, then any remaining systematic error between the two
rollouts is likely to be somewhat correlated and so the
error in the comparison between the plays is hopefully
small. This implies that truncated rollouts are better for
estimating _relative_ equity ("which is the better move here,
13/10*/9 or 13/10* 6/5*?") than _absolute_ equity ("at this
match score I need 29% wins to accept a dead cube; can I
take in this position?").

* Race database truncation: when the game enters its 2-sided bearoff
database, gnubg can estimate the probability of winning from that
position with no error at all (it can play and evaluate endgame
positions perfectly), which saves time and avoids introducing the
errors that can result from large equity variances at the end of
the game.

* Variance reduction: when using lookahead evaluations, it can reduce
errors by making use of the equity difference from one ply to
the next. (This can be interpreted as either cancelling out the
estimated "luck" (ie. the difference in equity evaluations before and
after rolling) or using subsequent evaluations to estimate the
error in prior ones; the two views are equivalent). gnubg automatically
performs variance reduction when looking ahead at least one ply.

* Stratified sampling: uses quasi-random number generation instead of
pseudo-random number generation (this is a standard technique in
Monte Carlo simulations where having a near-perfect uniform distribution
in your sample is more important than unpredictability). gnubg only
stratifies the first 2 plies of a rollout, though it would be easy
enough to extend it to the remainder.

There are undoubtedly other possible heuristics, but I have only implemented
the above. This description has glossed over many of the finer points that
are involved -- if you are interested in the details I recommend that you
examine the source code referred to above, or read some earlier articles
posted here that describe various topics with more precision. Look for
articles by Michael Zehr, David Montgomery, Chuck Bower and Brian Sheppard
for starters.

Cheers,
Gary.
--
Gary Wong, Department of Computer Science, University of Arizona
ga...@cs.arizona.edu http://www.cs.arizona.edu/~gary/

Bob Stringer

unread,
Jul 28, 1999, 3:00:00 AM7/28/99
to
It may be that I'm simply too unsophisticated in mathematics (I'm not
being sarcastic) to understand that my basic question has just been
answered, but I don't think that it has.

Here's the question again in distilled form: I've just gotten an
evaluation of a position by a bot. I'm not sure that I trust it.
Therefore, how to I test the *bot's* judgment? By using the *bot* to do
a rollout.

But how can I have *any* confidence that whatever predispositions or
tendencies, or whatever you want to call them, that led the bot to
initially evaluate or misevaluate (I don't know which, hence the
rollout) the position a certain way isn't going to lead the bot to
handle the rollout incorrectly? A rollout isn't just a matter of making
a whole bunch of rolls that play themselves -- someone has to exercise
*judgment* as to how all the rolls should be played.

I actually was a math major for two years (I got out because it bored
me), so I'm not completely dense about numbers. I understand the value
of a rollout when only a race is involved, because then the value of the
rollout is [almost] purely a matter of statistics, and I do consider
statistics valuable when properly used.

The issue I've raised, though, goes to the fact that when the bot does
the rollout, the very entity whose "judgment" is under investigation
(the bot) also makes the judgments about how the various positions in
the rollout should be evaluated and played. In short, the judge is
judging the judge.

I'm perfectly willing to accept that rollouts are valid and should
inspire confidence, because BG experts feel that way. I'm just saying
that I myself presently don't see the basis for that confidence, and so
far no one has explained it to me.

Doesn't *anyone* else see the issue that I do?

Regards,

Andrew Mill

unread,
Jul 29, 1999, 3:00:00 AM7/29/99
to
For what it's worth, I asked myself the same question when I first learned
what a backgammon rollout is. But, similar to yourself, I've accepted that
it must be more useful than a simple 3-ply evaluation at least.

I don't see how a rollout can be used to determine *the* best move if
rollouts aren't done for every move *after* the move you're rolling out.
You may be rolling out all the possible moves that can be made with that one
roll of the dice, but if you're just doing 3-ply evaluations on all the
rolls after that, the bot may not be choosing the best countermove and all
succeeding moves anyway.

Andrew Mill

Bob Stringer <REMO...@pacbell.net> wrote in message

news:379F9FB3...@pacbell.net...

Ian Shaw

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to

Bob Stringer wrote in message <379F9FB3...@pacbell.net>...

>It may be that I'm simply too unsophisticated in mathematics (I'm not
>being sarcastic) to understand that my basic question has just been
>answered, but I don't think that it has.
>
>Here's the question again in distilled form: I've just gotten an
>evaluation of a position by a bot. I'm not sure that I trust it.
>Therefore, how to I test the *bot's* judgment? By using the *bot* to do
>a rollout.
>
>But how can I have *any* confidence that whatever predispositions or
>tendencies, or whatever you want to call them, that led the bot to
>initially evaluate or misevaluate (I don't know which, hence the
>rollout) the position a certain way isn't going to lead the bot to
>handle the rollout incorrectly? A rollout isn't just a matter of making
>a whole bunch of rolls that play themselves -- someone has to exercise
>*judgment* as to how all the rolls should be played.
>
[snip]

In essence I think you are correct. If a bot is playing a position badly, it
will play similar positions badly. In these cases a rollout may not be
trustworthy.

I have seen several cases where an expert has written "I don't think the bot
plays this position well, so I did an interrective rollout (whatever that
is)" or "... so I rolled it out by hand". JellyFish 2.0 playing backgames is
a position that springs to mind. They seem to "step through" the rollout one
play at a time to see how the bot plays the position. This allows them to
judge how much to trust the rollout.

For the most part though, the bots seem give a better answer than we are
going to get any other way.
--
Regards
Ian Shaw (ian on FIBS)

Phill Skelton

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to
Bob Stringer wrote:
>
> Here's the question again in distilled form: I've just gotten an
> evaluation of a position by a bot. I'm not sure that I trust it.
> Therefore, how to I test the *bot's* judgment? By using the *bot*
> to do a rollout.

<snip>

I don't think you can trust the rollout data absolutely - as
plenty of other people have said it is biased by the tendency of
bots to misplay some positions. But the basic idea that a rollout
provides you with more reliable information that the static
evaluation is a sound one. I can't prove this, but I can try to
illustrate it.

First example - TD Gammon, which learned by temporal difference
learning. It started off with zero knowledge about backgammon other
than the rules; no knowldege of strategy at all. It played games
against itself and learned from them to reach expert level without
anyone needing to input backgammon-playing knowledge. Perhaps not the
best example, but it shows that even a position rolled out 1 time
contains at least some useful information - mopre than the bot
started with.

A silly example would be to take a position from a game and roll
it out with a bot that playes purely random (okay, pseudo-random, for
the pedantic ones reading this) moves. I imagine that given a choice
between 2 moves (eg leave a blot or play safe), rollouts would give
*some* indication *very slight perhaps) of which was the best on
the basis that leaving way too many blots around will be punished
even on purely random play, simply beause more of the random moves
will hit blots. Of course, even on a very long rollout it could get
it all wrong, but I'm pretty sure that rolling out 2 moves for a
large number of position, the rollouts would indicate which was best
correctly more than 50% of the time, which is what pure chance
would indicate.

The point is here that rollouts are effectively a random sampling
of all possible games and as such they are a reasonable indicator
of the probability of priming your opponent or closing him out, or
escaping to win a gammon, or whatever. The rollout isn't perfect
because the bot isn't perfect, but it *is* an improvement over what
the bot can do on it's own.

Consider truncated rollouts - rollout the next ten moves and then
use the bot's evaluator as the definitive value for the position
rwached. I'll assume you can see the value in doing 2 or 3 ply
lookahead. This is effectively a partial 10-ply lookahead. It
doesn't cover all possibilities, but the idea is to randomly or
quasi-randomly select a few thousand 10-ply positions (trying to
spread them as evenly over the range of possibilties as is practical)
and evaluate those. It's the same basis as opinion polling - rather
than interview everyone, just interview a large enough group of people,
and try your best to pick an accurate cross section of all the people
in the country.

Sorry, no mathematical proofs, just a lot of half-baked arguments.
Hope some of it is useful.

Phill

Rodrigo Andrade

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to
Think of it this way. You paid $350 to have the best backgammon software in
existence. Therefore, its rollouts must be pretty darn reliable.

Besides, if world class players rely on Snowie's rollouts most of the time,
why shouldn't we mortals do the same?

Bob Stringer

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to
Phill Skelton wrote:
>
> [snipped]

>
> I don't think you can trust the rollout data absolutely - as
> plenty of other people have said it is biased by the tendency of
> bots to misplay some positions. But the basic idea that a rollout
> provides you with more reliable information that the static
> evaluation is a sound one. I can't prove this, but I can try to
> illustrate it.
>
> [snipped some comments I agree with]

>
> The point is here that rollouts are effectively a random sampling
> of all possible games and as such they are a reasonable indicator
> of the probability of priming your opponent or closing him out, or
> escaping to win a gammon, or whatever. The rollout isn't perfect
> because the bot isn't perfect, but it *is* an improvement over what
> the bot can do on it's own.

I agree with this for many instances. However, what prompted my
original question was the possibility -- which surely is present in some
positions, only you don't know which ones without applying your own
judgment -- that Snowie has "misunderstood" the position and that
misunderstanding is based, for example, on the assumption that the
opponent is going to play the "wrong" move in response to the move under
consideration in the rollout. If Sowie makes that kind of mistake, the
entire rollout will be "infected" with that mistake.

The "difficult Crawford checker play" which started off this thread is a
good example. Given the fact that Snowie apparently misanalyzes the
position the same way both at 3-ply and after a rollout suggests to me
that Snowie incorrectly assumes that the opponent is going to hit loose
in a risky, unjustified manner. In other words, Snowie doesn't
"understand the position" and therefore makes wrong moves during the
rollout.

More on this below.


>
> Consider truncated rollouts - rollout the next ten moves and then
> use the bot's evaluator as the definitive value for the position
> rwached. I'll assume you can see the value in doing 2 or 3 ply
> lookahead. This is effectively a partial 10-ply lookahead. It
> doesn't cover all possibilities, but the idea is to randomly or
> quasi-randomly select a few thousand 10-ply positions (trying to
> spread them as evenly over the range of possibilties as is practical)
> and evaluate those. It's the same basis as opinion polling - rather
> than interview everyone, just interview a large enough group of
> people, and try your best to pick an accurate cross section of all the > people in the country.

I think the difference is that, unlike a statistical sampling, a rollout
involves *two* things -- [1] a sampling of rolls of the dice, and [2]
the use of "judgment" to play those rolls. When you're trying to
determine public opinion, a "statistically significant" sample should
give you a trustworthy result. But only #1, the rolls, is comparable to
such a sampling. #2, the question of what to do with those rolls, is
the problem posed by my question, and that definitely is *not* a matter
of pure statistics *unless* each and every play in the rollout is
automatic *or* errors in the plays manage to all cancel each other out.

I think this is borne out by what data we have on the Crawford position
in question. As I mentioned in an earlier post, at 3-ply Snowie
evaluates the "5 best moves" as follows:

1. 23/21 5/1 -0.162
2. 23/21 7/3 -0.258 (-0.096)
3. 23/21 6/2 -0.269 (-0.107)
4. 23/21 13/9 -0.276 (-0.114)
5. 23/21 10/6 -0.284 (-0.122)

Gregg Cattanach did a mini rollout on Snowie which produced this
evaluation:

1. M 23/21 5/1 Eq.: -0.288
0.1% 4.8% 35.9% 64.1% 5.3% 0.2%
2. M 23/21 10/6 Eq.: -0.327 (-0.039)
0.0% 1.1% 34.5% 65.5% 2.8% 0.1%

In order words, after Snowie conducted a rollout, it still considered
the doofus 5/1 move the best. Whatever mistake Snowie made at 3-ply, it
incorporated it into the rollout.

I've now checked the position on JellyFish at Level 7, and here are its
5 best ways to play it:

1. 10/6 23/21 -0.378
2. 10/6 13/11 -0.380
3. 10/6 11/9 -0.387
4. 7/1 -0.395
5. 10/6 7/5 -0.398

All very close, but the interesting thing is that 5/1 is no where to be
seen. I only have JF Tutor, so I can't do a rollout, but I'd bet
dollars to donuts that on a rollout, JellyFish wouldn't place 5/1 higher
than 10/6. And if I'm right, wouldn't that suggest that JF's "style" of
playing this particular position is reflected in its rollout, the same
as in Snowie's case? In others words, a bot either gets this position
right or gets it wrong depending upon how the bot is able to deal with
the position *before* it does the rollout. So what does that say about
the value of the rollout?

When I repeated my question in my last post, I referred to bots, since
everyone has bots on the brain, but my comments apply equally to human
players. If an expert, in a particular contact position [I don't
include non contact positions, since in such cases far greater weight
can be placed on the rolls themselves], isn't sure of the proper
strategy to employ -- and especially if he's *also* not sure of the
strategy his opponent should be following -- how can he conduct a
rollout? Doesn't he have to make a number of significant judgments down
the line? He can't keep rolling out every position that troubles him in
the rollout itself -- it would never end.

Maybe the answer is as simple as this: in most games, there really
aren't those many positions that the expert considers difficult.
Example: if the decision is to hold or run, and the expert is rolling
out the hold option, he's likely to face a perhaps *similar*, but not
the very same, hold-or-run decision on his next roll. Maybe that latter
decision usually isn't going to be as difficult as the first one.
Hence, rollouts do, in fact, generally end up "playing themselves", at
least when conducted by experts (or bots, which are the same thing).

But so far, I haven't seen anyone -- expert or otherwise -- say this.
I'm not an expert, and am not likely to become one in this lifetime, so
I'm not in a position to do anything but guess. All I'm able to say at
this point is yeah, the bots are great, I don't have any complaints, and
I trust their rollouts because the experts do. But for the reasons
stated, I *personally* don't understand why, once (if) an expert has
decided that he's not sure of a bot's judgment *before* a rollout has
been done, he should feel any more confident in the rollout itself.

Bob Stringer

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to
Rodrigo Andrade wrote:
>
> Think of it this way. You paid $350 to have the best backgammon
> software in existence. Therefore, its rollouts must be pretty darn
> reliable.

I don't get the reason for your tone. Do you think I'm somehow griping
about Snowie? I'm not, and I didn't say I was. I was, and am, curious
about the question I raised. Does intellectual curiosity have a place
on your radar screen?

If, on the other hand, you truly are trying to engage in an honest to
goodness discussion, I'll point out the fallacy of your reasoning this
way. Does the fact that The Beverly Hillbillies and Laverne and Shirley
were number 1 shows and made lots of money mean that they were "pretty
darn good"?

>
> Besides, if world class players rely on Snowie's rollouts most of the > time, why shouldn't we mortals do the same?

I've already said that because experts rely on rollouts I'm happy to do
the same, so you're not responding to anything that I said.

The reaction I have to your comment is the same as it would be had I
asked if someone would kindly explain the theory of relativity to me,
and you responded "we put a man on the moon, didn't we?" Doesn't answer
the question.

JP White

unread,
Jul 30, 1999, 3:00:00 AM7/30/99
to
Bob Stringer wrote:

<snip>

> When I repeated my question in my last post, I referred to bots, since
> everyone has bots on the brain, but my comments apply equally to human
> players. If an expert, in a particular contact position [I don't
> include non contact positions, since in such cases far greater weight
> can be placed on the rolls themselves], isn't sure of the proper
> strategy to employ -- and especially if he's *also* not sure of the
> strategy his opponent should be following -- how can he conduct a
> rollout? Doesn't he have to make a number of significant judgments down
> the line? He can't keep rolling out every position that troubles him in
> the rollout itself -- it would never end.
>

The major difference between a human doing a rollout and a bot is that we are as humans are capable
(I hope) of learning from the rollouts. Whereas the bots come pre-packaged with a certain amount of
'knowledge' and will not learn from a rollout. There is value therefore in an expert (or even you or
I) performing a rollout since we may well discover what is 'going on' in a perplexing position.

I can understand where you are coming from with the rollout performed by an already suspect bot being
suspect of itself.
I suspect that bots are at their weakest here where they lack strategic understanding of a certain
position. However a 'straightforward' blunder (for want of better words) by the position evaluator
would show up when the bot rolled out each of the candidate moves, and lo and behold discovered a
lower ranking candidate to be superior.

So I suppose you are right in that rollouts can be unreliable, but I can foresee some use for them
where tactical errors are present. How you figure out the reliable vs unreliable rollout is yet
another question, lol.

--
JP White
Mailto:jp.w...@nashville.com

max_d

unread,
Aug 1, 1999, 3:00:00 AM8/1/99
to

Bob Stringer a écrit dans le message <37A261D2...@pacbell.net>...
>Phill Skelton wrote:
>>
>> [snipped]

>I've now checked the position on JellyFish at Level 7, and here are its


>5 best ways to play it:
>
> 1. 10/6 23/21 -0.378
> 2. 10/6 13/11 -0.380
> 3. 10/6 11/9 -0.387
> 4. 7/1 -0.395
> 5. 10/6 7/5 -0.398
>
>All very close, but the interesting thing is that 5/1 is no where to be
>seen.


Hi

did you have put the correct match settings ?

i have that with jfa 3.1

5/1 23/21 -0.268
10/6 23/21 -0.317
7/1 -0.328
10/6 3/1 -0.350
10/6 7/8 -0.394
13/9 6/4 -0.397
10/6 11/9 -0.409
10/6 13/11 -0.411

Despite what i have red above about rollouts and (human rollout too).
I have tried 32 interactive RO for both positions.

10/6 23/21
wins g/bg bg
X 34.5 0.9 0
O 65.5 1.5 0
eq O 0.316
sd 0.037
32 games equivalent to 710


5/1 23/21
wins g/bg bg
X 35.3 1.1 0
O 64.7 2.6 0
eq O 0.308
sd 0.036
32 games equivalent to 783

my impression doing this was

1/ it still cheats :) (what obviously is non sense .)
2/ in any of the 2 initial positions i never dared to hit lose
(i just played as "at the table")
3/ as it has been said

"I agree ..... but he's not miles behind


in the race. X has to make only 8 crossovers to get everyone home; O
has 7. Also, if you consider minus four pips with the roll as an even
race, then before he rolled X was 5 pips *ahead.*"

Right:

just try x to play 5.2
1 23/16 eq +0.046

(here JF is probably right !)

Just have to replay one rollout session with 2nd position ,
hitting lose in some case.

What a nice post from Rolland !!

MD.


Bob Stringer

unread,
Aug 1, 1999, 3:00:00 AM8/1/99
to
max_d wrote:
> [snipped]

Oops, you're right about the settings. I had it set for a 9 point
match, but at a score of 0-0.

However, I just did it at 1-8, Crawford, on JF 3.5 and got:

1. 7/1 -0.322
2. 10/6 23/21 -0.324
3. 10/6 13/11 -0.330
4. 10/6 11/9 -0.332
5. 10/6 7/5 -0.342
6. 10/6 3/1 -0.344
7. 5/1 23/21 -0.362

David Montgomery

unread,
Aug 2, 1999, 3:00:00 AM8/2/99
to
In article <379DFA62...@pacbell.net>,

Bob Stringer <REMO...@pacbell.net> wrote:
> This raises the question that I posed on this news group a few months
> ago. At the time I got an informative response or two, but no one ever
> really addressed the ultimate issue. So here it is again. What's the
> basis for having any confidence in a rollout?

The basis is: if we do a rollout long enough, and we play in the rollout
just the way we would play over the board, then eventually the rollout
results will converge to be arbitrarily close to the expected values.
(I'm going to ignore positions that diverge.) It's just statistics.

A computer rollout is actually a completely reliable estimate of a position's
equity -- assuming that both sides are played by that same computer program.

You can also look at the rollout at trying to simulate "perfect" play,
rather than "actual" play. This doesn't change much. The rollouts
are only a rough simulation of either one.

Doing a rollout long enough isn't hard anymore, because of the bots.

However, we can never have complete assurance that a rollout is played
the way that we would play over the board. Most players' play selections
are somewhat random, and in any event each player is unique.

There is no general theoretical assurance that rollout results will
closely approximate results for any two human players. In fact, there
are positions where the rollout results are known to be wildly different.

There is actually very little empirical evidence to support the idea
that computer rollouts are generally close to the expected results for
human players. This is because it takes too long to gather any decent
data with human play. The best evidence is probably from some Jellyfish
interactive rollouts, which use variance reduction to squeeze more
information out of manual rollouts. But I don't know of any work where
someone has tried to show that the bot rollouts actually reflect the
results humans would get.

Despite this lack of solid evidence, there are very good reasons for
trusting computer rollouts most of the time. Most importantly, we know
that the computer programs play very well. Because they play well, we
expect their results to closely reflect the results between two strong
human players, most of the time.

A second major reason is due to the nature of backgammon itself. Most
backgammon positions quickly engender a large number of variations.
After the first few rolls, there are a wide variety of different kinds
of problems. A computer program's lack of complete understanding of a
certain kind of problem that arises occasionally in a rollout won't
necessarily destroy the validity of the rollout, because that problem is
probably only a small fraction of the decisions that must be made. And
although the program makes some mistakes on these problems, these are
likely to be offset to some degree by mistakes made when playing for the
opposing side.

> If you don't know which is the better strategy/tactic/move in a
> particular position, and you question Snowie's evaluation at 3-ply, you
> do a rollout. Snowie proceeds to play out the position numerous times.
> However, why should only the specific position that you're investigating
> pose a difficult question? On the very next roll, the very same, or a
> similar, strategical or tactical issue may be presented again.

This is true, and with these kinds of positions you should think more
carefully about whether you trust the rollout results. If the same
thematic idea gets tested over and over again, then if the bot doesn't
understand it, the rollout is likely to be worthless.

> Or maybe
> an entirely different, but still difficult decision will have to be
> made.

This is less problematic, because once we have a variety of decisions,
difficult or otherwise, it is less likely that the bot will botch them
all.

The fact that there are difficult decisions, even (especially?) for the
bot, means that errors will be made in the rollout. If the errors are
small, it isn't of great concern unless many of them accumulate. Small
errors in the play will make only a small difference in the rollout
result. If the errors are large, that can be a problem.

Many positions are of a nature that big errors are rare, simply because
most plays are very close in equity. For example, bearing in and off
against contact from the bar.

Other positions are of a nature that although big errors are not so rare,
they occur for both sides. In this case, they offset each other somewhat,
and the overall effect on the rollout is not so severe.

The real problem is when big errors are not rare, and they occur
predominantly for one side. And in this case the rollouts won't be
reliable. The most diagnosable situation like this is when one side
often makes a big error on its first turn.

> If you're not sure
> what the "best" move is to start out with, and you don't know whether
> Snowie is making the best decisions in subsequent positions, what's the
> basis for your confidence in the rollout?

The hope is that the bot is making almost all "good" moves, where "good"
may not necessarily be "best."

> In fact,
> in positions that involve anything more than racing, how do we *ever*
> have confidence that a rollout yields the "correct" play?

One important idea is that the bots are less likely to completely obscure
the big errors. Let's take your example. Say it is a terrible mistake
to run off the anchor, and yet the bot likes it. Now, if you roll out
running and not-running, in the not-running variation the bot is likely
to run on its next turn, obscuring the difference between the two thematic
approaches. However, if running is a big enough error, then there will
still tend to be some difference due to the first play.

And in general, the bigger the error the bigger the difference that will
show up, other things being equal. For most positions you can have a lot
of faith in a rollout that produces a large difference. Rollouts that
generate a small difference are much less reliable, but also less important.

> I don't think it's any answer that we can have confidence in the rollout
> because Snowie has proven over time that it's a good BG player. The
> same argument can be used to justify Snowie's decisions at 3-ply. Yet
> when we question a 3-ply decision by Snowie, we do a rollout on Snowie.
> It seems rather circular.

The difference is, if you do a long rollout, then you see what the equity
*is* (to within some statistical uncertainty) *assuming that the bot plays
the position*. With an evaluation, you just have the opinion of a very
strong player. With a rollout you have the results of thousands of actual
games.

The rollout is the answer to the question: "What is the equity in this
position if the bot plays both sides?" The question you then have to ask
yourself is whether the answer to this question is close enough to the answer
to your real question, which is probably something like: "What is my equity
in this position against the people I tend to play against?"

For strong players playing other strong players, these questions will
usually have similar answers, so the expert can often rely on the rollout.
However, experts generally look at rollout results with a somewhat critical
eye, and if the results don't seem right, then they will consider reasons
why the rollout might be wrong. (They will also consider reasons why their
own understanding of the position might be wrong.)

> Frankly, I have the very same question about rollouts that are done by
> humans. If an expert is not sure of the correct strategy in a
> particular position, how can he do an effective rollout if subsequent
> positions keep presenting similar strategic decisions?

All the same problems occur with human rollouts. Humans have the advantage
that they can learn. They have several disadvantages, too. The biggest
is that they are too slow.

Bob Stringer

unread,
Aug 2, 1999, 3:00:00 AM8/2/99
to
Thanks. I appreciate your very thoughtful, instructive response.
0 new messages