Volatility

Chuck Bower

unread,

Oct 11, 1998, 3:00:00 AM10/11/98

to

(Note to readers: this long post contains some basic material,
probably known to many, as well as some technical details which are
possibly over the head of the typical reader. Both kinds of info
are intermingled throughout, so readers interested in only one approach
should just skim past the uninteresting part but still try to make it
through to the end, one way or another.)

My college chemistry book (CHEMISTRY: Reactions, Structure, and
Properties by Clyde R. Dillard and David E. Goldberg) defines volatility
as: "tendency toward evaporation". The word is also used in reference
to securities (i.e. in the stock and commodities markets). A person
could be referred to as 'volatile', which I would take to mean "very
easily excitable".

In general, HIGH volatility means "likely to change considerably"
while LOW volatility is the opposite ("very little change expected").
Here is a common type of backgammon position where volatility is high:

+24-23-22-21-20-19-+---+18-17-16-15-14-13-+
| O O O | | O O O |2
| O O O | | O O O |
| | | |
| | | |
| | | |
| | | |
| X | | |
| X X | | |
| X X | | |
X| X X | | |
X| O X X | | |
X| O X X X | | O |
+-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+

Money game. O on roll.

11/36 of the time, O will hit the blot (and cash). 25/36 O misses,
and probably gets gammoned. This position is HIGHLY volatile since the
possible immediate outcomes have widely varying expectations.

Now look at another common type of position:

+24-23-22-21-20-19-+---+18-17-16-15-14-13-+
| O O O O O O | X | O |2
| O O O O O O | | |
| O | | |
| | | |
| | | |
| | | |
X| | | |
X| | | |
X| X X | | |
X| X X | | |
X| X X | | |
X| X X | | O |
+-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+

Money game. O on roll.

Here not much is likely to happen for a couple of rolls. O will
move checkers around while X rides the pine. O's winning chances will
be about the same after his/her roll as they are before the roll. This
is a position with LOW volatility.

So far we have been talking QUALITATIVELY. There is a way
to measure volatility (or, more specifically, to compute it). The
definition of volatility (at least as used by the two common bots,
Jellyfish and Snowie) is not standard, but related. Let's start out
with Snowie's definition for 1-ply volatility. (BTW, thanks to Andri
Nicoulin of Oasya for giving me the details on Snowie's calculation).

Consider the 36 dice rolls and the resulting cubeless equity after the
best play for each of those rolls. Define these quantities as e1, e2,
..., e36. ('e' is individual equity or expectation). Then if we average
these 36 outcomes (or take the 'arithmetic mean') we get the 1-ply lookahead
equity:

E = sum(e1,e2,...,e36) / 36. ('E' is mean equity)

To find volatility, add up the squares of the deviations (i.e. differences)
between the various outcome expectations and the mean equity:

Vs = sum[ (e1-E)^2 , (e2-E)^2 , ... , (e36-E)^2 ]

Vs is Snowie's definition of volatility.

Jellyfish goes a couple steps further. (Note that chronologically
I should first have said what JF does and then Snowie, since Fredrik
Dahl started this four or so years ago. However, it is easier to
explain by beginning with Snowie's definition. And I HOPE no one comes
away from this thinking that JF uses SW's numbers to calculate its own
value of volatility!!!!)

Vj = sqrt(Vs/35)

Vj is Jellyfish's definition of volatility.

For those who have studied statistics (and still remember it!) you
will recognize Jellyfish's definition of volatility as just the standard
deviation of the 36 possible outcomes, in cubeless equity units.

So, how does this help us play better backgammon? Volatility really
comes into importance in backgammon when a player is deciding whether or
not to offer the cube. This is where a common BG term "missing the market"
is often introduced. Hypothetically, consider two different positions which
have the same equity, but one has HIGH volatility and one has LOW volatility.
Let's say the equity is such that both positions are correct takes. Should
the game leader cube? It depends on the volatility. If the volatility
is low, then one roll from now the position will likely still be a take, so
the leader (who is considering cubing) can, and probably should, wait. If
the volatility is high, then next time this person is on roll, the game
will likely be VERY different. Sometimes s/he will be worse off (and be
glad not to have doubled) and sometimes BETTER off (and wish s/he HAD
DOUBLED). The "BETTER off" position is so good that now the opponent has
a very clear (i.e. not even close) pass. Here, the player who thought
about doubling, but didn't, "lost his/her market".

Although on the surface it looks like it doesn't much matter whether
the player turned the cube or not, in fact when you look carefully (i.e.
"quantitatively") at such problems you will find that waiting to double
in volatile positions actually costs equity in the long run. In simple
terms, doubling AFTER losing your market is worth less than doubling
before, because in the cases where you gain ground, you get TWICE as
much equity with the cube turned, but you only get a fixed value (the
number on the cube prior to its being turned) if you wait because your
double gets passed.

Let's look at a SIMPLE (but potentially REAL) example of losing
one's market:

+24-23-22-21-20-19-+---+18-17-16-15-14-13-+
13| O O | | |
O| | | |
O| | | |
O| | | |
O| | | |
O| | | |64
X| | | |
X| | | |
X| | | |
X| X | | |
X| X | | |
12| X | | |
+-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+

Money game. X on roll. Cube decisions??

If the game continues, O can win by the following occurences:

a) X doesn't roll a doublet, AND
b) O bears off both checkers in a single roll.

Condition a) happens 30/36 of the time. For condition b), O fails with
any 1 (that's 11 rolls), any 2 (that's 9 more), and 43 (2 more). That
only leaves 14 rolls where O gets both checkers off. The chances of
BOTH a) and b) occuring is the product of these two probabilities:

30/36 * 14/36 = 420/1296 = 32.41%.

The seasoned BG players recognizes this as a CLEAR TAKE for X. But is
it a double?

Certainly if X doesn't double, s/he loses 1 point 32.41% of the time
and wins 1 point the remainder (67.59%) and the net is 0.6759 - 0.3241 =
0.351 times the value of the stakes. (It IS a "money" game.)

If X doubles and O correctly takes, he wins twice this much:

0.6759*(+2) + 0.3241*(-2) = 0.702 times the value of the stakes.

X lost his/her market by not doubling. In fact s/he lost more than 1/3
of a point. BIG MISTAKE.

In general conditions aren't this simple. Maybe there are gammons,
and/or maybe you are playing the Jacoby Rule, or maybe it's a match. And
if there are a lot of rolls left, you might have lots of chances to turn the
cube, as might your opponent. It sometimes matters whether the cube is
centered or in the possession of the game leader. (Hey, if backgammon
were simple it wouldn't be nearly as much fun!)

The bots also perform 2-ply lookahead and can calculate volatility
after the associated 1296 outcomes, which is really what you want to look
at in a double/no double decision. The motivated reader should be able
to generalize the above equations for the 2-ply case.

At 2-ply lookahead, JF calculates a whopping 1.017 for the volatility
of the just illustrated position. But is there a way to actually use the
current equity and the volatility to decide whether a position is a double?
Sort of. Here is one way, using JF's volatility. Consider the quantity:

E + f*Vs

where E and Vs are defined above, and f is a multiplicative "factor".
You want to compare this quantity with the drop/take point (value of equity
where game leader doubles and game trailer has a borderline decision as
to whether or not to take or pass). One question is: "what value of
f should be used in order for the following condition to hold?"

E + f*Vs > T, then double, otherwise wait.

(Here, T is the drop/take point equity as seen from the leader's point of
view.)

If you assume that the 1296 outcomes are Gaussian distributed
(even though they are not), then a value of 1 for f means that leader
will lose his/her market 16% of the time, and a value of 0.5 results in
31% market losers. I believe Kleinman has studied market losers and
concluded that somewhere in the 20-30% range of market losers is typically
where a double should be offered. Based on this it looks like the value of f
should be somewhere between 0.5 and 1. (Clearly this simple 'rule' ignores
how much market is lost. Kit has emphasized that a lot of small market
losers may still be a hold but even a few LARGE market losers is often
a double.)

I believe that the bots don't actually use this method, however.
They can look at all 1296 possible outcomes and calculate an semi-cubeFUL
equity based on which of those outcomes will be a cash next time. Then they
can compare that number with the current equity to decide which is larger and
which cube decision (double or hold) is optimal. But most of us humans can't
do that within the time constraints of a game (especially since our opponents
don't allow external aids, like pencils, paper, calculators,...). So, chalk
up one more advantage for the sandbrains.

Chuck
bo...@bigbang.astro.indiana.edu
c_ray on FIBS

Chuck Bower

unread,

Oct 11, 1998, 3:00:00 AM10/11/98

to

In article <6vp3ae$tje$1...@jetsam.uits.indiana.edu>,
Chuck Bower <bo...@bigbang.astro.indiana.edu> wrote:

(snip)

> Here is one way, using JF's volatility. Consider the quantity:
>
> E + f*Vs
>
>where E and Vs are defined above, and f is a multiplicative "factor".
>You want to compare this quantity with the drop/take point (value of equity
>where game leader doubles and game trailer has a borderline decision as
>to whether or not to take or pass). One question is: "what value of
>f should be used in order for the following condition to hold?"
>
> E + f*Vs > T, then double, otherwise wait.
>
>(Here, T is the drop/take point equity as seen from the leader's point of
>view.)

(snip)

Thanks to Stu Katz for pointing out a 'typo' in the above mathematical
expressions. I erroneously used 'Vs' (Snowie's volatility) when I really
meant to use 'Vj' (Jellyfish's definition of volatility). I hope that these
were the only errors in my post, but I'm not holding my breath. ;)

Claes Thornberg

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

to

Nice post, Chuck!
I have one thing to add to the equation

E + f * Vj > T

(where E is the equity, Vj volatility as defined by JF, T is
take-point)

A friend of mine has told me that it is possible to prove, with some
simplifications, that the correct value of f is 0.5. If people are
interested, I'll try to post what simplifications you need to make.

Regards,
Claes Thornberg
--
______________________________________________________________________
Claes Thornberg Internet: cla...@it.kth.se
Dept. of Teleinformatics URL: NO WAY!
KTH/Electrum 204 Voice: +46 8 752 1377
164 40 Kista Fax: +46 8 751 1793
Sweden

Dan Frank

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

to

bo...@bigbang.astro.indiana.edu (Chuck Bower) wrote:
Subject: Volatility

> My college chemistry book (CHEMISTRY: Reactions, Structure, and
> Properties by Clyde R. Dillard and David E. Goldberg) defines volatility
> as: "tendency toward evaporation". The word is also used in reference
> to securities (i.e. in the stock and commodities markets). A person
> could be referred to as 'volatile', which I would take to mean "very
> easily excitable".

I would suppose that any dictionary of the English language (even very popular
ones, accsesible even to those "students" who can read) gives a pertinent
description.

The verb "volatiliser" (French, in English probably "to volatilise") means
"to evaporate".

With "volatility" is ment in effect "lability" (as opposed to stability)
(and a man can be labile but not volatile - which is the characterstic of
liquids like fuel, alcohol, solvents aso.)

Other misused, misunderstood words:

equity
timing.

--
Dan Frank

editor & publisher of ESSENTIAL BACKGAMMON

Gavin Anderson

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

I have a low-level question and a thought about how to clarify volatility
and the necessity of doubling early.

Chuck Bower wrote in message <6vp3ae$tje$1...@jetsam.uits.indiana.edu>...
(snip)

> Jellyfish goes a couple steps further

(snip)

> Vj = sqrt(Vs/35)
>
>Vj is Jellyfish's definition of volatility.
>
> For those who have studied statistics (and still remember it!) you
>will recognize Jellyfish's definition of volatility as just the standard
>deviation of the 36 possible outcomes, in cubeless equity units.

(snip)

My question is - why do you divide by 35 and not 36 in the above equation?

> So, how does this help us play better backgammon? Volatility really
>comes into importance in backgammon when a player is deciding whether or
>not to offer the cube. This is where a common BG term "missing the market"
>is often introduced. Hypothetically, consider two different positions
which
>have the same equity, but one has HIGH volatility and one has LOW
volatility.
>Let's say the equity is such that both positions are correct takes. Should
>the game leader cube? It depends on the volatility. If the volatility
>is low, then one roll from now the position will likely still be a take, so
>the leader (who is considering cubing) can, and probably should, wait. If
>the volatility is high, then next time this person is on roll, the game
>will likely be VERY different. Sometimes s/he will be worse off (and be
>glad not to have doubled) and sometimes BETTER off (and wish s/he HAD
>DOUBLED). The "BETTER off" position is so good that now the opponent has
>a very clear (i.e. not even close) pass. Here, the player who thought
>about doubling, but didn't, "lost his/her market".

(snip)

I was (long before this post) confused about doubling early when volatility
is high. I used to worry that (As Chuck puts it) 'sometimes s/he will be
worse off (and glad not to have doubled)'. It seemed to me that I should
wait and see in case things did go wrong. But then I explained/clarified it
to myself like this:

Sure, when volatility is high and you double, sometimes you'll be worse off
next roll, and sometimes better off, but the point is that you'll be better
off MORE OFTEN than you'll be worse off. Right? Otherwise you wouldn't be
worrying about doubling in the first place. You have to be winning to be
considering a double, and if you're in a situation where after the next roll
you'll find yourself worse off most of the time, then you're not winning!

So there is volatility, but it is skewed in your favour. That's why you want
to grab the chance to double early. Of course I'm not saying that volatility
is always skewed that way. Obviously there are times when the volatility
would be skewed against you, and times where it's about even, but you
wouldn't be worrying about doubling in those situations.

It seems a really obvious thing to point out, but hey, I got confused, so
I'm sure other people do too.

So if I can presume to edit Chuck's superb note, I think it would be helpful
to say:

>If the volatility is high, then next time this person is on roll, the game
>will likely be VERY different. Sometimes s/he will be worse off (and be

>glad not to have doubled) BUT MORE OFTEN S/HE WILL BE BETTER off (and wish
s/he HAD
>DOUBLED).

Am I making any mistakes in stating the above?

Gavin Anderson
brit...@mbf.sphere.ne.jp

Claes Thornberg

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

Just what we needed! Mr. Frank tells us how to use english words
properly. Is the 1st of April or what?

bshe...@hasbro.com

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

In article <yvkww66...@cuchulain.it.kth.se>,

Claes Thornberg <cla...@cuchulain.it.kth.se> wrote:
> Nice post, Chuck!
> I have one thing to add to the equation
>
> E + f * Vj > T
>
> (where E is the equity, Vj volatility as defined by JF, T is
> take-point)
>
> A friend of mine has told me that it is possible to prove, with some
> simplifications, that the correct value of f is 0.5. If people are
> interested, I'll try to post what simplifications you need to make.

I would very much like to see the details.

TIA,
Brian Sheppard

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

bshe...@hasbro.com

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

In article <6vp3ae$tje$1...@jetsam.uits.indiana.edu>,
bo...@bigbang.astro.indiana.edu (Chuck Bower) wrote:

> where E and Vs are defined above, and f is a multiplicative "factor".
> You want to compare this quantity with the drop/take point (value of equity
> where game leader doubles and game trailer has a borderline decision as
> to whether or not to take or pass). One question is: "what value of
> f should be used in order for the following condition to hold?"
>
> E + f*Vs > T, then double, otherwise wait.
>
> (Here, T is the drop/take point equity as seen from the leader's point of
> view.)
>
> If you assume that the 1296 outcomes are Gaussian distributed
> (even though they are not), then a value of 1 for f means that leader
> will lose his/her market 16% of the time, and a value of 0.5 results in
> 31% market losers. I believe Kleinman has studied market losers and
> concluded that somewhere in the 20-30% range of market losers is typically
> where a double should be offered. Based on this it looks like the value of f
> should be somewhere between 0.5 and 1. (Clearly this simple 'rule' ignores
> how much market is lost. Kit has emphasized that a lot of small market
> losers may still be a hold but even a few LARGE market losers is often
> a double.)

To clarify: *Kleinman's* simple rule ignores how much equity is lost by the
market losers. The E + f*Vs > T takes into account how much equity is lost.

In practice, it is possible to tune T using cubeful rollouts.

There is one very important point to consider when using this rule: there are
situations where Vs is very, very large, and in that case it may be that E +
f*Vs > T, where E is actually very small, or even negative. This very bug
plagued TD-Gammon in its match against Malcolm Davis earlier this year.

The solution is to look for situations where E + f*Vs > T and also E is
sufficiently large, or E - g*Vs > -C, or some other condition that guarantees
that you actually are a favorite.

> I believe that the bots don't actually use this method, however.
> They can look at all 1296 possible outcomes and calculate an semi-cubeFUL
> equity based on which of those outcomes will be a cash next time. Then they
> can compare that number with the current equity to decide which is larger and
> which cube decision (double or hold) is optimal. But most of us humans can't
> do that within the time constraints of a game (especially since our opponents
> don't allow external aids, like pencils, paper, calculators,...). So, chalk
> up one more advantage for the sandbrains.

While you are certainly correct that computers can use the forward-lookahead
rule you describe, I believe that they actually use the E + f*Vs > T rule you
described earlier.

For one thing, I know that TD-Gammon uses that rule, and there is commentary
from Fredrik Dahl suggesting that JF used a similar rule at some point in its
past. I don't know about SW.

The problem with implementing the forward search rule that you have described
is that the neural networks produce cubeless evaluations, and the forward
search rule you described requires cubeful evaluations. It is a non-trivial
problem to convert cubeless into cube-using equites in the most general case.

Warm Regards,

bshe...@hasbro.com

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

In article <6vuufs$t1i$1...@news1.sphere.ad.jp>,

"Gavin Anderson" <brit...@mbf.sphere.ne.jp> wrote:
> I have a low-level question and a thought about how to clarify volatility
> and the necessity of doubling early.
>
> Chuck Bower wrote in message <6vp3ae$tje$1...@jetsam.uits.indiana.edu>...
> (snip)
> > Jellyfish goes a couple steps further
> (snip)
> > Vj = sqrt(Vs/35)
> >
>

> My question is - why do you divide by 35 and not 36 in the above equation?

The formula for the standard deviation of a sample divides by N-1 rather than
N because there is one fewer "degree of freedom" in the data than seems
apparent based upon the number of data points. The reason is that the mean of
the *sample* is used in computing the result, rather than the mean of the
*population*.

Brian

gtes...@my-dejanews.com

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

Regarding the E + f*Vs > T rule: TD-Gammon's rule is somewhat
different; it's more like E > T(V) , where T(V) is a curve
specifying how the doubling threshold varies as a function of
volatility. The above rule is equivalent to making T(V) a
straight line, but more generally some sort of curved shape is
needed. It's an interesting question as to exactly what type
of curve to use. The only thing existing theory tells us are
the endpoint values: in the zero-volatility limit, theory says
T(0) = .6, and in the maximum-volatility limit (i.e. last-roll
positions), theory says T(1) = 0. What I actually did for T(V)
was to borrow some results of Zadeh and Kobliska: they published
some theory for doubling in races, in which they derived formulas
for T(P), the doubling threshold as a function of the pip count.
I converted their T(P) curves into equivalent T(V) curves.
Of course, nowadays we can fit T(V) to empirical rollout data,
and this would probably work much better.

The other wrinkle in TD's doubling algorithm is that I came up
with a higher dimensional generalization of Zadeh and Kobliska's
theory to allow for gammons. Unfortunately this is too complicated
to be described here. I wrote up a long paper on this several
years ago, but never found a suitable journal where it could be
published. :-(

-- Gerry Tesauro

Chuck Bower

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

In article <6vuufs$t1i$1...@news1.sphere.ad.jp>,
Gavin Anderson <brit...@mbf.sphere.ne.jp> wrote:

>I have a low-level question and a thought about how to clarify volatility
>and the necessity of doubling early.
>
>Chuck Bower wrote in message <6vp3ae$tje$1...@jetsam.uits.indiana.edu>...
>(snip)

>> Jellyfish goes a couple steps further

>(snip)

>> Vj = sqrt(Vs/35)
>>
>>Vj is Jellyfish's definition of volatility.
>>
>> For those who have studied statistics (and still remember it!) you
>>will recognize Jellyfish's definition of volatility as just the standard
>>deviation of the 36 possible outcomes, in cubeless equity units.
>

>(snip)

>
>My question is - why do you divide by 35 and not 36 in the above equation?
>

Brian answered this, and I can't add anything to his explanation. I
must admit it is something which I don't understand, so I choose to trust
the statisticians that they know what they are talking about.

>
>> So, how does this help us play better backgammon? Volatility really
>>comes into importance in backgammon when a player is deciding whether or
>>not to offer the cube. This is where a common BG term "missing the market"
>>is often introduced. Hypothetically, consider two different positions which
>>have the same equity, but one has HIGH volatility and one has LOW volatility.
>>Let's say the equity is such that both positions are correct takes. Should
>>the game leader cube? It depends on the volatility. If the volatility
>>is low, then one roll from now the position will likely still be a take, so
>>the leader (who is considering cubing) can, and probably should, wait. If
>>the volatility is high, then next time this person is on roll, the game
>>will likely be VERY different. Sometimes s/he will be worse off (and be
>>glad not to have doubled) and sometimes BETTER off (and wish s/he HAD
>>DOUBLED). The "BETTER off" position is so good that now the opponent has
>>a very clear (i.e. not even close) pass. Here, the player who thought
>>about doubling, but didn't, "lost his/her market".

>(snip)

Before addressing Gavin's next point, I want to try and clarify
my previous post (which is not rewritten here for brevity's sake). I
made some simplifications which may have confused some. For one thing,
as Brian pointed out, just showing that you are better off doubling
this turn than you will be next turn, although NECESSARY to make a
technically correct double, is not SUFFICIENT. In fact, the very first
position in my post had extremely high volatility with huge market
losers 11/36 of the time. Unfortunately the majority of the time (the
other 25/36) the roller was going to be much worse off. Oversimplifying
(again), the player on roll should be a favorite as well. However, I'm not
sure that even these two conditions together are sufficient, and, in
particular they are NOT sufficient if you define "favorite" as "cubeless
equity > 0" since cube value enters in most of the time.

Another thing I did not illustrate (but did try to warn the reader
about) is that its not just the total chances that the market will be
lost that matters, but also the SIZE of the loss in each of the market
losing sequences. The simple inequality ( E + f*Vj > T ) and its
use with a normal distribution to choose the value of 'f' is really only
looking at the chances that market is lost but not the size of the loss.

(Gavin continued:)

>
>I was (long before this post) confused about doubling early when volatility
>is high. I used to worry that (As Chuck puts it) 'sometimes s/he will be
>worse off (and glad not to have doubled)'. It seemed to me that I should
>wait and see in case things did go wrong. But then I explained/clarified it
>to myself like this:
>
>Sure, when volatility is high and you double, sometimes you'll be worse off
>next roll, and sometimes better off, but the point is that you'll be better
>off MORE OFTEN than you'll be worse off. Right?

No, I don't think this is right, depending on what you mean by
"better off." See below.

>Otherwise you wouldn't be
>worrying about doubling in the first place. You have to be winning to be
>considering a double, and if you're in a situation where after the next roll
>you'll find yourself worse off most of the time, then you're not winning!
>
>So there is volatility, but it is skewed in your favour. That's why you want
>to grab the chance to double early. Of course I'm not saying that volatility
>is always skewed that way. Obviously there are times when the volatility
>would be skewed against you, and times where it's about even, but you
>wouldn't be worrying about doubling in those situations.

(snip)

One thing which Gavin seems to be confused about is what I meant by
"better off". It appears that he interprets this to mean "better off than
the opponent", which can also be roughly equivalent to "winning" or "being
the favorite". That was NOT what I meant.

I was comparing roller's equity THIS roll with roller's equity the
NEXT time s/he will be on roll, with the cube in play. Let me try again:

Assume roller has access to the cube. Compare the cubeFUL equities of
the following two situations:

1) Next time this player is on roll, and the cube is sitting EXACTLY where
it is now (and with its current value),

2) Next time this player is on roll, and the cube is IN THE POSSESSION OF
OPPONENT (and with a value twice its current value),

If roller's cubeFUL equity is higher for situation 1 than situation 2, then
roller should NOT double this turn. If roller's cubeFUL equity is higher
for situation 2, it MIGHT be a double this time.

It looks like what Gavin is proposing is that a player should double
anytime s/he is a cubeFUL favorite (i.e. better off than his/her opp).
That is not right. Unfortunately the "losing your market" example I gave
was one which illustrated why a double was correct. I could have (and
maybe should have) also given an example where roller is a cubeFUL equity
favorite whether or not s/he turns the cube, but still should NOT double.
These positions aren't as simple as the one I chose, (but there are
certainly some which aren't all THAT complicated).

In general, (and this is something that Keeler/Spencer and Zadeh/
Kobliska assumed in their continuous model approximations to BG) you
are roughly as likely to gain equity as you are to lose it between now
and the next time you have a chance to consider using the cube (i.e. your
next roll). That by itself seems like an argument for NOT doubling.

I suspect that I confused a lot of readers, not just Gavin, and I'm
not picking on him by using his post to attempt to clarify what I said
earlier. In fact, I thank him for pointing out some of my shortcomings.

As a sidelight, I think it is important to emphasize that in
attempting to simplify the game of backgammon (either for one's own
better understanding or in attempting to enlighten others), that the
models, rules-of-thumb, and illustrations are ALMOST NEVER exact solutions.
Assumptions are often made which don't necessarily hold in general. That
is one reason why it is ALWAYS important to THINK while you are playing
the game, and not just follow a cookbook of rules. And besides, it's
more fun that way!

Gary Wong

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

bshe...@hasbro.com writes:
> In article <6vp3ae$tje$1...@jetsam.uits.indiana.edu>,
> bo...@bigbang.astro.indiana.edu (Chuck Bower) wrote:

> > where E and Vs are defined above, and f is a multiplicative "factor".
> > You want to compare this quantity with the drop/take point (value of equity
> > where game leader doubles and game trailer has a borderline decision as
> > to whether or not to take or pass). One question is: "what value of
> > f should be used in order for the following condition to hold?"
> >
> > E + f*Vs > T, then double, otherwise wait.
> >
> > (Here, T is the drop/take point equity as seen from the leader's point of
> > view.)
> >
> > If you assume that the 1296 outcomes are Gaussian distributed
> > (even though they are not), then a value of 1 for f means that leader
> > will lose his/her market 16% of the time, and a value of 0.5 results in
> > 31% market losers. I believe Kleinman has studied market losers and
> > concluded that somewhere in the 20-30% range of market losers is typically
> > where a double should be offered. Based on this it looks like the value of f
> > should be somewhere between 0.5 and 1. (Clearly this simple 'rule' ignores
> > how much market is lost. Kit has emphasized that a lot of small market
> > losers may still be a hold but even a few LARGE market losers is often
> > a double.)
>

> To clarify: *Kleinman's* simple rule ignores how much equity is lost by the
> market losers. The E + f*Vs > T takes into account how much equity is lost.

Well, only sort of. Here's a superficial argument why it can't: measuring
"volatility" as a scalar gives us only one degree of freedom, so it can't
represent both "how many market losers" and "how much market is lost"
simultaneously, without making some simplification about how those two
factors are related (in fact, the simplification is the normal distribution
assumption, as Chuck pointed out).

As a concrete example: suppose we have two positions, and both of them
have the same cubeless equity (let's say 0.4). Assume our opponent's
drop point is 0.55, so he has a fairly easy take if we double now from
either of these positions.

Now, let's say that in one of these positions we have 5 big market
losers, and the other 31 rolls leave us slightly worse off than we are
at the moment. Overall, the volatility is (let's say) 0.3.

Consider the symmetrical position: instead of 5 big market losers, we have
5 disasters; the other 31 rolls leave us slightly BETTER than we are at
the moment. The volatility must still be 0.3 (since the equity changes are
of exactly the same magnitude as the case above).

It should be clear that we should be more eager to double in the first
example than the second (even though the cubeless equities and volatilities
are identical). It seems fairly evident that must exist plenty of pairs of
positions A and B such that A is a double and B is not, even though their
equities, volatilities, and drop points are identical. Therefore I claim
there is NO linear function of equity, volatility and drop point that
can determine whether a position is a double or not. (Which is nothing
new, I just wanted to say it :-)

> > I believe that the bots don't actually use this method, however.
> > They can look at all 1296 possible outcomes and calculate an semi-cubeFUL
> > equity based on which of those outcomes will be a cash next time. Then they
> > can compare that number with the current equity to decide which is larger and
> > which cube decision (double or hold) is optimal.
>

> While you are certainly correct that computers can use the forward-lookahead
> rule you describe, I believe that they actually use the E + f*Vs > T rule you
> described earlier.
>
> For one thing, I know that TD-Gammon uses that rule, and there is commentary
> from Fredrik Dahl suggesting that JF used a similar rule at some point in its
> past. I don't know about SW.
>
> The problem with implementing the forward search rule that you have described
> is that the neural networks produce cubeless evaluations, and the forward
> search rule you described requires cubeful evaluations. It is a non-trivial
> problem to convert cubeless into cube-using equites in the most general case.

True, but I think we can do it well enough to be useful in practice.

The basic problem we're trying to solve is (to state the obvious) to
maximise our cubeful equity at the end of the next exchange. This
quantity (to state the even more obvious :-) is a random variable,
being one of 1296 possible equities, so all we can do is maximise its
expected value. We can already estimate the 1296 possible CUBELESS
equities; estimating the CUBEFUL equities is difficult as you point
out, but by assuming constant efficiencies for all subsequent cubes,
we arrive at the following relation (I can't be bothered deriving it
symbolically, so I'll just draw it. Sorry about the ASCII art!) --

If we don't double (A)
Cubeful
equity
^ .
| .
| .
| .
1.0 + - - - - - - - ,-------------
| ,' .
| ,' .
| ,' .
| ,' .
| ,' .
< ,' .
> ,' .
+^v--------------+------------> Cubeless equity
Drop at end of exchange
point

If we double (B)
Cubeful
equity
^ . /
| . /
| . /
| ./
1.0 + - - - - - - - -/- - - - - -
| /.
| / .
| / .
| / .
| / .
< / .
> / .
+^v--------------+------------> Cubeless equity
Drop at end of exchange
point

The slope of graph B will vary depending on how efficiently the opponent
can recube from the given position; the slope of graph A (to the left of
the drop point) will vary depending on how efficiently BOTH players can
subsequently cube. The slope will also be different depending on whether
it's an initial double or a redouble, but that's easy to account for.

Anyway, what we are essentially faced with is a choice between not doubling
(and getting graph A) or doubling (and getting graph B). After the next
exchange, we will be at one of 1296 points scattered over that whichever
graph we pick (the no double one, or the double one). A simpler way of
looking at it is to compare the difference in cubeful equity between
not doubling and doubling:

The cost of not doubling (B-A)
Cost
^ .
| . /
| . /
| . /
| . /
| . /
| ./
0 +^v-------------,*------------> Cubeless equity
| _,-~ . at end of exchange
| _,-~ .
| _,-~ .
|,-~ .
| .
v Drop
point

Notice that BELOW the opponent's drop point, the "cost" of not doubling is
negative (ie. we are glad we didn't double, if the exchange that actually
happened left us below the drop point). ABOVE the drop point, the cost
of not doubling is positive (ie. we wish we had doubled). The OVERALL
cost of not doubling is the AVERAGE cost over all 1296 exchanges. If
this average cost is negative, don't double; if it's positive, double.

If the B-A graph above were perfect (and I believe it could be made perfect,
assuming you could perfectly calculate the recube efficiencies after each
exchange), then I think the double/no double decision based on the above
procedure would also be perfect. (Generalisations to cope with gammons
are left as an exercise for the reader.) If I understand Chuck correctly,
then this is what he means by "calculating a semi-cubeful equity... to
decide which cube decision is optimal".

All I am really claiming is that I believe that calculating double vs.
no double costs for all 1296 subsequent changes must be superior to a
linear equity/volatility relation. The reason is that the equity/volatility
involves two inaccuracies (uncertainty of subsequent cube efficiencies,
and failure to interpret "skew" in the distribution of the 1296 resultant
equities); the "average cost" method described above corrects one of those
inaccuracies and leaves only the question of subsequent cube efficiencies.
(Assuming constant recube efficiency which is estimated empirically is a
reasonable first approximation.)

I should add that I haven't yet implemented the "average cost" algorithm,
so you can regard all the above as "speculative" (or a less polite
synonym :-).

Cheers,
Gary.
--
Gary Wong, Department of Computer Science, University of Arizona
ga...@cs.arizona.edu http://www.cs.arizona.edu/~gary/

Gregg Cattanach

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

From the Random House Webster's Unabridged Dictionary:
vol·a·tile (volÆÃ tl, -til or, esp. Brit., -t lÅ), adj.
1. evaporating rapidly; passing off readily in the form of vapor: Acetone
is a volatile solvent.
2. tending or threatening to break out into open violence; explosive: a
volatile political situation.
3. changeable; mercurial; flighty: a volatile disposition.
4. (of prices, values, etc.) tending to fluctuate sharply and regularly:
volatile market conditions.
5. fleeting; transient: volatile beauty.

I would say definitions #4 and #5 tend to have the meaning we think of in
backgammon.

la·bile (l€ÆbÃl, -b l), adj.
1. apt or likely to change.
2. Chem. (of a compound) capable of changing state or becoming inactive
when subjected to heat or radiation.
[1400–50; late ME labyl < LL l€bilis, equiv. to L l€b( ) to slip + -ilis
-ILE]

—la·bil·i·ty (lÃ bilÆi t , l€-), n.

Lability is a good word for this, too :-))

-

Gregg Cattanach
gcattanach...@prodigy.net

Zox at GamesGrid, VOG, FIBS

Dan Frank <10001...@CompuServe.COM> wrote in article
<eAIBsUi...@ntdwwaaw.compuserve.com>...

bshe...@hasbro.com

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

In article <700an2$d1d$1...@nnrp1.dejanews.com>,

gtes...@my-dejanews.com wrote:
>
> The other wrinkle in TD's doubling algorithm is that I came up
> with a higher dimensional generalization of Zadeh and Kobliska's
> theory to allow for gammons. Unfortunately this is too complicated
> to be described here. I wrote up a long paper on this several
> years ago, but never found a suitable journal where it could be
> published. :-(
>

Thanks for the elaboration. It is a rare treat to have an actual description
of the doubling algorithm used in a real program.

If I may be so bold as to ask: can you publish your expanded theory here? You
won't find a more appreciative audience anywhere!

Warm Regards,
Brian

bshe...@hasbro.com

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

In article <wtsogs9...@brigantine.CS.Arizona.EDU>,

Thanks for the example, Gary.

To put your example into numbers, in one case we have an equity of 0.4 and a
volatility of 0.3, with 5 rolls leading to a high equity and 31 rolls leading
to a low equity. If my calculations are correct, we have the following
situation:

5 rolls: equity 1.1469
31 rolls: equity 0.2795

We might realize such a case using a position in which we have a slight lead
in a long race and a shot at a gammon if we hit a 9 (i.e. 5 rolls).

In the other case we have a symmetric situation, so we have the following
probabilities:

5 rolls: equity -0.3469 (as far above 0.4 as 0.2795 is below 0.4)
31 rolls: equity 0.5205 (as far below 0.4 as 1.1469 is above 0.4)

We might realize such a case using a position where we are bearing in against
an anchor with a lead in the race, but with a 5/36 chance of exposing a blot
to a double shot.

In both situations we assume a drop point of 0.550, so the second case cannot
possibly be a double (since we have no market losers at all) whereas in the
first case we should consider doubling (since we have 5 market losers).

> It seems fairly evident that must exist plenty of pairs of
> positions A and B such that A is a double and B is not, even though their
> equities, volatilities, and drop points are identical. Therefore I claim
> there is NO linear function of equity, volatility and drop point that
> can determine whether a position is a double or not. (Which is nothing
> new, I just wanted to say it :-)

I agree. It is only in practice that the linear rule works. :-)

> > The problem with implementing the forward search rule is
> > that neural networks produce cubeless evaluations, and the forward
> > search rule requires cubeful evaluations. It is a non-trivial
> > problem to convert cubeless into cubeful equites in general.

> True, but I think we can do it well enough to be useful in practice.
>

> [Huge snip]

This is an area of active research for me. I set out with the same idea you
had, but I am not satisfied with the results.

First, the neural-net output is a vector of gammon and winning chances. You
need a vector for accurate match tactics. Your goal is to convert the cubeless
answer into a cubeful answer, and then weight by the appropriate match-equity
table entries.

The conversion of the evaluation vector from cubeless to cubeful contains some
surprising twists. For example, did you know that the owner of the cube hardly
ever wins a gammon? It's true: he wins by redoubling before then.

If you only need total equities then some of the surprising twists cancel
others and you get a reasonable piecewise linear approximation, such as Chuck
Bower uses in his doubling calculations. For example, the fact that the
opponent wins by redoubling cancels the fact that he loses his gammon chances.
But if you need the breakdown into wins and gammons, then the piecewise
approximation works badly.

You can only go so far with simple rules. Even in money games the conversion
is complicated, and when you get into match situations then you can't begin
to fathom the possibilities.

The simple linear rules you describe have about a 0.1 point standard error. By
contrast, the underlying cubeless equity estimate has about a 0.035 point
standard error. I believe that the conversion errors overwhelm the theoretical
superiority of the model.

As I mentioned before, this is an area of active research, so I am not without
hope. Neither should you be without hope. But I want to caution the readers of
this group against believing that this is just a question of implementation.
The problems go way beyond that.

Warm Regards and Best Wishes,
Brian Sheppard

gtes...@my-dejanews.com

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

In article <70232u$pbn$1...@nnrp1.dejanews.com>,
bshe...@hasbro.com wrote:

>
> Thanks for the elaboration. It is a rare treat to have an actual description
> of the doubling algorithm used in a real program.
>
> If I may be so bold as to ask: can you publish your expanded theory here? You
> won't find a more appreciative audience anywhere!
>
> Warm Regards,
> Brian

The latex source file for the paper is about 100K and packed
with hairy equations. It comes to about 25 pages when postscripted.
So I don't think this is an appropriate place. If anyone has any
bright ideas, please let me know.

--Gerry

Michael J Zehr

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

One common model used for evaluating backgammon races is the 1-checker
model (1CM). In this model a real BG position with an M to N pipcount
race is modelled as each player having one checker either M or N pips
away from bearing off. It's simple to model because every play is
forced and even if one wants to model races up to 150 by 150 (it's rare
to find a no-contact BG position with pipcounts above this) there are
only 22500 possible positions (four times as many if you want to examine
cubeless plus the three possible cube locations). One can even throw in
match scores and still keep the total number of positions well within
the range of what a computer can calculate exactly and store.

The following discussion is limited to what I will call 1CM(50+), where
both sides have 50 or more pips, as under that it becomes less and less
like real backgammon bearoffs.

If one solves the 1CM(50+) one finds that:

E + V*1.075122 > .57322

is a very close, but not perfect formula for deciding whether or not to
double. (There's no linear function that works for all positions.)

One could hand-wave and argue over how this model ought to compare to
real backgammon positions, but the summary of that argument is that
equity, volatility, and cash point aren't sufficient variables to have a
linear function that always works (as Brian Sheppard pointed out).

(For reference, the cash point in 1CM(50+) is between .561 and .569,
i.e. there are some positions with a cubeless equity as low as .561 that
are double/drop and some positions with a cubeless equity as high as
.569 that are double/take.)

Someone with more time than I (or with an api into a database such as
the Sconyer's bearoff database) could do the same thing with real
bearoffs and see how closely bearoff doubles can be determined by a
linear function. If one wanted to extend this to all BG positions one
would need to work gammons into the picture (which is most interesting
on an initial double because of the Jacoby rule) and define an
asymmetrical volatility function to work around the except Brian
Sheppard noted where you have positions A and B with identical equity
and variance, but A has N really good rolls and 36-N moderately bad
rolls and B has N really bad rolls and 36-N moderately good rolls.

-Michael J. Zehr

Gary Wong

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

bshe...@hasbro.com writes:
> > > To clarify: *Kleinman's* simple rule ignores how much equity is lost by the
> > > market losers. The E + f*Vs > T takes into account how much equity is lost.
> >
> > Well, only sort of. Here's a superficial argument why it can't: measuring
> > "volatility" as a scalar gives us only one degree of freedom, so it can't
> > represent both "how many market losers" and "how much market is lost"
> > simultaneously, without making some simplification about how those two
> > factors are related (in fact, the simplification is the normal distribution
> > assumption, as Chuck pointed out).
> >

> > As a concrete example: suppose we have two positions... [snip]

>
> To put your example into numbers, in one case we have an equity of 0.4 and a
> volatility of 0.3, with 5 rolls leading to a high equity and 31 rolls leading
> to a low equity. If my calculations are correct, we have the following
> situation:
>

[snip]

>
> In both situations we assume a drop point of 0.550, so the second case cannot
> possibly be a double (since we have no market losers at all) whereas in the
> first case we should consider doubling (since we have 5 market losers).

Thanks! That's exactly what I meant to write if I hadn't been so lazy :-)

> > > The problem with implementing the forward search rule is
> > > that neural networks produce cubeless evaluations, and the forward
> > > search rule requires cubeful evaluations. It is a non-trivial
> > > problem to convert cubeless into cubeful equites in general.
> >
> > True, but I think we can do it well enough to be useful in practice.
>

> This is an area of active research for me. I set out with the same idea you
> had, but I am not satisfied with the results.
>

[snip]

>
> The simple linear rules you describe have about a 0.1 point standard error. By
> contrast, the underlying cubeless equity estimate has about a 0.035 point
> standard error. I believe that the conversion errors overwhelm the theoretical
> superiority of the model.

Ah, OK. How did you measure those quantities? How do those values
vary with the choice of parameters (f for linear E/V; assumed recube
efficieny for "average cost")? For which kinds of positions do the two
models give conflicting results, and to which kind of position is
each model best suited? Sorry to ask so many questions :-)

I had assumed that the linear E/V rule would be at least as bad as the
"average cost" algorithm, given that they essentially [handwave,
handwave :-)] converge to the same result when the 1296 resultant
equities are normally distributed. If I understand Tesuaro's improved
T(V) model correctly, then his could be even closer to the "average
cost" result if T(V) looks like the product of the cost function and
the normal probability density function. As the equities become less
and less normally distributed, I would have guessed that the "average
cost" model would become more accurate than the linear E/V rule.

If there's some flaw in the "average cost" model that I'm overlooking
that the E/V approximation `fixes', then a possible compromise would
be to measure not only E and V, but also the skewness of the equities,
gamma. From the earlier example, we see that for constant E and V,
increasing gamma (ie. lengthening the distribution's "tail" to the
right) should indicate we're more eager to double; decreasing gamma
means we're more reluctant to double. Has anybody experimented with
functions something like E + f*Vj + g*gamma > T?

> As I mentioned before, this is an area of active research, so I am not without
> hope. Neither should you be without hope. But I want to caution the readers of
> this group against believing that this is just a question of implementation.
> The problems go way beyond that.

I'm glad -- by the time anything is reduced to an implementation problem,
it's no longer interesting :-)

Cheers,
Gary.

PS: Thanks very much for all the information you posted about 6 weeks
ago regarding supervised training. I have finally (with a database of
50,000 positions rolled out 72 times each) been able to come up with a
supervised net that plays as well as my TD-trained net (0.48 +/- 0.08
cubeless ppg vs. pubeval). Still not as strong as I hoped (I'm
pretty certain I'm still a clear favourite over it, and I'm only an
intermediate player), but I'm working on it...

Michael J. Zehr

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

Gary Wong wrote:
> The basic problem we're trying to solve is (to state the obvious) to
> maximise our cubeful equity at the end of the next exchange. This
> quantity (to state the even more obvious :-) is a random variable,
> being one of 1296 possible equities, so all we can do is maximise its
> expected value. We can already estimate the 1296 possible CUBELESS
> equities; estimating the CUBEFUL equities is difficult as you point
> out, but by assuming constant efficiencies for all subsequent cubes,
> we arrive at the following relation (I can't be bothered deriving it
> symbolically, so I'll just draw it. Sorry about the ASCII art!) --

[graphs snipped]

Shouldn't this be to maximize the equity after our 36 possible rolls?
If we look only at the 2-ply positions (i.e. 1296 next positions) then
we're assuming our opponent will never have a redouble if we double and
then roll a disaster. While immediate redoubles are not all that
common, they can come up in especially volatile positions. (To
determine the cubeful equity for the opponent we need to know if he has
a double, which involves the same calculation, recursively. A good
approximation would be to do a 1-ply search comparing doubling and not
doubling, and use a heuristic, like the E + fV > T to determine if the
opponent will double.)

-Michael J. Zehr

Gary Wong

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

"Michael J. Zehr" <mich...@michaelz.com> writes:

> Gary Wong wrote:
> > The basic problem we're trying to solve is (to state the obvious) to
> > maximise our cubeful equity at the end of the next exchange. This
> > quantity (to state the even more obvious :-) is a random variable,
> > being one of 1296 possible equities, so all we can do is maximise its
> > expected value.
>

> Shouldn't this be to maximize the equity after our 36 possible rolls?
> If we look only at the 2-ply positions (i.e. 1296 next positions) then
> we're assuming our opponent will never have a redouble if we double and
> then roll a disaster. While immediate redoubles are not all that
> common, they can come up in especially volatile positions. (To
> determine the cubeful equity for the opponent we need to know if he has
> a double, which involves the same calculation, recursively. A good
> approximation would be to do a 1-ply search comparing doubling and not
> doubling, and use a heuristic, like the E + fV > T to determine if the
> opponent will double.)

Perhaps I don't understand what you mean, but I definitely think it's
1296 (ie. 2 plies). The decision is whether to double this turn
vs. waiting and (potentially) doubling next turn. The only reason we
would want to double now is if we have market losers between now and
our next opportunity, and those market losers are technically made up
of our opponent's roll as well as ours. Imagine a position when we
are on the bar against a closed board, and the opponent is trapped
behind a 5-prime. The volatility on our "roll" is zero, but we may
still have market losers when our opponent rolls a board-cruncher.
The point of evaluating both plies is to measure the market losers
based on the opponent's roll as well as ours, not to evaluate our
opponent doubling immediately if we roll a disaster (which should
also be taken into consideration, as you point out).

bshe...@hasbro.com

unread,

Oct 15, 1998, 3:00:00 AM10/15/98

to

In article <wtlnmia...@brigantine.CS.Arizona.EDU>,

Gary Wong <ga...@cs.arizona.edu> wrote:
> "Michael J. Zehr" <mich...@michaelz.com> writes:
> > Gary Wong wrote:

> > > The basic problem we're trying to solve is (to state the obvious) to
> > > maximise our cubeful equity at the end of the next exchange. This
> > > quantity (to state the even more obvious :-) is a random variable,
> > > being one of 1296 possible equities, so all we can do is maximise its
> > > expected value.
> >

> > Shouldn't this be to maximize the equity after our 36 possible rolls?
> > If we look only at the 2-ply positions (i.e. 1296 next positions) then
> > we're assuming our opponent will never have a redouble if we double and
> > then roll a disaster. While immediate redoubles are not all that
> > common, they can come up in especially volatile positions. (To
> > determine the cubeful equity for the opponent we need to know if he has
> > a double, which involves the same calculation, recursively. A good
> > approximation would be to do a 1-ply search comparing doubling and not
> > doubling, and use a heuristic, like the E + fV > T to determine if the
> > opponent will double.)
>
> Perhaps I don't understand what you mean, but I definitely think it's
> 1296 (ie. 2 plies). The decision is whether to double this turn
> vs. waiting and (potentially) doubling next turn. The only reason we
> would want to double now is if we have market losers between now and
> our next opportunity, and those market losers are technically made up
> of our opponent's roll as well as ours. Imagine a position when we
> are on the bar against a closed board, and the opponent is trapped
> behind a 5-prime. The volatility on our "roll" is zero, but we may
> still have market losers when our opponent rolls a board-cruncher.
> The point of evaluating both plies is to measure the market losers
> based on the opponent's roll as well as ours, not to evaluate our
> opponent doubling immediately if we roll a disaster (which should
> also be taken into consideration, as you point out).
>

You are both wrong. :-) (I love saying that.) It doesn't matter what depth of
search you use--if you have a function that does cubeful evaluations then you
have a simple cube-handling algorithm. Details below.

Suppose that we have a cubeful search tree, with cubeful endpoint
evaluations. We can assign cubeful evaluations to interior nodes of the tree
by backing up scores. The process of computing the cubeful evaluation at an
interior nodes of the tree has the following form:

V = average of all successors assuming no double.
if (the side-to-move has access to the cube) {
v2 = average of all successors assuming double
if (v2 > V) V = v2;
}

The same logic allows us to read the correct cube decisions off the tree.

In this formulation, it doesn't matter how deeply you search. One-ply or ten-
ply still result in the same, correct algorithm for deciding whether to
double.

You can see how this formulation has reduced the problem to one of estimating
the cubeful equities. Or, in Gary's formulation, to the problem of converting
from cubeless equities to cubeful equities.

Warm Regards,
Brian Sheppard

bshe...@hasbro.com

unread,

Oct 15, 1998, 3:00:00 AM10/15/98

to

In article <wtogrea...@brigantine.CS.Arizona.EDU>,

> Ah, OK. How did you measure those quantities? How do those values
> vary with the choice of parameters (f for linear E/V; assumed recube
> efficieny for "average cost")? For which kinds of positions do the two
> models give conflicting results, and to which kind of position is
> each model best suited? Sorry to ask so many questions :-)

I computed the root-mean-square error of the conversion. The measurement was
done in the vector evaluation. The evaluation vector has 3 components (lose
gammon, win, win gammon). The root mean square involves the deviation of all 3
components, so a single measurement is of the form (g-G)^2 + (w-W)^2) + (h-H)
^2, where g,w,h are the conversion outputs and G,W,H are the training values.
Training values were established by JF Level 5 cubeful rollouts.

The measurement used my best effort at building a system of linear rules to
convert cubeless into cubeful evaluations. It is possible to build more
complicated systems, but I stopped at the point where I decided that there had
to be a better way.

> I had assumed that the linear E/V rule would be at least as bad as the
> "average cost" algorithm, given that they essentially [handwave,
> handwave :-)] converge to the same result when the 1296 resultant
> equities are normally distributed. If I understand Tesuaro's improved
> T(V) model correctly, then his could be even closer to the "average
> cost" result if T(V) looks like the product of the cost function and
> the normal probability density function. As the equities become less
> and less normally distributed, I would have guessed that the "average
> cost" model would become more accurate than the linear E/V rule.

The problem is not the strengths of the model, but the quality of the inputs
to the model. The linear model is theoretically flawed, but it has truly
super inputs. The search-based model uses inputs of substantially lower
quality.

> If there's some flaw in the "average cost" model that I'm overlooking
> that the E/V approximation `fixes', then a possible compromise would
> be to measure not only E and V, but also the skewness of the equities,
> gamma. From the earlier example, we see that for constant E and V,
> increasing gamma (ie. lengthening the distribution's "tail" to the
> right) should indicate we're more eager to double; decreasing gamma
> means we're more reluctant to double. Has anybody experimented with
> functions something like E + f*Vj + g*gamma > T?

I have not done any such experiments, but I regard it to be a promising
approach. Using gamma avoids the possibility that there are no market losers,
but there is still the possibility that doubles will be offered when E is very
small and Vj is large.

> PS: Thanks very much for all the information you posted about 6 weeks
> ago regarding supervised training. I have finally (with a database of
> 50,000 positions rolled out 72 times each) been able to come up with a
> supervised net that plays as well as my TD-trained net (0.48 +/- 0.08
> cubeless ppg vs. pubeval). Still not as strong as I hoped (I'm
> pretty certain I'm still a clear favourite over it, and I'm only an
> intermediate player), but I'm working on it...

You can do better against pubeval. My net is somewhere around 0.6 cubeless
ppg.

Your training base is too small. For high accuracy you probably need about 30
training examples for every weight in your NN, so if you have a somewhat
smallish net you need about 30 * 200 inputs * 40 hidden nodes = 240,000
examples.

Also, put some effort into adequately covering your input space. For example:
how many of your 50,000 cases have 4 men on the 19 point? If there aren't any
then you can't sensibly assign weights to positions that have 4 men on the 19
point.

Warm Regards,

Gary Wong

unread,

Oct 17, 1998, 3:00:00 AM10/17/98

to

bshe...@hasbro.com writes:
> In article <wtlnmia...@brigantine.CS.Arizona.EDU>,

> Gary Wong <ga...@cs.arizona.edu> wrote:
> > "Michael J. Zehr" <mich...@michaelz.com> writes:
> > > Gary Wong wrote:
> > > Shouldn't this be to maximize the equity after our 36 possible rolls?
> > > If we look only at the 2-ply positions (i.e. 1296 next positions) then
> > > we're assuming our opponent will never have a redouble if we double and

> > > then roll a disaster... [snip]

> >
> > Perhaps I don't understand what you mean, but I definitely think it's
> > 1296 (ie. 2 plies). The decision is whether to double this turn
> > vs. waiting and (potentially) doubling next turn. The only reason we
> > would want to double now is if we have market losers between now and
> > our next opportunity, and those market losers are technically made up

> > of our opponent's roll as well as ours... [snip]

>
> You are both wrong. :-) (I love saying that.)

Well, at least one of the three of us is wrong, but I won't claim to know
who it is :-)

> It doesn't matter what depth of
> search you use--if you have a function that does cubeful evaluations then you
> have a simple cube-handling algorithm. Details below.

This is certainly correct, but I'm _not_ claiming to have a function that
does cubeful evaluations. (After all, if we did have a cubeful evaluation
function, we wouldn't need to search at all!) I'm advocating a 2-ply
(ie. 1296 position) search because smaller searches than that cannot possibly
detect market losers at the 2nd ply (ie. our opponent's roll). 2 plies
is the "magic" number because (in general) following the 2nd ply, we'll
have the opportunity to double again if we don't now.

> Suppose that we have a cubeful search tree, with cubeful endpoint
> evaluations. We can assign cubeful evaluations to interior nodes of the tree
> by backing up scores. The process of computing the cubeful evaluation at an
> interior nodes of the tree has the following form:
>
> V = average of all successors assuming no double.
> if (the side-to-move has access to the cube) {
> v2 = average of all successors assuming double
> if (v2 > V) V = v2;
> }

Wait, wait! That's NOT what I said. By "cubeful endpoint evaluations" I
assume you mean equities, and therefore "V = average..." is also an equity.
But I'm not talking about average equities; I'm talking about the average
risk/reward of doubling now. Since the risk/reward function r() is not a
linear operator, E(r(X)) is NOT equivalent to r(E(X))! You're quite correct
that the equity at one ply is (by definition) the average of equities at the
next ply, but that doesn't apply to the quantities I'm measuring. Since
risk/reward is basically a measure of market loss, I want to search to two
plies so that all relevant market losers are included in the search.

bshe...@hasbro.com

unread,

Oct 19, 1998, 3:00:00 AM10/19/98

to

In article <wtg1cma...@brigantine.CS.Arizona.EDU>,

Gary Wong <ga...@cs.arizona.edu> wrote:
> bshe...@hasbro.com writes:
> > In article <wtlnmia...@brigantine.CS.Arizona.EDU>,
> > Gary Wong <ga...@cs.arizona.edu> wrote:
> > > "Michael J. Zehr" <mich...@michaelz.com> writes:
> > > > Gary Wong wrote:
> > > > Shouldn't this be to maximize the equity after our 36 possible rolls?
> > > > If we look only at the 2-ply positions (i.e. 1296 next positions) then
> > > > we're assuming our opponent will never have a redouble if we double and
> > > > then roll a disaster... [snip]
> > >
> > > Perhaps I don't understand what you mean, but I definitely think it's
> > > 1296 (ie. 2 plies). The decision is whether to double this turn
> > > vs. waiting and (potentially) doubling next turn. The only reason we
> > > would want to double now is if we have market losers between now and
> > > our next opportunity, and those market losers are technically made up
> > > of our opponent's roll as well as ours... [snip]
> >
> > You are both wrong. :-) (I love saying that.)
>
> Well, at least one of the three of us is wrong, but I won't claim to know
> who it is :-)

I don't know who is wrong either, but you have to sound convinced if you want
to sound convincing. :-)

> > It doesn't matter what depth of
> > search you use--if you have a function that does cubeful evaluations then
you
> > have a simple cube-handling algorithm. Details below.
>
> This is certainly correct, but I'm _not_ claiming to have a function that
> does cubeful evaluations.

Your risk/reward curves are a piece-wise linear conversion function between
cubeless and cubeful equities. (Right?)

> (After all, if we did have a cubeful evaluation
> function, we wouldn't need to search at all!)

This is a question of accuracy. You recognize that doing cubeful evaluation
accurately is hard, and so you propose projecting the cubeful evaluation to
the endpoints of a 2 ply search. You expect errors to be reduced because
extreme positions will be assigned the value 1.0 (with very low error rate),
and because errors in other situations will tend to cancel.

My point is that your technology can be used just as well with a one-ply
search as with a two-ply search. It probably works better with a two-ply
search, but it still works with a one-ply or zero-ply search.

>I'm advocating a 2-ply
> (ie. 1296 position) search because smaller searches than that cannot possibly
> detect market losers at the 2nd ply (ie. our opponent's roll). 2 plies
> is the "magic" number because (in general) following the 2nd ply, we'll
> have the opportunity to double again if we don't now.

Two ply is the smallest depth that *guarantees* detecting market losers, but a
cubeful evaluations will "intuit" market losers before they can be counted
directly.

> > Suppose that we have a cubeful search tree, with cubeful endpoint
> > evaluations. We can assign cubeful evaluations to interior nodes of the tree
> > by backing up scores. The process of computing the cubeful evaluation at an
> > interior nodes of the tree has the following form:
> >
> > V = average of all successors assuming no double.
> > if (the side-to-move has access to the cube) {
> > v2 = average of all successors assuming double
> > if (v2 > V) V = v2;
> > }
>
> Wait, wait! That's NOT what I said. By "cubeful endpoint evaluations" I
> assume you mean equities, and therefore "V = average..." is also an equity.
> But I'm not talking about average equities; I'm talking about the average
> risk/reward of doubling now. Since the risk/reward function r() is not a
> linear operator, E(r(X)) is NOT equivalent to r(E(X))! You're quite correct
> that the equity at one ply is (by definition) the average of equities at the
> next ply, but that doesn't apply to the quantities I'm measuring. Since
> risk/reward is basically a measure of market loss, I want to search to two
> plies so that all relevant market losers are included in the search.

The function I have given is not an average, since the function uses a MAX
operator whenever a player has a choice (i.e. moving or doubling). It uses
averaging quite appropriately: to compute a composite of all possible dice
rolls.

The difference between our formulations is one of "frame-of-reference." My
search returns absolute equities. Your search returns a "risk/reward" number,
which is positive if doubling is better than not doubling. Therefore, your
result equals mine minus the cubeful value of not doubling.

I propose that my formulation is better because it allows you to make checker
plays using cubeful evaluations and because it allows you to make
cube-handling decisions without doing the full 2-ply search. It is also
easier to extend to matchplay.

Warm regards,
Brian

Gary Wong

unread,

Oct 19, 1998, 3:00:00 AM10/19/98

to

bshe...@hasbro.com writes:
> In article <wtg1cma...@brigantine.CS.Arizona.EDU>,
> Gary Wong <ga...@cs.arizona.edu> wrote:
> > bshe...@hasbro.com writes:
> > > You are both wrong. :-) (I love saying that.)
> >
> > Well, at least one of the three of us is wrong, but I won't claim to know
> > who it is :-)
>
> I don't know who is wrong either, but you have to sound convinced if you want
> to sound convincing. :-)

I'm ready to be convinced except for one thing (which I ended up repeating
over and over below... sorry if it gets repetitive). If you can clear this
point up, then I'll even admit that it was me that was wrong :-)

> > > It doesn't matter what depth of
> > > search you use--if you have a function that does cubeful evaluations then
> you
> > > have a simple cube-handling algorithm. Details below.
> >
> > This is certainly correct, but I'm _not_ claiming to have a function that
> > does cubeful evaluations.
>
> Your risk/reward curves are a piece-wise linear conversion function between
> cubeless and cubeful equities. (Right?)

I guess so. I was trying to make a clear distinction between short
term considerations (ie. the search, market losers, etc. etc.) and
longer term concepts (basically, the assumptions about the vigourish
you're giving away to subsequent recubes by your opponent). Perhaps
this division is causing more obfuscation than it's worth and I should
assume your viewpoint of everything as a function estimating cubeful
equities. I am ready to do that, except for one thing I can't yet
reconcile between the two viewpoints, namely that the search is an
ESSENTIAL, integral part of the procedure I described (not just some
kind of refinement of the equities to yield more accuracy). Taking
yet another viewpoint :-), the aim is to wait until you're as high in
your doubling window as possible before doubling, but no higher than
that. The only time you EVER double is when you see positions ahead
in which you are off the top of your window (ie. market losing
sequences); how can you possibly detect these without a search? (Even
calculating the volatility as I understand it requires a search.) In
other words, without a search, the cubeful equity holding the cube
will ALWAYS meet or exceed the cubeful equity having given the cube to
the opponent, until they have a drop. (I realise that's not true with
real cubeful equities, but it's true with the evaluation functions I'm
assuming.) Yet ANOTHER viewpoint (the last one, I promise!) is that
the leaves of the tree are derived from a continuous equity model; the
only way to measure discontinuities with any accuracy is to perform a
search, and a 2-ply search is both necessary and sufficient to obtain
a couple of essential properties.

(Of course, the piecewise linear conversion is only a simple example;
in practice it could be any function of the data available from a
cubeless evaluation: that piecewise linear function, Janowski's
modified continuous model, dialling 1-900-PSYCHIC, whatever.)

> > (After all, if we did have a cubeful evaluation
> > function, we wouldn't need to search at all!)
>
> This is a question of accuracy. You recognize that doing cubeful evaluation
> accurately is hard, and so you propose projecting the cubeful evaluation to
> the endpoints of a 2 ply search. You expect errors to be reduced because
> extreme positions will be assigned the value 1.0 (with very low error rate),
> and because errors in other situations will tend to cancel.
>
> My point is that your technology can be used just as well with a one-ply
> search as with a two-ply search. It probably works better with a two-ply
> search, but it still works with a one-ply or zero-ply search.

I guess it depends what kind of function of the cubeless evaluation
you use. In the piecewise linear example I gave, a zero-ply search
would not work at all (it would evaluate everything as a
no-double/take or a double/drop, since the "reward" from doubling
comes ONLY from the threat of future positions which are above the
opponent's drop point. I guess you could call it "reluctant
doubling" :-)

> Two ply is the smallest depth that *guarantees* detecting market losers, but a
> cubeful evaluations will "intuit" market losers before they can be counted
> directly.

This seems to be the point we disagree on. I don't know HOW to write
a cubeful evaluation that can intuit market losers (besides the
search). I'm assuming that the evaluation at the leaves is something
that recognises that any double made earlier than the opponent's drop
point is less than perfectly efficient, and so evaluates any position with
the opponent holding the cube as LESS favourable than the same position
with the cube centred or on our side (until the drop point is reached,
of course). This evaluation does NOT "intuit" market losers, therefore
the search is necessary. It sounds like you had a different evaluation
in mind -- could you please describe how it differs?

> > > V = average of all successors assuming no double.
> > > if (the side-to-move has access to the cube) {
> > > v2 = average of all successors assuming double
> > > if (v2 > V) V = v2;
> > > }
> >

> > Wait, wait! That's NOT what I said...
[irrelevant comments of mine snipped]

>
> The function I have given is not an average, since the
> function uses a MAX operator whenever a player has a choice
> (i.e. moving or doubling). It uses averaging quite appropriately: to
> compute a composite of all possible dice rolls. The difference
> between our formulations is one of "frame-of-reference." My search
> returns absolute equities. Your search returns a "risk/reward"
> number, which is positive if doubling is better than not
> doubling. Therefore, your result equals mine minus the cubeful value
> of not doubling.

Sorry, I didn't fully appreciate what you were describing the last time I
wrote; my response missed the point. You're quite right about the averages.
The objection I should have written was that (for any kind of evaluation
function I'm assuming) in your psuedocode above, v2 will NEVER exceed V
(or at least only exceed V when the opponent has a drop). It's because
of this that r(E(X)) != E(r(X)), etc. etc...

> I propose that my formulation is better because it allows you to make checker
> plays using cubeful evaluations and because it allows you to make
> cube-handling decisions without doing the full 2-ply search. It is also
> easier to extend to matchplay.

I will admit all this is true IF you enlighten me about this cubeful
evaluation function and how it detects market losers without a search :-)

bshe...@hasbro.com

unread,

Oct 21, 1998, 3:00:00 AM10/21/98

to

In article <wtd87o9...@brigantine.CS.Arizona.EDU>,

You can train a neural network to approximate the cubeful equity of a
position, so you don't need a search in theory. (Key point: neural networks
are "universal function approximators," so we can approximate anything.) The
rest of this post will show how even a piecewise-linear conversion function
between cubeless and cubeful results in correct doubling decisions as depth
of search and accuracy of conversion increase.

> > Two ply is the smallest depth that *guarantees* detecting market losers,
but a
> > cubeful evaluations will "intuit" market losers before they can be counted
> > directly.
>
> This seems to be the point we disagree on. I don't know HOW to write
> a cubeful evaluation that can intuit market losers (besides the
> search). I'm assuming that the evaluation at the leaves is something
> that recognises that any double made earlier than the opponent's drop
> point is less than perfectly efficient, and so evaluates any position with
> the opponent holding the cube as LESS favourable than the same position
> with the cube centred or on our side (until the drop point is reached,
> of course). This evaluation does NOT "intuit" market losers, therefore
> the search is necessary. It sounds like you had a different evaluation
> in mind -- could you please describe how it differs?

An evaluation function "intuits" market losers by "flattening" the evaluation
curve as it passes the opponent's cubeless drop point. The true shape of the
cubeless to cubeful conversion function is something like the logistic
function (i.e. 1/(1+exp(-x)), as used in neural networks). Some key points of
difference that I want to gloss over for the rest of this discussion are: the
true curve depends on gammons and wins rather than just equity, and it
depends on volatility and skewness, and it actually reaches the 1.0 point
whereas the logistic function only asymptotes to that level.

In this discussion I want to describe the effects of a difference between the
true curve and the piecewise-linear approximation you proposed. I am
specifically concerned with the effect it has on the ability of the function
to "intuit" market losers.

On a static evaluation you cannot "intuit" market losers on the basis of your
curves because the conversion regard anyposition to be either a no-double/take
or a double/drop, as you have described. This occurs because the current
situation is either below the drop point (zero market losers, so no-
double/take) or above the drop point (100% market losers, so double/drop).

Let's recast the computation in a manner that doesn't count market-losers,
but instead relies upon "intuition." An indisputably correct procedure for
deciding whether to double is the following:

1) Evaluate the position giving the opponent the cube
2) Evaluate the position keeping the cube where it is
3) If Evaluation #1 times 2 is greater than Evaluation #2 then double.

There is no counting of market losers here; all of the calculation of market-
losing sequences is done implicitly by the conversion function. It is this
process that I refer to as "intuiting" market-losing sequences.

The 3-step procedure I described above is correct in the general formulation.
For example, with your conversion function it makes the same decisions (ie,
no- double/take or double/drop) in the same circumstances. That is because
the conversion function is linear, and evaluation 1 is always less than
evaluation 2 when you are below the drop point.

A more accurate cubeful evaluation function would allow the possibility of a
double/take if its "hip point" were not so sharp. If a smooth curve were
substituted for the piecewise-linear conversion function, then you can have a
double/take region.

The benefit of search is to convert the piecewise-linear approximation into
the true conversion function by the processes of averaging and minimaxing.
Let's see how by taking the shallowest-possible search of 1 ply.

A one-ply lookahead will envision some positions over the drop point and some
below. The backup process of the search will apply minimax to the double/no-
double decision and to the take/drop decision, always choosing the evaluation
that is better from the perspective of the side making the decision. When the
1- ply search completes, the backed-up evaluation at the root will choose
whether to double on the basis of the search results.

> > > > V = average of all successors assuming no double.
> > > > if (the side-to-move has access to the cube) {
> > > > v2 = average of all successors assuming double
> > > > if (v2 > V) V = v2;
> > > > }

> The objection I should have written was that (for any kind of evaluation
> function I'm assuming) in your psuedocode above, v2 will NEVER exceed V
> (or at least only exceed V when the opponent has a drop). It's because
> of this that r(E(X)) != E(r(X)), etc. etc...

If you evaluate a terminal node using the piecewise-linear conversion
function then you are correct. The evaluation will always be either V
(no-double) or 1.0 (double/drop).

But at interior nodes of the search tree (including the root of a 1-ply
search) this is no longer true, and that is why the proposed procedure does
result in correct doubling as the depth of search increases (or the
cubeless-to-cubeful conversion process becomes more accurate).

> I will admit all this is true IF you enlighten me about this cubeful
> evaluation function and how it detects market losers without a search :-)

How am I doing?

Warm Regards,
Brian Sheppard

Gary Wong

unread,

Oct 21, 1998, 3:00:00 AM10/21/98

to

bshe...@hasbro.com writes:
> You can train a neural network to approximate the cubeful equity of a
> position, so you don't need a search in theory. (Key point: neural networks
> are "universal function approximators," so we can approximate anything.)

Sure, I'm not disputing that. I wrote a little about that possibility a
few weeks ago (http://x11.dejanews.com/getdoc.xp?AN=396770794) but I
haven't tried it myself.

Everything I've written in this thread is assuming we have a cubeless static
evaluation function and nothing else. I maintain that the output of this
function cannot possibly be sufficient to determine whether to double without
a search (especially a 2-ply search).

> Let's recast the computation in a manner that doesn't count market-losers,
> but instead relies upon "intuition." An indisputably correct procedure for
> deciding whether to double is the following:
>
> 1) Evaluate the position giving the opponent the cube
> 2) Evaluate the position keeping the cube where it is
> 3) If Evaluation #1 times 2 is greater than Evaluation #2 then double.

Agreed.

> The 3-step procedure I described above is correct in the general formulation.
> For example, with your conversion function it makes the same decisions (ie,
> no- double/take or double/drop) in the same circumstances. That is because
> the conversion function is linear, and evaluation 1 is always less than
> evaluation 2 when you are below the drop point.
>
> A more accurate cubeful evaluation function would allow the possibility of a
> double/take if its "hip point" were not so sharp. If a smooth curve were
> substituted for the piecewise-linear conversion function, then you can have a
> double/take region.

I don't see that it would (not for a static evaluation, at least). If
Ep, Eo and Ec represent the equities with the cube owned by the
player, opponent, and centred respectively, then Ep >= Ec >= Eo,
surely? If the equities didn't obey this relation, then they don't
reflect the fact that for any position where the efficiency of an
immediate double is f, then (depending on the subsequent positions),
there is some possibility that by waiting another turn to double, the
expected efficiency of doubling then may be f + epsilon. Whatever
shape you make the curves, it seems to me that they must obey this
ordering. (Naturally, in some positions (ie. with market losers) the
expected future efficiency will go down, but you can't intuit this
from the equity curves from static evaluations. I can't, at least.)

> The benefit of search is to convert the piecewise-linear approximation into
> the true conversion function by the processes of averaging and minimaxing.
> Let's see how by taking the shallowest-possible search of 1 ply.
>
> A one-ply lookahead will envision some positions over the drop point and some
> below. The backup process of the search will apply minimax to the double/no-
> double decision and to the take/drop decision, always choosing the evaluation
> that is better from the perspective of the side making the decision. When the
> 1- ply search completes, the backed-up evaluation at the root will choose
> whether to double on the basis of the search results.

Right, I agree with this. NO function of the static evaluation alone
can determine the correct doubling decision (I guess a simple
counterexample is that you can have two positions with identical win
and gammon probabilities in which one is a double and the other is
not). Looking ahead one ply CAN decide to double if there are
positions at the next ply beyond the drop point, as you say. Looking
ahead two plies is even better because some of our opponent's rolls
may alter the subsequent evaluation further. I believe this is an
ideal distance to search; looking ahead 3 plies doesn't really help:
we're envisioning doubling now because the opponent may have a drop in
2 plies' time, in which case the third ply will never be played. If
there really ARE positions at the third ply and beyond that are drops,
and yet they follow from positions at the second ply that are takes,
then we'd do better to hold the cube this turn because we can always
double at the next opportunity with at least as much efficiency as a
double now.

All the above assumes constant recube vig; if we try to measure this
it turns out to depend on the efficiency with which the opponent is
likely to make future doubles, and is in general too hard to determine
accurately. For the moment I'm assuming we set a fixed average recube
efficiency or use some other simple approximation. A short search
generally won't help estimate this parameter at all well because the
efficiency can really only be determined accurately when we reach
positions around the drop point -- if we're considering doubling now
it must be because we're close to the OPPONENT having a drop; it will
often be some time before the opponent's position improves so much
that WE have a drop (if at all), so searching 3 or 4 plies is a waste
of time in most cases. As you point out, nets could be trained to
estimate efficiencies of future cubes given a position, but that
would require the entire position as an input (not just the static
evaluation of it).

> > > > > V = average of all successors assuming no double.
> > > > > if (the side-to-move has access to the cube) {
> > > > > v2 = average of all successors assuming double
> > > > > if (v2 > V) V = v2;
> > > > > }
> > The objection I should have written was that (for any kind of evaluation
> > function I'm assuming) in your psuedocode above, v2 will NEVER exceed V
> > (or at least only exceed V when the opponent has a drop). It's because
> > of this that r(E(X)) != E(r(X)), etc. etc...
>
> If you evaluate a terminal node using the piecewise-linear conversion
> function then you are correct. The evaluation will always be either V
> (no-double) or 1.0 (double/drop).
>
> But at interior nodes of the search tree (including the root of a 1-ply
> search) this is no longer true, and that is why the proposed procedure does
> result in correct doubling as the depth of search increases (or the
> cubeless-to-cubeful conversion process becomes more accurate).

OK, I agree with that, but I still believe 0 plies won't work at all;
1 will sort of work but is not really enough; 2 is just right; and 3
or more is pretty much a waste of time.

> > I will admit all this is true IF you enlighten me about this cubeful
> > evaluation function and how it detects market losers without a search :-)
>
> How am I doing?

Getting there -- sorry if I'm being hard to enlighten :-)

bshe...@hasbro.com

unread,

Oct 23, 1998, 3:00:00 AM10/23/98

to

In article <wt3e8h9...@brigantine.CS.Arizona.EDU>,

OK. I see that there is a limitation you are operating under that I am not.

You require your cubeless-to-cubeful conversion function to depend on the
equities alone, whereas I am assuming a more general relationship. For
instance, I would allow the cubeless-to-cubeful conversion to use any inputs
that had been computed for the purpose of the cubeless evaluation, plus the
outputs of the cubeless evaluation, though for speed I would limit myself to a
few select inputs (e.g. shots) that have a lot to do with upside and downside
potential.

How are we doing now?

Brian

Leo Bueno

unread,

Oct 23, 1998, 3:00:00 AM10/23/98

to

On Wed, 14 Oct 1998 21:01:33 GMT, gtes...@my-dejanews.com wrote:

.

>
>The latex source file for the paper is about 100K and packed
>with hairy equations. It comes to about 25 pages when postscripted.
>So I don't think this is an appropriate place. If anyone has any
>bright ideas, please let me know.
>

What about submitting it to another journal?