Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

0 views

Skip to first unread message

Aug 23, 1998, 3:00:00â€¯AM8/23/98

to

Hi there

Could one of you statistically minded readers

let me know if there are many different ways

to apply variance reduction to backgammon

rollouts.

Could one of you statistically minded readers

let me know if there are many different ways

to apply variance reduction to backgammon

rollouts.

I.e. is there more than one formula and if so, is

it possible they may result in differing results

significant enough to matter when relating to

backgammon check play analysis.

Also with regards to checker play comparison,

should we lean towards truncated rollouts with a set

horizon with no variance reduction or are the results

likely to better when applying

variance reduction a full rollout?

Thanks

Midas

Editor of Backgammon Bonanza

www.depreli.demon.co.uk

Aug 24, 1998, 3:00:00â€¯AM8/24/98

to

In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

>Hi there

>Could one of you statistically minded readers

>let me know if there are many different ways

>to apply variance reduction to backgammon

>rollouts.

>Hi there

>Could one of you statistically minded readers

>let me know if there are many different ways

>to apply variance reduction to backgammon

>rollouts.

I can trivially answer "yes" to this but in a way that might not be

useful to you. Jellyfish uses one method (the name of the algorithm is

in the manual) and Snowie uses a method that might or might not be the

same. (I'll speculate that it isn't the same since Snowie generally

reports a higher "equivalent to X games" value for the same size

rollout, which suggests that it reduces variance faster.)

In addition there's at least one variance reduction method that isn't

used by any computer programs that I am aware of. This is the (hereby

named, although reported on this newsgroup previously) Zehr

Cube-reduction algorithm: when rolling out with a live cube and a

takeable cube is given, keep the cube at the same value and play two

games from the position where the cube was turned. (Do this on all

subsequent cube turnings too.) The purpose of this is to allow a live

cube on rollouts and yet keep a single game with a high cube value from

affecting the results too much.

>I.e. is there more than one formula and if so, is

>it possible they may result in differing results

>significant enough to matter when relating to

>backgammon check play analysis.

I'm sure one can use different algorithms to get different results from

the same neural net and dice, however one would also get different

results from different neural nets when using the same algorithm.

The basic idea of any variance reduction method is to estimate the

amount of luck each side has and factor that out. For example, if one

side has two on the bar against a 5-point board and rolls the necessary

doubles to come in that's a very lucky roll. A variance reduction

algorithm would measure that luck and change the results of the game in

some way to take account of it.

In the example above, if the neural net's evaluation of rolling the

double to enter is +.243 equity, and the evaluation of the position

before rolling as -.324, that roll was worth +.567. If the player who

rolled the lucky double goes on to win then instead of getting 1 point

for the win they might get 1 point less some fraction of the equity from

their lucky rolls. So you might deduct half of .567 (as well as half of

other lucky rolls, but also adding in half of the unlucky rolls, and

doing the same thing for the opponents).

Some variations on this might include:

o vary the fraction of the adjustment (half in the example above)

o only make the adjustment if the roll was especially lucky or unlucky

and make no adjustments for rolls whose value is within some range of

the position's value before rolling

o change the adjustment based on whether the roll was close to the start

of the game or close to the end of the game

Any such algorithm is going to have results vary depending on the

evaluation used to decide what is a lucky roll. In the example above a

different program rolling out the same position with the same dice might

think that rolling the double was worth only .425 and hence make a

smaller adjustment.

An example of another kind of variance reduction is one that Snowie uses

-- you can enable bearoff database truncation. When using this a

rollout stops at the point when Snowie can look it up in its bearoff

database. This database is extremely accurate and can reduce all of the

variance from luck during the bearoff. For example, Snowie might

determine that one side has a 63% chance of winning. At that point it

can stop that game and use a result of .26 equity.

>Also with regards to checker play comparison,

>should we lean towards truncated rollouts with a set

>horizon with no variance reduction or are the results

>likely to better when applying

>variance reduction a full rollout?

First one should specify the context a bit more. A full rollout will

almost always give a better evaluation if time were not a factor. What

one is usually interested in is the fastest way to get a reliable

answer.

For some positions a truncated rollout will get a reliable answer

fastest and for some it won't. If the position is one that the

evaluator understand well, or the position is one that will soon

simplify (for example a holding game) then a truncated rollout can get a

very reliable answer very quickly. One can think of these as

evaluations with a higher ply. (A 1296 game rollout with the initial

rolls varied, truncated after 2 rolls and played n-ply is equal to an

n+2-ply evaluation.)

If the evaluator doesn't understand the position well or nothing is

going to change much for a while (for example a prime vs. prime

position) then it is probably best to do the full rollout.

The best way to decide which one to do is to start with a full rollout,

compare the results to the evaluator, and if they are close then trust a

truncated rollout. Unfortunately that doesn't save any time. But with

experience one can judge which kind of rollout to do.

One final comment about computer rollouts -- in those positions in

which a truncated rollout is not accurate we might not be able to trust

the full rollout results anyway. If the evaluator doesn't understand

the position very well, it might not even understand it well enough to

do a rollout. It's easy to fall into the trap of doing a bazillion game

rollout, getting a standard deviation of .001 on the results, and

believing you've found the right answer. In reality you've found the

best answer that particular tool can give you, which isn't necessarily

the right answer.

-Michael J. Zehr

Aug 24, 1998, 3:00:00â€¯AM8/24/98

to

An important issue is how to determine what paramaters are required for a

given rollout. Sometimes common sense will indicate that the cube is

required, or that the "checker play according to score" box should be

enabled (based upon match score, access to the cube, volatility, etc).

Often, these are very difficult though. Any suggestions as to how to get

the "most bang for the buck" from a rollout of a given number of hours?

given rollout. Sometimes common sense will indicate that the cube is

required, or that the "checker play according to score" box should be

enabled (based upon match score, access to the cube, volatility, etc).

Often, these are very difficult though. Any suggestions as to how to get

the "most bang for the buck" from a rollout of a given number of hours?

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

In article <35e1e3ea....@news.cncdsl.com>,

kena...@mindspring.com wrote:

> ta...@mit.edu (Michael J Zehr) wrote:

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

> >

> >I can trivially answer "yes" to this but in a way that might not be

> >useful to you. Jellyfish uses one method (the name of the algorithm is

> >in the manual) and Snowie uses a method that might or might not be the

> >same. (I'll speculate that it isn't the same since Snowie generally

> >reports a higher "equivalent to X games" value for the same size

> >rollout, which suggests that it reduces variance faster.)

kena...@mindspring.com wrote:

> ta...@mit.edu (Michael J Zehr) wrote:

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

> >

> >I can trivially answer "yes" to this but in a way that might not be

> >useful to you. Jellyfish uses one method (the name of the algorithm is

> >in the manual) and Snowie uses a method that might or might not be the

> >same. (I'll speculate that it isn't the same since Snowie generally

> >reports a higher "equivalent to X games" value for the same size

> >rollout, which suggests that it reduces variance faster.)

In another post I demonstrate that the effect of variance reduction depends on

the accuracy of the static evaluation function. If SW had a more accurate

neural network than JF, then SW would achieve better results from variance

reduction even with exactly the same variance reduction algorithm.

I speculate that SW and JF both use methods based upon the theory named in

JF's user manual. Of course, there may still be significant differences.

Brian Sheppard

-----== Posted via Deja News, The Leader in Internet Discussion ==-----

http://www.dejanews.com/rg_mkgrp.xp Create Your Own Free Member Forum

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

> >The basic idea of any variance reduction method is to estimate the

> >amount of luck each side has and factor that out. For example, if one

> >side has two on the bar against a 5-point board and rolls the necessary

> >doubles to come in that's a very lucky roll. A variance reduction

> >algorithm would measure that luck and change the results of the game in

> >some way to take account of it.

> >

> >In the example above, if the neural net's evaluation of rolling the

> >double to enter is +.243 equity, and the evaluation of the position

> >before rolling as -.324, that roll was worth +.567. If the player who

> >rolled the lucky double goes on to win then instead of getting 1 point

> >for the win they might get 1 point less some fraction of the equity from

> >their lucky rolls. So you might deduct half of .567 (as well as half of

> >other lucky rolls, but also adding in half of the unlucky rolls, and

> >doing the same thing for the opponents).

Whoa! This is not "variance reduction." This is "bias introduction."

(Math phobic readers may wish to avert their eyes at this juncture.)

The algorithm you just described computes the move-by-move change in the

neural net evaluation over the course of a single trial. The final outcome

of the trial is offset against the sum of the changes. This reduces variance,

to be sure. But *how* does it reduce variance? Let's take a look.

Let E0 be the evaluation of the root position of the rollout. After 1 turn the

evaluation changes to E1, then to E2, and so on. In the final position the

outcome is V. What is the sum of the changes from turn to turn?

In the first turn the change is E0 - E1. In the second turn the change is E1 -

E2. In the third turn the change is E2 - E3, and so on. The sum is

(E0 - E1) + (E1 - E2) + (E2 - E3) + (E3 - E4) + ... + (En - V)

This type of sum is called a "telescoping series," because it resembles a

collapsible telescope. To see why, let's rearrange the parentheses a little:

E0 (-E1 + E1) (-E2 + E2) (-E3 + E3) ... (-En + En) - V

So the sum is E0 - V! This amount will be offset against V. If the fraction we

offset against V is f, then the final answer is

f * E0 + (1 - f) * V

Well no wonder we reduce the variance! All the algorithm accomplishes is to

compute a weighted average of the true statistical answer with our initial

estimate. This is obviously the wrong answer; our goal is to eliminate our

initial estimate from the equation. The dependency on E0 is a bias.

True variance reduction does not introduce bias into the result. I believe I

have reconstructed JF's algorithm based upon the cryptic description in JF's

user manual. Before I describe it, I want to provide some motivation for it,

so that you can understand how the algorithm reduces variance.

Let E be the true cubeless equity function, and e be our static evaluation

function (e.g. a neural net). Rollouts estimate E by playing out a number of

trial games and averaging the results. A simple rollout does not exploit the

fact that e is a very good approxmation to E. Variance reduction reduces the

number of trials we need to achieve a specific level of accuracy by exploiting

the fact that e is a very good approximation of E. How does it work?

Instead of regarding the rollout as establishing a value for E, let's think of

it as establishing a value for E - e. In words, the rollout will estimate the

error in our static evaluation.

Let's begin by doing a 1-ply search, then estimate E as the average of the 1-

ply values of e. This would be completely correct if e were perfect, but since

e isn't perfect, we need to estimate the error involved. How can we do that?

Simple. Let's just pick a 1-ply position at random, and estimate the error

involved in its evaluation. Note that I have just recreated the problem that

I posed at the root (i.e., estimating the difference between e and E) but one

ply deeper into the game tree. I can continue this process until I reach a

terminal node.

To estimate E, we start at the end of the variation we have selected at random

and apply the formula

E = Delta + Average(e)

where the average is taken over all next-ply nodes and Delta is the

recursively computed estimate of the inaccuracy of e for the

randomly-selected successor.

How does this procedure reduce variance? Let's take a look. The variance of

our estimate of E equals the variance of our estimate of Delta plus the

variance of our estimate of Average(e). But Average(e) is a constant, so its

variance is zero. So the variance of our estimate of E is simply the variance

of our estimate of Delta.

Now, Delta is the estimate of E-e for a single successor position. And since e

is very, very accurate, the variance of Delta is very, very low. The standard

deviation of E-e is 0.035 or so. Compare this low error rate with the typical

standard deviation of 0.5 for a single game rollout. Reducing the standard

deviation by a factor of 0.5 / 0.035 = 14.28 would normally require 14.28 *

14.28 = 203.9 times as many trials. We need vastly fewer trials to get a

specific level of accuracy.

The factor of 203.9 is not pure profit, alas, because we have to do 1-ply

searches at every node when we use variance reduction. By contrast, a normal

rollout does only static evaluation. This typically costs a factor of 20 or

so, which is a shame. But we still should be very happy with a net increase

of a factor of 10 in overall throughput.

As I stated before, I believe that this is the algorithm used in JF. However,

this belief is based upon deciphering the user's manual, which refers only to

certain theorems from the theory of Markov chains. There may be other

formulations of this algorithm, or there may be other ways to apply the theory

to reduce variance.

Warm Regards,

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

> ta...@mit.edu (Michael J Zehr) wrote:

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

>

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

>

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

One significant technique is called "striated sampling." The goal is to choose

your samples according to a theoretical model of how samples ought to be

distributed.

For example, when you do a JF rollout with a multiple of 36 trials, JF

allocates trials equally over the 36 possible dice rolls. This ensures that

the variations exactly match the results of the first ply of rolls. And if

the number of trials is a multiple of 1296, then JF will do an exhaustive

enumeration out to 2 ply.

Another technique is to concentrate effort on those rolls that have the

highest variance. For example, suppose that the side-to-move is on the bar,

and 25 numbers fan. Then all 25 of those numbers transpose to the same

variation. Moreover, suppose that we observe the results of fanning to be

consistent: the opponent always wins. Accordingly, this category has 0

variance. Since other rolls have higher variance, we should concentrate our

efforts on other rolls.

To compensate for the fact that we distribute trials non-uniformly, we scale

the results for each roll. To continue our example, suppose we do 50 trials

of the 25 fanning rolls (losing a single point each time), and 1000 trials

for each of the other 11 rolls, losing 0.5 points each time. Our equity is

computed as follows:

average value in 50 trials is -1, times 25/36 = -25/36.

average value in 11000 trials is -0.5, times 11/36 = -5.5/36.

Total value: -30.5/36

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

> ta...@mit.edu (Michael J Zehr) wrote:

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

> >

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >>Hi there

> >>Could one of you statistically minded readers

> >>let me know if there are many different ways

> >>to apply variance reduction to backgammon

> >>rollouts.

> >

> >>I.e. is there more than one formula and if so, is

> >>it possible they may result in differing results

> >>significant enough to matter when relating to

> >>backgammon check play analysis.

> >

> >I'm sure one can use different algorithms to get different results from

> >the same neural net and dice, however one would also get different

> >results from different neural nets when using the same algorithm.

> >>it possible they may result in differing results

> >>significant enough to matter when relating to

> >>backgammon check play analysis.

> >

> >I'm sure one can use different algorithms to get different results from

> >the same neural net and dice, however one would also get different

> >results from different neural nets when using the same algorithm.

Something is seriously wrong if you reduce variance and end up with a

different answer. The goal of variance reduction is to cause the statistical

process to converge more quickly to the same answer as without variance

reduction.

It is true that if you change networks or the dice or the algorithm then you

will not get the same answer over a limited number of trials. However, if you

use the same neural network then any valid variance reduction method will

converge (in the limit of an arbitrarily large number of trials) to the same

answer as without that variance reduction method.

Warm regards,

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

> ta...@mit.edu (Michael J Zehr) wrote:

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >

> >In addition there's at least one variance reduction method that isn't

> >used by any computer programs that I am aware of. This is the (hereby

> >named, although reported on this newsgroup previously) Zehr

> >Cube-reduction algorithm: when rolling out with a live cube and a

> >takeable cube is given, keep the cube at the same value and play two

> >games from the position where the cube was turned. (Do this on all

> >subsequent cube turnings too.) The purpose of this is to allow a live

> >cube on rollouts and yet keep a single game with a high cube value from

> >affecting the results too much.

> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

> >

> >In addition there's at least one variance reduction method that isn't

> >used by any computer programs that I am aware of. This is the (hereby

> >named, although reported on this newsgroup previously) Zehr

> >Cube-reduction algorithm: when rolling out with a live cube and a

> >takeable cube is given, keep the cube at the same value and play two

> >games from the position where the cube was turned. (Do this on all

> >subsequent cube turnings too.) The purpose of this is to allow a live

> >cube on rollouts and yet keep a single game with a high cube value from

> >affecting the results too much.

This is clever. Doubling the number of trials in variations in which a cube is

accepted reduces the variance of our estimate in those variations by 50%.

The impact on run time is reasonable. Even though we double the double/take

trial count, we do not double the run-time, since the variation is only split

*after* the double. Moves prior to the double are common to the two trials.

The bottom line is that the method halves the variance of double/take trials

with considerably less than a doubling of the run-time.

A few implementation details for implementation-minded readers. First, there

are pathological situations where the rollout would explode. (These

situations are the same as the situations where the cube-using equity value

is undefined. Check out the Deja News archive for related threads.) The

number of "splits" of the variation tree must be limited to prevent

explosions.

Second, you have weigh the double/take variations on an equal footing with the

double/drop variations. This means that the pair of double/take games must be

counted as a single game, not two games.

Warm Regards,

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

I'm not very familiar with the concept of rollouts but my understanding is

you take a position and do rollouts starting with each move that could be

made given the dice at the time. Then after hundreds of these rollouts one

can rate the possible moves. And in each test it is the computer playing

against itself.

you take a position and do rollouts starting with each move that could be

made given the dice at the time. Then after hundreds of these rollouts one

can rate the possible moves. And in each test it is the computer playing

against itself.

I'm curious to know if anyone has done this with one program playing against

another. For example, rollout Jellyfish playing Snowie, then Jellyfish

playing Motif -- or whatever. Then switch sides. Then see if the plays are

at least ranked the same in each rollout set.

Here is why I ask this. Many years ago I had a computer gin game. I was

better than the two top computer players in this game, but I had to use a

different knocking strategy for each. And when I reversed my knocking

strategies with them, I lost more often than won. So considering that

computer backgammon players have their own styles, I wonder if the variable

of playing against different computer players has been tested in rollouts.

Aug 25, 1998, 3:00:00â€¯AM8/25/98

to

I think this fits into this discussion so I'll try posting it to this topic.

I'm not very familiar with the concept of rollouts but my understanding is

you take a position and do rollouts with each move that could be made given

the dice at the time. Then after hundreds of these rollouts one can rate

the possible moves. And in each test it is the computer playing against

itself.

I'm curious to know if anyone has done this with one program playing against

another. For example, rollout Jellyfish playing TD Gammon, then Jellyfish

playing Motif. Then switch sides, plus have TD Gammon go against Motif.

Then see if the plays are at least ranked the same in each rollout set.

Here is why I ask this. Many years ago I had a computer gin game. I was

better than the two top computer players in this game, but I had to use a

different knocking strategy for each. And when I reversed my knocking

strategies with them, I lost slightly more often than I won. So considering

that these computer backgammon players have noticeable styles, I wonder if

Aug 28, 1998, 3:00:00â€¯AM8/28/98

to

Great post!

But, I am a little confused exactly how this is done and thought an example might

help clear things up.

bshe...@hasbro.com wrote:

> Let E be the true cubeless equity function, and e be our static evaluation

> function (e.g. a neural net). Rollouts estimate E by playing out a number of

> trial games and averaging the results. A simple rollout does not exploit the

> fact that e is a very good approxmation to E. Variance reduction reduces the

> number of trials we need to achieve a specific level of accuracy by exploiting

> the fact that e is a very good approximation of E. How does it work?

>

Okay... lets say we want to to do a rollout of some position called P0 and have a

function e (a neural net) which computes cubeless equity.Suppose e(P0) = 0.60;

> Instead of regarding the rollout as establishing a value for E, let's think of

> it as establishing a value for E - e. In words, the rollout will estimate the

> error in our static evaluation.

>

> Let's begin by doing a 1-ply search, then estimate E as the average of the 1-

> ply values of e. This would be completely correct if e were perfect, but since

> e isn't perfect, we need to estimate the error involved. How can we do that?

>

So there's a function, call it e1, whose value is defined as a 1-ply search over

e.

Suppose e1(P0) = 0.62

> Simple. Let's just pick a 1-ply position at random, and estimate the error

> involved in its evaluation. Note that I have just recreated the problem that

> I posed at the root (i.e., estimating the difference between e and E) but one

> ply deeper into the game tree. I can continue this process until I reach a

> terminal node.

>

Here is where we rollout the position, producing the positions P1, P2, ... till

the game ends and we get some result R.

Suppose it looks like this:

P, e(P), e1(P)

-- --- ---

P0 0.60 0.62

P1 0.50 0.51

P2 0.60 0.58

P3 0.55 0.57

P4 0.72 0.68

result (R) = +1.0

> To estimate E, we start at the end of the variation we have selected at random

> and apply the formula

>

> E = Delta + Average(e)

>

> where the average is taken over all next-ply nodes and Delta is the

> recursively computed estimate of the inaccuracy of e for the

> randomly-selected successor.

>

This is where I get confused.

Is Delta = sum over all positions except P0 of e1 -e ? = ( [0.51 - 0.50] +

[0.58 - 0.60] + [0.57 - 0.55] + [0.68 - 0.72] )= -0.03

So for each trial, this Delta would be added to a running total (call it Dtot),

and

E = e1(P0) + (Dtot / numtrials)

Is this right? (And that according to the above the static eval of the original

position, e(P0), and the final result, R, are irrelevant)

Sep 1, 1998, 3:00:00â€¯AM9/1/98

to

In article <6rulua$7m8$1...@nnrp1.dejanews.com>, <bshe...@hasbro.com> wrote:

>> ta...@mit.edu (Michael J Zehr) wrote:

>> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

>> >I'm sure one can use different algorithms to get different results from

>> >the same neural net and dice, however one would also get different

>> >results from different neural nets when using the same algorithm.

>

>> ta...@mit.edu (Michael J Zehr) wrote:

>> >In article <35dfeaaa...@news.demon.co.uk>, <Midas> wrote:

>> >I'm sure one can use different algorithms to get different results from

>> >the same neural net and dice, however one would also get different

>> >results from different neural nets when using the same algorithm.

>

>Something is seriously wrong if you reduce variance and end up with a

>different answer. The goal of variance reduction is to cause the statistical

>process to converge more quickly to the same answer as without variance

>reduction.

>different answer. The goal of variance reduction is to cause the statistical

>process to converge more quickly to the same answer as without variance

>reduction.

Unless we assume that all variance reduction algorithms give the same

standard deviation after the same number of games, i.e. there is only

one variance reduction algorithm, then the values must be different.

We use variance reduction because after a fixed number of games we'll

get a result that is closer to the theoretically most precise result

from that neural net. If one algorithm is "better" than another, it

will report an even closer value after the same number of games -- in

other words it reports a different value.

As an example, suppose the theoretically most precise equity of a

position for a given neural net (not necessarily the true equity of a

position, but rather the best that a particular net can give us) is

.350, then after 36 games with no variance reduction the standard

deviation might be .150 and the equity reported might be .220. If we use

a variance reduction method that reduces variance by a factor of 10,

i.e. 36 games is equivalent to 360 games, the standard deviation might

be .45, and we'd expect a value closer to .350 than .220 is. If we use

a really good variance reduction so the standard deviation is .025

(equivalent to 1296 games, or a factor of 36), against we'd expect a

value closer to .350.

Since none of the neural nets give you a choice between two different

variance reduction methods this can't be actually demonstrated. However

one can demonstrate the difference between no variance reduction (which

is a valid algorithm, although a poor one!) and some variance

reduction. Try this if you have Snowie:

Pick a position. Set the score to double-match point. Do a 2-game

rollout selecting "rollout with doubling cube in play." Do another

2-game rollout without that option selected but everything else the

same. You'll get rather different results.

-Michael J. Zehr

PS. I think Brian was thinking of a different situation. If you have

two variance reduction algorithms than the equity position they

eventually converge to should be the same, i.e. if after 1000 games

algorithm A reports a standard deviation of .025 and after 2000 games

algorithm B reports a standard deviation of .025, the equity results

should be the same, or at least within .025. I was addressing what A

and B report after the same 1000 games, not what they report when they

eventually converge.

0 new messages

Search

Clear search

Close search

Google apps

Main menu