Would 1 million trials be enough to convince you?

MK

unread,

May 18, 2023, 5:09:09 AM5/18/23

to

Well, actually, 1 million 296 thousand to be exact.

Here is the short of it (with details at the end of post):

GNUbg ID: tm0TAQSabdsAAA:cAkFAAAAAAAA

1,296,000 trials, cubeful, 0-ply with maximum noise

9/8 9/7 = cubeful -0.299, cubeless -0.297
7/6 7/5 = cubeful -0.420 (-0.121), cubeless -0.421 (-0.124)
3/1 2/1 = cubeful -0.498 (-0.199), cubeless -0.496 (-0.199)
3/2 3/1 = cubeful -0.526 (-0.227), cubeless -0.525 (-0.227)

One striking detail is how the cubefull and cubeless
equities for moves and cubefull and cubeless equity
differences between moves become mostly identical,
(i.e. "cube skill" evaporates), after 1,296,000 trials...!

If you want to double-check it yourselves, it shouldn't
take more than 4-5 hours on an average desktop PC.

I realize that the decisions aren't completely random
and thus there is still a minimal amount of jackoffski
bias in there but with 1 million 296 thousand trials, I
hope you won't lower yourselves as much to deny that
the results are accurate and revealing enough, without
the capability of fully eliminating the jackoffski bias in
current bots (among which Noo-Bg is the only one that
even comes close to it as above).

Too bad that developers of Ex-Gee, Noo-BG, BG-bzzt,
etc. won't make the minimal effort to add the feature
to do random rollouts to their bots, which would allow
us to undeniably demonstrate what kinds of pieces of
shits their gamblegammon bots are. :(

Although not as stumped as Zimmer, Paul, Tim and
Stick, I'm still stumped how 9/8 9/7 could be better
than 7/6 7/5, in a very sad and lonely way... :( ... ;)

MK

================================================
1. Rollout 9/8 9/7 Eq.: -0.299
35.3 0.0 0.0 - 64.7 0.3 0.1 CL -0.297 CF -0.299
[ 0.0 1.5 0.0 - 0.0 0.0 0.0 CL 0.001 CF 0.001]
Full cubeful rollout with variance reduction
1296000 games, Mersenne Twister dice generator with seed 2655780272
Play: 0-ply cubeful, noise 1 (d)
Cube: 0-ply cubeful, noise 1 (d)
2. Rollout 7/6 7/5 Eq.: -0.420 (-0.121)
29.3 0.0 0.0 - 70.7 0.5 0.2 CL -0.421 CF -0.420
[ 0.0 0.8 0.0 - 0.0 0.0 0.0 CL 0.001 CF 0.001]
Full cubeful rollout with variance reduction
1296000 games, Mersenne Twister dice generator with seed 2655780272
Play: 0-ply cubeful, noise 1 (d)
Cube: 0-ply cubeful, noise 1 (d)
3. Rollout 3/1 2/1 Eq.: -0.498 (-0.199)
25.6 0.0 0.0 - 74.4 0.5 0.2 CL -0.496 CF -0.498
[ 0.0 0.2 0.0 - 0.0 0.0 0.0 CL 0.001 CF 0.001]
Full cubeful rollout with variance reduction
1296000 games, Mersenne Twister dice generator with seed 2655780272
Play: 0-ply cubeful, noise 1 (d)
Cube: 0-ply cubeful, noise 1 (d)
4. Rollout 3/2 3/1 Eq.: -0.526 (-0.227)
24.1 0.0 0.0 - 75.9 0.6 0.2 CL -0.525 CF -0.526
[ 0.0 3.3 0.0 - 0.0 0.0 0.0 CL 0.001 CF 0.001]
Full cubeful rollout with variance reduction
1296000 games, Mersenne Twister dice generator with seed 2655780272
Play: 0-ply cubeful, noise 1 (d)
Cube: 0-ply cubeful, noise 1 (d)

Stick Rice

unread,

May 18, 2023, 7:01:09 AM5/18/23

to

I have no idea what you're trying to prove with a 0 ply rollout and max noise. You might as well let my cat chose which plays to make. That setting from memory is basically a pure beginner.

Stick

Timothy Chow

unread,

May 18, 2023, 8:45:02 AM5/18/23

to

On 5/18/2023 7:01 AM, Stick Rice wrote:
> I have no idea what you're trying to prove with a 0 ply rollout and max noise. You might as well let my cat chose which plays to make. That setting from memory is basically a pure beginner.

His theory is that the "strong" bot settings introduce serious
systematic errors, which he's trying to avoid by using random
errors instead, and using large numbers of trials to smooth out
the random errors.

In particular, he'd be happy to use your cat, except that your
cat isn't fast enough.

---
Tim Chow

MK

unread,

May 18, 2023, 8:25:23 PM5/18/23

to

On May 18, 2023 at 5:01:09 AM UTC-6, Stick Rice wrote:

> On Thursday, May 18, 2023 at 5:09:09 AM UTC-4, MK wrote:

>> 1,296,000 trials, cubeful, 0-ply with maximum noise

>> 9/8 9/7 = cubeful -0.299, cubeless -0.297
>> 7/6 7/5 = cubeful -0.420 (-0.121), cubeless -0.421 (-0.124)
>> 3/1 2/1 = cubeful -0.498 (-0.199), cubeless -0.496 (-0.199)
>> 3/2 3/1 = cubeful -0.526 (-0.227), cubeless -0.525 (-0.227)

> I have no idea what you're trying to prove
> with a 0 ply rollout and max noise.

At the risk of offending the deep knowledge of
some people here, let me try to give a detailed
yet still a simple answer.

TD-Gammon v1, the granddaddy of all BG bots
that followed, was trained by random self-play
and reached an intermediate level of play, (and
was louded as the first BG bot untainted by any
human bias to achieve it), but it could only play
cubeless single games.

To progress at playing better and also at more
complicated cubeless matches, cubeful single
games and cubeful matches, ideally it would
have kept learning through random self-play.

But doing that would require huge amounts of
computing power that didn't exist back at that
time, 41 years ago. So, they re-inserted human
bias into it through cube skill formulas, match
equity tables, etc. which lead the bots to play
wrongly and to biased rollouts that circularly
validate the same wrong plays.

Today even average destop PC's have enough
power to do million-trial rollouts in a few hours.
So, what I'm trying to do is show you guys how
a properly trained, unbiased bot would play, by
substituting almost random rollouts. (Actually,
with 1,296,000 trials, I could well justify simply
saying "random" instead of "almost random").

I have no problem submitting to the results of
such rollouts as better than my judgment, (i.e.
9/8 9/7 vs. 7/6 7/5). I'm not as good at reading
bots' minds as some of you daily debaters of
positions but even if I may be able to do so in
some very simple cases, I'm truely stumped in
this one and I really wish that one of you oracles
could/would explain to me why..?

MK

MK

unread,

May 18, 2023, 8:56:48 PM5/18/23

to

On May 18, 2023 at 6:45:02 AM UTC-6, Timothy Chow wrote:

> On 5/18/2023 7:01 AM, Stick Rice wrote:

>> I have no idea what you're trying to prove
>> with a 0 ply rollout and max noise.

> His theory is that the "strong" bot settings
> introduce serious systematic errors,

Not just "strong" but any non-random play
settings cause it, only at varying degrees.

> which he's trying to avoid by using random
> errors instead,

I'm using "random play" not "random error".

> and using large numbers of trials to smooth
> out the random errors.

Yes, even at the "weakest" settings, there is
still a trace of bias and very large numbers
of trials "virtually" eliminates it.

> In particular, he'd be happy to use your cat,
> except that your cat isn't fast enough.

I'm a big cat lover. No, not big cats. I'm a cat
big lover. Oops, Tim won't like it. How about
I'm a big lover of cats? But I'm not big. Hmm.
I'm a lover of cats in a big way??

Anyway, when I read this, I immediately had
the vision of my cat scooting around pieces
and knocking over the doubling cube on my
backgammon board with it's cute little paws.
That gave me a long, reminiscent smile :)

Then, there was Stick's cat at the other end
of the backgammon board doing the same! :)

Thank you to both of you for making my day,
in a big way. :))

MK

Axel Reichert

unread,

Jun 5, 2023, 6:26:04 AM6/5/23

to

MK <mu...@compuplus.net> writes:

> GNUbg ID: tm0TAQSabdsAAA:cAkFAAAAAAAA
>
> 1,296,000 trials, cubeful, 0-ply with maximum noise
>
> 9/8 9/7 = cubeful -0.299, cubeless -0.297
> 7/6 7/5 = cubeful -0.420 (-0.121), cubeless -0.421 (-0.124)
> 3/1 2/1 = cubeful -0.498 (-0.199), cubeless -0.496 (-0.199)
> 3/2 3/1 = cubeful -0.526 (-0.227), cubeless -0.525 (-0.227)

Yes, I am finally convinced. Convinced that 9/8 9/7 is the best move
against a random player, provided that I also continue randomly after
this "best" move.

Unfortunately that does not even help me against the average coffee
house player.

But playing against GNU Backgammon set to maximum noise is relaxing and
comforting, even funny sometimes, because one can check how solid one's
own backgammon fundamentals are in very weird positions.

Best regards

Axel

MK

unread,

Jun 6, 2023, 3:58:13 AM6/6/23

to

On June 5, 2023 at 4:26:04 AM UTC-6, Axel Reichert wrote:

> MK <mu...@compuplus.net> writes:

>> GNUbg ID: tm0TAQSabdsAAA:cAkFAAAAAAAA

>> 1,296,000 trials, cubeful, 0-ply with maximum noise

>> 9/8 9/7 = cubeful -0.299, cubeless -0.297
>> 7/6 7/5 = cubeful -0.420 (-0.121), cubeless -0.421 (-0.124)
>> 3/1 2/1 = cubeful -0.498 (-0.199), cubeless -0.496 (-0.199)
>> 3/2 3/1 = cubeful -0.526 (-0.227), cubeless -0.525 (-0.227)

> Yes, I am finally convinced. Convinced that 9/8 9/7
> is the best move against a random player, provided
> that I also continue randomly after this "best" move.

I wonder if you realize how pathetic your comments
sound.

All bots since TD-Gammon, (and perhaps some even
before it), were trained through random self-play. Do
you have any objections to that?

What I am doing is making the bot train itself through
random self-play for a single position. Do you have any
objections to this in principle?

Accepting, (at least for the sake of the argument), you
people's claimt hat cubeful play is more complex, I let
the bot play not just 1,296 times but 1,296,000 times
against itself, so that it can encounter enough cubeful
positions enough times as any bot doing a rollout with
1,296 trial encounters cubeless positions.

I can understand if you find 1,296,000 trials not enough
and I would have no problem with running 12 million or
120 million trials. Let me hear what other problems you
may have with it..?

The 9/8 9/7 in this example isn't the best move against
only a random player. It's the best move period.

There is no limitation that the play continues randomly
after this "best" move either. Provided that the bot goes
through enough cubeful random trials, it will find "the
best move" better than any existing jackoffski bot!

> Unfortunately that does not even help me against the
> average coffee house player.

Because you either refuse to understand or you are not
able to understand it.

> But playing against GNU Backgammon set to maximum
> noise is relaxing and comforting, even funny sometimes,

Now you are dipping below pathetic. :( But if it helps ease
your pain, go ahead...

> because one can check how solid one's own backgammon
> fundamentals are in very weird positions.

What I'm proposing is not limited to "weird positions" at all.

But you are right that about checking your gamblegammon
fundamentals, (however/wherever you acquired them from),
in any kinds of positions against the results of cubeful (and
matchful) random rollout results. You will find yourself, and
your worshipped bots, wrong way too many times than you
can ever imagine. When that time comes, I hope that there
won't be too jumpings from bridges, tall buildings, etc...

MK

Axel Reichert

unread,

Jun 6, 2023, 5:13:40 PM6/6/23

to

MK <mu...@compuplus.net> writes:

> All bots since TD-Gammon, (and perhaps some even
> before it), were trained through random self-play. Do
> you have any objections to that?

No. During training the parameters of the neural network are allowed to
adapt.

> What I am doing is making the bot train itself through
> random self-play for a single position. Do you have any
> objections to this in principle?

Yes, I do. You do not train the bot, you run the bot. No parameters of
the neural network will have changed after your random rollout.

> The 9/8 9/7 in this example isn't the best move against
> only a random player. It's the best move period.

If you say so. But are you not the guy advocating adapting the play
based on the opponent (which I support)? How does that go along with
your claim that the best move against a random player is also the best
move against any opponent?

> There is no limitation that the play continues randomly after this
> "best" move either. Provided that the bot goes through enough cubeful
> random trials, it will find "the best move"

It will gather results and statistics based on random play, essentially
worthless against non-random opponents.

>> Unfortunately that does not even help me against the
>> average coffee house player.
>
> Because you either refuse to understand or you are not
> able to understand it.

Correct.

Axel

MK

unread,

Jun 7, 2023, 6:32:25 PM6/7/23

to

On June 6, 2023 at 3:13:40 PM UTC-6, Axel Reichert wrote:

> MK <mu...@compuplus.net> writes:

>> What I am doing is making the bot train itself
>> through random self-play for a single position.
>> Do you have any objections to this in principle?

> Yes, I do. You do not train the bot, you run the
> bot. No parameters of the neural network
> will have changed after your random rollout.

As I was typing, I felt that you were apt to give
such a response and I tagged on "in principle"
to my last sentence just to avert it but I guess
it didn't work. :(

Just because the bot isn't told to remember it
doesn't take away anything from the result. (In
fact, many times in the past, it was suggested
that future bots should perpetually update their
networks with all rollouts and games they play.)

>> The 9/8 9/7 in this example isn't the best
>> move against only a random player. It's the
>> best move period.

> If you say so.

I don't say so. The rollout says so. (I thought
7/6 7/5 was the best move.)

> But are you not the guy advocating adapting the
> play based on the opponent (which I support)?

Yes, I still advocate it and I'm glad you support it.

> How does that go along with your claim that the
> best move against a random player is also the
> best move against any opponent?

There is no contradiction because there is not a
such thing as the "best move against a random
player" (nor have I said anything to that effect).

Whatever gets accepted as the "best move", is
best against any/all opponents.

Whatever is learned from training through random
play becomes knowledge, not "random knowledge".

>> There is no limitation that the play continues
>> randomly after this "best" move either. Provided
>> that the bot goes through enough cubeful
>> random trials, it will find "the best move"

> It will gather results and statistics based on
> random play, essentially worthless against
> non-random opponents.

Random play is the only way to train bots without
humah bias. The object of the play is to win. The
random moves that result in most wins bubble up
to the top and become "best moves". If you retain
that knowledge and repeat the same moves next
time you play, they're no longer random moves.

The non-random rollouts do just that; reuse the
best moves that resulted from previous random
rollouts.

The only thing I'm doing differently is to make the
bot do *random cubeful* and *random matchful*
decisions during the "training" process in order to
eliminate human biases injected by cube formulas
and METs that are retroactively applied to results
of training by *random cubeless and matchless*,
(i.e. 1-pointer), play. It's this simple! I'm puzzled at
how you can fail to understand it..? :(

>>> Unfortunately that does not even help me
>>> against the average coffee house player.

>> Because you either refuse to understand or
>> you are not able to understand it.

> Correct.

Either which way you meant it, it is a virtue to
acknowledge one's limitations...

MK

Axel Reichert

unread,

Jun 8, 2023, 9:18:00 AM6/8/23

to

MK <mu...@compuplus.net> writes:

> future bots should perpetually update their networks with all rollouts
> and games they play.

IIRC, (much) older versions of GNU Backgammon allowed for this.

Axel

MK

unread,

Jun 8, 2023, 9:00:29 PM6/8/23

to

I have 14 versions of Noo-BG, starting with v0.0
and some of them including source codes but I
don't remember anything about this. It would be
very interesting/useful if anyone could provide
more info on this.

If it existed, can it be revived and improved on?
Was it done locally or were the results sent to a
remote center?? Etc.???

MK