Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Here is how gnubg cheats 3.

69 views
Skip to first unread message

mu...@compuplus.net

unread,
May 31, 2012, 5:09:32 AM5/31/12
to
GNU Backgammon Position ID: bXcjAAS7cwMAAA
Match ID : EQE5AwAAEAAE
+-1--2--3--4--5--6-------7--8--9-10-11-12-+ O: gnubg
O | O O O X O O | | | 0 points
O | O O O O O | | | Rolled 26
| O O O | | |
| | | |
| | | |
| |BAR| |^ 25 point match
| | | |
| | | |
| X X | | |
| X X X X X | | |
| X X X X X X | | X | 2 points
+24-23-22-21-20-19------18-17-16-15-14-13-+ X: Murat 25x110 (Cube: 2)


How would you play?

How do you think gnubg played?

3-ply cubeful hint says: 6/off 6/4 = 46.53%, 6/off 2/off = 46.44%.

And of course gnubg plays 6/off 6/4.

Hmmm...? Let's roll this one out.

3-ply cubeful roll out says: 6/off 2/off = 46.38%, 6/off 6/4 = 46.37%.

Okay, now, can you guess what was my next roll?

2-6.

This just happened during the 2nd game of the 79th match of 25-points I started playing against the gnubg grandmaster, while watching a video on Hulu at the same time, right after I posted a few articles here.

The day you are good enough to sniff such subtle irregularities in gnubg's play, you will be able to beat it by 75% also... ;)

MK

mu...@compuplus.net

unread,
May 31, 2012, 6:24:07 AM5/31/12
to
Update: about 1 hour 10 minutes later, after a long series of 22 games, "nut case" wins 27 to 20... :))

Still 20 minutes left on the Hulu movie "The vicious kind", so I'll play on...

MK

Tim Chow

unread,
May 31, 2012, 10:39:55 AM5/31/12
to
On May 31, 5:09 am, mu...@compuplus.net wrote:
>     GNU Backgammon  Position ID: bXcjAAS7cwMAAA
>                     Match ID   : EQE5AwAAEAAE
>     +-1--2--3--4--5--6-------7--8--9-10-11-12-+  O: gnubg
>   O | O  O  O  X  O  O |   |                  |  0 points
>   O | O  O  O     O  O |   |                  |  Rolled 26
>     |    O  O     O    |   |                  |
>     |                  |   |                  |
>     |                  |   |                  |
>     |                  |BAR|                  |^ 25 point match
>     |                  |   |                  |
>     |                  |   |                  |
>     |          X  X    |   |                  |
>     |    X  X  X  X  X |   |                  |
>     | X  X  X  X  X  X |   |       X          |  2 points
>     +24-23-22-21-20-19------18-17-16-15-14-13-+  X: Murat 25x110 (Cube: 2)
>
> How would you play?
>
> How do you think gnubg played?
>
> 3-ply cubeful hint says: 6/off 6/4 = 46.53%, 6/off 2/off = 46.44%.
>
> And of course gnubg plays 6/off 6/4.
>
> Hmmm...? Let's roll this one out.
>
> 3-ply cubeful roll out says: 6/off 2/off = 46.38%, 6/off 6/4 = 46.37%.
>
> Okay, now, can you guess what was my next roll?
>
> 2-6.

What exactly is your logic here?

If you change the dice seed and set up exactly the same position, GNU
3-ply will still play 6/off 6/4, and you'll hit for 11 out of 36
choices of dice seed.

So what is your claim? That GNU plays differently depending on the
dice seed? That's clearly not true as you can verify yourself. That
GNU's rolls don't depend deterministically on the dice seed but take
into account the position on the board? That's also not true as you
can verify yourself. So I'm at a loss as to what you're claiming.

---
Tim Chow

mu...@compuplus.net

unread,
Jun 1, 2012, 6:24:10 AM6/1/12
to
I haven't developed this bot. So, I don't know the infinite details about how it plays but apparently 3-ply cubeful hint is not always the same as 3-ply cubeful roll out.

I don't know if this works both ways either, since I don't try to catch positions where the bot's grandmaster decision is better than 3-ply roll out, which logically shouldn't be possible anyway.

So, my conclusion is that gnubg's 3-ply look ahead grandmaster level play is inferior to its own masturbating roll out...

Why would this matter?

Well, for one thing, in the above example, if we were betting on my predicting gnudung's future rolls, I would have predicted that I would not roll a 4 and that I would roll a 6. And I would have taken your money in proportion to the odds of my prediction coming true vs not...! Do you get this???

Even if we were not betting money on my predicting gnudung's future rolls, if this had happened at an earlier stage in a game, I would have made my moves based on my prediction that I would not roll a 4 and that I would roll a 6, and thus come out ahead... Do you get this???

I am just illustrating to you guys what I have been talking about for a while here. I am not the one obligated to explain why.

That is a task for the developers, promoters, defenders, worshippers, etc. of gnudung...

MK

Michael Petch

unread,
Jun 1, 2012, 7:04:38 AM6/1/12
to
On 2012-06-01 04:24, mu...@compuplus.net wrote:
> I haven't developed this bot. So, I don't know the infinite details about how it plays but apparently 3-ply cubeful hint is not always the same as 3-ply cubeful roll out.

In trying to understand if there is a terminology issue here or not, can
you tell me how you performing a "3-ply cubeful roll out" in GNUBG. I'm
asking what screen and buttons you press to do this.

Michael Petch

unread,
Jun 1, 2012, 8:23:06 AM6/1/12
to
On 2012-06-01 04:24, mu...@compuplus.net wrote:
> I haven't developed this bot. So, I don't know the infinite details about how it plays but apparently 3-ply cubeful hint is not always the same as 3-ply cubeful roll out.
>
> I don't know if this works both ways either, since I don't try to catch positions where the bot's grandmaster decision is better than 3-ply roll out, which logically shouldn't be possible anyway.


*Assuming* that you are actually doing 3-ply (grandmaster) rollouts, and
comparing them to a 3ply evaluation then I have this to say.

The two are not the same, and generally speaking if you do 3ply rollouts
with enough trials you generally will get better results than a 3ply
evaluation. And depending on the number of trials you do, it can take a
very long time to produce rollout results.

In GNUBG 0-ply is 1 move lookahead, 1-ply is 2 move lookahead.

A normal 3 ply evaluation uses either the neural network evaluator,
and/or the bearoff database to get its values. In simplest terms
(exclude filtering and pruning) a 3 ply evaluation, will look at all the
possible outcomes for the next 4 rolls only and produce choices based on
that.

A rollout is different. A rollout actually forces the bot to play
against itself using certain evaluation level. If you are truly doing a
3ply rollout (4 moves ahead) then the bot will play against itself for a
specific number of trials and/or a certain statistical threshold is met.
Each one of the moves in a trial would (in your case) look 4 moves
ahead. Each trial plays from the starting position to game conclusion
(double/drop, complete bearoff, resign etc). It keeps track of the
number of wins, gammons, and backgammons and produces output based on
the results of each move rolled out.

Since a rollout attempts to play actual games, and add up the results,
it can lead to different results than a normal evaluation. When the
difference is significant, it is an indication that the neural networks
understanding of a position is lacking.

If a n-ply rollout and an n-ply evaluation were actually the same thing,
there would be no reason for rollouts!

In summary - Rollouts using n-ply evaluations and simple n-ply
evaluations are two different things. The fact that an n-ply evaluation
can differ from a n-ply rollout is not any indication of cheating, it is
an indication that the neural network may not understand the position.

I have oversimplified some of the concepts here for simplicity.

montanao...@gmail.com

unread,
Jun 6, 2012, 7:23:31 AM6/6/12
to
I'll be glad to help you understand...

When I get suspicious of a move that gnubg makes, I click on the "hint"
button, which gives me list of the best plays at cubeful 3-ply (whatever
that means).

Then I highlight the first 2 to 4 moves (one at a time) and click rollout.

Id I click on the button with "..." next to rollout, this is what it shows:

Apparently some randomly selected seed value and 1296 trials.

Minimum trials 144. STD deviation 0.01 and JSD 1.96

Truncate cubeless. Cubeful variance reduction and quasi-random dice....

Does this help??

MK

montanao...@gmail.com

unread,
Jun 6, 2012, 7:39:08 AM6/6/12
to
On Friday, June 1, 2012 6:23:06 AM UTC-6, Michael Petch wrote:

> *Assuming* that you are actually doing 3-ply (grandmaster) rollouts, and
> comparing them to a 3ply evaluation then I have this to say.
>
> The two are not the same, and generally speaking if you do 3ply rollouts
> with enough trials you generally will get better results than a 3ply
> evaluation. And depending on the number of trials you do, it can take a
> very long time to produce rollout results.

Thanks for the clarifications and I will give you my side of the story.

I had the impression that gnudung trained itself by playing against itself a huge number of matches, to the end...

Are you saying this is not the case? If yes, then please explain how the neural net is trained...?

I thought the grandmaster level was the highest level, and that all other lower levels were achieved by inserting certain (perhaps arbitrary??) amounts of noise into gnudung's "perfect" (i.e. grandmaster level) play.

Now I see that there is a "4ply" level above "grandmaster".

Does this mean that "grandmaster" level is now a "noisy level of 4ply"...?

If so, then I will start playing against gnudung 4ply and nothing below that level because I just don't want to hear any bullshit about the "randomly inserted noise" to arrive at lower player ratings, which apparently now includes grandmaster...??

Let me ask you another way: is it possible for me to play against gnudung at the same level it trained itself by playing against itself (whatever that level may be)...?

BTW, I have only 10 more matches of 25-points left to satisfy Chow. How are you coming along with your efforts to set up a controlled experiment of me playing 100 matches of 25-points against gnudung...??

MK

Michael Petch

unread,
Jun 6, 2012, 9:57:52 PM6/6/12
to
On 2012-06-06 05:23, montanao...@gmail.com wrote:

> Does this help??
>

It tells me that your use of the term rollout was correct in previous
posts. I had to ask, since I have encountered people who have misused
the terms evaluation and rollout in my discussions with them. Wanted to
make sure we are talking about the same thing.

With that being said my post about rollouts and evaluations recently
that assumed we were talking about the same thing, still holds.

0 new messages