On another note, I think 15 points is too long, as I found myself
losing focus towards the end of some of the matches, not that I'm
trying to make excuses, but I think that 11 points is probably the
limit, at least for me (unless I'm playing an incredibly weak player,
I suppose).
>I won the series, 5 to 2, and I'm wondering if those of you who
>followed it think it was decided by a clear skill difference or was it
>due mostly to luck?
I did not follow it. But Michael thought that the two of you were
fairly equal in skill based upon the error rates. So what do *you*
think Monty? Was it clear skill difference or was it mostly due to
luck. Since Michael gave his opinion, why not give yours?
>
>On another note, I think 15 points is too long, as I found myself
>losing focus towards the end of some of the matches, not that I'm
>trying to make excuses, but I think that 11 points is probably the
>limit, at least for me (unless I'm playing an incredibly weak player,
>I suppose).
11 point matches will be decided more by luck than 15 point matches.
The ability to maintain focus for long periods of time is an important
quality for backgammon players who aspire to excellence.
Rich
On 5/10/08 5:00 PM, in article
b36c7242-fb2c-4b18...@p25g2000hsf.googlegroups.com,
"mont...@lycos.com" <mont...@lycos.com> wrote:
> I won the series, 5 to 2, and I'm wondering if those of you who
> followed it think it was decided by a clear skill difference or was it
> due mostly to luck?
> And what is the GNUBG assessment of luck overall
> (as well as skill)?
>
I'm going to let Bob and others post the findings (He is an outside observer
and has been keeping up with all the matches).
But I can tell you some interesting findings in general:.
PlayerA) Outplayed opponent on Cubes by wide margins over course of 7
matches.
PlayerB) Only had a better Cube error rate in *one* match.
PlayerA) Had better checker error rates in 4 of the matches
PlayerA) Was only player to have matches (2 of them) beating opponent on
Both Checker AND Cube play
PlayerA) Had an overall(total) lower checker play error rate than PlayerB
across all matches
PlayerA) Had an Overall snowie error rate in the 6's.
PlayerB) Had an Overall snowie error rate in the 7's.
PlayerA and PlayerB's best match were won with a similar total error rate
PlayerB) Had the worst match period (Based on combined error rates of Cubes
and Checker Play)
PlayerA) May be a slightly more consistent player
Regarding luck, in every case the luckier player won. I said we played about
even strength, but to be honest PlayerA in my opinion may have a slight edge
at present.
> On another note, I think 15 points is too long, as I found myself
> losing focus towards the end of some of the matches,
On average matches lasted about 1.25 hours. I'd venture o guess we made a
move every 3-4 seconds on average across all the matches.
What is the average time for a 7 or 9ptr over the board in an ABT event (I'd
be curious) - so someone with experience step in please. My bet is that some
of those 7ptrs at ABT events exceed the longest (time wise) match we had. I
may be wrong.
I must admit the length of the matches and time required were better than I
expected. No undue delays or lengthy BRB's. I had no issues with
concentration. But then again longer match lengths likely work in my favor
in the long run.
I often play matches of length 17-25.
> not that I'm
> trying to make excuses,
Then don't. We had ample time between matches, and I went out of my way to
make sure they weren't all done in a week. 7 15pt matches over 5 weeks in my
opinion was not too burdensome.
> but I think that 11 points is probably the
> limit, at least for me (unless I'm playing an incredibly weak player,
> I suppose).
That is your limit. I can accept that. Just out of curiosity, in general who
do the shorter matches favor - The stronger or the weaker player?
Michael
PS: Bob, would you be willing to post the actual results in summary form
from Gnubg (Similar to what you did after the first 3 matches).
>> but I think that 11 points is probably the
>> limit, at least for me (unless I'm playing an incredibly weak player,
>> I suppose).
>
> That is your limit. I can accept that. Just out of curiosity, in general who
> do the shorter matches favor - The stronger or the weaker player?
>
> Michael
>
Funny thing is I originally asked for 13pt matches, but Monty requested 15pt
matches (which was fine by me).
When I said 1.25 hours that was just a perceived estimate. Luckily on WGC
the chat files show the times we started playing to the time we finished
(Only exception was fibs today where I knew we finished under an hour):
Match 1: 50 mins
Match 2: 40 mins
Match 3: 60 mins
Match 4: 65 mins
Match 5: 45 mins
Match 6: 50 mins
Match 7: 55 mins
Only one match took slightly longer than an hour. Average time for a match -
50-55 mins.
I bring this up because it bugs me a bit - I felt I played a good paced
game, and even with the few comments in plays over the whole series that I
didn't take excessively long to play these matches. I have played some 11
pointers that have had twice as many moves and taken 2 hours. I'm on the
record saying that Match length was dictated by Monty, and I consider the
match times very good.
Mike
You are correct that some ABT matches will take longer. There is a
significant time savings though online due to no time used shaking
dice or rerolling cocked dice. Also forced moves being played
autmatically can be another time saver.
Bob Koca
> You are correct that some ABT matches will take longer. There is a
> significant time savings though online due to no time used shaking
> dice or rerolling cocked dice. Also forced moves being played
> autmatically can be another time saver.
>
Thanks Bob, that's what I expected.
Would you rate these matches as excessively long (Time wise for number of
points) for online matches? I think some might claim we actually played too
fast and may have actually missed some plays.
I only play online at dailygammon which is a turn based site so I
am not the best to ask.
Bob Koca
For many reasons I think the luck adjusted results are better to
look at for a series of games involving two players.
The most important reason is that it is not at all negatively
influenced by playing to the opponent.
Bob Koca
>Who was player A?
Take a guess.
> In any case, I'd be interested in continuing to
>play, but it would have to be 9 pt. matches for 1 Euro at TMG or XG.
>I've claimed that GNUBG error rate is not the "whole story" and that I
>play my human opponents in ways that might not be technically best
>because I'm trying to "trip" him or her "up."
This is priceless. If the tables were turned and Monty lost 2-5 with a
lower error rate he would be whining about the bad luck that he got
and talking about how his error rate was lower but he still lost.
A while back he played a very strong player "Progressive" on TMG.
Monty said that his error rate was 6 and Progressive's was 7 but he
lost due to bad luck and whined about how backgammon needs to change
the rules to eliminate bad luck. Maybe Progressive tried to trip up
Monty and that is why his error rate was a bit higher. Funny how Monty
does not consider that possibility.
Now that he is winning it is all due to his special skill at tripping
up opponents. I still would like to know why he puts "trip" and "up"
in quotation marks.
> How many 9 point
>matches would I have to win (let's say it's best of 15 matches) to be
>considered a better player, even if my error rate was not as good,
>according to GNUBG? Obviously, this can't be ascertained exactly -
>I'm just curious about opinions.
Notice how Monty does not directly give his opinion about whether he
won by skill or luck. He does seem to imply that he thinks that he won
by skill even though his error rate may have been the same or lower
than his opponent.
Does Monty think that winning 5 out of 7 matches is statistically
significant to determine who clearly is the best player.
Michael was generous enough to give Monty money up front just to
play.I still am puzzles by why on earth he would do that.
Would Monty put that same amount of money up in a rematch. Winner
takes all?
Rich
There are many more reasons why ABT matches will take longer. You need
to set up the board after each game. There may be a call to the
tournament director over irregularities or questions about rules.
There may be a question/disagreement about the placement of checkers
after an opponents moves several checkers to test out different plays.
Also the need to count pips manually will dramatically slow down play.
When clocks are used to time matches, they usually give 5 minutes per
point per opponent. So for a 7 point match, a total of 70 minutes is
alloted. This is more than the longest 15 point match that Michael and
Monty played.
If Monty tired after an hour of play he would be limited to 7 point
matches in ABT tournaments which essentially would disqualify him from
playing.
Rich
>
>Bob Koca
My opinion (as of now): I can beat Petch consistently (in this kind of
format: best of 7 fifteen point matches), because I think I know how
to play him, assuming his level and style of play stays about the
same. However, it's not really worth my time at this point, beyond
playing a 9 pt. match for a Euro now and then. I don't play nearly as
much as I used to, and I'm not nearly as interested in it as I used to
be (on any level). Even Rich's comments, which used to be
entertaining, are just plain boring now. For those of you who take
great interest in this game, "more power to you." Unless the
organization of this game changes, and it becomes more like
"professional chess" (where top players get paid whether they win or
lose the tournament), I can't see any reason to get more "serious"
about it.
Regarding 5-2. Lets say you had a fair coin, and you flipped it 7 times and
the result came up: HTHHHTH are you saying you can conclude with certainty
that the dice is unfair? I'd conclude your sample set of 7 isn't a large
enough sample set to draw any conclusion.
175 Euros was worth this post. And if you hadn't figured it out I was
PlayerA. I was actually being kind not to name names in that post to show
why I said we were relatively close without stating which player may have
the edge. I have been as gracious as I could be, well mannered, and pleasant
- unless you can show me otherwise.
I am an Intermediate advanced player who stated at the outset that on longer
matches my error rates usually put me around 1900 by GnuBg rating standard.
That is usually around a 7 snowie error rate. I played to my expectations.
I had an edge not only on checker play, but a huge edge on cube play. And
I'll say this - The longer the match with you the more interested in playing
you for money I'd be.
Why is it that PlayerB wished shorter matches, and PlayerA would prefer
longer ones. No explanation is needed on this, and I will simply state again
Monty - Nicely played series, but It shows two things. I am the *more*
dominant player when it came to skill at present, and you were the more
dominant player on luck. *This* time around you went 5-2 - and that's my
conclusion.
On 5/10/08 11:02 PM, in article C44BD9EF.58D4D%mpe...@capp-sysware.com,
"Michael Petch" <mpe...@capp-sysware.com> wrote:
> HTHHHTH are you saying you can conclude with certainty
> that the dice is unfair?
"HTHHHTH are you saying you can conclude with certainty
that the dice is unfair?"
Should have read:
"HTHHHTH are you saying you can conclude with certainty
that the COIN is actually unfair?"
>My opinion (as of now): I can beat Petch consistently (in this kind of
>format: best of 7 fifteen point matches), because I think I know how
>to play him, assuming his level and style of play stays about the
>same. you play at about the same error rate level, the only way that you can consistently beat him is to either play at a much higher level or be lucky. Apparently every match that each of you won was won by the luckier player.
Since you have no control over the luck you cannot possibly
*consistently* beat him. If you played another series of matches he
would be as likely to win five out of seven as you were.
But since you think that you know how to play him, why don't you share
with the group how you would do this. Cue Monty to tell us that he
does not want to give away any trade secrets which is funny
considering that Petch just gave him a considerable sum of money just
to play. It is the least that Monty could do under the circumstances.
Rich
> For many reasons I think the luck adjusted results are better to
> look at for a series of games involving two players.
> The most important reason is that it is not at all negatively
> influenced by playing to the opponent.
>
I agree with you here, but had I mentioned the results of that I think it
might have made things look more favorable to myself.
But since you brought it up (rightly so), when adjusted for luck I was a
favorite in all matches. Ranging from 51%/49% to 60%/40%. I was about an
average 54.4% favorite for all the matches based on the luck adjusted
values.
1 Mpetch +5.23 (3ply Checker 2 ply cube)
2 Mpetch +4.23 (3ply Checker 2 ply cube)
3 Mpetch +1.73 (2ply Checker 2 ply cube)
4 Mpetch +4.72 (2ply Checker 2 ply cube)
5 Mpetch +8.92 (2ply Checker 2 ply cube)
6 Mpetch +10.62(2ply Checker 2 ply cube)
7 Mpetch +2.13 (2ply Checker 2 ply cube)
Mike
The +10.62 means you are a 55.31% - 44.69% favorite and not a 60.62%
favorite correct?
Bob Koca
>
> When clocks are used to time matches, they usually give 5 minutes per
> point per opponent. So for a 7 point match, a total of 70 minutes is
> alloted. This is more than the longest 15 point match that Michael and
> Monty played.
>
The 5 minutes per point per player is from obsolete rules. Since
about a year ago clocked matches
are nearly always played with Bronstein clocks. Usually each player
gets a total of 2 minutes per point
to be played of reserve time plus 12 seconds per move grace time. If
the player moves before the 12 seconds
then no grace time is used up but there is no extra bonus to moving
faster than the 12 seconds.
So for a 7 point match each player gets 14 minutes reserve time. 10
games averaging 20 turns per player
per game would be well above average but will happen. That could take
up to worst case of (2 x 10 x 20 x 1/5) = 80 minutes
grace time + 28 minutes reserve time = 108 minutes play time. Each
player is usually allowed one short break.
There will be a a couple minutes setting up the pieces each game. Any
rulings could also delay.
Bob Koca
So you think you would be one of the top players if you were to get
more serious ?!
It didn't happen for you in chess why would backgammon be different?
Curious, where does the money come from to pay the top players
whether they win or lose?
Bob Koca
>
> The +10.62 means you are a 55.31% - 44.69% favorite and not a 60.62%
> favorite correct?
>
As far as I know - +10.62% is 50%+10.62%. I wondered about this many moons
ago when I first started using Gnu. but this is the description:
"Luck adjusted result: The luck adjusted result is calculated as the actual
result plus the total unnormalised luck rate. This is also called "variance
reduction of skill" as described in Douglas Zare's excellent article Hedging
Toward Skill. This should give an unbiased measure of the strengths of the
players."
By this definition one person will have 50%+"Total unnormalized luck rate",
and the other will have 50%-"Total unnormalized luck rate"
Someone else may wish to jump in if my interpretation of this is incorrect.
Mike
> Someone else may wish to jump in if my interpretation of this is incorrect.
>
As a followup I decided to plug the computed "Luck based Fibs rating
difference" into the Fibs formula to determine the match winning chances of
each player, and it seems to suggest my interpretation is probably correct.
Assuming:
F = the probability of the favorite winning the match. (Calculated)
U = the probability of the underdog winning the match. (calculated)
F = U-1 (Calculated)
N = match length (Given)
D = the difference between the two ratings (Given)
And we compute U as U = 1/(10^(D*SQRT(n)/2000)+1)
In match #1 luck based Fibs rating Difference = 47.13
Therefore D=47.13
U = 1 / (10^(47.13*sqrt(15)/2000)+1)
U = .447655
Chance of underdog Winning - U * 100 = 44.77%
Chance of Favorite winning - (1-U) * 100 = 55.23%
Gnubg shows the calculated Luck Adjusted value for Match #1 as 5.23%.
This maps to how I perceive the values (50%+LAV, and 50%-LAV).
Mike
On 5/11/08 2:41 AM, in article C44C0D60.58D64%mpe...@capp-sysware.com,
"Michael Petch" <mpe...@capp-sysware.com> wrote:
> F = U-1 (Calculated)
Should have read:
F = 1-U (Calculated)
Agree your interpretation is correct. For one more piece of evidence
I played a
1 point match always making the worst move I could quickly find. We
would expect the bot
to be close to 100% favorite and the luck adjusted result was +47.59
Bob Koca
The key mistaken word in this statement is "consistently."
The outcome of this set of matches just proves that a (slightly) less
skilled PlayerB: Monty can beat a (slightly) more skilled playerA:
MPetch over a limited set of games.
It's noted that, being paid *before* beginning the series of matches,
your only incentive was, instead of "playing the opponent," to play
just your best technical game in order to confirm before this
newsgroup (or before God) your self-awarded qualifications.
This is what Rich wasn't able to realize -- Michael's move aimed at
putting aside from Monty any extra-technical consideration, as would
have been the case of playing for a stake then "playing the opponent"
with sub-optimal play in order to exploit spotted flaws.
I think that statement of yours clearly demonstrates you're a self-
inflated player.
Snowie assigned levels of play on each match show that you're
uncapable of "consistently" achieving an expert level of play over
long matches (that is, longer than 9 points, your preferred match
length). From Michael's website:
#1 - 19 games Monty: casual
#2 - 9 games Monty: advanced
#3 - 15 games Monty: intermediate
#4 - 17 games Monty: casual
#5 - 13 games Monty: intermediate
#6 - 9 games Monty: casual
#7 - 13 games Monty: advanced
My Snowie analysis rank you a bit upwards but still, far from being an
expert, let alone an "almost world class" player.
So you'll be awarded the second corynthian banner in the recent
history of this newsgroup.
> For those of you who take
>great interest in this game, "more power to you."
For someone who does not take "great interest" in backgammon, Monty
sure seems to spend a lot of time playing the game and posting to this
newsgroup. He has been bitching and moaning about the game for years
in this newsgroup. And yet he keeps playing and posting.
But I don't think that it is the game of backgammon that has as much
interest to Monty, as does believing that he has great expertise in
some human endeavor even though he just deludes himself into believing
this. Even when his belief flies in the face of objective data (error
rate, luck) he continues to persist in his delusion.
I mean how can he say with a straight face that he could
"consistently" beat Michael Petch in series of 15 point matches when
his error rate, both cube and checker, were inferior? He used to
justify playing inferior moves because he played clearly inferior
players. But Michael Petch and Monty play at very close levels of
expertise so that justification no longer applies.
The reality is that Monty played a series of matches and demonstrated
that he plays at intermediate-advanced level. He is very far from
expert or World Class player, his delusional beliefs notwithstanding.
Rich
mpetch (Average of 7 ) 1852 +- 9 Moves: 133 Cubes: 15
Monty (Average of 7 ) 1824 +- 17 Moves: 146 Cubes: 30
Kees (Another thing just in player who went wild.)
> Here's a rating analysis of the series, using the same gnubg settings
> used to calibrate the rating estimate formulae built in gnubg. Listed
> are the estimated rating, the 5% confidence interval, and the
> contributions from checker play and cube play towards the lost rating
> points (2000 being calibrated as "perfect play").
>
> mpetch (Average of 7 ) 1852 +- 9 Moves: 133 Cubes: 15
> Monty (Average of 7 ) 1824 +- 17 Moves: 146 Cubes: 30
>
>
Thanks Kees. I have one question, in GnuBG doesn't it use 2050 as perfect
play. The numbers pretty much jive except they are about 50 lower than what
Gnu would generally show.
I understand the Gnubg rating calculation is an estimate based on Checker
and Cube play. With that being said - An interesting thing to note is that
if you take 1824+17 you get 1841, and 1852-9=1843. Within the expected error
range we could be seen as pretty much even, with a slight edge in my favor.
I felt the difference in strength (In the series) was marginal enough to
say "We'd split the matches in the long term". But at present I think one
might be able to say one player may have a slight edge. I guess this edge
may be why I would like to see longer matches, and Monty would like to see
the shorter ones.
Mike
The object of the game is to win the match, not have a low error
rate. Note that the error rate has problems with its definition even
if the bot is perfect, which it is not.
Bob Koca
It should say 95% CI instead of 5%. A common misinderstanding with
confidence intervals may be at play here. Suppose that your lower
amount and Monty's upper amount actually did meet. For example, you
were at 1850 instead of 1852. That does not mean that there is only
95% confidence that your rating is higher. It is actually higher than
that as two unlikely things would have needed to happen. Your true
rating needs to be much lower than the prediction and alos Monty's
much higher than predicted.
Bob Koca
Player mpetch Monty
Chequerplay statistics
Total moves 1574 1569
Unforced moves 1364 1338
Unmarked moves 1422 1449
Moves marked doubtful 93 64
Moves marked bad 42 34
Moves marked very bad 17 22
Error total EMG (MWC) -16.724 -17.095
Error rate mEMG (MWC) -12.3 -12.8
Chequerplay rating Intermediate Intermediate
Luck statistics
Rolls marked very lucky 0 0
Rolls marked lucky 88 76
Rolls unmarked 1386 1410
Rolls marked unlucky 58 46
Rolls marked very unlucky 13 3
Luck total EMG (MWC) +8.604 +19.822
Luck rate mEMG (MWC) +5.5 +12.6
Luck rating None None
Cube statistics
Total cube decisions 1222 1078
Close or actual cube decisions 254 245
Doubles 55 52
Takes 28 21
Passes 24 34
Missed doubles below CP (EMG (MWC)) 24 (-1.184) 18
(-1.879)
Missed doubles above CP (EMG (MWC)) 5 (-0.346) 11
(-1.256)
Wrong doubles below DP (EMG (MWC)) 17 (-1.362) 15
(-1.581)
Wrong doubles above TG (EMG (MWC)) 3 (-0.244) 4
(-0.407)
Wrong takes (EMG (MWC)) 0 2
(-1.032)
Wrong passes (EMG (MWC)) 8 (-1.391) 16
(-4.380)
Error total EMG (MWC) -4.528 -10.536
Error rate mEMG (MWC) -17.8 -43.0
Cube decision rating Intermediate Awful!
Overall statistics
Error total EMG (MWC) -21.252 -27.630
Error rate mEMG (MWC) -13.1 -17.5
Snowie error rate -6.8 -8.8
Overall rating Intermediate Intermediate
OH, I don't know what the default is, I set it to 2000.
Kees
> It should say 95% CI instead of 5%.
That statement is wrong.
>A common misinderstanding with confidence intervals may be at play
>here.
Perhaps, I don't know your level of misinderstanding.
Kees (Nonsense, for seen from allegations that findeth me culinair
gebied verzekerd van vertellen, moet zijn, belastingtechnisch
neutraal terrein en militarisme is gekomen???)
All in all I think it's clear that both of you are evenly matched and
play at a really high level.You have a slight edge in cube handling
though.
Now, many would be eager to see whether will you go for another
series.
Regards
>
> All in all I think it's clear that both of you are evenly matched and
> play at a really high level.You have a slight edge in cube handling
> though.
>
Agreed
> Now, many would be eager to see whether will you go for another
> series.
>
Good idea, so how about we start with a new challenge. I am willing to play
Monty Nine Matches, each 25pts long for $150 USD/MATCH. Assuming such a
match takes no more than 3 hours that would work out to about $50/hour for
Monty.
$150 USD is reasonable since I already gave him $225. Since he is the
superior player with such a commanding ability to beat me, by his
statistical analysis of first 7 matches he'll only need a few hundred
dollars (This is asusming he is as good as he thinks he is). And since he is
the superior opponent - he should have a better chance of beating me in a
much longer match.
I've stacked the tables against myself! Free money for Monty.
Mike
I'm very eager to watch your matches, if they're going to happen, cuz
your words are a bit .... hmmm, don't know what to say.
Also we'll be delighted to see you both playing at FIBS regularly.
There's lot of tourneys and leagues and you can enjoy the friendly
community there. Hope to have the pleasure of playing with you too.
Regards and good luck to you both!
For years people argue that the longer the match, the more likely the
more skillful player will win. That is a statement that would be hard
to refute. But I have always argued that a series of shorter matches
would be a far bigger test of skill than one long mather or a shorter
series of longer matches. In a long match, most of the time the take
point decisions and price of gammons are pretty close to the same as
money games...around 25% and not that complex. But it's when you get
to under 5-away 5-away and the score starts changing that the checker
and cube play decisions can get extremely complex. And here is where
great skill comes in.
I would love to see best of 9 5 point matches than 2 out of 3 13 point
matches, or one 25 point match. I not only think the series of
shorter matches would require more skill to win, they sure would be a
lot more fun to watch. Picture a tennis match where the first player
to win 100 points wins the match. How boring that would be. But real
tennis is extremely exciting and full of strategy partially because
that within 4 points you suddenly have a "critical" point that is a
deuce or advantage point. Backgammon needs more of those critical
games for excitement, fun, and skill.
This was discussed on this group some time ago. (Search: "Longer
matches favor the favorite, right?").
It's reasonably straight forward to calculate the various
probabilities.
Take a 1900 rated player, playing a 1700 rated player (FIBS rating
system),
then we get:
1. 21 point match -> P(win match) = 74.2%
2. 7 point match -> P(win match) = 64.8% --> P(win best of 3) =
71.5%
3. 3 point match -> P(win match) = 59.8% --> P(win best of 7) =
70.7%
4. 1 point match -> P(win match) = 55.7% --> P(win best of 21) =
70.2%
tansley
Big assumption that the rating system adjustment for match length is
accurate. I have not seen evidence for that.
Bob Koca
> Big assumption that the rating system adjustment for match length is
> accurate. I have not seen evidence for that.
>
> Bob Koca
From what I have learned, experienced and read from others is that the Fibs
rating formula is likely better estimate over a longer match length and may
over estimate MWC for the shorter match ( <= 3 ). I'd be interested in the
input from Kees or Mr Zare on this question.
Mike
PS: It's interesting how when you take away the personalities of what
prompted all these discussions, that the ensuing discussions of a technical
nature are actually quite interesting. If someone is wondering where the
value of my money went, part may have been for entertainment, but mostly I'd
say that the REAL discussions here were worth it.
Yes -- generally speaking.
But this set of matches were arranged under totally atypical
conditions, weren't they?
They were paid:
a) before beginning actual play
b) one way, regardless of the actual results
So you can't apply usual assumptions to this case.
I think Michael had a (more or less hidden) purpose other than merely
winning the matches. And the course Michael's argument is taking would
demonstrate that.
> Note that the error rate has problems with its definition
> even if the bot is perfect, which it is not.
Nonetheless it's the most reliable tool available to take relative
measures of both players' skills. Besides, it's likely that the same
distortions would roughly go both ways, so still giving a valid
*comparative* measure.
You are saying that Monty's only incentive was to play to minimize
the error rate and that is just ridiculous since Monty himself said he
used his "tricks". Clearly he had a will to win the matches, which is
incentive to make the plays to win the match regardless of error rate
consequences. Are you so enamored of money that you don't care about
winning a match if there is no money on the line and even take it so
far to think that everyone is like that also?
Bob Koca
Hi Guys,
********
PREFACE: After writing this version of War and Peace - some may decide to
ignore this post because of limited interest in this thread. If you are
going to close this message without reading but you are a GnuBG user please
see the valuable footnote (At the bottom) if you have an interest in the
Luck Adjusted Values that GnUBG computes.
********
I appreciate all the input. I'm going to give my view on what I was trying
to achieve with this first challenge.
I knew my playing level going into all this. I also suggested in a post
before the series that I likely averaged around 1900 (ratings usually fell
between 1875 and 1925). Based on all the available data on Monty (provided
by himself) I predicted he was the slightly better player at a 1925 level
(When I posted that number many felt he was likely not going to play at that
level, and SO FAR given the limited sample set that is currently accurate).
I felt that although I may be an underdog (a 25 ratings difference in a 15
pt match would have made him a slight favorite of about 52-53%), I felt I
was much better than the average skill level Monty claims to play for money
(See his posts on the types of fish he prefers).
I felt I would give him a run on skill. I knew full well he was likely to
use his Jedi Mind tricks. He has posted about them before. I was also well
aware of his tendency to drop clear takes when the match underdog may have
been slightly favored. I also surmised the cube was his likely a weak point
for him.
Monty had very limited knowledge on how I played and I likely had more
knowledge of cube tendencies. As Phil Simborg (Who is definitely no slouch!)
pointed out, its likely much harder to identify checker play weaknesses and
then exploit them (and know it can be consistently done against that
Opponent). I agree with Phil that on Cube play its much easier depending on
relative strength.
What I expected, even with his Jedi mind tricks is that at a minimum he
should outplay me. If his mind tricks caused me to get into more difficult
positions, then I would have had to make worse errors in the following
play(s) to offset the equity he originally gave up. I would expect that if
Monty makes an error, that somewhere he gained equity in doing so. I would
have to make enough corresponding errors in the "difficult" position to
offset the deliberate error(s) he made to put me into the position.
So If he makes an error, and I play the position out better than he expected
he effectively gave me extra equity I wouldn't necessarily have had if he
made the right play.
If you look at the checker play error rate, I can make a case that he failed
to force me into enough tough positions to make clear and consistent errors
that would be net benefit to him. I ended up with the edge on checker play.
My opinion is he assumed I was a much weaker player, assumed I would blunder
consistently tough positions his mind tricks put me in, but in the end I
ended up with fewer errors. It makes no sense to me that if he successfully
got me into hard positions that I would end up with a lower error rate.
All this means that if I had played at an 1750 level, I wouldn't have been
shocked that he played at 1850. Sure he might have been a 1925 to start but
if he made deliberate errors, and I had played substantially worse - he
would have had a clear argument that he had a net gain so his errors work
the right thing.
Monty has still not provided any reason why I ended up with a lower checker
play error rate than he did if I was the one being successfully trapped into
making huge errors in the resulting positions.
When all is said and done, I believe Monty actually gave up checker play
equity that never materialized in the bad play he expected from me. But I'm
still open for him to identify specific scenarios he repeatedly played and
produced a successful trap.
On to cube play. This was an even more interesting situation. Paul noted in
some of his analysis of the matches that I had taken advantage of Monty's
cube flaws. He was very correct. I actually cubed early knowing it was an
error (and should have been a take) for me but Monty dropped. He did this
consistently in the first 3 matches.
Now Monty may say "He is the superior player" and dropped because he felt he
could make up the equity in future cube and checker play. This in fact is
Valid, assuming you can outplay the opponent on future cube and checker
plays. Against a clearly inferior opponent I would consider doing what
Monty does (but not to the same degree) if I knew I could make up the
difference based on the skill differential alone. The error rates for cubes
and checker play suggest that he had no significant advantage considering
the level I actually ended up playing at (Better than his).
Unlike Monty I am also willing to point out where playing to an opponents
weakness doesn't always pay off. Again one of my annotations is on a cube I
knew was bad, but Monty took (In fact I was surprised he got the position
right, do Kudos to Monty). But I will argue that if you exclude missed
cubes (because we had similar levels of errors in that department - but
still an edge for me), and look at net gain in cube equity - I dominated
Monty. In fact Monty should have been able to fair much better against me
than he did. Telling stats are the amount of lost equity on 16bad passes and
2 bad takes compared to the equity I lost with 8 bad passes and no bad
takes.
There was one place where Monty got me to drop and I told him I knew it was
a take. After I said it was a take (but dropped), Monty said something like
"I knew you would do that". At the time the score was heavily in my favor.
But there was no pattern I would have actually dropped like that previously.
Just so happens this scenario occurred in one of the matches I won.
I will state for the record - plain and simple - I took advantage of
Monty's poor cube skills to gain equity that was a net benefit to myself.
Monty's mind tricks potentially resulted in a net deficit in checker play
and a net deficit in cube equity. I say potentially because no situations
have proven that he did this and or was consistently successful.
Why is it I can provide direct examples, and he can't? Maybe Monty missed
the post match analysis and commentary from myself and Bob and Paul.
------
Lets talk Luck. I really hate discussing it because Backgammon has a
significant luck factor. Dice are fickle and you can run hot one week and
cold another. We've all experienced it (Including Monty who complained about
some major whipping his opponents gave him with substantial luck factors -
and he posted about them on this NG). I accept luck as part of the normal
course of playing backgammon.
I'd like to draw your attention to this comment he made at the beginning of
Match #1's chat:
[1:05 PM] Monty> well, I'm curious to see how much luck is involved in this
series of matches
Well its really funny, I tried to discuss luck with him in a previous post,
and what the bots thought when luck was factored out. Its simple Monty won
with more luck than myself in the 5 matches he won. I had more luck in the 2
matches I won. So what does that prove? Absolutely nothing except that the
luckier player won.
So can we try to estimate what things may have been like if luck is factored
out. We can, GnuBG uses a variance reduction method across all the checker
moves to estimate the MWC and the ratings differential and appears as a
"Luck Adjusted Value"(LAV) in the stats. I have posted the stats before and
Kee's (A mathematical whiz akin to Douglas Zare) also posted his results.
Basically my LAV put me as the match favorite in all the matches even the
ones he won (Please see an important footnote below regarding a potential
issue with LAV below). So am I sitting here saying "I had positive LAV's in
all the matches, therefore I will conclude I am the superior opponent". NIOT
AT ALL. Number one, the LAV is an estimate, but its a good estimate over
longer matches (15 pts qualifies) but its not perfect. What it does suggest
is that there is more to the outcome of the matches than meets the eye.
There just isn't enough empirical evidence to draw a conclusion (whether
based on Matches won or who the match favorite is factoring out luck).
I believe this is where Monty and I differ. He believes (along with a
minority) that a sample set of seven in relation to matches won is a good
predictor of future results). This is a page out of "How to lie with
statistics". I mean take the one statistic that favors yourself and ignoring
other factors and variables to reach a conclusion about future results. This
is a substancially biased way of using Statistics. Monty WON *THIS* series.
That's clear. His predictions are not founded in present results.
I believe I am taking a more objective look at as many stats as possible to
reach a conclusion that its nearly toss up (Personal bias may suggest I
might be a slight favorite, but that's a far cry from Montys claim of proof
that he WILL consistently beat me in this type of format - a bold claim, and
that he is CLEARLY the better player). I know many player who think they
are good until they play someone better. That's why Monty vs Stick (#63)
would be an interesting benchmark to see Monty's playing ability (Likely the
supposed mind tricks will be rendered useless in a skill based analysis).
I would have been willing to say "Monty, you are a better player than
myself" if he had won the series having my results, and that in a majority
of the matches his LAV was higher. That would have been easy for me to say.
And going into the series its actually what I expected. I wanted to give him
a run for his money (well my money) and the statistics simply ended up
"interesting".
Anyway, this is likely my last substantial post on the specific nature of
this series.
----------
Footnote: In order to be objective, I wish to point out that there may
actually be a flaw in the Luck Adjusted Value within GnuBG that I want Monty
and others reading to be aware of. In 2006 a change was made to rectify a
serious error in computing the Cube error rates. It wasn't until yesterday
(And this was a coincidence and not related to these Monty vs Mpetch
threads) that Christian Anthon realized that the Coefficients used on the
LAC calcs related to cube errors still applied. There is a bit of debate on
how significant things could change. The GnuBG developers with the help of
Kee's hope to compute/determine new coefficients. With that being said, do I
expect the LAV's to be substantially different. I may make a case that in 1
match (possibly 2) Monty *MAY* be a favorite (but its hard to tell). If you
wish to monitor developments on this subject, the gnubg mailing list thread
can be found here:
http://lists.gnu.org/archive/html/bug-gnubg/2008-05/msg00008.html
Mike
On 5/13/08 1:27 PM, in article C44F47D9.58FE8%mpe...@capp-sysware.com,
"Michael Petch" <mpe...@capp-sysware.com> wrote:
> gain so his errors work
> the right thing.
Should be:
"gain so his errors WERE the right thing."
> Christian Anthon realized that the Coefficients used on the
> LAC calcs
Should be:
"Christian Anthon realized that the Coefficients used on the
LAV calcs"
Not exactly. I'm saying that Monty's only incentive was "outplay"
Michael, just like he so many times told us he did to his weaker
opponents at BGR and TMG, before he switched to a more "opponent-
taylored" play (passing takeable races, holding games, gammonish
doubles, etc). A lower error rate would be the consequence, not the
incentive.
> ...and that is just ridiculous since Monty himself
> said he used his "tricks".
Ah ok, if Monty said so...
> Clearly he had a will to win the matches, which
> is incentive to make the plays to win the match
> regardless of error rate consequences.
I think he had a will to win the matches, but towards confirming
himself as an expert player. On that count, he failed. The poor notes
awarded by GNU to his overall play can't be due to the use of "tricks"
here and there -- it'd rather denote that he was "tricking" most of
the time!
> Are you so enamored of money that you don't care about
> winning a match if there is no money on the line...
Under the peculiar conditions of these matches, with no money on the
line Monty's motivation was to play his best (technically, not
psychologically). If you have any doubt about it, then please re-read
the very first question he opened this thread with:
"what is the GNUBG assessment of luck overall (as well as skill)?"
Doesn't that tell you something...
> ...and even take it so far to think that everyone
> is like that also?
Ohh, let's not get sentimental.
Having read that thread on the gnubg-mailing list, I think it's a
viable assumption that Christian has misunderstood your question at
that point. In gnubg there is the luck-based rating difference, and
there is the error-based fibs elo estimate. The latter is surely
affected by the 2006 change, and according to my understanding, the
former is most certainly not. The 2006 change is concerned with
changing the divisor in calculating error rates. The LAV calculation
does not deal with rates (but with total mwc), and does not explicitly
care about cube errors, either.
Maik
> changing the divisor in calculating error rates. The LAV calculation
Ugh, "divisor" should have been spelled "denominator".
On 5/14/08 2:35 AM, in article
e20c0cab-8e70-4130...@p25g2000hsf.googlegroups.com, "Maik
Stiebler" <stie...@onlinehome.de> wrote:
> On 13 Mai, 21:27, Michael Petch <mpe...@capp-sysware.com> wrote:
>> Footnote: In order to be objective, I wish to point out that there may
>> actually be a flaw in the Luck Adjusted Value within GnuBG that I want Monty
>> and others reading to be aware of. In 2006 a change was made to rectify a
>> serious error in computing the Cube error rates. It wasn't until yesterday
>> (And this was a coincidence and not related to these Monty vs Mpetch
>> threads) that Christian Anthon realized that the Coefficients used on the
>> LAC calcs related to cube errors still applied.
>
You're correct, it should only be the estimated fibs rating difference that
is inaccurate.
Yes but forced moves are a thoroughly bad idea. Sometimes a play
appears forced but there is a hidden alternative which is stronger
than the obvious move. If forced moves are played automatically, a
player is alerted to these hidden choices (by the non-appearance of
the seemingly-forced play) -- obviously this is undesirable.
Paul Epstein