I was told the info was on ICC but i have not seen it there or
anywhere else on the web, anyone got any details? it seems a
reasonable adjustment ( if perhaps a little high) as the old formula
was a bit low from my observation.
Yes, this is absolutely correct, though I'm not certain of the exact
implementation date - I think it is 1 September 2002 but I don't have my source
document to hand (the BCF Grading List, published 1 August 2002) but to my
considerable surprise I don't immediately see it on the BCF website, but the
information is rock solid - I've double-checked another site
(www.sccu.ndo.co.uk) for the formula. And I know absolutely that the
implementation date is before Christmas - I am one of the organisers of the
Hastings Congress starting 28/12/2002, no one told us of the imminent change,
and we printed 5K entry forms in mid July with the old formula - just two or
three days later we heard of the change.
Paul Buswell
It was very unusual, under the old formula, to find a player with a BCF
grade above his FIDE grade after adjustment; so one could argue, as the
BCF has effectively done, that your observation concerning the old
formula is correct.
Interestingly, most players, who have expressed an opinion, want to scrap
the BCF system in favour of an Elo one. On the other hand, most graders
seem to want to keep the system they are used to.
One could say that the BCF grading system is logically flawed: e.g. if
two players play a thirty game drawn match in one grading period, and
these are the only graded games played by them, then their grades swap.
Also the expected score implicit in the BCF grade is not merely dubious,
it is wrong; viz:-
Recall that a grade (ignoring junior player adjustments and the +-40
band), when played between two players, is essentially:
n * OurGrade = W(TheirGrade + 50) + D(TheirGrade) + L(TheirGrade -
50) - (1)
Where W = number of wins, D = number of draws, L = number of
losses, and n = number of games.
We also know that n = W + D + L - (2)
We wish to eliminate L, proceeding by Gaussian elimination:
(1) - ((2) * (TheirGrade - 50))
n (50+ OurGrade - TheirGrade) = W(100) + D( 50) - (3)
Now the expected score per game is W + D/2 (as draws only score a
half, which is essentially the right hand side of (3) ),
hence W + D/2 = n/2 {1 + GradDiff/50}. Or for one game
Expected score = {1 + GradDiff/50} / 2 - (4)
Note that (4) only makes sense for Abs(GradDiff) <= 50.
I believe that John Nunn, many years ago, looked at the raw data; this is
a more realistic method of working out expected scores. If I recall
correctly, the conclusion reached was that playing someone 20 points
below oneself on average was the best method to boost one's grade.
Although what the pleasure is in constantly playing weaker opposition
escapes me. Another way of looking at it is that playing someone thirty
points below oneself, say, is bad news: firstly, because the game won't
be a pleasure; and secondly, if the above is correct, one's grade will
suffer.
The position of the graders is not as irrational as it may appear; the
exceptional cases mentioned, and hinted at, above do not occur in a
noticeable way. Remember, too, that the BCF grading system predates the
work of professor Elo. If you feel strongly about it, you could
participate in the surveys being carried out by the BCF, serious
consideration is being given to a change.
Regards,
Simon.
Is that so? I acknowledge the 215 upper limit, but din't think there was a
lower limit. I don't have source material to hand - what is yours for the 150
limit?
thanks
PB
Things change slowly in the United Kingdom.
If Great Britain were to switch to driving on the right-hand side of the
road, they'd do it gradually. They'd start by doing it just for trucks.
Bill Smythe
"Note: the current conversion assumes a minimum FIDE Elo of 2000."
And
(150 * 5) + 1250 = 2000.
Maybe things will change when FIDE go "sub-2000".
As a non-sequitur, this is a huge drop from the old correspondence of 175
BCF = 2000 FIDE.
I assume it is inclusive because this is the first figure in the table
immediately above, as well as the FIDE minimum.
The graph below the quote starts at 140 BCF on the abscissa, 2000 FIDE on
the ordinate; a visual inspection of this graph, assuming the raw data is
accurate, does tend to suggest a better fit with the new formula.
Kind regards,
SS.
All your thoughts should be directed to his honour, the
unfortunate Henry Clay III of Texas. What kind of country is it if a
mayor can't have a quiet drink of beer without having his manhood (sic)
questioned?
Cheers,
Simon.
--------
When chapman billies leave the street,
And drouthy neebors neebors meet;
As market-days are wearing late,
An' folk begin to tak the gate;
While we sit bousing at the nappy,
An' getting fou and unco happy,
We think na on the lang Scots miles,
The mosses, waters, slaps, and styles,
That lie between us and our hame,
Whare sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.
I don't think it's just a matter of being used to it --
it's also much simpler:
performance == average opponent + % score - 50
[ignoring, as you do, the +-40 limits and junior stuff].
>One could say that the BCF grading system is logically flawed: e.g. if
>two players play a thirty game drawn match in one grading period, and
>these are the only graded games played by them, then their grades swap.
Umm. If I play 30 different opponents with an average
grade of, say, 200 and score 50%, then my grade will become 200,
which is surely the only sensible value. Why is it a *logical*
flaw if I play 1 opponent with a grade of 200 30 times and the
same happens? There is an argument that grades should "decay"
rather than "jump", but it's not logically *compelling*, just a
matter of choice. [Would you want a football league in which
last season's results slowly decayed rather than starting each
season afresh?] [And, of course, for a large majority of players,
those who play less than 30 games/year, the BCF system also
includes old grades, so it's a matter of scale rather than the
principle.]
>Also the expected score implicit in the BCF grade is not merely dubious,
>it is wrong; viz:-
> [...] Or for one game
> Expected score = {1 + GradDiff/50} / 2 - (4)
Yes, this is a re-arrangment of the formula above.
> Note that (4) only makes sense for Abs(GradDiff) <= 50.
Yes; that's why there is the +-40 adjustment which we
have both ignored! There is really nothing useful to be done
with games where the players are too far apart, and certainly
nothing meaningful as far as grades are concerned. I don't know
whether I'm unusual in this respect, but I scarcely ever play
people that far away anyway -- perhaps once every five-ish years
-- so it's not going to affect my grade much no matter what.
>I believe that John Nunn, many years ago, looked at the raw data; this is
>a more realistic method of working out expected scores.
There was an article in BCM. His basic idea, IIRC, was
to list your opponents in grade order, move all the wins to the
"weak" end and the losses to the "strong" end, then the middle
of the draws was your performance.
> If I recall
>correctly, the conclusion reached was that playing someone 20 points
>below oneself on average was the best method to boost one's grade.
Personally, (a) those are the people I perform worst
against, and (b) what is the point in boosting ones grade to
the point where you're obviously not as good as your grade. ...
>Although what the pleasure is in constantly playing weaker opposition
>escapes me. Another way of looking at it is that playing someone thirty
>points below oneself, say, is bad news: firstly, because the game won't
>be a pleasure; and secondly, if the above is correct, one's grade will
>suffer.
... After all, these people who are *now* 30 points
below you are the ones who *used* to be 20 points below and
who helped you to boost your grade by 10 points!
>The position of the graders is not as irrational as it may appear; the
>exceptional cases mentioned, and hinted at, above do not occur in a
>noticeable way. Remember, too, that the BCF grading system predates the
>work of professor Elo. [...]
I certainly wouldn't claim that the BCF system is in
any way perfect, but it does have considerable merits. It's
very simple, and stable; anyone can work out their own grade
in a few moments from the above formula. My personal view is
that Elo has been oversold. It doesn't really offer anything
better to compensate for the complexity, which just gives a
spurious accuracy. It might be quite fun to calculate that in
a 10000-game match against Kasparov I expect to score 1.5 points,
or whatever [probably on the days when he's ill], but it's also
quite meaningless. I can see the advantages of "Glicko" for
doing retrospectives, but not for the day-to-day business of
calculating grades/ratings for ordinary chessplayers to use to
assess their own performance or to be in awe of or comtemptuous
of their opponents.
--
Andy Walker, School of MathSci., Univ. of Nott'm, UK.
a...@maths.nott.ac.uk
Many thanks for your follow up.
> Umm. If I play 30 different opponents with an average
> grade of, say, 200 and score 50%, then my grade will become 200,
> which is surely the only sensible value. Why is it a *logical*
> flaw if I play 1 opponent with a grade of 200 30 times and the
> same happens? There is an argument that grades should "decay"
> rather than "jump", but it's not logically *compelling*, just a
> matter of choice. [Would you want a football league in which
> last season's results slowly decayed rather than starting each
> season afresh?] [And, of course, for a large majority of players,
> those who play less than 30 games/year, the BCF system also
> includes old grades, so it's a matter of scale rather than the
> principle.]
I had better explain what I mean here. "Tal" with a BCF grade of 270,
somewhat implausibly, plays a thirty game match with "Nezhmetdinov",
whose BCF is 245. Let us further suppose that the match is drawn: "Tal"
now is graded 245 and "Nezhmetdinov" 270, I find this irrational;
although, as you say, it is not compelling, however, it is freakish.
Also, I'm not sure what their grades should be in such a circumstance.
> Yes; that's why there is the +-40 adjustment which we
> have both ignored! There is really nothing useful to be done
> with games where the players are too far apart, and certainly
> nothing meaningful as far as grades are concerned. I don't know
> whether I'm unusual in this respect, but I scarcely ever play
> people that far away anyway -- perhaps once every five-ish years
> -- so it's not going to affect my grade much no matter what.
>
I play in the London, Middlesex, and 4NCL leagues. In the Middlesex
league, there are huge discrepancies in playing strength; of the thirteen
games played by me out of fourteen, I outgraded my opponent on ten
occasions (assuming one can describe a chess match as an occasion!). The
differences were:
+26, +6, +41, +43, +26, +49, +44, -2, -6, default win (fortunately, to
judge by the rest of their team it would have been another +50 job), -15,
+37, +45, +61. In some eight board matches our board seven outgraded
their board one. In fact, for any board other than one, it is normal to
heavily outgrade the opponent. I tried to introduce a rolling scheme to
try and share the "plums" a bit, but the protests were too vociferous.
I suspect the problem is that too much strength is concentrated in the
top three teams; however, is this so unusual? It may be, as some London
clubs hire professionals, which is why we can't win the thing.
One big reason not to change the rating system is that the scope for
manipulation is less by definition; was it fifteen years ago that FIDE
introduced the "improvement" that the winner of an international
tournament couldn't lose rating points?
For myself, I don't hold a strong brief either way. There are too many
imponderables; "good opponent", "favourite opening", tiredness, and so
on; for one number to mean all that much . Nor do I believe that
switching to a "category" system is worth the result. A final point is
that any form of UK chess organisation is a thankless profitless task,
thus the opinion of the mass of graders should be given extra weight, a
consideration painfully ignored in the recent past.
> ...
> I don't think it's just a matter of being used to it --
> it's also much simpler:
>
> performance == average opponent + % score - 50
"Things should be as simple as possible...but not simpler" A. Einstein.
>
> [ignoring, as you do, the +-40 limits and junior stuff].
Well then, when these are included...how does the system look on the
"simplicity" measure?
>
> >One could say that the BCF grading system is logically flawed: e.g. if
> >two players play a thirty game drawn match in one grading period, and
> >these are the only graded games played by them, then their grades swap.
>
> Umm. If I play 30 different opponents with an average
> grade of, say, 200 and score 50%, then my grade will become 200,
> which is surely the only sensible value. Why is it a *logical*
> flaw if I play 1 opponent with a grade of 200 30 times and the
> same happens?
Because these are not analogous situations.
In the first case, it is an acceptable approximation to say that the
*other* players strengths remain constant, while the single player's
strength changes dramatically.
In the second case, this approximation is simply not justified.
*Particularly* when it is applied both ways!
> There is an argument that grades should "decay"
> rather than "jump", but it's not logically *compelling*, just a
> matter of choice.
I find it compelling.
> [Would you want a football league in which
> last season's results slowly decayed rather than starting each
> season afresh?]
Yes.
> ...
> Yes; that's why there is the +-40 adjustment which we
> have both ignored! There is really nothing useful to be done
> with games where the players are too far apart, and certainly
> nothing meaningful as far as grades are concerned.
Really? I disagree.
> I don't know
> whether I'm unusual in this respect, but I scarcely ever play
> people that far away anyway -- perhaps once every five-ish years
> -- so it's not going to affect my grade much no matter what.
Perhaps the design of a rating system should be based on more than your
personal experience.
> ...
--
Kenneth Sloan sl...@uab.edu
Computer and Information Sciences (205) 934-2213
University of Alabama at Birmingham FAX (205) 934-5473
Birmingham, AL 35294-1170 http://www.cis.uab.edu/info/faculty/sloan/
>a...@merlot.uucp (Dr A. N. Walker) writes:
>
>> ...
>> I don't think it's just a matter of being used to it --
>> it's also much simpler:
>>
>> performance == average opponent + % score - 50
>
>"Things should be as simple as possible...but not simpler" A. Einstein.
>
>>
>> [ignoring, as you do, the +-40 limits and junior stuff].
>
>Well then, when these are included...how does the system look on the
>"simplicity" measure?
Not much different. The +-40 limit says that if you play an opponent
more than 40 points higher than you, you assume he is exactly 40
points higher than you for the grading calculation. Similarly, a
player more than 40 points lower is assumed to be exactly 40 points
lower. This stops you increasing your grade by losing to a much
stronger player or lowering your grade by beating a much weaker
player.
The junior stuff adds no complexity at all as it does not affect the
calculation of your grade. When the BCF calculate a junior's grade,
their published grade is enhanced by a few points (dependent on age)
from their calculated grade to take account of their expected
improvement over the next year. The enhanced grade is used in all
grading calculations and there is no need to know what the enhancement
was to calculate your own grade.
>
>>
>> >One could say that the BCF grading system is logically flawed: e.g. if
>> >two players play a thirty game drawn match in one grading period, and
>> >these are the only graded games played by them, then their grades swap.
>>
>> Umm. If I play 30 different opponents with an average
>> grade of, say, 200 and score 50%, then my grade will become 200,
>> which is surely the only sensible value. Why is it a *logical*
>> flaw if I play 1 opponent with a grade of 200 30 times and the
>> same happens?
>
>Because these are not analogous situations.
>
>In the first case, it is an acceptable approximation to say that the
>*other* players strengths remain constant, while the single player's
>strength changes dramatically.
>
>In the second case, this approximation is simply not justified.
>*Particularly* when it is applied both ways!
>
>> There is an argument that grades should "decay"
>> rather than "jump", but it's not logically *compelling*, just a
>> matter of choice.
Is the Elo system not flawed as well? Suppose a player rated 2300
plays 10 games against opposition averaging 2400 and scores 50%. If
the rating period ends at this point his rating is recalculated and
becomes 2321. If he then plays another 10 games in the next rating
period against opposition averaging 2400 and scores another 50%, his
rating changes to 2338 at the end of the second rating period. On the
other hand, if he plays all 20 games in the first rating period and
none in the second rating period his rating will be 2342 at the end of
both rating periods. It cannot be sensible that identical results give
different ratings dependent on the arbitrary points at which ratings
are recalculated.
>
>I find it compelling.
>
>> [Would you want a football league in which
>> last season's results slowly decayed rather than starting each
>> season afresh?]
>
>Yes.
>
>> ...
>> Yes; that's why there is the +-40 adjustment which we
>> have both ignored! There is really nothing useful to be done
>> with games where the players are too far apart, and certainly
>> nothing meaningful as far as grades are concerned.
>
>Really? I disagree.
Surely both Elo and BCF systems only give statistically reliable
results if you play a variety of opponents , both stronger and weaker
than yourself, and not too different in strength to you. For the BCF
system, "too different" means more than 40 points, or an expected
score against the opponent of <10% or >90%.
>
>> I don't know
>> whether I'm unusual in this respect, but I scarcely ever play
>> people that far away anyway -- perhaps once every five-ish years
>> -- so it's not going to affect my grade much no matter what.
>
>Perhaps the design of a rating system should be based on more than your
>personal experience.
>
>> ...
>
>--
>Kenneth Sloan sl...@uab.edu
>Computer and Information Sciences (205) 934-2213
>University of Alabama at Birmingham FAX (205) 934-5473
>Birmingham, AL 35294-1170 http://www.cis.uab.edu/info/faculty/sloan/
Ian Thompson
Surely the 40 cutoff is just a hack to overcome a limitation of the
system, namely mapping what is probably a normal distribution to a linear
function.
> This stops you increasing your grade by losing to a much
> stronger player or lowering your grade by beating a much weaker
> player.
Objectively, one can play better when losing to a stronger opponent than
when beating a weaker one, but how to measure it? The answer must lie in
a series of games; e.g. suppose someone graded 170 loses to a 225 graded
opponent, his grade for that game is 160; suppose, further, that he had a
really good season which comes out at 190, then perhaps he should have
lost at 175 strength? The system works here because one game hardly
matters, but, in this instance, it would work with or without the +- 40
band.
> The junior stuff adds no complexity at all as it does not affect the
> calculation of your grade.
This is simply untrue, there is a very minor, barely significant, boost
to adult grades. The BCF enhancements are:
10 points for juniors under 11
8 11 - 14
6 15 - 17.
As you state, these are added to a junior's results and this final figure
published. One's grade does depend upon how many juniors one plays as a
proportion of the graded games; but I believe it to be obvious that
playing several of them with their bonus included results in a grade one
or two points higher than if the bonus did not exist. Although to me, and
I'm sure for most chess players in England, 120 is the same as 121, and
200 is the same as 201.
> Is the Elo system not flawed as well? Suppose a player rated 2300
> plays 10 games against opposition averaging 2400 and scores 50%. If
> the rating period ends at this point his rating is recalculated and
> becomes 2321. If he then plays another 10 games in the next rating
> period against opposition averaging 2400 and scores another 50%, his
> rating changes to 2338 at the end of the second rating period. On the
> other hand, if he plays all 20 games in the first rating period and
> none in the second rating period his rating will be 2342 at the end of
> both rating periods. It cannot be sensible that identical results give
> different ratings dependent on the arbitrary points at which ratings
> are recalculated.
Why should the results be identical when they occur in different time
periods with different starting grades? Intuitively, they should be
similar, which, according to your calculations, they are, but no more.
Also, one could argue that 50% against stronger opponents over 20 games
is better than the same percentage over 10 in the same period of time.
This strikes me as perfectly sensible, even in the light of too small a
sample of games.
> Surely both Elo and BCF systems only give statistically reliable
> results if you play a variety of opponents , both stronger and weaker
> than yourself, and not too different in strength to you. For the BCF
> system, "too different" means more than 40 points, or an expected
> score against the opponent of <10% or >90%.
That is your opinion; however, if you are right then it would be better
not to grade such mismatched games at all, not that that would be
popular.
Simon.
The issue of rating games one at a time, vs in large batches, is a
legitimate one, but has nothing to do with Elo as such. ANY rating system
can rate games either singly or in batches, with slightly different results.
USCF rates one TOURNAMENT at a time, which is a reasonable compromise in
most cases.
Bill Smythe
This is because there are too many games in a single rating period.
The same would be true of Elo systems. The solution (with Elo, BCF, or
anything else) is to rate the games one at a time, or a few at a time.
USCF rates one tournament at a time (typically 4-7 games). In Elo
terminology, the K-factor should never exceed 800 / N, where N is the number
of games being rated, else the new rating will actually overshoot the
performance rating, as in your example.
Bill Smythe
> USCF rates one TOURNAMENT at a time, which is a reasonable compromise in
> most cases.
>
> Bill Smythe
>
>
>
>
In England a lot of chess is played in leagues rather than in
tournaments, each club fielding one or more teams against other clubs, so
an annual grade is more realistic.
Leagues makes sense over a small geographical area. In a mini-continent,
such as the USA, playing in tournaments is a better option: I still
remember blanching during one of my visits to the States when it was
suggested we drive fifty miles to a restaurant as though it were next
door.
Regards,
Simon.
---------
This truth fand honest Tam O'Shanter,
As he frae Ayr ae night did canter:
(Auld Ayr, wham ne'er a town surpasses,
For honest men and bonie lasses.)
But then, a smaller K-factor (or whatever the equivalent terminology is for
BCF) should be used for players with a lot of games. Perhaps BCF isn't
doing this, so that post-period ratings are overshooting the performance
ratings.
Bill Smythe
I disagree with you, the problem is the dearth of games against other
players, which would ordinarily at least partly compensate for the
"Indian sign". I wouldn't pretend to know how any system copes with this.
My instinct tells me that the grading difference should narrow, although
the gap in favour of "Tal" shouldn't vanish, but I could be wrong.
One merit of the BCF system is that it is suited to a large number of
games.
> The same would be true of Elo systems.
Yes, roughly true; take the Scottish Elo system, as I understand it, the
Scots use the following formula:
New Grade = Old Grade + 800 * (Total Actual Points - Total Expected
Points)/Number of Games Played .
This is for graded players playing a minimum of thirty games.
The table they use looks like a normal distribution, (see web page
http://www.users.globalnet.co.uk/~sca/ ) I can't be bothered to check,
so, to illustrate your point:
200 gap = 0.757 expected for the stronger player, call it 3/4, so I
can do it in my head, then the grades roughly swap [e.g. 2700 -> 2500].
100 gap = 0.636 (don't look, I'm calling it 5/8), then the stronger
player drops roughly 100.
300 gap = 0.851 (Can I call that 7/8?) then "Tal" loses roughly
300.
On balance I would have to agree with Dr Walker when he asserts, assuming
I have understood him correctly, that the extra effort for scant reward
hardly makes the change worthwhile. Although the extremely large number
of mismatches is a problem, and not just in grading terms. I was not
amused having to drag myself across London with a busted foot to play off
an adjournment when the result was dead obvious - roll on Quickplays!
There may be more than one problem here.
In the specific example you cited, the two players exchanged ratings after a
lengthy drawn match. Clearly, this could happen only if the K-factor were
too high, or too many games were being rated at the same time. The new
rating overshot the performance rating.
The new rating should always come out somewhere between the old rating and
the performance rating. In your example, the most that should have happened
was that both players would end up with the same rating, halfway between the
players' old ratings.
In USCF or Elo terms, the K-factor should never exceed 800 / N, where N is
the number of games currently being rated. Evidently (unless you are
misinterpreting something) the people who run the BCF system do not
understand this.
> One merit of the BCF system is that it is suited to a large number of
> games.
Not if they don't limit the K-factor (or whatever the equivalent BCF
terminology is).
Of course, you are correct that ratings are more accurate if most of a
player's opponents' ratings are reasonably close to his own. But even then,
with a large number of games, the K-factor should not be allowed to run
wild.
Bill Smythe
Yes. In the USA, virtually all tournaments are now played with a sudden
death final time control. About the slowest you'll find is 40 moves in 2
hours, then SD in 1 hour.
Bill Smythe
As for my misinterpreting things, I couldn't possibly comment.
Simon.
--------------
O Tam, had'st thou but been sae wise,
As taen thy ain wife Kate's advice!
She tauld thee weel thou was a skellum,
A blethering, blustering, drunken blellum;
That frae November till October,
Ae market-day thou was nae sober;
That ilka melder wi' the miller,
Thou sat as lang as thou had siller;
That ev'ry naig was ca'd a shoe on,
The smith and thee gat roaring fou on;
That at the Lord's house, even on Sunday,
Thou drank wi' Kirkton Jean till Monday.
She prophesied, that, late or soon,
Thou would be found deep drown'd in Doon,
Or catch'd wi' the warlocks in the mirk
By Alloway's auld, haunted kirk.
> In article <3d5d78c5...@news.freeserve.net>,
> I...@crookham.freeserve.co.uk says...
> > >
> > >Well then, when these are included...how does the system look on the
> > >"simplicity" measure?
> >
> > Not much different. The +-40 limit says that if you play an opponent
> > more than 40 points higher than you, you assume he is exactly 40
> > points higher than you for the grading calculation. Similarly, a
> > player more than 40 points lower is assumed to be exactly 40 points
> > lower.
>
> Surely the 40 cutoff is just a hack to overcome a limitation of the
> system, namely mapping what is probably a normal distribution to a linear
> function.
In the beginning, there is the 'S' curve.
Seekers after simplicity replace this with a line (matching the central
portion of the 'S' curve.
When the absurdity of this is pointed out, they truncate the line and
add two more (horizontal) lines, offset slightly from the asymptotes of
the original 'S' curve.
Leaving us with the "simpler" three-line approximation to the single
(oh, so complicated) 'S' curve.
Piecewise-linear approximations to "complicated" curves are not
necessarily bad things. The "explanations" that inevitably follow are,
however, pure drivel.
Read Elo.
Amen.
Bill Smythe
Nevertheless, I'd bet dollars to donuts that the BCF system really is an Elo
system in disguise, or at least a linear approximation of it, whether you
realize it or not.
Many rating systems have been developed based on linear formulas, with an
upper limit on the rating difference to avoid absurdities. The people who
develop these systems may have little mathematical sophistication, but are
nevertheless, purely by accident, hitting on something close to mathematical
validity.
In fact, a popular approximation to the USCF system with K-factor 32 is the
old "16-plus-or-minus-4%" formula. In this case, it is necessary to limit
rating differences to 400, in order to prevent a player from gaining rating
points by losing a game, or losing points by winning. In fact, sometimes
the limit was set to 350, to ensure that, even with a large rating
difference, the higher player would gain a point or two by defeating a much
lower-rated player.
But it's still Elo, whether you like it or not. I quote from "The Rating of
Chess Players Past and Present", Elo's classic 1978 work:
"1.13. In the chess world, rating systems have been used with varying
degrees of success for over twenty-five years. Those which have survived
share a common principle in that they combine the percentage score achieved
by a player with the rating of his competition. They use similar formulae
for the evaluation of prerformances and differ mainly in the elaboration of
the scales. The most notable are the Ingo (Hoesslinger 1948), the Harkness
(Harkness 1956), and the British Chess Federation (Clarke 1957) systems.
These received acceptance because they produced ranking lists which
generally agreed with the personal estimates made by knowledgeable chess
players."
I also quote from Professor Ken Sloan's recent post:
"In the beginning, there is the 'S' curve. .... Seekers after simplicity
replace this with a line matching the central portion of the 'S' curve.
.... When the absurdity of this is pointed out, they truncate the line and
add two more (horizontal) lines, offset slightly from the asymptotes of the
original 'S' curve. .... Leaving us with the 'simpler' three-line
approximation to the single (oh, so complicated) 'S' curve. ....
Piecewise-linear approximations to 'complicated' curves are not necessarily
bad things. The 'explanations' that inevitably follow are, however, pure
drivel. ....
Read Elo."
In other words, those who start out with linear formulas, then limit rating
differences to avoid absurd results, are, without even realizing it, setting
up an approximation to an Elo-type system. The K-factor is implicit
therein, whether it goes by that name or another. And it's important to
limit the K-factor, and/or the number of games rated in a batch, so that the
new rating does not overshoot the performance rating.
In the linear approximation to the USCF system, for example, if a player has
played many games, the K-factor can, essentially, be cut from 32 to 16 by
using an 8-plus-or-minus-2% approximation, in place of 16 and 4%.
If the recent posts contain any truth at all, it appears to me that BCF
desperately needs to recognize what its K-factor is, and make an adjustment
when appropriate.
And I agree with Professor Sloan -- you should read Elo.
Bill Smythe
> ...
> And I agree with Professor Sloan -- you should read Elo.
>
> Bill Smythe
>
>
>
Apparently, reading Elo is not sufficient.
The BCF system is *not* Elo.
0) take out your copy of Elo
1) open to the Index
2) find "British Chess Federation" in the index
3) read the sections cited
For those who might wish to defend the current BCF system, I would
greatly appreciate a gloss on why Elo's paragraphs 8.56 and 8.57 do not
apply.
For those who believe that BCF = "Elo of some sort", I say "Read Elo
(again, if necessary)"
That may depend on your definition.
If a linear approximation (three lines, as in your last post) is used
instead of the S-curve, does that, according to your definition, make a
system not Elo, even though it's still fairly close?
Also, if periodic performance ratings are used, instead of the Elo formula
for established players, does that make the system not Elo, even if
performance ratings are calculated in essentially the same way?
I would think that your answer to at least one of the above questions would
have to be Yes, to support your contention that BCF is not Elo.
> 0) take out your copy of Elo
> 1) open to the Index
> 2) find "British Chess Federation" in the index
> 3) read the sections cited
I did all of that, before my last post. There were three sections cited.
The first section, which I quoted in its entirety, seems to suggest that BCF
is an Elo system, at least approximately. The second section points out
that BCF uses only performance ratings, eschewing the formula for
established ratings. The third section seems to bemoan the linear
approximation. So it boils down, again, to the two questions I asked above.
Bill Smythe
One's grade depends on the true (unknown) strength of your opponents
compared to their grades (which are an estimate of their true
strength). If, on average, you play opponents whose true strength is
higher than their grade, then your grade will go down (assuming your
standard of play remains constant). This is what the junior
enhancement is designed to combat, because it is known that juniors
generally improve, so their grade (without enhancement) is an
underestimate of their true strength. If there were no junior
enhancement, there would be deflation in adult grades over time.
>> Is the Elo system not flawed as well? Suppose a player rated 2300
>> plays 10 games against opposition averaging 2400 and scores 50%. If
>> the rating period ends at this point his rating is recalculated and
>> becomes 2321. If he then plays another 10 games in the next rating
>> period against opposition averaging 2400 and scores another 50%, his
>> rating changes to 2338 at the end of the second rating period. On the
>> other hand, if he plays all 20 games in the first rating period and
>> none in the second rating period his rating will be 2342 at the end of
>> both rating periods. It cannot be sensible that identical results give
>> different ratings dependent on the arbitrary points at which ratings
>> are recalculated.
>
>Why should the results be identical when they occur in different time
>periods with different starting grades? Intuitively, they should be
>similar, which, according to your calculations, they are, but no more.
>Also, one could argue that 50% against stronger opponents over 20 games
>is better than the same percentage over 10 in the same period of time.
>This strikes me as perfectly sensible, even in the light of too small a
>sample of games.
>
The results should be the same because you have had identical results
against opponents of exactly the same strength. Change the example
slightly. Take two players, both initially rated 2300. One plays the
20 games above in a single rating period and ends up with a rating of
2342. The other plays the games over two rating periods and ends up
with a rating of 2338. Is the first player a better player than the
second player? Obviously not, so their ratings should still be the
same.
>> Surely both Elo and BCF systems only give statistically reliable
>> results if you play a variety of opponents , both stronger and weaker
>> than yourself, and not too different in strength to you. For the BCF
>> system, "too different" means more than 40 points, or an expected
>> score against the opponent of <10% or >90%.
>
>That is your opinion; however, if you are right then it would be better
>not to grade such mismatched games at all, not that that would be
>popular.
>
>Simon.
Ian Thompson
>>"Things should be as simple as possible...but not simpler" A. Einstein.
Note that the BCF system has been in use for roughly
50 years with only very limited modifications; it has been
widely -- too widely! -- accepted by the players, it is very
stable, and it is easy for the players to self-check their
own grades. For me, that amounts to a proof-by-example that
it is not "simpler than possible", but is a good working system.
>>> [...] There is really nothing useful to be done
>>> with games where the players are too far apart, and certainly
>>> nothing meaningful as far as grades are concerned.
>>Really? I disagree.
For any good reason? What meaningful result could
be extracted from a [mis-]match between me and Kasparov,
or between me and J Rank Beginner? If Kasparov wins 10-0,
is that not so utterly expected as to have no significance
at all as far as my grade is concerned? If, mirabile dictu,
I win a game, is it not *far* more likely [as a matter of
chess rather than of statistics] that Kasparov was ill, or
did something *really* stupid, or lost concentration after
a series of trivial wins, or even that the result was a
"fix", than that suddenly I played a game at super-GM level?
Is losing 0-10 to Kasparov somehow a better result for me
than losing 0-10 to Anand or to Khalifman, since Kasparov
has the highest rating? Phooey! Any such results are just
froth, a distraction from assessing my [or Kasparov's] grade.
>>> I don't know
>>> whether I'm unusual in this respect, but I scarcely ever play
>>> people that far away anyway -- perhaps once every five-ish years
>>> -- so it's not going to affect my grade much no matter what.
>>Perhaps the design of a rating system should be based on more than your
>>personal experience.
Well, the BCF system was designed with no help from me
at all, though I'd like to think that my experiences and comments
have fed in to one or two of the minor modifications. But *unless*
my experiences are wildly untypical, then the fact remains that
doing something more or less clever with the handful of outliers
is not going to affect grades very much, and it is not worth losing
any sleep over the details.
Well, to win or lose your bet, we'd need to know what you
mean by "an Elo system". If you're desperate for your doughnuts,
then you could presumably claim that any rating system whatsoever
that produced results recognisably correlated with the results
produced by a "pure" Elo system was Elo-in-disguise. But the
BCF system was certainly not *influenced* by Elo in any way, and
it has been working pretty well over the decades.
>Many rating systems have been developed based on linear formulas, with an
>upper limit on the rating difference to avoid absurdities. The people who
>develop these systems may have little mathematical sophistication, but are
>nevertheless, purely by accident, hitting on something close to mathematical
>validity.
In the case of the BCF, that's an unnecessary [and false]
slur on the people concerned; and it's no accident that it works,
Can't speak for the other systems developed at around the same time.
>If the recent posts contain any truth at all, it appears to me that BCF
>desperately needs to recognize what its K-factor is, and make an adjustment
>when appropriate.
This is a system which has worked well for 50 years!
There may be some scope for minor tinkering, but there is
absolutely nothing "desperately" needed by way of adjustment.
It may well be that *if* someone plays and draws a 30-game match
against a much stronger opponent, *and* these are the only games
that either player plays during the entire season, *then* the
resulting grades are not what you or Ken might expect; but as
this has never ever happened in real life, it's irrelevant.