http://www.evanmiller.org/how-not-to-sort-by-average-rating.htmlEvan Miller's (E.B.Wilson's) idea is interesting but
I think is not the right idea for range voting quorum purposes.
Also, it quite likely is not the right idea for MIller's web-page purposes
either, in many situations anyhow.
Reason I say that is, let's say Candidate Joe
recruits 100 rabid supporters and nobody else has heard of him.
Joe gets 100 scores of 9 on an 0-9 scale, and that is all.
That average is 9, but if every candidate had 100 fake zero scores
before voting starts, Joe's average would be only 4.5.
Meanwhile well known Candidate Bob gets millions of scores averaging 7.7
and this is affected negligibly by the 100 fake zero scores.
I think in that case Bob ought to win.
OK... in such a situation, Evan Wilson's formula
[which involves a term z/4n meaning what? (z/4)*n or z/(4*n)?
I'm assuming the latter]
is, I think, going to get for Joe about 8.9 score.
[Other issues with Evan Wilson's formula: why does he
say alpha/2 when he could say alpha? Guess it does not matter,
but it made stating his formula more complicated for no reason.
Also, the way he wrote his formula it suffers from a lot of numerical
cancellation error, which he could avoid with the usual conjugacy trick
which is standard for such situations.]
The problem is Wilson+Miller's underlying statistical model assumes that Joe's voters are sampled from the same distribution as Bob's. But that is
exactly wrong, which in fact is the whole point and the whole reason this manipulation attempt by Joe+supporters, can work.
The Dirichlet approach I suggest (the 100 fake votes) is not about
confidence under a false assumption of "same distribution sampled;"
it instead is about smooth interpolation between "prior" and "posterior"
estimates. The 100 fake votes represent prior.
You can see that the Wilson/Miller formula is blowing it on that whole issue, by considering what it does in the case of ZERO DATA.
In that case, the right thing to to is to use the prior estimate
(i.e. the fake votes).
The wrong thing to do, which Miller does, is to divide zero by zero, causing a machine crash.