Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Competition: Did I miss something?

3 views
Skip to first unread message

Karen Sutton

unread,
Nov 17, 1997, 3:00:00 AM11/17/97
to

Interesting news group...

I've been reviewing the games in the IF competition - eight so far - and
having a great time. I played the original Adventure on a PDP-11/70 in
the late 70s and have been an on-again /off-again IFer over the years
since (including graphical adventures). I learned about the IF
competition when I found (played, and thoroughly enjoyed!) Uncle
Zebulon's Will by Magnus Olsson (winner of the first IF competition, I
believe). I couldn't believe it was written by an amateur - and was
absolutely free! So excited was I that I just *had* to check out this
year's competition.

I thought I understood from the competition web pages that *anyone*
could play the games and be a judge, as long as one judged all games
according to the same criteria. But I also read something there about
games being overrated in the past. A scoring system was discussed where
Infocom games would probably rate between 7-8 on a scale of 1-10. Under
that rating system the games in the competition would likely not be
rated over a 6. So I'm a bit confused by the postings here about ratings
of 10. Am I doing the authors an injustice if I don't score above 6?
Can anyone help clear that up my responsibilities as a judge?

I readily admit I know nothing about writing, developing, or programming
games of any sort. So I won't get involved in the programming-type
discussions here (like which interpreter is better/best and why). I
consider it enough that I figured out what to do to play these games!
(-: . But I love a good adventure story and I appreciate the efforts of
anyone desiring to entertain via this medium and willing make the
effort. Especially when their offering is free! And if I can help
resurrect text adventures and have some fun too, I'm all for it.

I consider it a privilege just to have access to these games
free-for-nothing, regardless of what system they were written on or for.
That I can play play them on a 68K Mac, a Power Mac, an old Laptop, and
on a mid-range Pentium is miracle enough. That most will play on half a
dozen other platforms too is un-fricking-believable! (I guess having a
version for an old Atari 800 that still runs is asking too much? (-; )
So when I run into one that won't play, my only upset is "bummer - I
don't get to see that one!" My *real* sadness is that more people aren't
aware of how easy they are to play and how totally absorbing. And what
minimal system requirements! *These* are the games that would really
make sense on Newtons and PalmPilots and the like - not to mention
breathing some new life into all those 286's out there... And I have to
guess that if a killer game could only be played on one platform and was
deemed the winner, it would find it's way to other platforms.

My thanks to the competition producers, supporters, and authors for many
hours of free fun! Keep up the good work!

-Karen Sutton
sut...@europa.com


Stephen Granade

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

On Mon, 17 Nov 1997, Karen Sutton wrote:

> I thought I understood from the competition web pages that *anyone*
> could play the games and be a judge, as long as one judged all games
> according to the same criteria. But I also read something there about
> games being overrated in the past. A scoring system was discussed where
> Infocom games would probably rate between 7-8 on a scale of 1-10. Under
> that rating system the games in the competition would likely not be
> rated over a 6. So I'm a bit confused by the postings here about ratings
> of 10. Am I doing the authors an injustice if I don't score above 6?
> Can anyone help clear that up my responsibilities as a judge?

Your responsibilities are to play as many of the games as possible and
rate them however you see fit. As long as you use a large enough point
spread that you can differentiate between your most favorite and least
favorite games (i.e. don't use a two-point range)), it won't matter; even
if your scores are lower than other judges', they still accurately
indicate _your_ rankings.

That doesn't really address your question, though. The original comments
("Infocom games are a 7 or 8") were about how games were scored in SPAG,
an interactive fiction e-zine. In SPAG all sorts of games are rated from
1-10, so any score should take into account how all other games had been
scored. In the competition you are rating the games against other
competition games only*, so the "Infocom=7-8" criteria doesn't really
matter.

Stephen

* Granted, there are other schemes, but this is how I judge the games, so
there.

--
Stephen Granade | Interested in adventure games?
sgra...@phy.duke.edu | Check out
Duke University, Physics Dept | http://interactfiction.miningco.com


Steve Bernard

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <347126D3...@europa.com>,
Karen Sutton <sut...@europa.com> wrote:

[sniperinni]

> I thought I understood from the competition web pages that *anyone*
> could play the games and be a judge, as long as one judged all games
> according to the same criteria. But I also read something there about
> games being overrated in the past. A scoring system was discussed where
> Infocom games would probably rate between 7-8 on a scale of 1-10. Under
> that rating system the games in the competition would likely not be
> rated over a 6. So I'm a bit confused by the postings here about ratings
> of 10. Am I doing the authors an injustice if I don't score above 6?
> Can anyone help clear that up my responsibilities as a judge?

Ok, somebody smack me if I'm wrong, but here's my explanation:

1. Yes, anyone can judge as long as they judge as many as they can and
give scores from 1 to 10 in whole numbers. The actual criteria for
judging is left to the judge's discretion; there is no "official" stated
criteria for judging.

2. The problems about rating games too high is from SPAG, not the
competition. In the first couple of SPAG issues, a couple games I can't
recall now (Klaustrophobia was one, I think) were given 9 and 9.5 ratings
which raised them to the top of SPAG's all-time top three. There were
some complaints that this didn't truly represent the gaming community in
general and that the reviewers were giving much higher ratings than the
games deserved.

Somebody stated that in a scale that was supposed to judge all I-F games
ever written, it wasn't quite right to rate these games so high that more
widely praised games like Trinity would have to be rated a 10 just to
beat them. The argument was that nobody has yet produced a perfect 10
game. Most of the Infocom games would probably rate around 7 or 8, with
the best of those around 9 at the highest, and 10 would be kept as the
ideal. That's why games have to be rated by at least three people to
make SPAG's top five nowadays.

3. The ratings for the competition aren't necessarily on par with SPAG's
ratings. Judges can assign scores as they see fit for whatever reason
they want. There is a lot of discussion here concerning what sort of
criteria should be considered and what shouldn't, but it's ultimately up
to the individual. Some judges will invariably be nicer and some meaner.
The more people who vote, the more the scores will represent a true
average.

I, for example, have a tendency in any circumstance (not just the
competition) to rate the games I like really high and those I don't
really low. Others, I imagine, tend to rate the bulk of games as average
with a few games at the extremes. For the sake of the competition, I'm
planning to rate them all and review all of my scores. I want to avoid
bunching in my scores by assigning the best game I played the 10 and the
worst the 1 and see how all the others fall in the middle. This, I
think, will prevent me from having a lot of ones and twos and a lot of
nines and tens.

I don't know if anyone else scores that way, but I figure with a lot of
judges everything will come out to be pretty fair.


> My thanks to the competition producers, supporters, and authors for many
> hours of free fun! Keep up the good work!

Me too... :)

>
> -Karen Sutton
> sut...@europa.com

-Steve Bernard Jr.

-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet

Adam Cadre

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

Karen Sutton wrote:
> I thought I understood from the competition web pages that *anyone*
> could play the games and be a judge, as long as one judged all games
> according to the same criteria. But I also read something there about
> games being overrated in the past. A scoring system was discussed
> where Infocom games would probably rate between 7-8 on a scale of
> 1-10. Under that rating system the games in the competition would
> likely not be rated over a 6. So I'm a bit confused by the postings
> here about ratings of 10. Am I doing the authors an injustice if I
> don't score above 6?

Now that I've sent in my ballot, I thought I'd share my take on
scoring. So, um, here goes.

When you judge a game, you're not comparing it to every IF game ever
published, or some absolute rubric -- you're indicating which game or
games you'd like to place highly in the comp, and which you'd prefer
placed low. Thus, it doesn't make much sense to me =not= to give a 10
to your favorite game and a 1 to your least favorite. After all, giving
your favorite game a 6 and your least favorite a 3 has the exact same
effect as giving your favorite game a 10 and your least favorite a 7,
or giving your favorite a 4 and your least favorite a 1. You might as
well maximize the range of grades you can give.

That's my take, anyway. Your mileage may vary, and be sure to change
your oil every three months.

-----
Adam Cadre, Durham, NC
http://www.duke.edu/~adamc | http://www.retina.net/~grignr
Sites newly revamped! Play the games! Read the stories! Hear the
music! Eat the pie! (Okay, well, you'll have to find your own pie.)

Lucian Paul Smith

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

Karen Sutton (sut...@europa.com) wrote:
: Interesting news group...

We like to think so,...

: I've been reviewing the games in the IF competition - eight so far - and
: having a great time.

Great!

: I thought I understood from the competition web pages that *anyone*


: could play the games and be a judge, as long as one judged all games
: according to the same criteria. But I also read something there about
: games being overrated in the past. A scoring system was discussed where
: Infocom games would probably rate between 7-8 on a scale of 1-10. Under
: that rating system the games in the competition would likely not be
: rated over a 6. So I'm a bit confused by the postings here about ratings
: of 10. Am I doing the authors an injustice if I don't score above 6?

: Can anyone help clear that up my responsibilities as a judge?

I think you're remembering two different judging criteria--that for SPAG,
and that for the contest. SPAG is a long-running IF-review 'zine, and the
reviews there are kept for posterity. Therefore, it becomes necessary to
rate games on an objective scale, so that all the games that have been
written in the past, and all the games that will be written in the future
can be compared on the same scale.

This is not true for the contest. Contest ratings are a one-off, and are
only used for this one batch of games. Therefore, to maximize your voting
power, you should rate the best game (or games) in the batch a '10' and
give the worst a '1'.

The purpose of linking to the SPAG rating system from the contest page was
to start you thinking about different aspects of the games to think about
when rating the games. And if you want to follow that system religiously,
that's certainly also your perogative. There are really no rules for how
to assign numbers to the various games--just suggestions.

Have fun!

-Lucian "Lucian" Smith

ct

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <8799584...@dejanews.com>, Steve Bernard

<sber...@earthling.net> wrote:
>
> 1. Yes, anyone can judge as long as they judge as many as they can and
> give scores from 1 to 10 in whole numbers.
^^^^^^^^^^^^^^^^

Can I just emphasise this, just in case anyone else wants to annoy me
with fractional votes? Thankyou, I knew I was up to it.

regards, ct "and a quarter"


Karen Sutton

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

Steve Bernard wrote:

> << mostly snipped >>


>
> 1. Yes, anyone can judge as long as they judge as many as they can and
>

> give scores from 1 to 10 in whole numbers. The actual criteria for
> judging is left to the judge's discretion; there is no "official"
> stated
> criteria for judging.
>
>

> -Steve Bernard Jr.
>
> -------------------==== Posted via Deja News
> ====-----------------------
> http://www.dejanews.com/ Search, Read, Post to Usenet

Thanks - that clears things up.

And thanks for the additional info - it helps to understand the history!

-Karen
sut...@europa.com


Mary K. Kuhner

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <347126D3...@europa.com> Karen Sutton <sut...@europa.com> writes:

>I thought I understood from the competition web pages that *anyone*
>could play the games and be a judge, as long as one judged all games
>according to the same criteria. But I also read something there about
>games being overrated in the past. A scoring system was discussed where
>Infocom games would probably rate between 7-8 on a scale of 1-10. Under
>that rating system the games in the competition would likely not be
>rated over a 6. So I'm a bit confused by the postings here about ratings
>of 10. Am I doing the authors an injustice if I don't score above 6?
>Can anyone help clear that up my responsibilities as a judge?

It seems to me that Whizzard has been saying loud and clear
"Judge the games on any criteria you choose." So if you think a game is
a ten, give it a ten. It's not as if they're being rated against
the Infocom games, anyway (though I personally think there are several
games in Comp97 that would stack up just fine against the classics).

There may have been past discussions of more rigid scoring systems,
but I believe they have been abandoned.

Mary Kuhner mkku...@genetics.washington.edu

Matthew T. Russotto

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <ant19170...@stu012.sjc.ox.ac.uk>,

ct <c...@comlab.ox.ac.uk> wrote:
}In article <8799584...@dejanews.com>, Steve Bernard
}<sber...@earthling.net> wrote:
}>
}> 1. Yes, anyone can judge as long as they judge as many as they can and
}> give scores from 1 to 10 in whole numbers.
} ^^^^^^^^^^^^^^^^
}
}Can I just emphasise this, just in case anyone else wants to annoy me
}with fractional votes? Thankyou, I knew I was up to it.

Bah, now I'm going to have to feed my ratings through some sort of
mu-law thing to compress their dynamic range 10:1 while still
retaining appropriate separations. Yuck.


--
Matthew T. Russotto russ...@pond.com
"Extremism in defense of liberty is no vice, and moderation in pursuit
of justice is no virtue."

ct

unread,
Nov 20, 1997, 3:00:00 AM11/20/97
to

In article <64vohb$p...@wanda.vf.pond.com>, Matthew T. Russotto

<russ...@wanda.vf.pond.com> wrote:
> In article <ant19170...@stu012.sjc.ox.ac.uk>,
> ct <c...@comlab.ox.ac.uk> wrote:
> }In article <8799584...@dejanews.com>, Steve Bernard
> }<sber...@earthling.net> wrote:
> }>
> }> 1. Yes, anyone can judge as long as they judge as many as they can and
> }> give scores from 1 to 10 in whole numbers.
> } ^^^^^^^^^^^^^^^^
> }
> }Can I just emphasise this, just in case anyone else wants to annoy me
> }with fractional votes? Thankyou, I knew I was up to it.
>
> Bah, now I'm going to have to feed my ratings through some sort of
> mu-law thing to compress their dynamic range 10:1 while still
> retaining appropriate separations. Yuck.

I think you need to ingest more caffiene. Or pop more valium. Maybe both?

regards, ct

Second April

unread,
Nov 20, 1997, 3:00:00 AM11/20/97
to

> This is not true for the contest. Contest ratings are a one-off, and are
> only used for this one batch of games. Therefore, to maximize your voting
> power, you should rate the best game (or games) in the batch a '10' and
> give the worst a '1'.

Um...I certainly agree in principle, but remember that lots of judges
aren't going to get to all of the games. If hypothetical intelligent
experienced Judge A rates all of the games, with scores ranging from 1 to
10, and judge B only gets to ten of them and they happen to be judge A's
bottom 10 or top 10, it'll skew things quite a bit if judge B rates his or
her games from 1 to 10. In theory, statistically, it'll even out; in
practice, I don't think our judging pool is all that big.

My experience, if I can discuss it sufficiently obliquely (I've gotten
through 26 of them now), is that there are a few (five for me) on the high
end, between 8 and 10, somewhat more (hmmm, seven, I think) on the low
end, between 1 and 3, and the rest are all in the middle. But I
encountered several that were on the low end right at the beginning, and
if I had been so pressed for time that I hadn't gotten much farther, I
still would have rated them where I did; I wouldn't have adjusted things
so that a game that screamed "2!" at me had a 5.

The moral is, well, judge the games by their merits; if you get through
every game, those merits are somewhat relative, but only somewhat.
(Considering size limitations, I feel comfortable saying that the ones
I've rated high are that high objectively--it's not a comparison to
Trinity or any other worthy full-length game, because the circumstances
are so different that they really can't be compared.)

Duncan Stevens
d-st...@nwu.edu
312-654-0280

The room is as you left it; your last touch--
A thoughtless pressure, knowing not itself
As saintly--hallows now each simple thing,
Hallows and glorifies, and glows between
The dust's gray fingers, like a shielded light.

--from "Interim," by Edna St. Vincent Millay

David Thornley

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <654bsr$lnq$1...@joe.rice.edu>,
Lucian Paul Smith <lps...@rice.edu> wrote:
>
>ct (c...@comlab.ox.ac.uk) wrote:
>
>: > >Jim MacKenzie wrote:
>: > >
>: > >> Maybe we should do the final marking based on ordinals, and not on
>: > >> average scores, as they do in figure skating. Figure out which
>: > >> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
>: > >> the actual marks. This would eliminate the problem.
>
>: The idea itself is not a bad one, but is effectively what we have presently
>: if everybody uses the full range available to them. In general, I prefer
>: voting/counting systems to be as simple as possible so people (that is, me)
>: can understand _exactly_ what their vote means immediately.
>
>: Out of curiosity, what kind of averaging do figure skating use? I'm vaguely
>: tempted towards harmonic means, rather than arithmetic for averaging ranks,
>: but I suspect the stats behind it all is slightly more complicated than I'm
>: imagining atm.
>
>If I'm understanding the figure skating scenario correctly, there'd be a
>problem if we tried to apply it to the contest, since not everyone rates
>all the games. The result would be that someone who rated all the games
>would have a game ranked #35, while someone who rated only ten's worst
>game would be ranked #10. Averaging these numbers straight across the
>board would not yield correct results.
>
>*IF* we wanted to be more complicated (and that's a big 'if'), we try to
>come up with a way to rank the games on a relative scale, instead of on an
>absolute scale. This would eliminate even further the problem of people
>not playing all the games. The idea would be that you would take all
>combinations of two games, see who voted on both of them, and discover
>which game was rated higher than the other. So you might find that out of
>50 voters, 30 of them voted on both games #1 and #2, and game #1 was rated
>higher than game #2 for 20 of those votes. Therefore, game #1 is placed
>higher than game #2. Then you go to the combination of game #1 and game
>#3, and so on.
>

This way lies paradox! Suppose we have three voters, Floyd, Duffy,
and Trent (aka Tiffany), and three games, unimaginatively called
1, 2, and 3. The preferences are, from first to last:

Floyd: 1, 2, 3
Duffy: 2, 3, 1
Trent: 3, 1, 2

Notice that two people prefer game 1 to game 2, two people prefer
game 2 to game 3, and two people prefer game 3 to game 1. We can't
determine a ranking.

IIRC, there is no paradox-free way of setting up ordinal voting,
although it is possible to set up decision procedures that normally
yield results.

I think the current method is good enough, and I'd rather see that
continue rather than take the ordinal risks. Some people are going to
rate all the games on a 4-6 scale, perhaps, but then again some people
aren't going to rate on all the games.

The current system is fairly simple and intuitive. It doesn't force
judges to use the full 1-10 spread of numbers, but then it doesn't
force judges to rate games anyway. It also has the virtue that it
is working right now.


--
David H. Thornley | These opinions are mine. I
da...@thornley.net | do give them freely to those
http://www.thornley.net/~thornley/david/ | who run too slowly.

ct

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

Gerry Kevin Wilson <g...@pobox.com> wrote:

> Every year someone points out that the votes who use the entire
> range of 1 to 10 count slightly more than those that use a smaller range.
> Every year I think the same thing..."So?" People are aware of the fact.
> <snip> I operate under
> the assumption that the judges we get are mature enough, and intelligent
> enough to handle their own affairs, including rating the games however
> they want, and designating the scores in the manner of their choosing.

Very well said. (I must note that down for other electioneering tasks I
face.)

regards, ct "copyright? On Usenet?"


Adam Cadre

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

Michael Straight wrote:
> Maybe I *want* Adam's opinions to count more in the overall standings
> than mine.

Oh, if only everyone thought the way you did...

Dennis Matheson

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

Jim MacKenzie wrote:
>
>>snip<<

>
> Maybe we should do the final marking based on ordinals, and not on
> average scores, as they do in figure skating. Figure out which
> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
> the actual marks. This would eliminate the problem.
>
> Jim

Maybe, but how do you handle ties? There are 34 games and we are only
rating them from 1 to 10. I know I have assigned several games the same
rating. If a judge has two 10s and three 9s for example, which 10 gets
#1 and which gets #2? And how do you assign #3, 4, and 5 to the 9s?

This would work if we assigned ordinal rankings initially, but I don't
see how we could do it this time around. Any suggestions anyone?
--
"You can't run away forever, but there's nothing wrong with
getting a good head start" --- Jim Steinman

Dennis Matheson --- Dennis....@transquest.com
--- http://home.earthlink.net/~tanstaafl

ct

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <HEcotaAE...@chrism.demon.co.uk>, Chris Marriott
<ch...@chrism.demon.co.uk> wrote:
> In article <ant21020...@stu012.sjc.ox.ac.uk>, ct
> <c...@comlab.ox.ac.uk> writes

> >Jim MacKenzie wrote:
> >
> >> Maybe we should do the final marking based on ordinals, and not on
> >> average scores, as they do in figure skating. Figure out which
> >> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
> >> the actual marks. This would eliminate the problem.
> >
> >You've heard of the phrases 'taking a running jump at yourself', and
> >'enjoy a long walk on a short pier', yes?
>
> Jim's idea seems like an *excellent* one to me. What do you have against
> it?

The extra effort involved in counting (ie having to rewrite my vote
counting program again)!

The idea itself is not a bad one, but is effectively what we have presently
if everybody uses the full range available to them. In general, I prefer
voting/counting systems to be as simple as possible so people (that is, me)
can understand _exactly_ what their vote means immediately.

Out of curiosity, what kind of averaging do figure skating use? I'm vaguely
tempted towards harmonic means, rather than arithmetic for averaging ranks,
but I suspect the stats behind it all is slightly more complicated than I'm
imagining atm.

regards, ct "Me, do stats? Nah..."

regards, ct


Chris Marriott

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <ant21020...@stu012.sjc.ox.ac.uk>, ct
<c...@comlab.ox.ac.uk> writes
>Jim MacKenzie wrote:
>
>> Maybe we should do the final marking based on ordinals, and not on
>> average scores, as they do in figure skating. Figure out which
>> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
>> the actual marks. This would eliminate the problem.
>
>You've heard of the phrases 'taking a running jump at yourself', and
>'enjoy a long walk on a short pier', yes?

Jim's idea seems like an *excellent* one to me. What do you have against
it?

Chris

----------------------------------------------------------------
Chris Marriott, Microsoft Certified Solution Developer.
SkyMap Software, U.K. e-mail: ch...@skymap.com
Visit our web site at http://www.skymap.com

Chris Marriott

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <654372$or...@gcs.delta-air.com>, Dennis Matheson <"Dennis..Ma
theson@"@transquest.?.com> writes

>Maybe, but how do you handle ties? There are 34 games and we are only
>rating them from 1 to 10. I know I have assigned several games the same
>rating. If a judge has two 10s and three 9s for example, which 10 gets
>#1 and which gets #2? And how do you assign #3, 4, and 5 to the 9s?

They'd be assigned equally. If you had scores of:

10, 9, 9, 8, 7, 7, 7, 6

they'd be ranked as:

1, 2, 2, 4, 5, 5, 5, 8

Neil Brown

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

At 03:05:09 on Fri, 21 Nov 1997, ct wrote:
>Jim MacKenzie wrote:
>
>> Maybe we should do the final marking based on ordinals, and not on
>> average scores, as they do in figure skating. Figure out which
>> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
>> the actual marks. This would eliminate the problem.
>
>You've heard of the phrases 'taking a running jump at yourself', and
>'enjoy a long walk on a short pier', yes?

Hmm. Someone should start going to bed earlier, methinks.

- NJB

John Francis

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <654372$or...@gcs.delta-air.com>,
Dennis Matheson <"Dennis..Matheson@"@transquest..com> wrote:
>Jim MacKenzie wrote:
>>
>>>snip<<

>>
>> Maybe we should do the final marking based on ordinals, and not on
>> average scores, as they do in figure skating. Figure out which
>> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
>> the actual marks. This would eliminate the problem.
>>
>> Jim

>
>Maybe, but how do you handle ties? There are 34 games and we are only
>rating them from 1 to 10. I know I have assigned several games the same
>rating. If a judge has two 10s and three 9s for example, which 10 gets
>#1 and which gets #2? And how do you assign #3, 4, and 5 to the 9s?
>
>This would work if we assigned ordinal rankings initially, but I don't
>see how we could do it this time around. Any suggestions anyone?

Aaaaargh!

OK. I feel better now.
It's hard enough to decide whether a game is a ""6", 7" or an "8" now.
Ordinal ranking would mean that once I had decidied it was worth a "7",
I'd still have to rank it against the four or five other games in that
category. That's not easy. And it gets harder and harder as time goes
by, because there are more games in any given scoring group, and some
of them were played quite a long time ago. Ordinal ranking is fine
for ten or twelve games, but with over thirty it's just too hard.

As long as everybody has a fairly wide point spread in their scores,
the current system seems the best one to me. Admittedly if I only
rank games between "3" and "8" (actually I've found a "9" now), then
my vote will have slightly less effect on the final outcome than a
judge who uses the full range of 1-10. This doesn't bother me. It
is no more iniquitous than weighting every judge equally, even though
one judge may be scrupulously honest and dispassionate, while another
could be biased and unfair. That's life.

It doesn't really matter too much how you do the actual scoring,
anyway. You'll still end up with two or three games out in front
of everybody else, a similar number bringing up the rear, and the
rest pretty much spread out in the middle. And it will be the same
two or three games at the front under any scoring system. If the
scores are very, very close maybe there will be some shuffing of
places at the top. But most of the time there will be the same
outright winner under all systems.

These are observations based on my experience when I was running an
internet-based competition with several hundred entrants. I tried
different scoring algorithms, but the results seemed pretty stable.

--
John Francis jfra...@sgi.com Silicon Graphics, Inc.
(650)933-8295 2011 N. Shoreline Blvd. MS 43U-991
(650)933-4692 (Fax) Mountain View, CA 94043-1389
Unsolicited electronic mail will be subject to a $100 handling fee.

Gerry Kevin Wilson

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <654g8r$2m5$1...@darla.visi.com>
thor...@visi.com (David Thornley) wrote:

> The current system is fairly simple and intuitive. It doesn't force
> judges to use the full 1-10 spread of numbers, but then it doesn't
> force judges to rate games anyway. It also has the virtue that it
> is working right now.

Plus, we don't have any volunteers to do the vote tabulating in the, ahem,
statistically enhanced manner. When it comes to suggestions that are going
to make the competition "committee's" jobs that much harder, I don't even
begin to consider them unless we have someone who is gonna 'walk the
walk.' Otherwise I risk alienating the people who have very generously
donated their time, effort, cash, etc, to the contest. Any fool can see that
is the only way to do things when based on a strictly volunteer basis. If
you don't, you will 'toe-step' your way out of existance.

Besides the fact that ct, (the current vote counter, and therefore more than
entitled to say "Push off' to anyone who wants to make his job that much
more difficult.) doesn't want a more complicated system, I don't feel we
need one. Every year someone points out that the votes who use the entire


range of 1 to 10 count slightly more than those that use a smaller range.
Every year I think the same thing..."So?" People are aware of the fact.

If they choose to ignore it when voting, well, cry me a river. I operate under


the assumption that the judges we get are mature enough, and intelligent
enough to handle their own affairs, including rating the games however
they want, and designating the scores in the manner of their choosing.

---
G. Kevin Wilson: Freelance Writer and Game Designer. Resumes on demand.


Michael Straight

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to


On Thu, 20 Nov 1997, Jim MacKenzie wrote:

> Michael Straight wrote:
> >
> > Also note that if I rate my highest game a 7 and my lowest game a 3 and
> > Adam gives his highest game a 10 and his lowest game a 1, then Adam's
> > opinions will count more in the overall standings than mine.


>
> Maybe we should do the final marking based on ordinals, and not on
> average scores, as they do in figure skating. Figure out which
> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
> the actual marks. This would eliminate the problem.

Who said it was a problem? Maybe I *want* Adam's opinions to count more
in the overall standings than mine. Some people may feel more or less
strongly about their rankings, and they can reflect that in how much they
spread out their scores.

SMTIRCAHIAGEHLT

Lucian Paul Smith

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

ct (c...@comlab.ox.ac.uk) wrote:

: > >Jim MacKenzie wrote:
: > >
: > >> Maybe we should do the final marking based on ordinals, and not on


: > >> average scores, as they do in figure skating. Figure out which
: > >> game Judge #1 rated #1, #2, #3, etc. and average the ranking, not
: > >> the actual marks. This would eliminate the problem.

: The idea itself is not a bad one, but is effectively what we have presently


: if everybody uses the full range available to them. In general, I prefer
: voting/counting systems to be as simple as possible so people (that is, me)
: can understand _exactly_ what their vote means immediately.

: Out of curiosity, what kind of averaging do figure skating use? I'm vaguely
: tempted towards harmonic means, rather than arithmetic for averaging ranks,
: but I suspect the stats behind it all is slightly more complicated than I'm
: imagining atm.

If I'm understanding the figure skating scenario correctly, there'd be a


problem if we tried to apply it to the contest, since not everyone rates
all the games. The result would be that someone who rated all the games
would have a game ranked #35, while someone who rated only ten's worst
game would be ranked #10. Averaging these numbers straight across the
board would not yield correct results.

*IF* we wanted to be more complicated (and that's a big 'if'), we try to
come up with a way to rank the games on a relative scale, instead of on an
absolute scale. This would eliminate even further the problem of people
not playing all the games. The idea would be that you would take all
combinations of two games, see who voted on both of them, and discover
which game was rated higher than the other. So you might find that out of
50 voters, 30 of them voted on both games #1 and #2, and game #1 was rated
higher than game #2 for 20 of those votes. Therefore, game #1 is placed
higher than game #2. Then you go to the combination of game #1 and game
#3, and so on.

Some consideration must be taken, however: What happens if the ratings
are as follows (simplified for 5 voters):

Voter # Game #1 rating Game #2 rating
------- -------------- --------------
1 3 4
2 5 6
3 2 3
4 7 2
5 9 1

In this scenario, voters 1, 2, and 3 rated game #2 higher by 1 point.
Voters 4 and 5 rated game #1 higher by lots of points. The average score
for game #1 is about 5. The average score for game #2 is about 3. 5 is
higher than 3, but three voters are more than two voters. See the
problem? Furthermore, what if voters 1-3 were voting on all 35 games,
while voters 4 and 5 were voting on only 10? Or what if that was
reversed? Should one person who *really* liked the first game better than
the second count more than two people who slightly preferred the second
over the first?

So maybe we should take the number of games between the two. Include half
of the games rated the same, for convenience (so, for example, if voter #1
had given three other 3's and five other 4's, you'd say that game #1
was ranked five lower than game #2). You then might have a chart like:

Voter # Games between #1 and #2
------- -----------------------
1 -5
2 -2.5
3 -1
4 +5
5 +4

But wait! Should we scale these values based on the total number of games
voted on? And what about the person who played one great game and eight
crappy ones, and so gave a 10 and eight 2's? Maybe we should scrap the
concept of 'games between' and instead weigh the difference (as in the
previous chart) by how many games the voter had played. This might yield
something like:

Voter # (#1 - #2) * games rated
------- -----------------------
1 (3 - 4) * 35 = -35
2 (5 - 6) * 30 = -30
3 (2 - 3) * 20 = -20
4 (7 - 2) * 10 = +50
5 (9 - 1) * 6 = +48
-----
Total: 98 - 85 = +13

This time the game #1 has prevailed--but it was close! Had voter #5 given
the games a 7 and 2 like voter #5, game #2 would have prevailed by 5.

Anyway, the upshot is that it's complex and controversial. For this year,
at least, I think we better just stick with arithmetic mean, which,
honestly, will probably yield exactly the same results as this complex
formula. I might ask ct if he could send me the votes (just the numbers,
with game # instead of title, say) to see if I could fool around and write
a program to analyze them in one of the above ways, and see if it helped,
or made much of a difference.

Wow this was long.

-Lucian "Lucian" Smith

David A. Cornelson

unread,
Nov 22, 1997, 3:00:00 AM11/22/97
to

Actually, the current system is flawed from the beginning because we
inherently allow the judges to rate two or more games equal when in fact
our quest is to differentiate them, e.g., "this one sucks", "this one
REALLY sucks", instead of both of them being rated as, "putrid waste of
time and unbelievably bad spelling".

We should force voters to rank the games from 1 to 34 and give each
progressive rank the opposite number of points like they do in the college
football rankings (which seems entirely fair and accurate to me).

Therefore, for each first place vote, a game would receive 34 points and
for each second place vote, 33 points, and so on.

On the side, if you were to receieve less points than the number of judges
for your entry, this would be bad.

The 1 to 10 thing is probably not the best way, although I believe we
should follow some of Lucian's strategies in order to ferret out the
exactly right and truly favorable solution.

I give this discussion a 9.43847387483, but the Tempest one on arts was
probably a 9.43847387484.

Den of Iniquity

unread,
Nov 23, 1997, 3:00:00 AM11/23/97
to

On Thu, 20 Nov 1997, ct wrote:

> Matthew T. Russotto wrote:
>> ct wrote:


>> }Steve Bernard wrote:
>> }> give scores from 1 to 10 in whole numbers.
>> } ^^^^^^^^^^^^^^^^
>> }Can I just emphasise this, just in case anyone else wants to annoy me
>> }with fractional votes? Thankyou, I knew I was up to it.
>> Bah, now I'm going to have to feed my ratings through some sort of
>> mu-law thing to compress their dynamic range 10:1 while still
>> retaining appropriate separations. Yuck.
>
>I think you need to ingest more caffiene. Or pop more valium. Maybe both?

Ccccccccccccccccccacccacccncan iiiii jjjj-j-j-j-jj-justtttttt
recccccommmendddddd thatt-t-tt-t
yy-y-yoouu d-don'tt't't
b 'b 'b 'both.
t-t-t-tttry

--
D'D'Dddddddd


Den of Iniquity

unread,
Nov 23, 1997, 3:00:00 AM11/23/97
to

On Fri, 21 Nov 1997, Adam Cadre wrote:

>Michael Straight wrote:
>> Maybe I *want* Adam's opinions to count more in the overall standings
>> than mine.
>

>Oh, if only everyone thought the way you did...

Then I'd withdraw my game from such a dreadfully fixed competition
immediately!

Oh hang on, I didn't enter.

Hum.

--
Den


Den of Iniquity

unread,
Nov 23, 1997, 3:00:00 AM11/23/97
to

On 21 Nov 1997, Lucian Paul Smith wrote:
[snip]
>Wow this was long.

Phew, it's a good job there are so many prizes and no really big ones or
there'd be no end of debate.

--
Den


Nick

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

>
> We should force voters to rank the games from 1 to 34 and give each
> progressive rank the opposite number of points like they do in the college
> football rankings (which seems entirely fair and accurate to me).

There is the problem that if I think that game A is vastly better than
game B which in turn is only slightly better than game C I cannot express
this under your system.
Maybe this is unimportant, but information is lost by simple rankings.
Nick


0 new messages