Hi Jannis, All,
Great to hear from you! Thanks for your
suggestion.
There are many different ways of
scoring the various arena elements and certainly one way we have
tried in the past is manually weighting the levels of difficulty,
as you suggest, particularly in the RoboCupRescue Robot League. Of
course we are open to refining this as no scoring system is
perfect. Here is a bit of background, just so we know what we're
trying to optimize for and why we *mostly* moved away from that.
The issue with manually deciding on
weightings is that all of the different arena elements have
different levels of difficulty. Some of them are obvious. The
single level hurdle is harder than the pinwheel ramps, so surely
the single level hurdle should be worth more points. But
afterwards you have a more difficult choice. How many points are
the stairs worth relative to the double hurdle? What is pushing a
button on dexterity worth relative to sand and gravel? What is a
QR code worth? Unfortunately, in the past this has degenerated
into different groups arguing that the thing they did really well
was really hard and worth more points than the thing that they
don't do so well in and things can become generally unpleasant.
More worryingly, we have directly observed this becoming a
cultural issue, where teams from cultures that were more used to
being vocal dominated and pushed the competition into favoring
their approach while those from cultures that are more averse to
confrontation tended to just disappear.
So it became a priority to reduce the
need for such debates and instead have the weightings
automatically adjust based on what teams were doing well.
In the preliminaries, we solve this
problem by normalizing all scores to the highest performance in
that test. It's not a perfect solution of course but it has the
nice property where there is an incentive for a team to really
push performance in something that might have been neglected by
other teams.
An important detail to note is in the
rulebook[1], page 13, "Setting", paragraph 3, which states:
"Tests with multiple settings are
considered separately for the purpose of scoring."
What this means is that the scores for
hurdles at 1 layer should never appear in the same column as the
scores for hurdles at 2 layers. It does also mean that a team that
can do hurdles at 2 layers also needs to do the hurdles at 1 layer
- because as far as scoring is concerned, they are completely
separate tests. We don't consider this a waste of time - perhaps
at 1 layer it's a speed run while at 2 layers, it's a really
impressive feat of engineering and control. If only one team can
do even just a single layer 2 hurdle, this can result in an
incentive that's even more than the suggested doubling of 2 layer
hurdle scores[2].
We have deliberately chosen the number
of permutations of apparatus and setting to be suitable for the
number of teams and slots at a typical world championship
(although we are also open to suggestions for ones to add or
drop). I do realize that sometimes regional opens don't have as
many slots available, be it due to a limitation of the time
available or the number of experienced judges. In such a
situation, the organizers have a choice to make between a few
compromise options that include:
1: Removing some options (e.g., only
having the harder settings or removing the easier apparatuses
altogether).
2: Weighting the harder setting (as you
suggest - it sounds like this was the choice made at the German
Open).
3: Having the number of laps at the
harder setting also count as the number of laps at the easier
setting. This can be an issue if the harder setting really slows
the robots down.
Of course none of these options are
ideal and we don't recommend them if at all possible but if a
regional competition is slot-limited, there may be no ideal
solution.
The finals is a different type of
competition and is actually intended to be a race. According to
the original plan, the team places were to be decided by who
finishes all of the arena elements first. Points only matter if
no-one finishes all of the arena elements. So as originally
conceived, it doesn't matter if some arena elements are easier
than others - all teams have to do them anyway, if they're easy
then all teams spend less time doing them.
Not being able to traverse a particular
arena element in the finals (or only being able to traverse it in
one direction) carries with it a double penalty. The first is that
it means that point isn't available to the team so they already
can't win (unless no teams finish). The second is that the team
wastes time having to detour around that element.
So in a sense, the finals already give
us a natural 'normalization' that double-penalizes teams for not
being able to do something. We don't need to add additional
complication by having to do accounting for how much each element
is worth. This also avoids the aforementioned arguments around the
relative difficulty of each arena element.
We can still get arguments around how
the arena is laid out. This is where the folks administering the
competition need to make sure that all teams who reached the
finals have a say in the arena design and that all teams are able
to advocate for their interests, regardless of their background.
I'm particularly open to suggestions for improving this part of
the competition!