Adjudicator Allocation

8 views
Skip to first unread message

Deepak Jois

unread,
Nov 1, 2007, 8:21:21 PM11/1/07
to tabbie...@googlegroups.com, wudc...@googlegroups.com
Hi Klaas
We are having some internal discussions here in the Worlds committee
regarding Adjudicator allocation, and how it can reduce the manual
allocation and shifting around of adjudicators as much as possible.

I sent you a mail off the list regarding this, but I thought it would
be a good idea if you could post your reply here for the benefit of
all.

Could you please elaborate on the basic principles behind the
Simulated Annealing algorithm you are using, and what it intends to
achieve. Do you have a doc lying around somewhere which I can refer
to. What do you think are the strong points of the algorithm you are
using as opposed to the default one that was used in NTU?

Deepak

Meir Maor

unread,
Nov 2, 2007, 8:00:05 AM11/2/07
to tabbie...@googlegroups.com, wudc...@googlegroups.com
I am strongly in favor of using a Simulated Annealing or a genetic algorithm
to perform adjudicator allocation.
In fact in tournaments which allow factoring in more then just a few
factors this
is a good approach for debater allocation as well.(WUDC rules are very
strict which limits the
effectiveness of clever algorithms as far as debater/team allocation).

The main idea behind these approaches is that you can define your
criteria for measuring an
allocation, And try to maximize this
Obviously testing all possible options is not feasible and therefor we
use a heuristic search
approach as to find something as near as possible to the optimum(and
quite possibly the true optimum) in a reasonable time.
such approaches include: Simulated annealing, and genetic algorithms.
For simulated annealing one needs to define an evaluation function for
the allocation also referred to as the Energy function.
In a genetic algorithm all you need is to define a function comparing
two allocations(this can be done by comparing an energy function or
otherwise).
In both cases you need to be able to find an allocation similar to a given one.
For a genetic approach you need to also define a way to splice
together to allocations to get a new allocation similar to both
previous.

I myself have a lot of experience with tuning genetic algorithms for
allocation problems,
though from academic reading simulated annealing should work well when
the algorithm parameters are tuned properly.

As for adjudicator allocations, one needs to decide what we wish to achieve:
We obviously want good adjudicators to be chairs, we may want stronger
panels in the top rooms,
this needs to be formalized. In some cases we would like panels to
rotate and we would like adjudicators to meet different teams and
different adjudicators.
In the late rounds it is common to give young adjudicators showing
promise a chance at chairing in the weaker rooms, this is easily done
by strengthening the bias towards stronger panels in the top rooms in
later rounds.
We also usually avoid adjudicators and specifically chairs for judging
teams from their own university. We may want to build panels made up
adjudicators from many universities and countries.
We should not fear adding multiple criteria but we should simply
weight them in carefully.

Me

Klaas van Schelven

unread,
Nov 2, 2007, 11:34:49 AM11/2/07
to wudc...@googlegroups.com, tabbie...@googlegroups.com
Gentleman (I believe Tabbing is not very rich in ladies),

I hope I can reply to quite a number of concerns at once here...

Firstly, the current algorithm actually uses Simulated Annealing. Meir, you're more than welcome to use your experience to tune or radically change the algorithm. This was very much a first attempt, which contains a number of important first ideas, but can also be extended a lot.

For Nerds, the algorithm can be inspected here:
http://tabbie.svn.sourceforge.net/viewvc/tabbie/trunk/draw/adjudicator/simulated_annealing.php?view=markup

For everybody, the algorithm is currently tuned directly in the code, right here:
http://tabbie.svn.sourceforge.net/viewvc/tabbie/trunk/draw/adjudicator/simulated_annealing_config.php?view=markup
I would love to add some options to make tuning easier, but haven't found the time to do so yet. Anyone who want to take this up is more than welcome to do so.... I was thinking of both predetermined options ("early tournament", "last round") and the option to tune each value.

Except a number of values and a big licence you will see some explanations in this file. They should answer most of your questions. I will repeat key points here:

   36 
university_conflict: Penalty for each team-adjudicator from that team's uni.
   37 High values: Uni- conflicts will occur less
   38 Low values: (Down to 0): Uni conflicts matter less
   39 
40 chair_not_perfect: Penalty for each chair of less quality than 100. Total penalty = penalty * (100 - real value) 41 High values: The best adjudicators will all be chairs 42 Low values: Having the best people in chair is not as important 43
44 "panel_steepness": Value between 0 and 1. Reflecting the relation between panel strength and debate strength. 45 High values: Up to 1: Debate strength strictly relates to panel strength. 46 Low values: All debates are considered equal. 47 Further remarks: Slowly increasing this value during the tournament is expected to have a positive effect on the tournament. 48
49 panel_strength_not_perfect: Penalty for distance to this 'ideal average'. 50 High values: Emphasis on getting panels on the 'right strenght' 51 Low values: Not so much emphasis on getting panels on the 'right strenght' 52
53 adjudicator_met_adjudicator: Penalty for adjudicator meeting each other again. This penalty is multiplied by the 54 times these two already were in one panel together. 55 High values: Adjudicators are not put in panels with previous co-panellists
University Conflict and team conflict are at a high value, meaning the algorithm will try hard to steer away from conflicts between both adjudicators and their uni's, and any scratches that have been manually entered.

The chair-not-perfect value is currently medium. I recommend boosting this value at the beginning of the tournament, so that no "low chairs" get into the chair position. At a later state of the tournament, lower people should get into chair positions to allow for higher judges to make it to the higher rooms as pannellists, so lower the value later.

I would personally recommend keeping the different adjudicator_met_xxx values high enough, because it will just make for a much better tournament if you have adjudicators fairly mixed up.

Panel Steepness and Panel Strength Not Perfect can be increased slowly throughout the tournament, to give better debaters better judges as the tournament progresses.

@Meir: "An alteranative aproach to increasing the php timeout is to work asynchronsly," => this is correct. I'm not sure how easy it would be to make this in PHP (Tabbie's current tech, and will probably stay it's tech until Worlds), but I guess it could be done. An alternative setup would be to make an "improve_allocation.php", which takes a short while (couple of secs), load from DB at start and writes back to DB at end, and is called in a loop by some Ajax/Javascript code. Until someone hits the stop button.

Ok, so far my answering of questions. I'd love to answer some more questions, but please play with it yourself for a sec as well (if possible, also including manipulation of the different values). This should give you a good view of what already works, especially since the algorithm tells you quite a bit about what it thinks it did well and not so well.

I also have quite a wishlist (or TODO list). I really don't feel like doing to much work on Tabbie alone, since I've pretty much been the only one sice Deepak and AK left the program 4 years ago....  but if someone else also jumps in I'll certainly put in some more work.

* Tuning the algorithm
* It's currently not possibly for the panel-sizes to change from debate to debate. The panel-sizes are random (i.e. max. diff = 1) at the start, but only swaps are considered as of yet. As "swap" from a big to a small panel without swapping someone back should also be possible.
* It would be nice to be able to show the scores (i.e. "engergies") also after the human changes have been made. If for nothing else, so that you can see if you've accidentally introduced conflicts (scratches)
* The asynch stuff Meir talked about

--- further future ----
* protecting judges from going crazy by counting the number of binrounds they've had, and taking them out of the bins after a while.
* more serious analysis of making the break-chances and stuff.

And a final point (that I think I've discussed before, but will mention again with Worlds coming up):

It is a long standing custom to put the best judges in the rounds surrounding the break in the last round. This makes no sense to me.... why? Because of the way the break itself is folded, it is (to determine the best teams for the final), much more important to have the break itself ordered in the right way, than to have the right people at the bottom of the break (if you have to make that choice). I will elaborate if the point comes up again.

regards,
Klaas

Meir Maor

unread,
Nov 3, 2007, 1:29:08 PM11/3/07
to tabbie...@googlegroups.com
A note on conflicts and other unwelecome events, I believe almost all
these parameters
should be factored in in a non linear fashion, for example.
when comparing adjudicator-university conflicts. If we have two options:
In the first we have two adjudicators each meeting a team from their
own respective university,
and in the second we have two adjudicators from the same university
adjudicating a single team from that university,
We will obviously prefer the former.
We can do this by using a non linear factor, we can have conflict count squared
for each conflict to reflect this.
The same may apply for bin rooms, We declare the bottom N rooms(not
include first and possibly second round) as bin rooms, and for each
judge we count how many times he met a bin room and square this value
and then collect it together with a linear weight with
the rest of the parameters.
Though obviously certain adjudicators are prone to hit bin rooms, The
adjudicators who just
barely qualify as chairs(I have personnal expireance in this position).


Me.

Klaas van Schelven

unread,
Nov 3, 2007, 4:39:21 PM11/3/07
to tabbie...@googlegroups.com
Hi,

Yes, I remember having a similar thought.... I think I've implemented it for one of the factors (as I remember, the difference between the ideal room average panel strength and the actual panel strength). I guess this should be propagated to more parameters....

Klaas
Reply all
Reply to author
Forward
0 new messages