Dear AXOLOTL participants,
The test phase evaluation results are now public in our Github repository:
https://github.com/ltgoslo/axolotl24_shared_task/tree/main/results
There are three leaderboards:
1) scores for Subtask 1 evaluated with ARI
2) scores for Subtask 1 evaluated with F1
3) scores for Subtask 2 evaluated with BLEU and BERTScore
The teams are ranked by the average of their scores across all three
languages including the surprise language (Finnish, Russian and German).
It's the first `Fi-Ru-De` column. For convenience, we also show the
average scores without the surprise language in the `Fi-Ru` column. The
rest of the columns are results for specific languages.
Note that for the Subtask 1, we keep separate leaderboards for ARI and
F1, since these metrics focus on very different aspects of the task, and
it does not make sense to average across them.
For the Subtask 2, we average across BLEU and BERTScore, since they aim
at measuring the same aspects of the task.
We are open to any questions if need be, but first: congratulations on
your great submissions! We are excited to read through your papers and
see you at the LChange workshop in August.
On behalf of other organizers,
--
Andrey
Language Technology Group (LTG)
University of Oslo