Dear all,
Thank you so much for participating in the shared task. We will be releasing the official rankings next week after we make sure we have the highest scoring submission from each team.
The evaluation/test data with the gold standard labels is now available on the website and linked below. Per the competition description, the submissions were scored using Pearson's correlation with the 'Overall' column. If a 'pair_id' was in the submitted data twice, only the first score was used to compute the correlation. We hope this data will be useful in evaluating your models by, for example, producing breakdowns for different language pairs.
Please save the file with a .csv extension:
We're excited to see your papers and code. We strongly encourage everyone to submit regardless of your score. Please also consider sharing your code on GitHub or another location.
Best wishes,
Scott on behalf of all task organizers