The evaluation phase of the PARSEME shared task 1.2 on semi-supervised identification of verbal MWEs has just started!
We have released the blind test data for all 14 languages on our public Gitlab repo:
You can also use the larger unannotated corpora available here (also in closed track):
This year's focus is on unseen VMWEs: the general ranking will emphasize results on unseen VMWEs.
The deadline for the submission of results was extended to July 6 (anywhere in the world).
Results submission is to be made on the MWE-LEX softconf page:
Results must be a single compressed archive ("zip") with one folder per language, named according to the
2-letter language code (e.g. GA/ for Irish).
Each output must be
and conform to the .cupt format
./validate_cupt.py --input test.system.cupt
If you participate in both the closed and open tracks, please make distinct submissions for each.
Each team can submit 2 results per track, i.e. at most 4 in total (with one result per language in each
It is not mandatory to cover all languages, but then the macro-averages will not be comparable to other systems.
Subscribe and use the participants' mailing list if you find a bug or if you have questions:
Agata, Ashwini, Bruno, Carlos, Jakub, Marie