Dear AXOLOTL'24 participants,
We updated the training and development Russian datasets once again,
making them cleaner and more consistent.
Most important changes:
1) Old XIX century spelling in definitions and examples is
(automatically) converted to the modern Russian orthography. That is,
for example, "Нѣтъ, по струнамъ" is now transformed into "Нет, по
струнам". We hope it will make the whole task more focused on semantics,
instead of dealing with the peculiarities of historical orthography.
2) Some erroneous sense ids are fixed
3) Lots of OCR and parsing errors in definitions and usage examples are
manually fixed
4) Redundant instances removed
The same changes were applied to the held-out test set, so there will be
no surprises when we publish it on March 25.
This week, we will finalized our Codalab instance (where you will submit
your test set predictions) and announce it in this mailing list.
Good luck!
> ------------------------------------------------------------------------
> *From:*
axolo...@googlegroups.com <
axolo...@googlegroups.com> on
> *Sent:* 09 March 2024 18:40:19
> *To:* AXOLOTL-24
> *Subject:* [axolotl] updates to Russian datasets
>
https://groups.google.com/d/msgid/axolotl-24/181e0f16-1fb6-475c-b75e-757619987e3d%40ifi.uio.no <
https://groups.google.com/d/msgid/axolotl-24/181e0f16-1fb6-475c-b75e-757619987e3d%40ifi.uio.no>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "AXOLOTL-24" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
axolotl-24+...@googlegroups.com
> <mailto:
axolotl-24+...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/axolotl-24/8b685c1d763445d5b5a19f68deed334a%40ifi.uio.no <
https://groups.google.com/d/msgid/axolotl-24/8b685c1d763445d5b5a19f68deed334a%40ifi.uio.no?utm_medium=email&utm_source=footer>.