Dataset creation for GEM in local language

17 views
Skip to first unread message

A Shvets

unread,
Jun 7, 2021, 5:59:31 AM6/7/21
to gem-benchmark
Dear GEM organizers,

The management of the company I'm working for is interested in creation of an opensource dataset, similar to CommonGen, in French language. I'm therefore asking you for the requirements this dataset must meet in order to be included to GEM benchmark.

Thank you in advance for your answer,
Anna

Sebastian Gehrmann

unread,
Jun 7, 2021, 8:55:27 AM6/7/21
to A Shvets, gem-benchmark
Hi Anna, 

Our plan is to have a yearly selection process for GEM (similar to our initial one) that decides which datasets we want to focus on. This year's process will start after the workshop in August. While I do not dictate the criteria, I can assume that we will prefer datasets with interesting and challenging test splits, good documentation and non-English language.

Besides that, even if not selected for the main challenges, it is fairly easy to make a task compatible with our evaluation framework and the one for creating challenge sets, which can be done independently (and we are happy to help).

Best
Sebastian

--
You received this message because you are subscribed to the Google Groups "gem-benchmark" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gem-benchmar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gem-benchmark/9114d3f6-d9a3-4a0c-8ca6-4e1834a78480n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages