Final call for papers and shared task submissions for the Workshop on Generation, Evaluation, and Metrics (GEM) at ACL ’21
Update April 22: Our Paper submission deadline has been extended to May 3! Please submit your papers at this SoftConf link. The shared task submission deadline is May 14.
Call for Participation
Natural language generation is one of the most active research fields in NLP. As such, the number of available datasets, metrics, models, and evaluation strategies is rising rapidly. Consequently, new models are often evaluated on different anglo-centric tasks with incompatible evaluation setups. With GEM, we are aiming to tackle this problem by standardizing and improving the corpora on which to evaluate NLG models, and by supporting the development of better evaluation approaches. In our shared task, models will be applied to a wide set of NLG tasks. It covers challenges that measure specific generation aspects, such as content selection and planning, surface realization, paraphrasing, simplification, and others. To avoid hill-climbing on automated metrics, a second part of the shared task focuses on an in-depth analysis of submitted model outputs across both human and automatic evaluation with the aim to uncover shortcomings and opportunities for progress. The GEM Workshop is a SIGGEN-endorsed event.
The shared task is described in-depth here: https://gem-benchmark.com/shared_task.
It includes two parts:
In the first part, participants are encouraged to apply their model to as many of the included tasks as possible and submit their formatted outputs. We provide GEM-specific test sets that will be used to evaluate specific generation aspects.
In the second part, all submitted and baseline outputs will be released for an evaluation shared task. Participants can submit analyses and evaluations of the model outputs.
During the GEM workshop, shared task participants will come together to discuss their findings which will inform future iterations of GEM.
Call for Papers
All papers are allowed unlimited space for references and appendices. For papers associated with the shared task, we additionally highly encourage publishing the code used to generate the results. We ask for papers in the following categories:
- System Descriptions
Participants of the modeling shared task are invited to submit a system description of 4-8 pages.
- System Evaluation Descriptions
Participants of the evaluation shared task are invited to submit a paper describing their analysis approach and findings of 4-8 pages.
- Research Papers
We welcome papers discussing any of the following topics:
Automatic evaluation of NLG systems
Creating challenge sets for NLG corpora
Critiques of benchmarking efforts (including ours)
Crowdsourcing strategies to improve the inclusiveness of NLG research
Measuring progress in NLG / What should a GEM 2.0 look like
Modeling and data-augmentation strategies for training effective and/or efficient NLG systems that can be applied to a wide range of tasks
Standardizing human evaluation and making it more robust
We additionally invite every group that contributed to the creation and organization of GEM to submit a description of their considerations and contributions.
These submissions can take either of the following forms:
Archival Papers Papers describing original and unpublished work can be submitted in either a short (4-page) or a long (8-page) format.
Non-Archival Abstracts To discuss work already presented or under review at a peer-reviewed venue, we allow the submission of 2-page abstracts
Please note that we are not looking at submissions that focus on specific modeling challenges or introduce new model architectures, etc., which would fit better into conferences like ACL or INLG.
All submissions should conform to ACL 2021 style guidelines. Archival long and short paper submissions must be anonymized. Abstracts and shared task submission descriptions should include author information. Please submit your papers at the SoftConf link.
✅February 2 First Call for Shared Task Submissions and Papers, Release of the Training Data
April 26 May 3 Workshop Paper Due Date (excl. shared tasks)
May 28 Notification of Acceptance (excl. shared tasks)
June 7 Camera-ready papers due (excl. shared tasks)
Shared Task Dates
✅February 2 Release of the training Data
✅March 29 Release of the test sets
May 14 Modeling submissions due
✅March 29 April 2 Release of the baseline outputs
May 17 Release of the submission outputs
System Descriptions and Analyses
June 11 System Descriptions and Analyses due
June 25 Notification of Acceptance (shared task)
July 9 Camera-ready papers and task descriptions due
August 5-6 Workshop Dates
The workshop is organized by
Antoine Bosselut (Stanford University)
Esin Durmus (Cornell University)
Varun Prashant Gangal (Carnegie Mellon University)
Sebastian Gehrmann (Google Research)
Yacine Jernite (Hugging Face)
Laura Perez-Beltrachini (University of Edinburgh)
Samira Shaikh (UNC Charlotte)
Wei Xu (Georgia Tech)