On behalf of the co-organizers, I was hoping you could share this CFP of our NeurIPS workshop on “
Human Evaluation of Generative Models” with your labs, departments, your institution, and anyone else in your network. Would also love it if you would submit your work in progress. The details of the workshop and submission instructions are as follows:
Website: https://humaneval-workshop.github.io/Workshop Date: December 3, 2022
Submission: https://openreview.net/group?id=NeurIPS.cc/2022/Workshop/HEGMSubmission Deadline: September 15th 2022 23:59 GMT
Contact: hegm-w...@lists.andrew.cmu.edu*************************
Description:Rapid advances in generative models for both language and vision have made
these models increasingly popular in both the public and private sectors.
For example, governments use generative models such as chatbots to better
serve citizens. As such, it is critical that we not only evaluate whether these
models are safe enough to deploy, but also ensure that the evaluation systems
themselves are reliable. Oftentimes, humans are used to evaluate these models.
Our goal is to call attention to the discussion on how to best perform reliable
human evaluations of generative models. Through this discussion, we aim to
uplift cutting edge research and engage stakeholders in dialogue on
how to address these challenges from their perspective. Critical considerations
of safe deployment include reproducibility and trustworthiness of an evaluation,
assessment of human-AI interaction when predictions lead to policy decisions,
and value-alignment of these systems.
In partnership with the Day One Project --- Federation of American Scientist's
impact-driven policy think tank that helps subject matter experts become
policy entrepreneurs--- we will select a few papers with clear policy implications
and recommendations, invite authors to write policy memos, and work to implement
those policy recommendations. Finally, we will capture the discussions that happen
during our panels in a paper that will summarize the workshop recommendations
and seek to publish that work for scholarly record.
Topics of interest include but are not limited to the following:
- Experimental design and methods for human evaluations
- Role of human evaluation in the context of value alignment of large generative models
- Designing testbeds for evaluating generative models
- Reproducibility of human evaluations
- Ethical considerations in human evaluation of computational systems
- Quality assurance for human evaluation
- Issues in meta-evaluation of automatic metrics by correlation with human evaluations
- Methods for assessing the quality and the reliability of human evaluations
Organizers:
Divyansh Kaushik (Carnegie Mellon University)
Jennifer Hsia (Carnegie Mellon University)
Jessica Huynh (Carnegie Mellon University)
Yonadav Shavit (Harvard University)
Samuel R. Bowman (New York University)
Ting-Hao 'Kenneth' Huang (Penn State University)
Douwe Kiela (Hugging Face)
Zachary Lipton (Carnegie Mellon University)
Eric Smith (Facebook AI Research)
Important dates:
Submission deadline: September 15, 2022
Acceptance notifications: October 20, 2022
Camera-ready deadline: November 3, 2022
Workshop date: December 3, 2022