Call for Participation: The Third Shared Task on Explanation Regeneration at TextGraphs-15

3 views

Skip to first unread message

Alexander Panchenko

unread,

Mar 4, 2021, 6:32:53 AM3/4/21

to Конференция АИСТ - Анализ Изображений Сетей и Текстов

We invite participation in the 3rd Shared Task on Explanation Regeneration

associated with the 15th Workshop on Graph-Based Natural Language Processing

(TextGraphs-15) co-located with NAACL-2021: https://competitions.codalab.org/competitions/29228

All systems participating in the shared task will be invited to submit system description papers.

Each system description paper will be peer-reviewed by (two) other participating teams and will be presented as a poster at the main workshop: https://sites.google.com/view/textgraphs2021.

# Overview

Multi-hop inference is the task of combining more than one piece of information

to solve an inference task, such as question answering. This can take many

forms, from combining free-text sentences read from books or the web, to

combining linked facts from a structured knowledge base. The Shared Task

on Explanation Regeneration asks participants to develop methods that

reconstruct large explanations for science questions, using a corpus of

gold explanations that provides supervision and instrumentation for this

multi-hop inference task. Each explanation is represented as an

"explanation graph", a set of atomic facts (between 1 and 16 per explanation,

drawn from a knowledge base of 9,000 facts) that, together, form a detailed

explanation for the reasoning required to answer and explain the resoning

behind a question.

Explanation Regeneration is a stepping-stone towards general multi-hop inference

over language. In this shared task we frame explanation regeneration as a

ranking task, where the inputs to a given system consist of questions and their

correct answers. Participating systems must then rank atomic facts from a

provided semi-structured knowledge base such that the combination of top-ranked

facts provide a detailed explanation for the answer. This requires combining

scientific and common-sense/world knowledge with compositional inference.

While large language models (BERT, ERNIE) achieved the highest performance in

the 2019 and 2020 shared tasks, substantially advancing the state-of-the-art

over previous methods, absolute performance remains modest, highlighting the

difficulty of generating detailed explanations through multi-hop reasoning.

# New for 2021 edition

Many-hop multi-hop inference is challenging because there are often multiple

ways of assembling a good explanation for a given question. This 2021

instantiation of the shared task focuses on the theme of determining relevance

versus completeness in large multi-hop explanations. To this end, this year

we include a very large dataset of approximately 250,000 expert-annotated

relevancy ratings for facts ranked highly by baseline language models from

previous years (e.g. BERT, RoBERTa).

Submissions using a variety of methods (graph-based or otherwise) are

encouraged. Submissions that evaluate how well existing models designed on

2-hop multihop question answering datasets (e.g. HotPotQA, QASC, etc) perform

at many-fact multi-hop explanation regeneration are welcome.

# Example

For example, for the question: "Which of the following is an example of an

organism taking in nutrients?" with the correct answer: "a girl eating an

apple", an ideal system would rank the following explanatory statements

at the top of its extracted sentences:

1. A girl means a human girl.

2. Humans are living organisms.

3. Eating is when an organism takes in nutrients in the form of food.

4. Fruits are kinds of foods.

5. An apple is a kind of fruit.

# Important Dates

* 2021-02-15: Training data release

* 2021-03-10: Test data release; Evaluation start

* 2021-03-24: Evaluation end

* 2021-04-01: System description paper deadline

* 2021-04-13: Deadline for reviews of system description papers

* 2021-04-15: Author notifications

* 2021-04-26: Camera-ready description paper deadline

* 2021-06-11: TextGraphs-15 workshop

# Data

The data used in this shared task contains approximately 5,100 questions drawn

from the AI2 Reasoning Challenge (ARC) dataset (Clark et al., 2018), together

with explanation sentences for their correct answers drawn from the

WorldTree V2 corpus (Xie et al., 2020, Jansen et al., 2018). For 2021, this

has been augmented with a new set of approximately 250k expert-generated relevancy

ratings.

The knowledge base supporting these questions contains approximately 9,000 facts.

To encourage a variety of solving methods, the knowledge base is available both as

plain-text sentences (unstructured) as well as semi-structured tables. Facts are a

combination of scientific knowledge as well as common-sense/world knowledge.

Please see the shared task website for more details:

https://competitions.codalab.org/competitions/29228

https://github.com/cognitiveailab/tg2021task

Reply all

Reply to author

Forward

0 new messages