Multisentence context in WI+locness dataset

Yoav Katz

unread,

Feb 20, 2019, 8:54:48 AM2/20/19

to BEA 2019 Shared Task: Grammatical Error Correction

Hi,

I understand the shared task is evaluated at the sentence level. However, WI+locness m2 files are generated from the corrections of complete essays - so some corrections may require references to previous sentences.

For example:

Original text: (A.dev.json )
"text": "Susan is a little dragon. Her skin is colored red and green, red dots ontop of the green to be more precise. She does that every day after school. Of course, she has also a little brother. His skin is colored red, just like the father.", "userid": "23782", "id": "1-159067", "cefr": "A1.ii", "edits": [[0, [[59, 60, ";"], [70, 75, "on top"], [162, 170, "also has"], [224, 227, "their"], [234, 234, "'s"]]]]}

M2 file: (A.dev.gold.bea19.m2 )

S His skin is colored red , just like the father .
A 8 9|||R:DET|||their|||REQUIRED|||-NONE-|||0
A 10 10|||M:NOUN:POSS|||'s|||REQUIRED|||-NONE-|||0

The natural correction is to "His skin is colored red , just like his father's" . I'm unsure how common this situation is.

In general - do you plan to address corrections in the dataset ?

Yoav Katz
ka...@il.ibm.com
IBM Research

BEA 2019 Shared Task Organisers

unread,

Feb 20, 2019, 9:43:46 AM2/20/19

to BEA 2019 Shared Task: Grammatical Error Correction

Hi Yoav,

You're absolutely right that some corrections depend on previous sentences. I'm also not aware of any work on anaphor errors in GEC, so this is definitely something we'd like to encourage.

Although this may not be easy in M2 format, since we lose the information about where an essay ends and begins, I think it'd still be reasonable to take just the context of the previous sentence into account to resolve many of these cases.

As for corrections to the dataset, I actually think the example you gave is already correct. The "father" is the father of both Susan and her brother, so he's "their" father. ;)

More generally however, we are unlikely to make any corrections to the dataset. GEC is known to be a highly subjective task, so while there are almost certainly some genuine errors in the data, others probably come down to personal preference. We will try to account for this in the test data by having multiple sets of annotations per sentence.

Yoav Katz

unread,

Feb 21, 2019, 10:10:11 AM2/21/19

to BEA 2019 Shared Task: Grammatical Error Correction

I agree that sentence's context could be interesting in real GEC scenarios - so the problem could be defined to include this.

To facilitate this, and make the training and evaluation more robust, perhaps you can consider adding empty lines between sentences from different contexts This could also be done in the M2 file with an 'S' line with no additional text. It will be up to the user to decide if and how to use the context to improve the correction process.

Without a clear indication of the context scope, some strange artifacts could be introduced by mixing the contexts of two unrelated pieces of text.

Yoav

בתאריך יום רביעי, 20 בפברואר 2019 בשעה 16:43:46 UTC+2, מאת BEA 2019 Shared Task Organisers:

BEA 2019 Shared Task Organisers

unread,

Feb 21, 2019, 5:16:49 PM2/21/19

to BEA 2019 Shared Task: Grammatical Error Correction

So you're right that M2 format doesn't make it easy to handle cross-sentence errors, but we also don't really want to change the format halfway through the shared task when people have already been training systems.

It's also not so easy to simply add an empty S line between the texts in each file as each corpus has a different raw format, so multiple scripts would need to be changed. We'd then also have to arrange to upload new files with our distributors (we're not hosting all the corpora ourselves).

I'm also not convinced it's worth it because I'd estimate only about 3% of all errors are anaphor errors, and would expect most of them can be resolved by checking either the same or previous sentence at most. Still, nobody has actually looked at this as far as I know, so by all means let us know if you find lots of strange artefacts!

Chris

Yoav Katz

unread,

Feb 25, 2019, 8:35:37 AM2/25/19

to BEA 2019 Shared Task: Grammatical Error Correction

I understand - I agree it is the right approach at this stage of the task.

writeto...@gmail.com

unread,

Mar 6, 2019, 5:42:21 AM3/6/19

to BEA 2019 Shared Task: Grammatical Error Correction

Will sentences in test data be randomly shuffled?

BEA 2019 Shared Task Organisers

unread,

Mar 6, 2019, 7:32:27 PM3/6/19

to BEA 2019 Shared Task: Grammatical Error Correction

No.

We will only shuffle the test data in terms of essays; e.g B essay, C essay, A essay, N essay, A essay.

This is to make sure you don't know the level of the essays (e.g. ordered ABCN as in the dev data). The sentences in each essay will not be shuffled.

Reply all

Reply to author

Forward