service and dialog order in SGD dataset?

Michael White

unread,

Mar 30, 2021, 7:57:12 PM3/30/21

to gem-benchmark

Hi folks

A couple questions re the schema guided dialogue dataset:

It appears that the service is missing from the input. However, the service appears to be essential in some cases to the meaning of what is to be generated. For example, NOTIFY_SUCCESS (which has no arguments) means something different depending on whether the service is renting a car vs. transferring funds etc., which can be seen in the varying references for this act. Without the service in the input, it's impossible to reliably get a sensible output. Can this be added to the datasets?
The training set has turns from complete dialogues in order, but the validation and test sets do not (just isolated turns). Kale & Rastogi's EMNLP paper has a section where they experiment with adding context (up to several turns) to the input and find that can increase BLEU scores. The current setup makes it impossible to experiment with adding more context to the input. Can that be changed? Or can one of the test sets have this condition?

Note that it does appear to be possible to use the dialog and turn IDs to link back to the original SGD dataset, so in principle it would be possible for participants to do these augmentations. Not sure whether that is supposed to be allowed in the shared task though.

Thanks!

Mike

Sebastian Gehrmann

unread,

Mar 31, 2021, 10:14:03 AM3/31/21

to Michael White, gem-benchmark

Hi Mike,

Thank you for the questions! Adding the service is indeed something we should do.

For the second question, we had quite a few discussions around adding context and the current plan is to have the full dialog shown in the human evaluation, but make it as easy as possible for modeling by not requiring it. However, if you prefer to model the context as well, we can add it back into the representation alongside the service.

Best

Sebastian

--
You received this message because you are subscribed to the Google Groups "gem-benchmark" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gem-benchmar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gem-benchmark/5ecb203f-40fd-43af-a838-2c2cfa05c506n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael White

unread,

Mar 31, 2021, 10:52:43 AM3/31/21

to Sebastian Gehrmann, gem-benchmark

Hi Sebastian

Thanks for the quick response! I think it would be great to make both the service and dialog context available in the input. Kale & Rastogi's paper does not include any analysis of why the context is apparently helpful so it would be good to encourage research in that direction.

Best

Mike

Sebastian Gehrmann

unread,

Mar 31, 2021, 10:54:17 AM3/31/21

to Michael White, gem-benchmark

Hi Mike,

I'll add both and will update this thread once they are integrated in our data loader(s).

Best

Sebastian

gehrmann

unread,

Apr 8, 2021, 9:19:05 AM4/8/21

to gem-benchmark

Hi Mike,

The latest Huggingface Datasets version now includes two extra fields for schema-guided dialog: 'context' and 'service'. We propagated the changes through all of the challenge sets as well.

I am working now on propagating the changes to TFDS as well.

Best

Sebastian

Michael White

unread,

Apr 8, 2021, 1:39:42 PM4/8/21

to gehrmann, gem-benchmark

Awesome 😎 will check it out!

Many thanks

Mike

To view this discussion on the web visit https://groups.google.com/d/msgid/gem-benchmark/4b81850c-a1f9-4c5d-9039-a618a153567an%40googlegroups.com.

Reply all

Reply to author

Forward