Why the T5 baseline for ToTTo not match the result with <<Text-to-Text Pre-Training for Data-to-Text Tasks>>

75 views

Skip to first unread message

unread,

Mar 9, 2021, 7:54:19 AM3/9/21

to gem-benchmark

Hello everyone,

I am wondering the paper <<Text-to-Text Pre-Training for Data-to-Text Tasks>> which used T5-3B for the ToTTo dataset achieving the BLEU 49.5, however, in GEM, the BLEU score is 42.2. Which is much lower, even lower than the Bert-to-Bert used by the original paper <<ToTTo: A Controlled Table-To-Text Generation Dataset>>.

Can anyone tell me the reason? Are GEM using a smaller T5 model, or they used the whole table as input instead of only the highlighted table?