some questions about the dataset

张澈

unread,

Dec 6, 2022, 1:50:03 AM12/6/22

to valueeval

I have some questions about the dataset.

As you said in last mail,new dataset contains a testset without labels,and will keep it keep those secret until after the final submission deadline.

Do you mean that the method of calculating the final score is predict this testset on TIRA?But how about someone labels the testset by themself and use it as training dataset to train their models?

Are these behaviors allowed？

Waiting for your apply,thanks very much.

张澈

18181...@qq.com

Johannes Kiesel

unread,

Dec 6, 2022, 2:13:37 AM12/6/22

to valu...@googlegroups.com

Hi,

That would generally be considered unfair behavior. Use the training and
validation sets for training and validation/optimization. The test set
should, as usual, show the performance of the approaches on unseen data.

Good that you asked!

Regards,
Johannes

(sorry for sending you this mail twice, did not notice you sent it to
the list)

On 06.12.22 07:49, '张澈' via ValueEval wrote:
> I have some questions about the dataset.
> As you said in last mail,new dataset contains a testset without
> labels,and will keep it keep those secret until after the final
> submission deadline.
> Do you mean that the method of calculating the final score is predict
> this testset on TIRA?But how about someone labels the testset by
> themself and use it as training dataset to train their models?
> Are these behaviors allowed？
> Waiting for your apply,thanks very much.

> ------------------------------------------------------------------------
>
> 张澈
> 18181...@qq.com
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=%E5%BC%A0%E6%BE%88&icon=https%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DKIVTxCB4bezqMyyMGNoYdQ%26s%3D100%26t%3D1581757258%3Frand%3D1583730392&mail=18181288772%40qq.com&code=Te4pBzJUZp-HNNcH5qXip9mvhvfQJI6RnaIPKXOmX3JY6w27H59E6qic_WioBOaNrN7cbNcrMX3EBHsZ72cwEQ>
>
> --
> You received this message because you are subscribed to the Google
> Groups "ValueEval" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to valueeval+...@googlegroups.com
> <mailto:valueeval+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/valueeval/tencent_239C783985A2444DCDFC648403D530EBD206%40qq.com <https://groups.google.com/d/msgid/valueeval/tencent_239C783985A2444DCDFC648403D530EBD206%40qq.com?utm_medium=email&utm_source=footer>.

--
Johannes Kiesel

Bauhaus-Universität Weimar
Bauhausstr. 9a, Room 106
99423 Weimar, Germany

Phone: +49 (0)3643 - 58 3720

Johannes Kiesel

unread,

Dec 6, 2022, 3:46:46 AM12/6/22

to 张澈, valu...@googlegroups.com

You are true; that would be unfair. And very sad, especially since this
shared task is about human values.

We can not avoid this happening if someone wants to screw over. We could
have done so by keeping the arguments from the test set hidden and thus
removing the possibility of uploading run submissions and forcing you
all to use Docker submissions to TIRA. Now, there are many good reasons
to go for Docker submissions, but I also know that for some
participants, it is much easier to upload a file. I think it is
irresponsible to make the life of people harder out of fear of someone
cheating.

And there is nothing gained from a team being first by cheating. If that
team can not explain their good results in their paper, it will become
apparent pretty fast what they did. Especially if asked to open source
their code.

This is a *shared* task. We are working on this together, trying to find
out which approaches work and which do not. Of course, it is nice to
rank high on the leaderboard. But what will remain after SemEval is over
are the insights you all gathered. And a team that places last with some
exciting approach (and maybe some easy-to-fix but fatal problem) will
have more impact than the team in the first place that just used
cutting-edge models. Of course, we will thus also highlight the exciting
approaches in the task overview paper that we organizers will write.

So, please do not worry about someone cheating. Instead, think about
clever and novel ideas for tackling the task.

And have a nice day!
Johannes

On 06.12.22 09:18, 张澈 wrote:
>
> So,I see.But if someone do this,he labels the testset by himself or his
> team, and use it to training,how will you avoid this happening？I think
> it's unfair.

> ------------------------------------------------------------------------
>
> 张澈
> 18181...@qq.com
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=%E5%BC%A0%E6%BE%88&icon=https%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DKIVTxCB4bezqMyyMGNoYdQ%26s%3D100%26t%3D1581757258%3Frand%3D1583730392&mail=18181288772%40qq.com&code=Te4pBzJUZp-HNNcH5qXip9mvhvfQJI6RnaIPKXOmX3JY6w27H59E6qic_WioBOaNrN7cbNcrMX3EBHsZ72cwEQ>
>
>

> ------------------ 原始邮件 ------------------
> *发件人:* "Johannes Kiesel" <johanne...@uni-weimar.de>;
> *发送时间:* 2022年12月6日(星期二) 下午3:13
> *收件人:* "valueeval"<valu...@googlegroups.com>;
> *主题:* Re: [valueeval] some questions about the dataset

> an email to valueeval+...@googlegroups.com.

> To view this discussion on the web visit

> https://groups.google.com/d/msgid/valueeval/5cc8a49c-172d-6868-447c-a52e8c68c703%40uni-weimar.de.

Reply all

Reply to author

Forward