Questions on the dataset

171 views

Skip to first unread message

Yanan Cai

unread,

Sep 12, 2023, 5:36:00 PM9/12/23

to unlearning-challenge

Hello,

If we understand correctly, the dataset that is used for scoring the public and private leaderboard will not be made public. But will you publish a sample dataset that has similar distribution for testing?

We think it's important for us to first look at some statistics on the dataset for unlearning, and it is more realistic to have access to the data for a real-world unlearning scenario. We understand that we can still develop a solution by using another dataset and getting the feedback from notebook submissions. But it would be great to learn more about the motivation behind the hidden dataset and no sample dataset.

Another question on the dataset:

- From the distribution, most images belong to class 0. Does class 0 refer to "unknown age" or a real age range?

Thanks!

etriantafillou

unread,

Sep 13, 2023, 10:25:02 AM9/13/23

to unlearning-challenge

Hi,

Thank you for the questions!

No, we won't publish any dataset at this point. There are several reasons behind this decision, including the sensitive nature of this data, as well as our desire to prevent overfitting, as is common in code competitions. Given this, we won't reveal any additional information at this time.