Question about Track 1

xxxcs

unread,

Oct 25, 2024, 8:47:53 PM10/25/24

to clas2024-updates

Dear Organizers,

Sorry for bothering you, but I would like to express a concern about the black-box setting for the testing phase, which seems to make the results highly depend on luck or who can guess the hidden models rather than the algorithm.

I understand that the black-box setting is used for avoiding double-jailbreaking, but double-jailbreaking can be easily avoided by checking the code and responses of the winners. The issue caused by the black-box setting seems to be harder to fix than the double-jailbreaking.

Thank you in advance for considering my concern.

Thanks,
Participants

xxxcs

unread,

Oct 25, 2024, 9:23:23 PM10/25/24

to clas2024-updates

Sorry for the repeated email. More specifically, the white-box setting can create a fair environment for the teams to compare the algorithms. However, I am concerned that the black-box setting can easily create an unfair environment for competition, which depends on luck, information gap, the randomness caused by multiple factors, etc. I appreciate the efforts made by CLAS team for this wonderful competition, but I also want to express my concern for your consideration.

Zhen Xiang

unread,

Oct 25, 2024, 10:20:25 PM10/25/24

to clas2024-updates

Dear Participants,

Thank you for your participation and for providing your thoughts. In the testing phase, we keep the evaluation model unreleased to avoid double jailbreaking.

For the heldout model, the purpose is to ensure the transferability of the attack. We expect the participants to consider this important attack property during attack development. This policy was announced at the beginning of the development phase. At this stage, we will not change our policy. Thank you for your understanding.

Best,

Organizers

xxxcs

unread,

Oct 25, 2024, 10:53:39 PM10/25/24

to clas2024-updates

Thank you for your consideration. I understand the policy but the black-box setting indeed causes more issues, e.g., it can create an unfair environment, which make the competition's results less convincing (depending on luck, information gap, and some unknown factors, etc.). I think this is not only my thought

xxxcs

unread,

Oct 25, 2024, 11:05:42 PM10/25/24

to clas2024-updates

Also, due to the bias of the judging mechanism, the black-box setting is not guaranteed to evaluate transferability correctly

Zhen Xiang

unread,

Oct 25, 2024, 11:09:00 PM10/25/24

to clas2024-updates

Dear Participants,

We appreciate your participation and also understand your disappointment about your current ranking. Our point is that the policy was announced before the development phase. If you have concerns about the policy, we could have a chance to discuss them before your submission for the testing phase runs out. At this stage, it is too late for us to change the policies.

We also want all participants to know that the held-out model and the unreleased evaluation model are known to only very few members of the organization team. We can guarantee we didn't release the information to any of the teams.

Best,

Organizers

xxxcs

unread,

Oct 26, 2024, 12:48:58 AM10/26/24

to clas2024-updates

I fully understand your point and appreciate the efforts. I wonder if the judging mechanism can be released after the competition so that I could determine whether the problem originates from the bias of the black-box judging mechanism or the attack method. Thanks.

Zhen Xiang

unread,

Oct 26, 2024, 1:09:56 AM10/26/24

to clas2024-updates

Dear Participants,

The evaluation model and the heldout model will both be released after the announcement of winning teams.