ABout track I evaluation

88 views

Skip to first unread message

Zhen Xiang

unread,

Oct 22, 2024, 10:58:34 PM10/22/24

to clas2024-updates

Dear Paricipants,

We notice there are some ambiguities about track I evaluation. Here, we make some clarification.

1. There are two models to test the jailbreak prompts you submit. One is Gemma-2B-it, the other is not released.

2. There is a third model for evaluation/judging, which is not released.

3. The maximum number of injected token is 100, which is measured by the tokenizer of Gemma-2B-it. If a prompt exceeds this limit, it will receive a zero score, and this will not affect the evaluation for other prompts.

4. We have cleared-up the leaderboard and restored the submission chances for all teams.

Please feel free to let us know any further questions. Thanks again for your participation.

Best,

Organizers

Avinaash Anand K.

unread,

Oct 23, 2024, 12:25:44 AM10/23/24

to Zhen Xiang, clas2024-updates

Dear Organizing Team,

Thank you for the recent clarification regarding Track I evaluation. We have a specific query regarding point #3 (token limit enforcement):

1. Using the Gemma tokenizer on HuggingFace, our submitted prompts averaged approximately 93 added tokens, which is within the specified 100-token limit.

2. However, in the evaluation results, we noticed that for Model 1 (Gemma-2B-it), our scores consistently showed a pattern suggesting all jailbreak scores were set to 0. Specifically, the final scores were approximately 0.16*S(M), which would occur only if added_token_count > 100.

3. Interestingly, this pattern is not present in the results for the unreleased model.

Given this discrepancy, we wonder if the token counting issue (previously identified during the development phase) might have resurfaced in the Gemma-2B-it evaluation pipeline.

We would greatly appreciate your guidance on this matter.

Thanks,
Participants

Reply all

Reply to author

Forward

0 new messages