Question about Track 1 Testing Phase Released Model

75 views

Skip to first unread message

unread,

Oct 21, 2024, 12:54:04 AM10/21/24

to clas2024-updates

Dear Organizers，

We observe that the new evaluation model 'gemma-2b-it' cannot output the evaluation score according to the evaluation prompt.

According to the baseline test of the testing phase, only 1 out of 100 responses output "#thescore: ", and the others are invalid scores.

For example, the model basically only outputs "#score" or directly refuses to output.

This caused us a lot of trouble. We want to know if it could be fixed, which will be great for our experiments.

Thanks，

Participants

unread,

Oct 21, 2024, 3:10:17 AM10/21/24

to clas2024-updates

Dear Participants,

Thanks for your question. In the testing phase, the judging model will not be released. Please refer to our announcement for more information: https://www.llmagentsafetycomp24.com/getting-started/

Best,

Organizers

Reply all

Reply to author

Forward

0 new messages