Question about Track 1 Testing Phase Released Model

75 views
Skip to first unread message

Yiqi Yang

unread,
Oct 21, 2024, 12:54:04 AM10/21/24
to clas2024-updates
Dear Organizers,
We observe that the new evaluation model 'gemma-2b-it' cannot output the evaluation score according to the evaluation prompt. 
According to the baseline test of the testing phase, only 1 out of 100 responses output "#thescore: ", and the others are invalid scores.
For example, the model basically only outputs "#score" or directly refuses to output.
This caused us a lot of trouble. We want to know if it could be fixed, which will be great for our experiments.

Thanks,
Participants

Zhen Xiang

unread,
Oct 21, 2024, 3:10:17 AM10/21/24
to clas2024-updates
Dear Participants,

Thanks for your question. In the testing phase, the judging model will not be released. Please refer to our announcement for more information: https://www.llmagentsafetycomp24.com/getting-started/

Best,
Organizers

Reply all
Reply to author
Forward
0 new messages