Query about Track I Evaluation

125 views
Skip to first unread message

Koo Roi

unread,
Oct 5, 2024, 11:35:06 PM10/5/24
to clas2024-updates
Dear Organizers of the Competition,
We have encountered two issues while using the evaluation code for the jailbreak track.
First, sometimes the harmful responses cause the judging LLM to refuse to respond, resulting in an incorrect score of 0 for ‘score_model_evaluate’. Will this issue be addressed?
Second, we have observed that the judging LLM occasionally experiences exceptions if responses are longer than 4096 tokens, which interrupts the judging process. Will you intervene and re-evaluate when such exceptions occur?
Thank you for your attention to these matters.

Zhen Xiang

unread,
Oct 7, 2024, 2:30:08 PM10/7/24
to clas2024-updates
Dear Participants,

Thank you for your question. For the first question, we will use a more powerful LLM for judging, which will significantly alleviate this issue. For the second question, we will set a limit on the evaluation length during the testing phase.

Best,
Organizers

SnigdhaChandan Khilar

unread,
Oct 8, 2024, 4:48:17 AM10/8/24
to Zhen Xiang, clas2024-updates
Hi organizers ,

Whats the total number of submission anyone can make during dev phase ?

--
您收到此邮件是因为您订阅了Google群组上的“clas2024-updates”群组。
要退订此群组并停止接收此群组的电子邮件,请发送电子邮件到clas2024-updat...@googlegroups.com
要在网络上查看此讨论,请访问https://groups.google.com/d/msgid/clas2024-updates/c7c26794-241e-4504-8606-265c687a206an%40googlegroups.com
要查看更多选项,请访问https://groups.google.com/d/optout

Zhen Xiang

unread,
Oct 8, 2024, 9:56:52 PM10/8/24
to clas2024-updates
Dear Participants,

Each team has a maximum of 15 submission quotas for each track during the development phase (and 5 submissions for each track during the testing phase).

Best,
Organizers

SnigdhaChandan Khilar

unread,
Oct 10, 2024, 2:21:23 AM10/10/24
to Zhen Xiang, clas2024-updates
Hi Organizers ,

The score in the leaderboard is not getting updated after my second submission.

Reply all
Reply to author
Forward
0 new messages