Koo Roi
unread,Oct 5, 2024, 11:35:06 PM10/5/24Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to clas2024-updates
Dear Organizers of the Competition,
We have encountered two issues while using the evaluation code for the jailbreak track.
First, sometimes the harmful responses cause the judging LLM to refuse to respond, resulting in an incorrect score of 0 for ‘score_model_evaluate’. Will this issue be addressed?
Second, we have observed that the judging LLM occasionally experiences exceptions if responses are longer than 4096 tokens, which interrupts the judging process. Will you intervene and re-evaluate when such exceptions occur?
Thank you for your attention to these matters.