Dear LLM CTF Participants,
We hope the Defense phase is going well. We have a few important updates to share with you:
Bug Fixes and Budget Replenishment
We experienced a regression in the API from January 3rd 16:47 UCT+1 to January 8th 10:15 UCT+1. The new “Debug defense” feature made multiple consecutive “assistant” messages be sent to the model. This is now fixed and the chat is working as before: only the final, filtered assistant message is part of the chat conversation.
We understand that this may have impacted your team’s ability to test and refine your strategies. To address this, we have replenished some of the team budgets affected during this period. All teams got some additional 3$ for GPT-3.5 and 5$ for Llama 2.
Extended Deadlines
Given the above, we are extending the deadlines by 2 days. The new important dates are as follows:
- Defense submission deadline: January 17th
- Reconnaissance phase begins: January 12th
- Evaluation phase begins: January 27th
- Evaluation and Reconnaissance phases deadline: March 2nd
- Winners announced: March 4th
All deadlines are in Anywhere on Earth (UTC-12).
Defense Submissions for Different Models
In response to community feedback, we have decided to allow defenders to submit separate defenses for gpt-3.5-turbo and llama-2-70b-chat models. This corresponds better to the real-world scenarios this competition intends to investigate.
The defense/{id}/submit endpoint will be adjusted accordingly in a few days to take as argument the model for which the defense has been developed. You will receive an additional email when it’s ready. In case you want to submit the same defense for both models, please submit two separate (but equal) defenses with distinct IDs and run /defense{id}/submit twice with a different model each time. Teams are free to adjust strategies accordingly and submit different defenses for each model.
If two defenses from the same team are eligible for prizes, we will only award the prize to the better one. We understand that some teams might have spent time ensuring transferability, and are thus losing a bit of work now; and we apologize. However this is partly offset by having a better chance of winning the prize (two entries) with the same work. If a team submits two defenses, the fact that these two defenses are from the same team will not be hardcoded on the leaderboard. Of course, teams are also allowed to submit only a single defense with their preferred model.
Thank you for your engagement and contributions to making the LLM CTF useful.
Best regards,
The LLM CTF Organizing Team