[CfP] 10k prices for agent reliability! CAR-bench Challenge @ IJCAI-ECAI 2026, Bremen

46 views
Skip to first unread message

Lukas Stappen

unread,
May 26, 2026, 10:32:43 AM (yesterday) May 26
to Machine Learning News

Dear colleagues, 

📢 Call for Participation: CAR-bench Challenge @ IJCAI-ECAI 2026, Bremen

The first competition on LLM agent reliability. Build an agent that completes multi-turn tasks across 58 tools and 19 domain policies in an automotive voice assistant setting, but critically, also knows when to refuse, clarify, or admit it can't help. Baseline frontier LLMs manage only 58% consistency. Can your agent harnessing, planning, self-verification, or reliability design do better?

Website: https://car-bench.github.io/car-bench/

Co-organized by
Elisabeth André (Univ. Augsburg)
Lukas Stappen (BMW)
Patrick Dreisch (Anthropic)
Natalia Vassilieva (Cerebras)
Raj Tumuluri (OpenStream.ai), Erik Cambria (NTU Singapore), Iryna Gurevych (TU Darmstadt), Varin Sikka (Stanford), and Johannes Kirmayr (BMW / Univ. Augsburg).

Prizes:

  • (Open Track) $5,000 prize pool in Anthropic API credits, sponsored by Anthropic
  • (Cerebras Track) 2 Codex Pro 12-month subscriptions, sponsored by OpenAI
  • Additional award-sponsoring by OpenStream.ai 

Two tracks:

  • Open Track: Any model, any framework. Ranked by consistency (Pass^3) on a hidden test set. Best Innovation Award for novel reliability design.
  • Cerebras Fast-Reasoning Track: Build agents on gpt-5.3-codex-spark served by Cerebras. The idea: turn record-breaking inference speed into performance gains through inference-time compute such as deeper reasoning, self-verification, or search within a fixed time budget. 15 teams, sponsored Codex Pro access.

Winners receive the opportunity to present at IJCAI-ECAI 2026 and co-author the competition report paper. 

Key dates (AoE):

  • May 25: Competition opens (data, baseline, starter kit)
  • Jul 19: Final evaluation (rankings determined)
  • Jul 26: Technical report deadline (4p, IJCAI format)
  • Aug 15–21: Presentations at IJCAI-ECAI 2026 

The competition runs online.

Accepted at ACL 2026 Main · HF Paper of the Day · 1st at UC Berkeley AgentX

Website: https://car-bench.github.io/car-bench/ 

Please share with colleagues and students who might be interested. 

Best regards,

CAR-bench organizing team

Reply all
Reply to author
Forward
0 new messages