[rl-list] [NeurIPS2025 competition] Call for contributions - NeurIPS 2025 E2LM Competition

29 views
Skip to first unread message

Reda Alami

unread,
Jul 6, 2025, 11:33:49 AM7/6/25
to rl-...@googlegroups.com, mouadh....@tii.ae

Call for contributions
NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models

 

Join us in building benchmarks that capture early-stage reasoning & scientific knowledge in LLMs!

The development of Large Language Models (LLMs) typically begins with a series of ablation experiments, wherein various model architectures, data mixtures, and training hyperparameters are systematically evaluated. This phase is commonly referred to as the early stages of training. During this period, researchers primarily monitor two key metrics: the training loss curve and evaluation scores. However, existing evaluation benchmarks often fail to provide meaningful or discriminative signals during these initial stages where LLMs are trained on a few tokens ~200B tokens, making it challenging to derive conclusive insights from ongoing experiments.

In this competition, we want to build together new benchmarks to effectively capture relevant signals in early training stages of LLMs, specifically for scientific knowledge domain.

How to participate

The competition will be hosted on a dedicated Hugging Face organization - to register to the competition please follow this registration link 👉 https://e2lmc.github.io/registration.
Participants will have to submit their solutions, which will be based on lm-evaluation-harness library through a HuggingFace Space. An active leaderboard will be maintained during the competition to track promising submissions.

The size of the models make them easily runnable for everyone, on free-tier Google Colab GPUs. We also provide a comprehensive starting kit including several notebooks to get started with the competition.

 

Evaluation metrics

Each submission will be evaluated using three different scores:

  • Signal Quality : smooth, meaningful learning curves
  • Ranking Consistency : stable model rankings across training (until 1 Tera Tokens)
  • Scientific Compliance:  benchmarks should accurately reflect scientific knowledge and reasoning.

 

Timeline

Competition kick-off : July 14
July 14 – August 18: Warm-up phase
August 18 – October 27: Development phase
October 27 – November 7: Final evaluation
December 6/7:
Winning solutions presentation @NeurIPS 2025


 
💰 Prizes


🏆 $6,000 – 1st place
🥈 $4,000 – 2nd place
🥉 $2,000 – 3rd place
🎓 $2,000 × 2 – Best student submissions
🎓 Winning entries will be showcased at a dedicated  workshop @Neurips 2025

More details on the competition website: https://e2lmc.github.io/
Register for the competition: https://lnkd.in/euqjzJcx
Read the competition proposal: https://lnkd.in/eu9TKsVh


 

Reply all
Reply to author
Forward
0 new messages