Just published in Nat.Comm.: "Mapping global dynamics of benchmark creation and saturation in artificial intelligence"

40 views

Skip to first unread message

Matthias Samwald

unread,

Nov 21, 2022, 4:10:29 AM11/21/22

to AI Evaluation

Hi everyone,

Hopefully of interest to some --We just published a large-scale analysis of AI benchmark data in the journal Nature Communications:

Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Abstract: Benchmarks are crucial to measuring and steering progress in artificial intelligence (AI). However, recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, we introduce methodologies for creating condensed maps of the global dynamics of benchmark creation and saturation. We curate data for 3765 benchmarks covering the entire domains of computer vision and natural language processing, and show that a large fraction of benchmarks quickly trends towards near-saturation, that many benchmarks fail to find widespread utilization, and that benchmark performance gains for different AI tasks are prone to unforeseen bursts. We analyze attributes associated with benchmark popularity, and conclude that future benchmarks should emphasize versatility, breadth and real-world utility.

Paper: https://www.nature.com/articles/s41467-022-34591-0

Short-form summaries:

https://twitter.com/matthiassamwald/status/1591104555974299649

https://sigmoid.social/@matthiassamwald/109341410103106546

Jose Hernandez-Orallo

unread,

Mar 7, 2023, 3:42:26 AM3/7/23

to ai-...@googlegroups.com

Dear all,

We're happy to inform that the event:

"Predictable AI: Evaluation, Anticipation and Control"

https://www.predictable-ai.org/march2023event

will be broadcast tomorrow. If you want to follow some of the sessions,
this is the youtube link:

https://www.youtube.com/watch?v=oG6mPc7Q4Xg

Best wishes,

Jose.

Reply all

Reply to author

Forward

0 new messages