Call for participants: ICME2025 Audio Encoder Challenge

51 views
Skip to first unread message

Wenwu Wang

unread,
Mar 3, 2025, 12:17:51 PMMar 3
to ml-...@googlegroups.com

The IEEE International Conference on Multimedia & Expo (ICME) 2025 Audio Encoder Capability Challenge
Overview
The ICME 2025 Audio Encoder Capability Challenge, hosted by Xiaomi, University of Surrey, and Dataocean AI, aims to rigorously evaluate audio encoders in real-world downstream tasks.
This challenge imposes NO restrictions on model size or the scale of training data, and training based on existing pre-trained models is allowed.
Participants are invited to submit pre-trained encoders that convert raw audio waveforms into continuous embeddings. These encoders will undergo comprehensive testing across diverse tasks spanning speech, environmental sounds, and music. The evaluation will emphasize real-world usability and leverage an open-source evaluation system.
Participants are welcome to independently test and optimize their models. However, the final rankings will be determined based on evaluations conducted by the organizers.

Registration
To participate, registration is required. Please complete the registration form before April 1, 2025. Note that this does not means the challenge starts on April 1, 2025. The challenge begins on February 7, 2025.
For any other information about registration, please send Email to: 2025ic...@dataoceanai.com
Submission
  1. Clone the audio encoder template from the GitHub repository.
  2. Implement your own audio encoder following the instructions in README.md within the cloned repository. The implementation must pass all checks in audio_encoder_checker.py provided in the repository.
  3. Before the submission deadline, April 30, 2025, email the following files to the organizers at 2025ic...@dataoceanai.com:
  • a ZIP file containing the complete repository
  • a technical report paper (PDF format) not exceeding 6 pages describing your implementation
The pre-trained model weights can either be included in the ZIP file or downloaded automatically from external sources (e.g., Hugging Face) during runtime. If choosing the latter approach, please implement the automatic downloading mechanism in your encoder implementation.
While there are no strict limitations on model size, submitted models must be able to be run successfully in a Google Colab T4 environment, where the runtime is equipped with a 16 GB NVIDIA Tesla T4 GPU, 12GB RAM.
More details can be found from the following webpage:

Thanks for your attention. Sorry for cross-posting. 

Best wishes,
 
Wenwu
 
 
--
Wenwu Wang

Professor of Signal Processing and Machine Learning,
Centre for Vision Speech and Signal Processing (CVSSP)

Associate Head of External Engagement, 
School of Computer Science and Electronic Engineering

AI Fellow,
Surrey Institute for People Centred AI

University of Surrey
Guildford, GU2 7XH
United Kingdom
Phone: +44 (0) 1483 686039
Fax: +44 (0) 1483 686031
Email: w.w...@surrey.ac.uk

Wenwu Wang

unread,
Nov 10, 2025, 11:49:23 AM (yesterday) Nov 10
to ml-...@googlegroups.com
Welcome to attend IEEE ICASSP 2026 Grand Challenge (GC)
GC-2: Automatic Song Aesthetics Evaluation
Organized by: Ting Dang, Haohe Liu, Hao Liu, Hexin Liu, Lei Xie, Huixin Xue, Wei Xue, Guobin Ma, Hao Shi, Yui Sudo, Jixun Yao, Ruibin Yuan, Jingyao Wu, Wenwu Wang
ICASSP 2026 website: https://2026.ieeeicassp.org/
Title: Automatic Song Aesthetics Evaluation Challenge
Short description: Recent advances in generative music models have enabled automatic song creation with impressive quality and diversity, powering applications from virtual artists to movie dubbing. However, evaluating the aesthetic quality of generated songs, capturing factors like emotional expressiveness, musicality, and listener enjoyment, remains a key challenge. Existing metrics often fail to reflect human perception. To address this gap, the Automatic Song Aesthetics Evaluation Challenge invites participants to develop models that predict human ratings of song aesthetics based solely on audio. This competition aims to establish a standardized benchmark for assessing musical aesthetics in song generation, with human-annotated datasets and a focus on listener-centered criteria. By bridging signal processing, affective computing, and machine learning, this challenge seeks to drive progress toward more human-aligned music generation and evaluation.
Timeline
  • September 01, 2025: Registration Opens
  • September 10, 2025: Train set and baseline system release
  • November 10, 2025: Test set release
  • November 20, 2025: Results Submission Deadline
  • December 07, 2025: Paper submission (2-page papers)
  • January 11, 2026: Paper acceptance notification
  • January 18, 2026: Camera-ready paper submission
  • May 4-8, 2026: ICASSP 2026 in Spain
Reply all
Reply to author
Forward
0 new messages