We are pleased to invite you to participate in the ESDD2: Environment-Aware Speech and Sound Deepfake Detection Challenge,which will be hosted in conjunction with IEEE ICME 2026.
Audio recorded in real-world environments often consists of two components: (i) speech, referring to linguistically meaningful speech produced by the primary foreground speaker, and (ii) environmental sound, referring to any non-speech background or non-target speech. With recent advances in text-to-speech, voice conversion, and other generation models, either component can now be modified independently. Such component-level manipulations are more difficult to detect than whole-audio deepfakes, as the remaining unaltered component may mislead detection systems and often sounds more natural to human listeners.
To address this challenge, ESDD2 focuses on component-level spoofing, where speech and environmental sounds may be independently manipulated or synthesized, creating a more challenging and realistic detection scenario.
We warmly invite researchers from both academia and industry to participate in this challenge, exploring robust and effective solutions for these critical deepfake detection tasks. Participants to the Grand Challenge are encouraged to submit short papers (up to 4 pages) , in case of acceptance, which will be published in the conference workshop proceedings.
Participation Process
Challenge website: https://sites.google.com/view/esdd-challenge/esdd-challenges/esdd-2/description
Evaluation plan: https://arxiv.org/pdf/2601.07303
Important Dates
Jan. 10, 2026 – Registration Opens
Apr. 30, 2026 – Paper Submission Deadline
Organizers
Prof. Ming Li, Duke Kunshan University, China
Sponsors and Awards
OfSpectrum, Inc. (https://ofspectrum.com/), an AI company based in the U.S., sponsors this challenge and will provide a USD 1,000 prize to the first-place winner.