Session 17: IDLM: Inverse-distilled Diffusion Language Models

14 views
Skip to first unread message

Diffusion LLM

unread,
May 15, 2026, 2:57:35 PMMay 15
to diffus...@googlegroups.com

Hello folks,

Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. 

To address this, the authors extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. However, this extension introduces both theoretical and practical challenges.

To overcome these challenges, the authors first provide a theoretical result demonstrating that their inverse formulation admits a unique solution, thereby ensuring valid optimization. They then introduce gradient-stable relaxations to support effective training. 

As a result, experiments on multiple DLMs show that their method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4×—64×, while preserving the teacher model’s entropy and generative perplexity.

This Monday, David Li and Nikita Gushchin will present their jointly led paper, which was recently accepted at ICML 2026.

Title: IDLM: Inverse-distilled Diffusion Language Models


Meeting Link: click here

Time: May 18 (Monday) 1pm ET / 10am PT / 7pm CET / 10:30pm IST

Paper: [2602.19066] IDLM: Inverse-distilled Diffusion Language Models 


Prior knowledge: 

Fundamentals of discrete diffusion (video by Sasha Rush)

The Diffusion Duality (video by our reading group)


Abstract: Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. To address this, we extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. Nonetheless, this extension introduces both theoretical and practical challenges. From a theoretical perspective, the inverse distillation objective lacks uniqueness guarantees, which may lead to suboptimal solutions. From a practical standpoint, backpropagation in the discrete space is non-trivial and often unstable. To overcome these challenges, we first provide a theoretical result demonstrating that our inverse formulation admits a unique solution, thereby ensuring valid optimization. We then introduce gradient-stable relaxations to support effective training. As a result, experiments on multiple DLMs show that our method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4x-64x, while preserving the teacher model's entropy and generative perplexity.


Yours truly,

Subham, Justin, Zhihan

Website, Twitter, Discord, YouTube

Diffusion LLM

unread,
May 18, 2026, 6:11:00 PM (13 days ago) May 18
to Diffusion-llms
Hi folks, we just uploaded the recording of today's session, make sure to check it out: https://www.youtube.com/watch?v=RZ6_huata1Y
Reply all
Reply to author
Forward
0 new messages