Session 16: Unifying Masked Diffusion Models with Various Generation Orders and Beyond

Diffusion LLM

unread,

May 9, 2026, 6:12:46 PMMay 9

to diffus...@googlegroups.com

Hello folks,

AR generates left-to-right; masked diffusion generates in any order; and block diffusion generates block-wise left-to-right, with random order within each block. Can we unify all these frameworks and further learn the generation order jointly with token prediction?

The authors propose OeMDM, a unified masked diffusion framework that can express various generation orders, and LoMDM, which jointly learns the generation order and the diffusion model.

Everything comes down to the scheduler: by making the forward and reverse schedulers maximally flexible, it becomes possible to describe all generation orders, even learnable generation orders, within the masked diffusion framework.

LoMDM achieves SOTA among discrete diffusion models across all benchmarks, and even outperforms block diffusion models, which strongly benefit from left-to-right bias!

This Monday, Chunsan Hong will present his paper, which received Spotlight at ICML 2026.

Title: Unifying Masked Diffusion Models with Various Generation Orders and Beyond

Meeting Link: click here

Time: May 11 (Monday) 1pm ET / 10am PT / 7pm CET / 10:30pm IST

Paper: [2602.02112] Unifying Masked Diffusion Models with Various Generation Orders and Beyond

Prior knowledge:

Fundamentals of discrete diffusion (video by Sasha Rush)

Abstract: Masked diffusion models (MDMs) are a potential alternative to autoregressive models (ARMs) for language generation, but generation quality depends critically on the generation order. Prior work either hard-codes an ordering (e.g., blockwise left-to-right) or learns an ordering policy for a pretrained MDM, which incurs extra cost and can yield suboptimal solutions due to the two-stage optimization. Motivated by this, we propose order-expressive masked diffusion model (OeMDM) for a broad class of diffusion generative processes with various generation orders, enabling the interpretation of MDM, ARM, and block diffusion in a single framework. Furthermore, building on OeMDM, we introduce learnable-order masked diffusion model (LoMDM), which jointly learns the generation ordering and diffusion backbone through a single objective from scratch, enabling the diffusion model to generate text in context-dependent ordering. Empirically, we confirm that LoMDM outperforms various discrete diffusion models across multiple language modeling benchmarks.

Yours truly,

Subham, Justin, Zhihan

Website, Twitter, Discord, YouTube

Diffusion LLM

unread,

May 11, 2026, 12:00:36 PMMay 11

to diffus...@googlegroups.com

This is happening in 1 hour!!

Gentle reminder: See you all at 1pm ET / 10am PT / 7pm CET / 10:30pm IST

Meeting Link: click here

Today's paper: [2602.02112] Unifying Masked Diffusion Models with Various Generation Orders and Beyond

Diffusion LLM

unread,

May 16, 2026, 8:24:11 AMMay 16

to Diffusion-llms

Hi folks, a recording of the paper's presentation is now available on YouTube! Make sure to check it out: https://youtu.be/wktMa0sfdos

Reply all

Reply to author

Forward