Hello folks,
Discrete diffusion models are powerful, but out of the box they give little control over the target distribution.
Discrete Feynman-Kac Correctors fix this by using Sequential Monte Carlo to modify the distribution through annealing, composing multiple models, or tilting with external reward functions, all at inference time with no retraining needed. This unlocks boosting coding performance, sampling across a range of temperatures in the Ising model, and generating higher quality protein sequences.
This Monday, Mohsin Hasan and Viktor Ohanesian will co-present their jointly led paper Discrete Feynman-Kac Correctors.
Title: Discrete Feynman-Kac Correctors
Meeting Link: click here
Time: Mar 16 (Monday) 1pm ET / 10am PT / 6pm CET / 10:30pm IST
Paper: [2601.10403] Discrete Feynman-Kac Correctors
Prior knowledge:
Fundamentals of discrete diffusion (video by Sasha Rush)
Abstract: Discrete diffusion models have recently emerged as a promising alternative to the autoregressive approach for generating discrete sequences. Sample generation via gradual denoising or demasking processes allows them to capture hierarchical non-sequential interdependencies in the data. These custom processes, however, do not assume a flexible control over the distribution of generated samples. We propose DISCRETE FEYNMAN-KAC CORRECTORS, a framework that allows for controlling the generated distribution of discrete masked diffusion models at inference time. We derive Sequential Monte Carlo (SMC) algorithms that, given a trained discrete diffusion model, control the temperature of the sampled distribution (i.e. perform annealing), sample from the product of marginals of several diffusion processes (e.g. differently conditioned processes), and sample from the product of the marginal with an external reward function, producing likely samples from the target distribution that also has high reward. Notably, our framework does not require any training of additional models or fine-tuning of the original model. We illustrate the utility of our framework in several applications including: efficient sampling from the annealed Boltzmann distribution of the Ising model, improving the performance of language models for code generation and amortized learning, as well as reward-tilted protein sequence generation.
Yours truly,
Subham, Justin, Zhihan