Research & Papers

LIFT method boosts diffusion language models by 3x on reasoning benchmarks

Diffusion language models get smarter with learnability-informed training – up to 3x gains on AIME

Deep Dive

A new paper from researchers at Texas A&M University and collaborators introduces LIFT, a fine-tuning method designed to improve reasoning capabilities in diffusion language models (DLMs). While supervised fine-tuning (SFT) works well for autoregressive models, it often degrades DLM performance. The team’s analysis reveals that vanilla SFT ignores “learnability”—what tokens are easy or hard to learn at different masking stages. Rare tokens are difficult when most input is masked, while common tokens offer little value when mostly unmasked.

LIFT addresses this by scheduling token learning based on the information available at each diffusion time step: easy tokens are targeted early (high masking), hard tokens later (more context). Tested on six reasoning benchmarks, LIFT outperforms existing SFT baselines, achieving up to 3x relative improvements on AIME’24 and AIME’25—challenging math reasoning sets. The method is efficient and adds no inference overhead. Code is open-sourced, offering a practical recipe for boosting DLM reasoning through smarter, learnability-aware fine-tuning.

Key Points
  • Vanilla SFT for diffusion language models can degrade performance by overlooking when tokens are easy or hard to learn.
  • LIFT aligns training with masking level: learns easy tokens when heavily masked, hard tokens when more context is available.
  • LIFT achieves up to 3x relative gains on AIME'24 and AIME'25 reasoning benchmarks over existing SFT baselines.

Why It Matters

A smarter fine-tuning recipe for diffusion language models that could unlock better reasoning without extra inference cost.