x

LESSWRONG

LW

Aman Gokrani — LessWrong

Aman Gokrani

Aman Gokrani

Message

1

2

7mo

Aman Gokrani

1

7mo

What Happens When You Try to Merge AR Reasoning Into Diffusion Models

Diffusion language models are the rising topic of discussion lately. Models like LLaDA, Dream 7B, Mercury, and SDAR are as good as autoregressive models on standard benchmarks while offering 2-4x faster inference through parallel token generation. For a deeper dive into how diffusion LMs work, see our previous post. SDAR...