-
[d1] Scaling Reasoning in dLLMs via RLLLMs/Diffusion 2026. 5. 7. 18:31
https://dllm-reasoning.github.io/
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Table below shows the detailed performance comparison across different benchmarks and generation sequence lengths. d1-LLaDA consistently outperforms all other models, with diffu-GRPO showing better performance than SFT alone. Table: Model performance on GS
dllm-reasoning.github.io
(NeurIPS 2025 spotlight)



































'LLMs > Diffusion' 카테고리의 다른 글
[KL-Adaptive DPG] Distributional Approach to Controlled Text Generation (0) 2026.05.09 [D-CFG] Simple Guidance Mechanism for Discrete Diffusion Models (0) 2026.05.08 [LLaDA] Large Language Diffusion Models (0) 2026.05.07 [MDLMs] Simple and Effective Masked Diffusion Language Models (0) 2026.05.05 Likelihood-Based Diffusion Language Models (0) 2026.01.01