-
[VRPO] LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion ModelsLLMs/Diffusion 2026. 5. 10. 15:12
(ICLR 2026 withdraw)


























'LLMs > Diffusion' 카테고리의 다른 글
Research Proposal (0) 2026.05.09 [CDPG] Controlling Conditional Language Models (0) 2026.05.09 [KL-Adaptive DPG w/ baseline] RL and DM for Fine-tuning LMs (0) 2026.05.09 [KL-Adaptive DPG] Distributional Approach to Controlled Text Generation (0) 2026.05.09 [D-CFG] Simple Guidance Mechanism for Discrete Diffusion Models (0) 2026.05.08