-
[KL-Adaptive DPG w/ baseline] RL and DM for Fine-tuning LMsLLMs/Diffusion 2026. 5. 9. 15:29
역시 읽은지 꽤 오래 되어서, 다시 읽음.
(NeurIPS 2022)






















'LLMs > Diffusion' 카테고리의 다른 글
Research Proposal (0) 2026.05.09 [CDPG] Controlling Conditional Language Models (0) 2026.05.09 [KL-Adaptive DPG] Distributional Approach to Controlled Text Generation (0) 2026.05.09 [D-CFG] Simple Guidance Mechanism for Discrete Diffusion Models (0) 2026.05.08 [d1] Scaling Reasoning in dLLMs via RL (0) 2026.05.07