RL/RL
f-DPG
밤 편지
2024. 12. 23. 14:11
* f-divergence
* f-divergence examples (KL-divergence, Total Variation Distance)
* Aligning LMs with Preferences through f-divergence Minimization
* Algorithm