Research/...
-
Aligning Language Models with Preferences through f-divergence MinimizationResearch/... 2024. 12. 23. 07:13
https://arxiv.org/pdf/2302.08215(Jun 2023 ICML 2023) 짜잔~ 종합선물세트 입니다~!f-divergence로 대동단결.옛다. 크리스마스 선물 받아랏~! 너무 당연한 이야기이지만, 목표를 어떻게 설정하느냐에 따라 인생은 완전히 달라진다.목적지 변경은 우리를 완전히 다른 방향으로 이끌기도 한다.objective function (loss function)에 따라서 model의 behavior가 달라지는 건 참 흥미진진하다.(좀 더 구체적으로는..- 어떠한 measure로 target distribution을 approximate할 것인가- metric에 따라 convergence하는 양상은 어떻게 달라지는가) 이 논문을 읽는데 왜케 행복하지? ㅜㅜ 이 논문을 읽기 위해..
-
[cdpg] Controlling Conditional Language Models without Catastrophic ForgettingResearch/... 2024. 12. 22. 08:49
https://arxiv.org/pdf/2112.00791(Jun 2022 ICML 2022)AbstractMachine learning is shifting towards general-purpose pretrained generative models, trained in a self-supervised manner on large amounts of data, which can then be applied to solve a large number of tasks. However, due to their generic training methodology, these models often fail to meet some of the downstream requirements (e.g., halluc..
-
(3/3) GAN, F-Divergence, IPMResearch/... 2024. 12. 22. 00:34
1. Example of Parallel Line Density2. Wasserstein Distance with GAN3. Kantarovich-Rubinstein Duality4. Wasserstein as Primal LP5. Wasserstein as Dual LP6. Property of Dual LP on Wasserstein Distance7. Lipschitz Continuity8. Dual Problem of Wasserstein Distance9. Kantorovich-Rubinstein Duality & Wasserstein GAN
-
(1/3) GAN, F-Divergence, IPMResearch/... 2024. 12. 20. 07:37
https://www.youtube.com/playlist?list=PLzZ7PPT4KK5oQ4io5Fead9j_ksLIokrri1. Loss Function of GAN2. Jensen Shannon Divergence3. Training of GAN4. Theoretical Results of GAN5. Mode Collapse6. Contitional Generative Adversarial Network7. Adding Latent Variable to GAN8. InfoGAN9. Comparison between Conditional GAN & InfoGAN
-
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic ForgettingResearch/... 2024. 12. 17. 09:30
https://arxiv.org/pdf/2206.00761https://github.com/naver/gdc(Nov 2022 NeurIPS 2022)AbstractThe availability of large pre-trained models is changing the landscape of Machine Learning research and practice, moving from a “training from scratch” to a “finetuning” paradigm. While in some applications the goal is to “nudge” the pre-trained distribution towards preferred outputs, in others it is to st..