'Research/RL_DeepMind' 카테고리의 글 목록

PPO & RLHF & DPO

Research/RL_DeepMind 2024. 8. 11. 16:11

https://www.youtube.com/watch?v=SgC6AZss478&list=PLs8w1Cdi-zvYviYYw_V3qe6SINReGF5M-&index=1https://www.youtube.com/watch?v=TjHH_--7l8g&list=PLs8w1Cdi-zvYviYYw_V3qe6SINReGF5M-&index=2https://www.youtube.com/watch?v=Z_JUqJBpVOk&list=PLs8w1Cdi-zvYviYYw_V3qe6SINReGF5M-&index=3https://www.youtube.com/watch?v=k2pD3k1485A&list=PLs8w1Cdi-zvYviYYw_V3qe6SINReGF5M-&index=4 The idea is that, we have a Trans..

[6/6] Policy Gradient Methods

Research/RL_DeepMind 2024. 8. 10. 17:23

https://www.youtube.com/watch?v=e20EY4tFC_Q&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=6Policy gradient methods take a more direct approach to the problem statement of RL and as a result, many of the most effective models are from this category. For example, Proxmal Policy Optimization is a type of policy gradient method, and that's OpenAI's go to RL algorithm. In fact, that's what they use t..

[5/6] Function Approximation

Research/RL_DeepMind 2024. 8. 10. 13:04

https://www.youtube.com/watch?v=Vky0WVh_FSk&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=5

[4/6] Temporal Difference Learning

Research/RL_DeepMind 2024. 8. 10. 12:13

https://www.youtube.com/watch?v=AJiG3ykOxmY&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=4

[3/6] Monte Carlo and Off-Policy Methods

Research/RL_DeepMind 2024. 8. 10. 02:10

https://www.youtube.com/watch?v=bpUszPiWM7o&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=3

[2/6] Bellman Equations, Dynamic Programming, Generalized Policy Iteration

Research/RL_DeepMind 2024. 8. 9. 22:33

https://www.youtube.com/watch?v=_j6pvGEchWU&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=2

[1/6] Reinforcement Learning, by the Book

Research/RL_DeepMind 2024. 8. 9. 22:17

https://www.youtube.com/watch?v=NFo9v_yKQXA&list=PLzvYlJMoZ02Dxtwe-MmH4nOB5jYlMGBjr&index=1

[Lecture 12] (2/2) Deep Reinforcement Learning

Research/RL_DeepMind 2024. 8. 9. 16:18

https://www.youtube.com/watch?v=cVzvNZOBaJ4&list=PLqYmG7hTraZDVH599EItlEWsUOsJbAodm&index=12https://storage.googleapis.com/deepmind-media/UCL%20x%20DeepMind%202021/Lecture%2012-%20Deep%20RL%201%20.pdf In this section, I want to give you some insight in what happens when ideas from reinforcement learing are combined with deep learning, both in terms of how known RL issues manifest when using deep..

ABOUT ME

밤에 쓰는 편지 밤에 쓰는 편지

티스토리툴바