-
Miscellaneous (already overloaded list, with further updates to come)*RL/RL 2025. 8. 23. 12:33
* Google Research
https://research.google/blog/?search=reinforcement&
Latest News from Google Research Blog - Google Research
research.google
* Open AI Spinning up
https://spinningup.openai.com/en/latest/
Welcome to Spinning Up in Deep RL! — Spinning Up documentation
© Copyright 2018, OpenAI. Revision 038665d6.
spinningup.openai.com
* Policy Gradient Algorithms
https://lilianweng.github.io/posts/2018-04-08-policy-gradient/
Policy Gradient Algorithms
[Updated on 2018-06-30: add two new policy gradient methods, SAC and D4PG.] [Updated on 2018-09-30: add a new policy gradient method, TD3.] [Updated on 2019-02-09: add SAC with automatically adjusted temperature]. [Updated on 2019-06-26: Thanks to Chanseok
lilianweng.github.io
* Advantage Actor Critic (A2C) & Generalized Advantage Estimation (GAE)
https://balajiai.github.io/high_variance_in_policy_gradients
High Variance in Policy gradients
We’ve talked about Policy gradients in our previous blog. Though vanilla policy gradient is theoretically simple and mathematically proven, it doesn’t seem to work well in practice. The main reason is because of high variance which policy gradient exhi
balajiai.github.io
* PPO - Clipped Surrogate Objective Function
https://huggingface.co/learn/deep-rl-course/unit8/visualize
Visualize the Clipped Surrogate Objective Function - Hugging Face Deep RL Course
Unit 0. Welcome to the course Unit 1. Introduction to Deep Reinforcement Learning Bonus Unit 1. Introduction to Deep Reinforcement Learning with Huggy Live 1. How the course work, Q&A, and playing with Huggy Unit 2. Introduction to Q-Learning Unit 3. Deep
huggingface.co
Model-based RL
* MBRL-Lib: A Modular Library for Model-based Reinforcement Learning
https://arxiv.org/pdf/2104.10159
https://github.com/facebookresearch/mbrl-lib
GitHub - facebookresearch/mbrl-lib: Library for Model Based RL
Library for Model Based RL . Contribute to facebookresearch/mbrl-lib development by creating an account on GitHub.
github.com
* Model-based RL: A Survey
https://arxiv.org/pdf/2006.16712
Offline RL
An Optimistic Perspective on Offline Reinforcement Learning
DQN Replay Dataset The DQN Replay Dataset was collected as follows: We first train a DQN agent, on all 60 Atari 2600 games with sticky actions enabled for 200 million frames (standard protocol) and save all of the experience tuples of (observation, action,
offline-rl.github.io
https://www.youtube.com/watch?v=qgZPZREor5I
* Offline RL: Tutoriral, Review
https://arxiv.org/pdf/2005.01643
* Decision Transformer: RL via Sequence Modeling
https://arxiv.org/pdf/2106.01345
https://huggingface.co/blog/decision-transformers
Introducing Decision Transformers on Hugging Face 🤗
huggingface.co
* Online Decision Transformer
https://arxiv.org/pdf/2202.05607
* Language Models in RL
https://huggingface.co/learn/deep-rl-course/unitbonus3/language-models
Language models in RL - Hugging Face Deep RL Course
Unit 0. Welcome to the course Unit 1. Introduction to Deep Reinforcement Learning Bonus Unit 1. Introduction to Deep Reinforcement Learning with Huggy Live 1. How the course work, Q&A, and playing with Huggy Unit 2. Introduction to Q-Learning Unit 3. Deep
huggingface.co
'*RL > RL' 카테고리의 다른 글
Lectures (0) 2025.08.21 [2 - Sample Efficiency] Causal Reinforcement Learning (0) 2025.07.22 [1 - generalization] Causal Reinforcement Learning (0) 2025.07.22 f-DPG (0) 2024.12.23 Aligning Language Models with Preferences through f-divergence Minimization (0) 2024.12.23