Miscellaneous (already overloaded list, with further updates to come)

*RL/RL 2025. 8. 23. 12:33

* Google Research

https://research.google/blog/?search=reinforcement&

Latest News from Google Research Blog - Google Research

research.google

* Open AI Spinning up

https://spinningup.openai.com/en/latest/

Welcome to Spinning Up in Deep RL! — Spinning Up documentation

spinningup.openai.com

* Policy Gradient Algorithms

https://lilianweng.github.io/posts/2018-04-08-policy-gradient/

Policy Gradient Algorithms

[Updated on 2018-06-30: add two new policy gradient methods, SAC and D4PG.] [Updated on 2018-09-30: add a new policy gradient method, TD3.] [Updated on 2019-02-09: add SAC with automatically adjusted temperature]. [Updated on 2019-06-26: Thanks to Chanseok

lilianweng.github.io

* Advantage Actor Critic (A2C) & Generalized Advantage Estimation (GAE)

https://balajiai.github.io/high_variance_in_policy_gradients

High Variance in Policy gradients

We’ve talked about Policy gradients in our previous blog. Though vanilla policy gradient is theoretically simple and mathematically proven, it doesn’t seem to work well in practice. The main reason is because of high variance which policy gradient exhi

balajiai.github.io

* PPO - Clipped Surrogate Objective Function

https://huggingface.co/learn/deep-rl-course/unit8/visualize

Visualize the Clipped Surrogate Objective Function - Hugging Face Deep RL Course

Unit 0. Welcome to the course Unit 1. Introduction to Deep Reinforcement Learning Bonus Unit 1. Introduction to Deep Reinforcement Learning with Huggy Live 1. How the course work, Q&A, and playing with Huggy Unit 2. Introduction to Q-Learning Unit 3. Deep

huggingface.co

Model-based RL

* MBRL-Lib: A Modular Library for Model-based Reinforcement Learning

https://arxiv.org/pdf/2104.10159

https://github.com/facebookresearch/mbrl-lib

GitHub - facebookresearch/mbrl-lib: Library for Model Based RL

Library for Model Based RL . Contribute to facebookresearch/mbrl-lib development by creating an account on GitHub.

github.com

* Model-based RL: A Survey

https://arxiv.org/pdf/2006.16712

Offline RL

https://offline-rl.github.io/

An Optimistic Perspective on Offline Reinforcement Learning

DQN Replay Dataset The DQN Replay Dataset was collected as follows: We first train a DQN agent, on all 60 Atari 2600 games with sticky actions enabled for 200 million frames (standard protocol) and save all of the experience tuples of (observation, action,

offline-rl.github.io

https://www.youtube.com/watch?v=qgZPZREor5I

* Offline RL: Tutoriral, Review

https://arxiv.org/pdf/2005.01643

* Decision Transformer: RL via Sequence Modeling

https://arxiv.org/pdf/2106.01345

https://huggingface.co/blog/decision-transformers

Introducing Decision Transformers on Hugging Face 🤗

huggingface.co

* Online Decision Transformer

https://arxiv.org/pdf/2202.05607

* Language Models in RL

https://huggingface.co/learn/deep-rl-course/unitbonus3/language-models

Language models in RL - Hugging Face Deep RL Course

huggingface.co

'*RL > RL' 카테고리의 다른 글

Lectures (0)	2025.08.21
[2 - Sample Efficiency] Causal Reinforcement Learning (0)	2025.07.22
[1 - generalization] Causal Reinforcement Learning (0)	2025.07.22
f-DPG (0)	2024.12.23
Aligning Language Models with Preferences through f-divergence Minimization (0)	2024.12.23

ABOUT ME

밤에 쓰는 편지 밤에 쓰는 편지

Model-based RL

Offline RL

'*RL > RL' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Model-based RL

Offline RL

'*RL > RL' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바