RL入门资料 | Zaber BLOG

Zaber BLOG

Share and record my life.

RL入门资料

2024-1-19

| 2024-1-19

字数 173阅读时长≈ 1 分钟

type

Post

status

Published

date

Jan 19, 2024

slug

rl_intro

summary

强化学习入门必看文章和相关资料

tags

ML

category

强化学习

icon

password

Property

Jan 19, 2024 11:42 AM

大佬带读系列

强化学习炼丹手记

RL/DL 相关paper notes

https://www.zhihu.com/column/c_1121374124756480000

强化学习炼丹手记

大佬做的RL论文的中文笔记:

https://github.com/2019ChenGong/RL-Paper-notes

入门经典RL文章

Value-based methods

一般用于解决离散动作空间问题

DQN

Playing Atari with Deep Reinforcement Learning (2013)
Human-level control through deep reinforcement learning (2015, 发布在Nature上)

Double DQN (DDQN)

Deep Reinforcement Learning with Double Q-learning

Dueling DQN

Dueling Network Architectures for Deep Reinforcement Learning

Policy-based methods

可离散或连续动作空间

stochastic policy:

A3C:

Asynchronous Methods for Deep Reinforcement Learning (2016)

TRPO:

Trust Region Policy Optimization (2015)

PPO:

Proximal Policy Optimization Algorithms (2017/08 v2)

SAC:

Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic for Discrete Action Settings

deterministic policy:

DDPG

Continuous control with deep reinforcement learning (ICLR 2016, Deepmind)

TD3

Addressing Function Approximation Error in Actor-Critic Methods

Tricks

GAE

High-Dimensional Continuous Control Using Generalized Advantage Estimation

Retrace

Safe and efficient off-policy reinforcement learning (DeepMind 2016)

作者:Zaber
链接:https://blog.zaberlab.com/article/rl_intro
声明:本文采用 CC BY-NC-SA 4.0 许可协议，转载请注明出处。

相关文章 :

标签:

ML

Partial episode bootstrapping (PEB)编译OpenWRT教程(下篇)

Loading...

目录

0%