Reinforcement learning

Jimmy (xiaoke) Shen

Jul 30, 2022

--

Screenshot from [1]

Based on [1], for RL algorithm, we can have:

Value-based
Policy-based
Both value and policy based algorithms.

Some nice book and tutorials can be found in [2][3][4].

This article is not complete yet, I will keep on updating. If any suggestions, please leave a comment. Thanks!

Reference

[1] Lihong Yi video in Chinese

[3]lectures by David Silver from deep mind

[4] Lecture by John Schulman (the first author of PPO) OpenAI

Reinforcement Learning

Written by Jimmy (xiaoke) Shen

MLE/SWE @meta

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams