Reinforcement learning
Jul 30, 2022
Based on [1], for RL algorithm, we can have:
- Value-based
- Policy-based
- Both value and policy based algorithms.
Some nice book and tutorials can be found in [2][3][4].
This article is not complete yet, I will keep on updating. If any suggestions, please leave a comment. Thanks!
Reference
[1] Lihong Yi video in Chinese
[2] textbook
[3]lectures by David Silver from deep mind
[4] Lecture by John Schulman (the first author of PPO) OpenAI