
Based on [1], for RL algorithm, we can have:
- Value-based
- Policy-based
- Both value and policy based algorithms.
Some nice book and tutorials can be found in [2][3][4].
This article is not complete yet, I will keep on updating. If any suggestions, please leave a comment. Thanks!
Reference
[1] Lihong Yi video in Chinese
[2] textbook
[3]lectures by David Silver from deep mind
[4] Lecture by John Schulman (the first author of PPO) OpenAI
I have been struggling for a long time on paper writing especially for the abstract. However, Professor Liang Huang’s lecture helps a lot on how to wrtie up papers. The lecture notes can be found here. In this article, I am focusing on how to write paper abstract.

In the following session, I’d like to analysis based on the above rules for Professor Huang’s several papers and some other papers.
Papers from Professor Huang’s team
grep searching only a specified data type
Such as only search python files
grep -ri "hello_world" --include=*.py