Reinforcement Learning Course
Markov Decision Processes and Value Functions
RL Part 2: Markov decision processes, returns, policies, and value functions.
A collection of 3 posts
RL Part 2: Markov decision processes, returns, policies, and value functions.
RL Part 1: Agents, environments, rewards, and why RL is different from supervised learning.
A series of technical deep dives on Reinforcement Learning that covers fundamentals and background, the classical techniques, MDPs, Bellman equations, deep RL methods, how RL is used to train modern language models, agentic RL, and much more.