Reinforcement Learning

How to Beat GRPO Without Touching Model Weights

Berkeley beat GRPO by 10 points with 35× fewer rollouts and no GPU training,

May 2

How to Beat GRPO Without Touching Model Weights

Reinforcement Learning

How Top AI Labs Are Building RL Agents in 2026

The era of not writing custom reward functions.

Apr 28

How Top AI Labs Are Building RL Agents in 2026

Reinforcement Learning Course

A series of technical deep dives on Reinforcement Learning that covers fundamentals and background, the classical techniques, MDPs, Bellman equations, deep RL methods, how RL is used to train modern language models, agentic RL, and much more.

Apr 25