Markov Decision Processes and Value Functions
RL Part 2: Markov decision processes, returns, policies, and value functions.
RL Part 2: Markov decision processes, returns, policies, and value functions.
RL Part 1: Agents, environments, rewards, and why RL is different from supervised learning.
Diffusion LLMs Part 2: How dLLMs scale to 100B parameters, the inference stack that makes them fast, hands-on code, and when to actually use them.
...explained with exact prompts and usage!
A first-principles walk through agent memory (open-source).
Diffusion LLMs Part 1: Understanding how diffusion language models work from first principles, the math behind masked diffusion, and why they represent a fundamentally different approach to text generation.
Reduce token costs and improve performance...and how to use it with Claude!
A deep dive into what Anthropic, OpenAI, Perplexity and LangChain are actually building.