Foundations of Reinforcement Learning
RL Part 1: Agents, environments, rewards, and why RL is different from supervised learning.
385 posts published
RL Part 1: Agents, environments, rewards, and why RL is different from supervised learning.
Diffusion LLMs Part 2: How dLLMs scale to 100B parameters, the inference stack that makes them fast, hands-on code, and when to actually use them.
...explained with exact prompts and usage!
A first-principles walk through agent memory (open-source).
Diffusion LLMs Part 1: Understanding how diffusion language models work from first principles, the math behind masked diffusion, and why they represent a fundamentally different approach to text generation.
Reduce token costs and improve performance...and how to use it with Claude!
A deep dive into what Anthropic, OpenAI, Perplexity and LangChain are actually building.
An exploration of real-world MLOps and LLMOps case studies, examining the importance of reliable ML and AI engineering and their significance for business outcomes.
A complete guide to CLAUDE.md, custom commands, skills, agents, and permissions, and how to set them up properly.
LLMOps Part 14: An overview of the fundamentals of LLM serving, including API-based access, inference with vLLM, and practical decisions.
LLMOps Part 13: Exploring the mechanics of LLM inference, from prefill and decode phases to KV caching, batching, and optimization techniques that improve latency and throughput.
LLMOps Part 12: Understanding LLM fine-tuning, parameter-efficient methods like LoRA and QLoRA, and alignment techniques such as RLHF, DPO, and GRPO.
...explained visually!
A case study on how Claude achieves 92% cache hit-rate.
LLMOps Part 11: Understanding evaluation of conversational LLM systems, tool evaluations, tracing with Langfuse, and automated red teaming.
LLMOps Part 10: Understanding model benchmarks, LLM application evaluation, and tooling.
LLMOps Part 9: A foundational guide to the evaluation of LLM applications, covering challenges and a practical taxonomy of evaluation methods.
LLMOps Part 8: A concise overview of memory, dynamic and temporal context in LLM systems, covering short and long-term memory, dynamic context injection, and some of the common context failure modes in agentic applications.
LLMOps Part 7: A conceptual overview of context engineering, covering context types, context construction principles, and retrieval-centric techniques for building high-signal inputs.
LLMOps Part 6: Exploring prompt versioning, defensive prompting, and techniques such as verbalized sampling, role prompting and more.
LLMOps Part 5: An introduction to prompt engineering (a subset of context engineering), covering prompt types, the prompt development workflow, and key techniques in the field.
LLMOps Part 4: An exploration of key decoding strategies, sampling parameters, and the general lifecycle of LLM-based applications.
LLMOps Part 3: A focused look at the core ideas behind attention mechanism, transformer and mixture-of-experts architectures, and model pretraining and fine-tuning.
Tools, prompts and resources form the three core capabilities of the MCP framework. Capabilities are essentially the features or functions that the server makes available. * Tools: Executable actions or functions that the AI (host/client) can invoke (often with side effects or external API calls). * Resources: Read-only data sources that
At its heart, MCP follows a client-server architecture (much like the web or other network protocols). However, the terminology is tailored to the AI context. There are three main roles to understand: the Host, the Client, and the Server. Host The Host is the user-facing AI application, the environment where
Without MCP, adding a new tool or integrating a new model was a headache. If you had three AI applications and three external tools, you might end up writing nine different integration modules (each AI x each tool) because there was no common standard. This doesn’t scale. Developers of
Imagine you only know English. To get info from a person who only knows: * French, you must learn French. * German, you must learn German. * And so on. In this setup, learning even 5 languages will be a nightmare for you. But what if you add a translator that understands all