Building Blocks of LLMs: Attention, Architectural Designs and Training
LLMOps Part 3: A focused look at the core ideas behind attention mechanism, transformer and mixture-of-experts architectures, and model pretraining and fine-tuning.
265 posts published
LLMOps Part 3: A focused look at the core ideas behind attention mechanism, transformer and mixture-of-experts architectures, and model pretraining and fine-tuning.
Tools, prompts and resources form the three core capabilities of the MCP framework. Capabilities are essentially the features or functions that the server makes available. * Tools: Executable actions or functions that the AI (host/client) can invoke (often with side effects or external API calls). * Resources: Read-only data sources that
At its heart, MCP follows a client-server architecture (much like the web or other network protocols). However, the terminology is tailored to the AI context. There are three main roles to understand: the Host, the Client, and the Server. Host The Host is the user-facing AI application, the environment where
Without MCP, adding a new tool or integrating a new model was a headache. If you had three AI applications and three external tools, you might end up writing nine different integration modules (each AI x each tool) because there was no common standard. This doesn’t scale. Developers of
Imagine you only know English. To get info from a person who only knows: * French, you must learn French. * German, you must learn German. * And so on. In this setup, learning even 5 languages will be a nightmare for you. But what if you add a translator that understands all
LLMOps Part 2: A detailed walkthrough of tokenization, embeddings, and positional representations, building the foundational translation layer that enables LLMs to process and reason over text.
AI Agents Crash Course—Part 17 (with implementation).
AI Agents Crash Course—Part 16 (with implementation).
LLMOps Part 1: An overview of AI engineering and LLMOps, and the core dimensions that define modern AI systems.
A comprehensive guide to Opik, an open-source LLM evaluation and observability framework.
MLOps Part 18: A hands-on guide to CI/CD in MLOps with DVC, Docker, GitHub Actions, and GitOps-based Kubernetes delivery on Amazon EKS.
MLOps Part 17: ML monitoring in practice with Evidently, Prometheus and Grafana, stitched into a FastAPI inference service with drift reports, metrics scraping, and dashboards.
AI Agents Crash Course—Part 15 (with implementation).
MLOps Part 16: A comprehensive overview of drift detection using statistical techniques, and how logging and observability keep ML systems healthy.
MLOps Part 15: Understanding the EKS lifecycle, getting hands-on with AWS setup, and deploying a simple ML inference service on Amazon EKS.
MLOps Part 14: Understanding AWS cloud platform, and zooming into EKS.
...explained step-by-step with code.
Chat with videos and get precise timestamps.
...that actually prevents hallucinations (explained visually).
MLOps Part 13: An overview of cloud concepts that matter, from virtualization and storage choices to VPC, load balancing, identity, and observability.
MLOps Part 12: An introduction to Kubernetes, plus a practical walkthrough of deploying a simple FastAPI inference service using Kubernetes.
Multi-agent (100% local).
Generate realistic data using existing data (100% local).
...with hands-on implementation.
MLOps Part 11: A practical guide to taking models beyond notebooks, exploring serialization formats, containerization, and serving predictions using REST and gRPC.