Building Blocks of LLMs: Decoding, Generation Parameters, and the LLM Application Lifecycle
LLMOps Part 4: An exploration of key decoding strategies, sampling parameters, and the general lifecycle of LLM-based applications.
LLMOps Part 4: An exploration of key decoding strategies, sampling parameters, and the general lifecycle of LLM-based applications.
LLMOps Part 3: A focused look at the core ideas behind attention mechanism, transformer and mixture-of-experts architectures, and model pretraining and fine-tuning.
LLMOps Part 2: A detailed walkthrough of tokenization, embeddings, and positional representations, building the foundational translation layer that enables LLMs to process and reason over text.
AI Agents Crash Course—Part 17 (with implementation).
AI Agents Crash Course—Part 16 (with implementation).
LLMOps Part 1: An overview of AI engineering and LLMOps, and the core dimensions that define modern AI systems.
A comprehensive guide to Opik, an open-source LLM evaluation and observability framework.
MLOps Part 18: A hands-on guide to CI/CD in MLOps with DVC, Docker, GitHub Actions, and GitOps-based Kubernetes delivery on Amazon EKS.
MLOps Part 17: ML monitoring in practice with Evidently, Prometheus and Grafana, stitched into a FastAPI inference service with drift reports, metrics scraping, and dashboards.
AI Agents Crash Course—Part 15 (with implementation).
MLOps Part 16: A comprehensive overview of drift detection using statistical techniques, and how logging and observability keep ML systems healthy.
MLOps Part 15: Understanding the EKS lifecycle, getting hands-on with AWS setup, and deploying a simple ML inference service on Amazon EKS.
MLOps Part 14: Understanding AWS cloud platform, and zooming into EKS.
MLOps Part 13: An overview of cloud concepts that matter, from virtualization and storage choices to VPC, load balancing, identity, and observability.
MLOps Part 12: An introduction to Kubernetes, plus a practical walkthrough of deploying a simple FastAPI inference service using Kubernetes.
MLOps Part 11: A practical guide to taking models beyond notebooks, exploring serialization formats, containerization, and serving predictions using REST and gRPC.
MLOps Part 10: A comprehensive guide to model compression covering knowledge distillation, low-rank factorization, and quantization, followed by ONNX and ONNX Runtime as the bridge from training frameworks to fast, portable production inference.
MLOps Part 9: A deep dive into model fine-tuning and compression, specifically pruning and related improvements.
MLOps Part 8: A systems-first guide to model development and optimizing performance with disciplined hyperparameter tuning.
MLOps Part 7: An applied look at distributed data processing with Spark and workflow orchestration and scheduling with Prefect.
MLOps Part 6: A deep dive into sampling, class imbalance, and data leakage; plus a hands-on Feast feature store demo.
MLOps Part 5: A detailed walkthrough of data engineering for MLOps, covering data sources, format performance trade-offs, and ETL/ELT pipelines.
MLOps Part 4: A practical walkthrough of W&B-powered reproducibility.
MLOps Part 3: A practical exploration of reproducibility and versioning, covering deterministic training, data and model versioning, and experiment tracking.
MLOps Part 2: A deeper look at the ML lifecycle, plus a minimal train-to-API and containerization demo using FastAPI and Docker.
MLOps Part 1: An introduction to machine learning in production, covering pitfalls, system-level concerns, and an overview of the full ML lifecycle.
MCP Part 9: Building a full-fledged research assistant with MCP and LangGraph.
MCP Part 8: Integration of the model context protocol (MCP) with LangGraph, LlamaIndex, CrewAI, and PydanticAI.
MCP Part 7: A deep dive into understanding sandboxing and its need in MCP.
Understanding every little detail on vector databases and their utility in LLMs, along with a hands-on demo.