

TODAY'S ISSUE
TODAY’S DAILY DOSE OF DATA SCIENCE
A Visual Guide to Agent2Agent (A2A) Protocol
Agentic applications require both A2A and MCP.
- MCP provides agents with access to tools.
- While A2A allows agents to connect with other agents and collaborate in teams.
Today, let's clearly understand what A2A is and how it can work with MCP.
If you don't know about MCP servers, we covered them recently in the newsletter here:
In a gist:
- Agent2Agent (A2A) protocol lets AI agents connect to other Agents.
- Model context protocol lets AI Agents connect to Tools/APIs.
So using A2A, while two Agents might be talking to each other...they themselves might be communicating to MCP servers.
In that sense, they do not compete with each other.
To explain further, Agent2Agent (A2A) enables multiple AI agents to work together on tasks without directly sharing their internal memory, thoughts, or tools.
Instead, they communicate by exchanging context, task updates, instructions, and data.
Essentially, AI applications can model A2A agents as MCP resources, represented by their AgentCard (more about it shortly).
Using this, AI agents connecting to an MCP server can discover new agents to collaborate with and connect via the A2A protocol.
A2A-supporting Remote Agents must publish a "JSON Agent Card" detailing their capabilities and authentication.
Clients use this to find and communicate with the best agent for a task.
There are several things that make A2A powerful:
- Secure collaboration
- Task and state management
- Capability discovery
- Agents from different frameworks working together (LlamaIndex, CrewAI, etc.)
Additionally, it can integrate with MCP.
While it's still new, it's good to standardize Agent-to-Agent collaboration, similar to how MCP does for Agent-to-tool interaction.
What are your thoughts?
We shall cover this from an implementation perspective soon.
Stay tuned!
IN CASE YOU MISSED IT
​LoRA/QLoRA—Explained From a Business Lens
Consider the size difference between BERT-large and GPT-3:
I have fine-tuned BERT-large several times on a single GPU using traditional fine-tuning:

But this is impossible with GPT-3, which has 175B parameters. That's 350GB of memory just to store model weights under float16 precision.
This means that if OpenAI used traditional fine-tuning within its fine-tuning API, it would have to maintain one model copy per user:
- If 10 users fine-tuned GPT-3 → they need 3500 GB to store model weights.
- If 1000 users fine-tuned GPT-3 → they need 350k GB to store model weights.
- If 100k users fine-tuned GPT-3 → they need 35 million GB to store model weights.
And the problems don't end there:
- OpenAI bills solely based on usage. What if someone fine-tunes the model for fun or learning purposes but never uses it?
- Since a request can come anytime, should they always keep the fine-tuned model loaded in memory? Wouldn't that waste resources since several models may never be used?
​​LoRA​​ (+ ​​QLoRA and other variants​​) neatly solved this critical business problem.
ROADMAP
From local ML to production ML
Once a model has been trained, we move to productionizing and deploying it.
If ideas related to production and deployment intimidate you, here’s a quick roadmap for you to upskill (assuming you know how to train a model):
- First, you would have to compress the model and productionize it. Read these guides:
- Reduce their size with ​Model Compression techniques​.
- ​Supercharge ​​PyTorch Models​​ With TorchScript.​
- If you use sklearn, learn how to ​optimize them with tensor operations​.
- Next, you move to deployment. ​Here’s a beginner-friendly hands-on guide​ that teaches you how to deploy a model, manage dependencies, set up model registry, etc.
- Although you would have tested the model locally, it is still wise to test it in production. There are risk-free (or low-risk) methods to do that. ​Learn what they are and how to implement them here​.
This roadmap should set you up pretty well, even if you have NEVER deployed a single model before since everything is practical and implementation-driven.
THAT'S A WRAP
No-Fluff Industry ML resources to
Succeed in DS/ML roles

At the end of the day, all businesses care about impact. That’s it!
- Can you reduce costs?
- Drive revenue?
- Can you scale ML models?
- Predict trends before they happen?
We have discussed several other topics (with implementations) in the past that align with such topics.
Here are some of them:
- Learn sophisticated graph architectures and how to train them on graph data in this crash course.
- So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
- Run large models on small devices using Quantization techniques.
- Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
- Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
- Learn how to scale and implement ML model training in this practical guide.
- Learn 5 techniques with implementation to reliably test ML models in production.
- Learn how to build and implement privacy-first ML systems using Federated Learning.
- Learn 6 techniques with implementation to compress ML models.
All these resources will help you cultivate key skills that businesses and companies care about the most.
SPONSOR US
Advertise to 600k+ data professionals
Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., around the world.