

TODAY'S ISSUE
TODAY’S DAILY DOSE OF DATA SCIENCE
[Hands-on] Build an AI Agent With Human-like Memory
If a memory-less AI Agent is deployed in production, every interaction with that Agent will be a blank slate.
- It doesn’t matter if the user mentioned their name 5s ago…it’s forgotten.
- If the Agent solved an issue in the last session, it won’t remember it now.
With Memory, your Agent becomes context-aware and practically applicable.
Today, let us build an AI Agent with human-like memory. We have added a video above if you prefer that.
Here’s our tech stack:
- ​Open-source Graphiti​ (by Zep) as the memory layer for our AI agent.
- AutoGen for agent orchestration
- Ollama to locally serve Qwen 3.
Here’s the system overview:
- User submits a query.
- Agent saves the conversation and extracts facts into memory.
- Agent retrieves facts and summarizes.
- Uses facts and history for informed responses.
If you prefer a video, here's a detailed walkthrough:
Implementation
Integrating Memory with Agent
Let’s dive into the code!
Setup LLM
We'll use a locally served Qwen 3 via Ollama.

Initialise Zep Client
We're leveraging Zep’s Foundational Memory Layer to equip our Autogen agent with genuine task-completion capabilities.

Create User Session
Create a Zep client session for the user, which the agent will use to manage memory. A user can have multiple sessions!

Define Zep Conversable Agent
Our Zep Memory Agent builds on Autogen's Conversable Agent, drawing live memory context from Zep Cloud with each user query.
It remains efficient by utilizing the session we just established.

Setting up Agents
We initialize the Conversable Agent and a Stand-in Human Agent to manage chat interactions.

Handle Agentic Chat
The Zep Conversable Agent steps in to create a coherent, personalized response.
It seamlessly integrates memory and conversation.

Streamlit UI
We created a streamlined Streamlit UI to ensure smooth and simple interactions with the Agent.

Visualize Knowledge Graph
We can interactively map users’ conversations across multiple sessions with Zep Cloud's UI. This powerful tool allows us to visualize how knowledge evolves through a graph.

Done!
We have equipped our AI Agent with a SOTA memory layer.
​Find the complete code in the GitHub repository →​
We recommend watching the video attached at the top for better understanding!
That said, Agents forget everything after each task. Open-source memory toolkit ​Graphiti by Zep​ lets Agents build and query temporally-aware knowledge graphs!
​Check the GitHub repo here →​ (don’t forget to star)
Thanks for reading!
ROADMAP
From local ML to production ML
Once a model has been trained, we move to productionizing and deploying it.
If ideas related to production and deployment intimidate you, here’s a quick roadmap for you to upskill (assuming you know how to train a model):
- First, you would have to compress the model and productionize it. Read these guides:
- Reduce their size with ​Model Compression techniques​.
- ​Supercharge ​​PyTorch Models​​ With TorchScript.​
- If you use sklearn, learn how to ​optimize them with tensor operations​.
- Next, you move to deployment. ​Here’s a beginner-friendly hands-on guide​ that teaches you how to deploy a model, manage dependencies, set up model registry, etc.
- Although you would have tested the model locally, it is still wise to test it in production. There are risk-free (or low-risk) methods to do that. ​Learn what they are and how to implement them here​.
This roadmap should set you up pretty well, even if you have NEVER deployed a single model before since everything is practical and implementation-driven.
THAT'S A WRAP
No-Fluff Industry ML resources to
Succeed in DS/ML roles

At the end of the day, all businesses care about impact. That’s it!
- Can you reduce costs?
- Drive revenue?
- Can you scale ML models?
- Predict trends before they happen?
We have discussed several other topics (with implementations) in the past that align with such topics.
Here are some of them:
- Learn sophisticated graph architectures and how to train them on graph data in this crash course.
- So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
- Run large models on small devices using Quantization techniques.
- Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
- Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
- Learn how to scale and implement ML model training in this practical guide.
- Learn 5 techniques with implementation to reliably test ML models in production.
- Learn how to build and implement privacy-first ML systems using Federated Learning.
- Learn 6 techniques with implementation to compress ML models.
All these resources will help you cultivate key skills that businesses and companies care about the most.
SPONSOR US
Advertise to 600k+ data professionals
Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., around the world.