
TODAY'S ISSUE
TODAY’S DAILY DOSE OF DATA SCIENCE
A crash course on RAG systems - Part 2
Last week, we started a crash course on building RAG systems.
Part 2 is now available, where we are building on the foundations laid in Part 1.
Read here: A Crash Course on Building RAG Systems – Part 2 (With Implementation).
Why care?
Over the last few weeks, we have spent plenty of time understanding the key components of real-world NLP systems (like the deep dives on bi-encoders and cross-encoders for context pair similarity scoring).
RAG is another key NLP system that got massive attention due to one of the key challenges it solved around LLMs.
More specifically, if you know how to build a reliable RAG system, you can bypass the challenge and cost of fine-tuning LLMs.
That’s a considerable cost saving for enterprises.
And at the end of the day, all businesses care about impact. That’s it!
- Can you reduce costs?
- Drive revenue?
- Can you scale ML models?
- Predict trends before they happen?
Thus, the objective of this crash course is to help you implement reliable RAG systems, understand the underlying challenges, and develop expertise in building RAG apps on LLMs, which every industry cares about now.
- Read the first part here →
- Read the second part here →
Of course, if you have never worked with LLMs, that’s okay. We cover everything in a practical and beginner-friendly way.
IN CASE YOU MISSED IT
Extend the context length of LLMs
- GPT-3.5-turbo had a context window of 4,096 tokens.
- Later, GPT-4 took that to 8,192-32k tokens.
- Claude 2 reached 100,000 tokens.
- Llama 3.1 → 128,000 tokens.
- Gemini → 1M+ tokens.
We have been making great progress in extending the context window of LLMs.
But how?
We covered techniques that help us unlock larger context windows earlier this week.
Read the techniques to extend the context length of LLMs here →
IN CASE YOU MISSED IT
Building a 100% local multi-agent Internet research assistant with OpenAI Swarm & Llama 3.2
Recently, OpenAI released Swarm.
It’s an open-source framework designed to manage and coordinate multiple AI agents in a highly customizable way.
AI Agents are autonomous systems that can reason, think, plan, figure out the relevant sources and extract information from them when needed, take actions, and even correct themselves if something goes wrong.
We published a practical and hands-on demo of this in the newsletter. We built an internet research assistant app that:
- Accepts a user query.
- Searches the web about it.
- And turns it into a well-crafted article.
The demo is shown below: