A crash course on RAG systems

TODAY’S DAILY DOSE OF DATA SCIENCE

A crash course on RAG systems - Part 2

Last week, we started a crash course on building RAG systems.

Part 2 is now available, where we are building on the foundations laid in Part 1.

Read here: A Crash Course on Building RAG Systems – Part 2 (With Implementation).

Why care?

Over the last few weeks, we have spent plenty of time understanding the key components of real-world NLP systems (like the deep dives on bi-encoders and cross-encoders for context pair similarity scoring).

RAG is another key NLP system that got massive attention due to one of the key challenges it solved around LLMs.

More specifically, if you know how to build a reliable RAG system, you can bypass the challenge and cost of fine-tuning LLMs.

That’s a considerable cost saving for enterprises.

And at the end of the day, all businesses care about impact. That’s it!

Can you reduce costs?
Drive revenue?
Can you scale ML models?
Predict trends before they happen?

Thus, the objective of this crash course is to help you implement reliable RAG systems, understand the underlying challenges, and develop expertise in building RAG apps on LLMs, which every industry cares about now.

Read the first part here →
Read the second part here →

Of course, if you have never worked with LLMs, that’s okay. We cover everything in a practical and beginner-friendly way.

IN CASE YOU MISSED IT

Extend the context length of LLMs

GPT-3.5-turbo had a context window of 4,096 tokens.
Later, GPT-4 took that to 8,192-32k tokens.
Claude 2 reached 100,000 tokens.
Llama 3.1 → 128,000 tokens.
Gemini → 1M+ tokens.

We have been making great progress in extending the context window of LLMs.

But how?

We covered techniques that help us unlock larger context windows earlier this week.

Read the techniques to extend the context length of LLMs here →

IN CASE YOU MISSED IT

Building a 100% local multi-agent Internet research assistant with OpenAI Swarm & Llama 3.2

Recently, OpenAI released Swarm.

It’s an open-source framework designed to manage and coordinate multiple AI agents in a highly customizable way.

AI Agents are autonomous systems that can reason, think, plan, figure out the relevant sources and extract information from them when needed, take actions, and even correct themselves if something goes wrong.

We published a practical and hands-on demo of this in the newsletter. We built an internet research assistant app that:

Accepts a user query.
Searches the web about it.
And turns it into a well-crafted article.

The demo is shown below:

0:00

/0:45

Learn how to build this Agent here →

No-Fluff Industry ML resources to

Succeed in DS/ML roles

At the end of the day, all businesses care about impact. That’s it!

Can you reduce costs?
Drive revenue?
Can you scale ML models?
Predict trends before they happen?

We have discussed several other topics (with implementations) in the past that align with such topics.

Develop Industry ML skills

Here are some of them:

Learn sophisticated graph architectures and how to train them on graph data in this crash course.
So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
Run large models on small devices using Quantization techniques.
Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
Learn how to scale and implement ML model training in this practical guide.
Learn 5 techniques with implementation to reliably test ML models in production.
Learn how to build and implement privacy-first ML systems using Federated Learning.
Learn 6 techniques with implementation to compress ML models.

All these resources will help you cultivate key skills that businesses and companies care about the most.

Advertise to 600k+ data professionals

Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., around the world.

Get in touch today →

The Full MCP Blueprint: Testing, Security, and Sandboxing in MCPs (Part B)

The Full MCP Blueprint: Testing, Security and Sandboxing in MCPs (Part A)

The Full MCP Blueprint: Integrating Sampling into MCP Workflows

A crash course on RAG systems - Part 2

TODAY’S DAILY DOSE OF DATA SCIENCE

​A crash course on RAG systems - Part 2

Why care?

IN CASE YOU MISSED IT

Extend the context length of LLMs

IN CASE YOU MISSED IT

Building a 100% local multi-agent Internet research assistant with OpenAI Swarm & Llama 3.2

No-Fluff Industry ML resources to

Succeed in DS/ML roles

SPONSOR US

Advertise to 600k+ data professionals

Read next

The Full MCP Blueprint: Testing, Security, and Sandboxing in MCPs (Part B)

The Full MCP Blueprint: Testing, Security and Sandboxing in MCPs (Part A)

The Full MCP Blueprint: Integrating Sampling into MCP Workflows

Join the Daily Dose of Data Science Today!

A crash course on RAG systems - Part 2