A crash course on RAG systems - Part 2

Beginner-friendly and with implementation.

A crash course on RAG systems - Part 2
πŸ‘‰
Hey! Enjoy our free data science newsletter! Subscribe below and receive a free data science PDF (530+ pages) with 150+ core data science and machine learning lessons.

TODAY'S ISSUE

TODAY’S DAILY DOSE OF DATA SCIENCE

​A crash course on RAG systems - Part 2

Last week, we started a crash course on building RAG systems.

Part 2 is now available, where we are building on the foundations laid in Part 1.

Read here: ​A Crash Course on Building RAG Systems – Part 2 (With Implementation)​.

​A Crash Course on Building RAG Systems – Part 2 (With Implementation)​.

Why care?

Over the last few weeks, we have spent plenty of time understanding the key components of real-world NLP systems (like the deep dives on bi-encoders and cross-encoders for context pair similarity scoring).

RAG is another key NLP system that got massive attention due to one of the key challenges it solved around LLMs.

More specifically, if you know how to build a reliable RAG system, you can bypass the challenge and cost of fine-tuning LLMs.

That’s a considerable cost saving for enterprises.

And at the end of the day, all businesses care about impact. That’s it!

  • Can you reduce costs?
  • Drive revenue?
  • Can you scale ML models?
  • Predict trends before they happen?

Thus, the objective of this crash course is to help you implement reliable RAG systems, understand the underlying challenges, and develop expertise in building RAG apps on LLMs, which every industry cares about now.

Of course, if you have never worked with LLMs, that’s okay. We cover everything in a practical and beginner-friendly way.

IN CASE YOU MISSED IT

Extend the context length of LLMs

  • GPT-3.5-turbo had a context window of 4,096 tokens.
  • Later, GPT-4 took that to 8,192-32k tokens.
  • Claude 2 reached 100,000 tokens.
  • Llama 3.1 β†’ 128,000 tokens.
  • Gemini β†’ 1M+ tokens.

We have been making great progress in extending the context window of LLMs.

But how?

We covered techniques that help us unlock larger context windows earlier this week.

​Read the techniques to extend the context length of LLMs here →​

IN CASE YOU MISSED IT

Building a 100% local multi-agent Internet research assistant with OpenAI Swarm & Llama 3.2

Recently, OpenAI released Swarm.

It’s an open-source framework designed to manage and coordinate multiple AI agents in a highly customizable way.

AI Agents are autonomous systems that can reason, think, plan, figure out the relevant sources and extract information from them when needed, take actions, and even correct themselves if something goes wrong.

We published a practical and hands-on demo of this in the newsletter. We built an internet research assistant app that:

  • Accepts a user query.
  • Searches the web about it.
  • And turns it into a well-crafted article.

The demo is shown below:

0:00
/0:45

​Learn how to build this Agent here β†’

THAT'S A WRAP

No-Fluff Industry ML resources to

Succeed in DS/ML roles

At the end of the day, all businesses care about impact. That’s it!

  • Can you reduce costs?
  • Drive revenue?
  • Can you scale ML models?
  • Predict trends before they happen?

We have discussed several other topics (with implementations) in the past that align with such topics.

Here are some of them:

  • Learn sophisticated graph architectures and how to train them on graph data in this crash course.
  • So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
  • Run large models on small devices using Quantization techniques.
  • Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
  • Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
  • Learn how to scale and implement ML model training in this practical guide.
  • Learn 5 techniques with implementation to reliably test ML models in production.
  • Learn how to build and implement privacy-first ML systems using Federated Learning.
  • Learn 6 techniques with implementation to compress ML models.

All these resources will help you cultivate key skills that businesses and companies care about the most.

Our newsletter puts your products and services directly in front of an audience that matters β€” thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., around the world.

Get in touch today β†’


Join the Daily Dose of Data Science Today!

A daily column with insights, observations, tutorials, and best practices on data science.

Get Started!
Join the Daily Dose of Data Science Today!

Great! You’ve successfully signed up. Please check your email.

Welcome back! You've successfully signed in.

You've successfully subscribed to Daily Dose of Data Science.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.