Contrastive Learning Using Siamese Networks

Building a face unlock system.

Contrastive Learning Using Siamese Networks
👉
Hey! Enjoy our free data science newsletter! Subscribe below and receive a free data science PDF (530+ pages) with 150+ core data science and machine learning lessons.

TODAY'S ISSUE

TODAY’S DAILY DOSE OF DATA SCIENCE

Contrastive Learning Using Siamese Networks

Task

As an ML engineer, you are responsible for building a face unlock system.

Let’s look through some possible options:

Today, we shall cover the overall idea and do the implementation tomorrow.

1) How about a simple binary classification model?

Output 1 if the true user is opening the mobile; 0 otherwise.

Initially, you can ask the user to input facial data to train the model.

But that’s where you identify the problem.

All samples will belong to “Class 1.”

Now, you can’t ask the user to find someone to volunteer for “Class 0” samples since it’s too much hassle for them.

Not only that, you also need diverse “Class 0” samples. Samples from just one or two faces might not be sufficient.

The next possible solution you think of is…

Maybe ship some negative samples (Class 0) to the device to train the model.

Might work.

But then you realize another problem:

What if another person wants to use the same device?

Since all new samples will belong to the “new face” during adaptation, what if the model forgets the first face?


2) How about transfer learning?

This is extremely useful when:

  • The task of interest has less data.
  • But a related task has abundant data.

This is how you think it could work in this case:

  • Train a neural network model (base model) on some related task → This will happen before shipping the model to the user’s device.
  • Next, replace the last few layers of the base model with untrained layers and ship it to the device.

It is expected that the first few layers would have learned to identify the key facial features.

From there on, training on the user’s face won’t require much data.

But yet again, you realize that you shall run into the same problems you observed with the binary classification model since the new layers will still be designed around predicting 1 or 0.


Solution: Contrastive learning using Siamese Networks

At its core, a Siamese network determines whether two inputs are similar.

It does this by learning to map both inputs to a shared embedding space (the blue layer above):

  • If the distance between the embeddings is LOW, they are similar.
  • If the distance between the embeddings is HIGH, they are dissimilar.

They are beneficial for tasks where the goal is to compare two data points rather than to classify them into predefined categories/classes.

This is how it will work in our case:

    • If a pair belongs to the same person, the true label will be 0.
    • If a pair belongs to different people, the true label will be 1.

Create a dataset of face pairs:

    • Pass both inputs through the same network to generate two embeddings.
    • If the true label is 0 (same person) → minimize the distance between the two embeddings.
    • If the true label is 1 (different person) → maximize the distance between the two embeddings.

After creating this data, define a network like this:

Contrastive loss (defined below) helps us train such a model:

where:

  • y is the true label.
  • D is the distance between two embeddings.
  • margin is a hyperparameter, typically greater than 1.

Here’s how this particular loss function helps:

    • The above value will be minimum when D is close to the margin value, leading to more distance between the embeddings.
    • The above value will be minimum when D is close to 0, leading to a low distance between the embeddings.

When y=0 (same person), the loss will be:

When y=1 (different people), the loss will be:

This way, we can ensure that:

  • when the inputs are similar, they lie closer in the embedding space.
  • when the inputs are dissimilar, they lie far in the embedding space.

Siamese Networks in face unlock

Here’s how it will help in the face unlock application.

First, you will train the model on several image pairs using contrastive loss.

This model (likely after model compression) will be shipped to the user’s device.

During the setup phase, the user will provide facial data, which will create a user embedding:

This embedding will be stored in the device’s memory.

Next, when the user wants to unlock the mobile, a new embedding can be generated and compared against the available embedding:

  • Action: Unlock the mobile if the distance is small.

Done!

Note that no further training was required here, like in the earlier case of binary classification.

Also, what if multiple people want to add their face IDs?

No problem.

Create another embedding for the new user.

During unlock, compare the incoming user against all stored embeddings.

That was simple, wasn’t it?

I am ending today’s issue here, but tomorrow, we shall discuss a simple implementation of Siamese Networks using PyTorch.

Until then, here’s some further hands-on reading to learn how to build on-device ML applications:

👉 Over to you: Siamese Networks are not the only way to solve this problem. What other architectures can work?

IN CASE YOU MISSED IT

Build a Multi-agent Research Assistant With SwarmZero​

After using OpenAI’s Swarm, we realized several limitations.

One major shortcoming is that it isn’t suited for production use cases since the project is only meant for experimental purposes.

​​SwarmZero​​ solves this.

We recently shared a practical and hands-on demo of this.

We’ll build a PerplexityAI-like research assistant app that:

  • Accepts a user query.
  • Searches the web about it.
  • And turns it into a well-crafted article, which can saved as a PDF, in a Google Doc, confluence page, and more.

​Learn how to build multi-agent applications with SwarmZero →

THAT'S A WRAP

No-Fluff Industry ML resources to

Succeed in DS/ML roles

At the end of the day, all businesses care about impact. That’s it!

  • Can you reduce costs?
  • Drive revenue?
  • Can you scale ML models?
  • Predict trends before they happen?

We have discussed several other topics (with implementations) in the past that align with such topics.

Here are some of them:

  • Learn sophisticated graph architectures and how to train them on graph data in this crash course.
  • So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
  • Run large models on small devices using Quantization techniques.
  • Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
  • Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
  • Learn how to scale and implement ML model training in this practical guide.
  • Learn 5 techniques with implementation to reliably test ML models in production.
  • Learn how to build and implement privacy-first ML systems using Federated Learning.
  • Learn 6 techniques with implementation to compress ML models.

All these resources will help you cultivate key skills that businesses and companies care about the most.

Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., around the world.

Get in touch today →


Join the Daily Dose of Data Science Today!

A daily column with insights, observations, tutorials, and best practices on data science.

Get Started!
Join the Daily Dose of Data Science Today!

Great! You’ve successfully signed up. Please check your email.

Welcome back! You've successfully signed in.

You've successfully subscribed to Daily Dose of Data Science.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.