Sep 28, 2024 Machine Learning

Contrastive Learning Using Siamese Networks

Building a face unlock system.

Avi Chawla

👉

TODAY'S ISSUE

TODAY’S DAILY DOSE OF DATA SCIENCE

Contrastive Learning Using Siamese Networks

Task

As an ML engineer, you are responsible for building a face unlock system.

Let’s look through some possible options:

Today, we shall cover the overall idea and do the implementation tomorrow.

1) How about a simple binary classification model?

Output 1 if the true user is opening the mobile; 0 otherwise.

Initially, you can ask the user to input facial data to train the model.

But that’s where you identify the problem.

All samples will belong to “Class 1.”

Now, you can’t ask the user to find someone to volunteer for “Class 0” samples since it’s too much hassle for them.

Not only that, you also need diverse “Class 0” samples. Samples from just one or two faces might not be sufficient.

The next possible solution you think of is…

Maybe ship some negative samples (Class 0) to the device to train the model.

Might work.

But then you realize another problem:

What if another person wants to use the same device?

Since all new samples will belong to the “new face” during adaptation, what if the model forgets the first face?

2) How about transfer learning?

This is extremely useful when:

The task of interest has less data.
But a related task has abundant data.

This is how you think it could work in this case:

Train a neural network model (base model) on some related task → This will happen before shipping the model to the user’s device.
Next, replace the last few layers of the base model with untrained layers and ship it to the device.

It is expected that the first few layers would have learned to identify the key facial features.

From there on, training on the user’s face won’t require much data.

But yet again, you realize that you shall run into the same problems you observed with the binary classification model since the new layers will still be designed around predicting 1 or 0.

Solution: Contrastive learning using Siamese Networks

At its core, a Siamese network determines whether two inputs are similar.

It does this by learning to map both inputs to a shared embedding space (the blue layer above):

If the distance between the embeddings is LOW, they are similar.
If the distance between the embeddings is HIGH, they are dissimilar.

They are beneficial for tasks where the goal is to compare two data points rather than to classify them into predefined categories/classes.

This is how it will work in our case:

If a pair belongs to the same person, the true label will be 0.

If a pair belongs to different people, the true label will be 1.

Create a dataset of face pairs:

Pass both inputs through the same network to generate two embeddings.
If the true label is 0 (same person) → minimize the distance between the two embeddings.
If the true label is 1 (different person) → maximize the distance between the two embeddings.

After creating this data, define a network like this:

Contrastive loss (defined below) helps us train such a model:

where:

y is the true label.
D is the distance between two embeddings.
margin is a hyperparameter, typically greater than 1.

Here’s how this particular loss function helps:

The above value will be minimum when D is close to the margin value, leading to more distance between the embeddings.
The above value will be minimum when D is close to 0, leading to a low distance between the embeddings.

When y=0 (same person), the loss will be:

When y=1 (different people), the loss will be:

This way, we can ensure that:

when the inputs are similar, they lie closer in the embedding space.
when the inputs are dissimilar, they lie far in the embedding space.

Siamese Networks in face unlock

Here’s how it will help in the face unlock application.

First, you will train the model on several image pairs using contrastive loss.

This model (likely after model compression) will be shipped to the user’s device.

During the setup phase, the user will provide facial data, which will create a user embedding:

This embedding will be stored in the device’s memory.

Next, when the user wants to unlock the mobile, a new embedding can be generated and compared against the available embedding:

Action: Unlock the mobile if the distance is small.

Done!

Note that no further training was required here, like in the earlier case of binary classification.

Also, what if multiple people want to add their face IDs?

No problem.

Create another embedding for the new user.

During unlock, compare the incoming user against all stored embeddings.

That was simple, wasn’t it?

I am ending today’s issue here, but tomorrow, we shall discuss a simple implementation of Siamese Networks using PyTorch.

Until then, here’s some further hands-on reading to learn how to build on-device ML applications:

Learn how to build privacy-first ML systems (with implementations): Federated Learning: A Critical Step Towards Privacy-Preserving Machine Learning.
Learn how to compress ML models and reduce costs: Model Compression: A Critical Step Towards Efficient Machine Learning.

👉 Over to you: Siamese Networks are not the only way to solve this problem. What other architectures can work?

IN CASE YOU MISSED IT

Build a Multi-agent Research Assistant With SwarmZero

After using OpenAI’s Swarm, we realized several limitations.

One major shortcoming is that it isn’t suited for production use cases since the project is only meant for experimental purposes.

SwarmZero solves this.

We recently shared a practical and hands-on demo of this.

We’ll build a PerplexityAI-like research assistant app that:

Accepts a user query.
Searches the web about it.
And turns it into a well-crafted article, which can saved as a PDF, in a Google Doc, confluence page, and more.

Learn how to build multi-agent applications with SwarmZero →

Published on Sep 28, 2024

Contrastive Learning Using Siamese Networks

TODAY’S DAILY DOSE OF DATA SCIENCE

Contrastive Learning Using Siamese Networks

Task

1) How about a simple binary classification model?

2) How about transfer learning?

Solution: Contrastive learning using Siamese Networks

Siamese Networks in face unlock

IN CASE YOU MISSED IT

Build a Multi-agent Research Assistant With SwarmZero​

Build a Multi-agent Research Assistant With SwarmZero