5 Must-Know Ways to Test ML Models in Production (Implementation Included)

👉

Introduction

A typical blueprint of any real-world machine learning (ML) looks like the following:

Formulate the problem statement
Get the management’s approval
Gather the required datasets
Explore the gathered data
Start building a model
Make improvements
Validate the model
Test the model
Improve it

Once you are satisfied:

Productionize the ideal model
Proceed with deployment
Set up logging methods
Handover the model
Go back to step 1

Of course, the above process can be a bit more comprehensive, but the overall blueprint from idea inception to handover to the team you built that solution for is almost the same across projects.

Also, in the above process:

Steps 1-9 mainly highlight development in the local environment.
Steps 10-13 are inclined towards the production environment.

To elaborate further, during the local development phase (steps 1-9), the model goes through rigorous engineering and testing to ensure its accuracy, robustness, and generalizability. We do this all the time.

Testing using validation/test sets is critical in this phase as it helps identify and rectify any issues before the model is sent for productionisation (which demands considerable engineering efforts).

What is productionisation?

For more context, productionisation is the phase where an ideal model is prepared for deployment in a production setting.

For instance, if we developed the model in Python, but the server we intend to deploy our model on runs any other language except Python, like C++ or Java, then making it compatible with such environment configuration is what productionisation involves.

We discussed this here in the following article (you can read it after this article):

This article discussed the use of TorchScript, which we can use to convert Python-developed PyTorch models into formats that are compatible with other languages.

Moreover, it's possible that we leveraged some classical ML models from sklearn. But these models are not production-friendly because sklearn is built on top of NumPy, which can only run on a single core of a CPU.

As a result, it provides sub-optimal performance.

The techniques we discussed in the following article help us make these models more production-friendly (you can read it after this article):

This article discussed techniques to convert sklearn models to tensor computations, which can be parallelized and can also be loaded on a GPU (if needed)

One more example could be that if the model is to be deployed on edge devices, we may want to reduce its size (discussed below).

In a nutshell, the two primary objectives of the productionisation phase are to optimize the model for deployment and ensure its robustness and reliability.

This involves testing the model against various edge cases and scenarios to ensure that it can handle unexpected inputs and situations gracefully.

Once the model is fully productionised, it is deployed to the production environment, where it begins to serve predictions to end-users or other systems.

Project over?

Not yet!

Ideal deployment strategy

If we already have a model running in production, it could be a terrible idea to instantly replace the previous model with the updated model.

Instead, a more conservative and reliable strategy is to test the model in production (yes, on real-world incoming data) before completely substituting/discarding the previous version of the model.

Testing a model in production might appear risky, but ML teams do it all the time, and it isn't that complicated.

In the upcoming section, we shall discuss five commonly used techniques to test ML models in production.

We shall also implement these strategies, and in order to do that, we shall be using Modelbit, which we discussed in the following article:

It’s okay if you haven’t read it yet. We will do a quick overview of the model deployment steps in Modelbit.

👉

Note: Modelbit is a commercial product. However, this article is not influenced by any external motivation (or sponsorships) to use their service. I have personally found Modelbit’s deployment procedures to be immensely simple. Everything demonstrated in this article can be done with a free Modelbit account.

Let’s begin!

Modelbit deployment demo

👉

Feel free to skip this section if you already know how Modelbit works.

The core objective behind model deployment is to obtain an API endpoint for our deployed model, which can be later used for inference purposes:

Modelbit lets us seamlessly deploy ML models directly from our Python notebooks (or Git, as we would see ahead in this article) and obtain a REST API.

Process workflow

Since Modelbit is a relatively new service, let’s understand the general workflow to generate an API endpoint when deploying a model with Modelbit.

The image below depicts the steps involved in deploying models with Modelbit:

Step 1) We connect the Jupyter kernel to Modelbit.
Step 2) Next, we train the ML model.
Step 3) We define the inference function. Simply put, this function contains the code that will be executed at inference. Thus, it will be responsible for returning the prediction.
Step 4) [OPTIONAL] Here, we specify the version of Python and other open-source libraries we used while training the model.
Step 5) Lastly, we send it for deployment.

Once done, Modelbit returns the API endpoint, which we can integrate into any of the applications and serve end-users with.

Let’s implement this!

Prerequisites

First, we must install the Modelbit package first.

We can use pip to install the Modelbit package:

Done!

Also, to deploy and view our deployed models in the Modelbit dashboard, we must create a Modelbit account as well here: https://app.modelbit.com/signup.

Now, we can implement the steps depicted in the earlier animation.

Step 1) Connect Jupyter kernel to Modelbit

First, we connect our Jupyter kernel to Modelbit. This is done as follows:

Read the full article

Join today!

Already have an account? Sign in

5 Must-Know Ways to Test ML Models in Production (Implementation Included)

Introduction

What is productionisation?

Ideal deployment strategy

Modelbit deployment demo

Process workflow

Prerequisites

Step 1) Connect Jupyter kernel to Modelbit

Read the full article

Read next

The Full MCP Blueprint: Practical MCP Integration with 4 Popular Agentic Frameworks

The Full MCP Blueprint: Testing, Security, and Sandboxing in MCPs (Part B)

The Full MCP Blueprint: Testing, Security and Sandboxing in MCPs (Part A)

Join the Daily Dose of Data Science Today!