5 Must-Know Ways to Test ML Models in Production (Implementation Included)

A beginner-friendly guide to model testing.

5 Must-Know Ways to Test ML Models in Production (Implementation Included)


A typical blueprint of any real-world machine learning (ML) looks like the following:

  1. Formulate the problem statement
  2. Get the management’s approval
  3. Gather the required datasets
  4. Explore the gathered data
  5. Start building a model
  6. Make improvements
  7. Validate the model
  8. Test the model
  9. Improve it

Once you are satisfied:

  1. Productionize the ideal model
  2. Proceed with deployment
  3. Set up logging methods
  4. Handover the model
  5. Go back to step 1

Of course, the above process can be a bit more comprehensive, but the overall blueprint from idea inception to handover to the team you built that solution for is almost the same across projects.

Also, in the above process:

  • Steps 1-9 mainly highlight development in the local environment.
  • Steps 10-13 are inclined towards the production environment.

To elaborate further, during the local development phase (steps 1-9), the model goes through rigorous engineering and testing to ensure its accuracy, robustness, and generalizability. We do this all the time.

Testing using validation/test sets is critical in this phase as it helps identify and rectify any issues before the model is sent for productionisation (which demands considerable engineering efforts).

What is productionisation?

For more context, productionisation is the phase where an ideal model is prepared for deployment in a production setting.

For instance, if we developed the model in Python, but the server we intend to deploy our model on runs any other language except Python, like C++ or Java, then making it compatible with such environment configuration is what productionisation involves.

We discussed this here in the following article (you can read it after this article):

PyTorch Models Are Not Deployment-Friendly! Supercharge Them With TorchScript.
Eliminating the dependence of PyTorch models on Python.

This article discussed the use of TorchScript, which we can use to convert Python-developed PyTorch models into formats that are compatible with other languages.

Moreover, it's possible that we leveraged some classical ML models from sklearn. But these models are not production-friendly because sklearn is built on top of NumPy, which can only run on a single core of a CPU.

As a result, it provides sub-optimal performance.

The techniques we discussed in the following article help us make these models more production-friendly (you can read it after this article):

Sklearn Models are Not Deployment Friendly! Supercharge Them With Tensor Computations.
Speed up sklearn model inference up to 50x with GPU support.

This article discussed techniques to convert sklearn models to tensor computations, which can be parallelized and can also be loaded on a GPU (if needed)

One more example could be that if the model is to be deployed on edge devices, we may want to reduce its size (discussed below).

Model Compression: A Critical Step Towards Efficient Machine Learning
Four critical ways to reduce model footprint and inference time.

In a nutshell, the two primary objectives of the productionisation phase are to optimize the model for deployment and ensure its robustness and reliability.

This involves testing the model against various edge cases and scenarios to ensure that it can handle unexpected inputs and situations gracefully.

Once the model is fully productionised, it is deployed to the production environment, where it begins to serve predictions to end-users or other systems.

Project over?

Not yet!

Ideal deployment strategy

If we already have a model running in production, it could be a terrible idea to instantly replace the previous model with the updated model.

Instead, a more conservative and reliable strategy is to test the model in production (yes, on real-world incoming data) before completely substituting/discarding the previous version of the model.

Testing a model in production might appear risky, but ML teams do it all the time, and it isn't that complicated.

In the upcoming section, we shall discuss five commonly used techniques to test ML models in production.

We shall also implement these strategies, and in order to do that, we shall be using Modelbit, which we discussed in the following article:

Deploy, Version Control, and Manage ML Models Right From Your Jupyter Notebook with Modelbit
Deployment has possibly never been so simple.

It’s okay if you haven’t read it yet. We will do a quick overview of the model deployment steps in Modelbit.

Note: Modelbit is a commercial product. However, this article is not influenced by any external motivation (or sponsorships) to use their service. I have personally found Modelbit’s deployment procedures to be immensely simple. Everything demonstrated in this article can be done with a free Modelbit account.

Let’s begin!

Modelbit deployment demo

Feel free to skip this section if you already know how Modelbit works.

The core objective behind model deployment is to obtain an API endpoint for our deployed model, which can be later used for inference purposes:

Modelbit lets us seamlessly deploy ML models directly from our Python notebooks (or Git, as we would see ahead in this article) and obtain a REST API.

Process workflow

Since Modelbit is a relatively new service, let’s understand the general workflow to generate an API endpoint when deploying a model with Modelbit.

The image below depicts the steps involved in deploying models with Modelbit:

  • Step 1) We connect the Jupyter kernel to Modelbit.
  • Step 2) Next, we train the ML model.
  • Step 3) We define the inference function. Simply put, this function contains the code that will be executed at inference. Thus, it will be responsible for returning the prediction.
  • Step 4) [OPTIONAL] Here, we specify the version of Python and other open-source libraries we used while training the model.
  • Step 5) Lastly, we send it for deployment.

Once done, Modelbit returns the API endpoint, which we can integrate into any of the applications and serve end-users with.

Let’s implement this!


First, we must install the Modelbit package first.

We can use pip to install the Modelbit package:


Also, to deploy and view our deployed models in the Modelbit dashboard, we must create a Modelbit account as well here: https://app.modelbit.com/signup.

Now, we can implement the steps depicted in the earlier animation.

Step 1) Connect Jupyter kernel to Modelbit

First, we connect our Jupyter kernel to Modelbit. This is done as follows:

Join the Daily Dose of Data Science Today!

A daily column with insights, observations, tutorials, and best practices on data science.

Get Started!
Join the Daily Dose of Data Science Today!

Great! You’ve successfully signed up. Please check your email.

Welcome back! You've successfully signed in.

You've successfully subscribed to Daily Dose of Data Science.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.