A Crash Course on Causality – Part 1

A guide to building robust decision-making systems in businesses with causal inference.

A Crash Course on Causality – Part 1


“Because” is possibly one of the most powerful words in business decision-making.

  • “Our customer satisfaction improved because we introduced personalized recommendations.”
  • “The energy consumption dropped because of the new efficiency standards implemented.”

Backing any observation/insights with causality gives so much ability to confidently use the word “because” in business/regular discussions.

Identifying these causal relationships is vital because these relationships typically require an additional inspection that goes beyond just the typical correlation analysis, which almost anyone can do these days.

Thus, in this two-part article series, we shall dive into the details of causality and understand some of the most widely used techniques at the forefront of business decision-making, so that you can add value to your job and projects with your diversified skill set.

If you aspire to make valuable contributions to your data science job, this series will be super helpful.

A motivating example

It is well known that while correlation can show a relationship between two variables, it doesn’t imply that one causes the other.

For instance:

  • Studying correlation is likely to suggest that around the whole year, ice cream sales and AC sales are correlated. But this doesn’t mean eating ice cream causes people to buy air conditioners or vice versa. Both are influenced by a third factor – temperature.
I’m not suggesting that companies should act on this idea, but consider something for a moment. We know that temperature is a causal factor for AC sales. Thus, it would be tempting for AC companies to launch campaigns that increase temperatures. This way, they can boost their sales and it would a pretty effective strategy. This demonstrates the power of causal analysis. On the flip side, if AC companies only relied on correlation analysis for decision-making, they may notice a correlation with ice cream sales. However, would they really succeed if they tried to boost ice cream sales to increase their own? No, right? This example highlights why understanding causality is so important in decision-making.

If we were able to accurately establish causality, it would immensely optimize the company’s operations.

In this series, we shall cover four statistical tools that provide a scientific basis for using the word “because.”

Only by rigorously establishing causality can you justifiably use the word “because.”

My experience with Causality

In 2021, I was a data scientist at Mastercard and mentored an intern. Causal inference was not something me and my time had deep experience in so we were still exploring.

Causal inference is a field of study focused on understanding the cause-and-effect relationships between variables. Unlike correlation, which only indicates that two variables are correlated, causal inference seeks to determine whether changes in one variable directly cause changes in another.

Here's how we decided to approach it, but first, let me give you some context.

Mastercard handles millions of transactions per day. Every transaction goes through a fraud-detection model. Post the authentication phase, whether the transaction will be approved is decided based on the binary outcome of this model:

  • Fraud $\rightarrow$ reject the transaction.
  • Non-fraud $\rightarrow$ approve the transaction.

Now, once a label has been assigned to a transaction, it becomes a fact in this universe. Of course, the prediction may be wrong, but it has become a fact, and we cannot change it.

In the case of non-fraud transactions, the way we (Mastercard) ascertained whether the model made the correct prediction was determined based on whether the cardholder reached out to his bank or not.

So, let's say a transaction happened through your card right now, but you did not do it. However, the model classified it as non-fraud.

You would contact your bank, claiming you did not do that transaction. Of course, the bank will block your card immediately, but they may not trust you because you might be committing a friendly fraud.

Friendly fraud, also known as chargeback fraud, occurs when a customer makes a legitimate purchase but later disputes the charge, claiming it was not executed by them. In this scenario, the bank needs to differentiate between genuine fraud and potential friendly fraud. I have had experience working on a friendly fraud use case too. I can share what I learned in another article if you wish to learn more. Let me know.

Assuming it is not a case of friendly fraud, Mastercard waits for about 30-45 days to know (receive the label) whether a transaction classified as non-fraud was actually a fraud or not. Banks usually take this long to get back to Mastercard with a true label.

In other words, the feedback exists far into the future.

As we would see ahead, a big part of causal inference also revolves around counterfactual learning. As the name suggests:

  • Counter $\rightarrow$ Refers to something that is opposite or different.
  • Factual $\rightarrow$ Relates to actual events or facts that have occurred.
  • Learning $\rightarrow$ Involves acquiring knowledge or understanding through study and experience.

Counterfactual learning involves analyzing what would have happened under different circumstances. It helps us understand the impact of actions had we taken a different decision in the past.

For example, in the context of fraud detection:

  • Counterfactual $\rightarrow$ What if the transaction had been classified as fraud instead of non-fraud?
  • Learning $\rightarrow$ Gaining insights from these counterfactual scenarios to improve future decision-making and model accuracy.

We wanted to study this:

If a transaction was originally classified as fraud, what would have happened if it was classified as non-fraud instead? Would the cardholder have contacted the bank, claiming it was a fraud?

As you may have already understood, the most tricky thing about this is that we never get to know about the alternative reality. If the transaction has been classified as fraud, it becomes a fact, and now we cannot go back in time, reverse that decision, and observe the universe again.

This is why addressing questions of causality necessitates using rigorous -statistical tools, which we will explore in the article ahead.

Potential Outcome Model

The historical utility of causality estimations comes from the medical industry, where causality was primarily used for treatment evaluation.

More specifically, it was used to evaluate whether a specific treatment would cause sick patients' health to improve.

  • Expose some patients to a treatment.
  • Keep the other patients unexposed.
  • Measure the difference in outcome between the two groups while ensuring the overall conditions stayed the same.

So, before jumping to the core causality techniques, we need to understand a commonly used framework to analyze causality.

It's called the Potential Outcome Model.

Let’s define some notations before we proceed ahead:

Join the Daily Dose of Data Science Today!

A daily column with insights, observations, tutorials, and best practices on data science.

Get Started!
Join the Daily Dose of Data Science Today!

Great! You’ve successfully signed up. Please check your email.

Welcome back! You've successfully signed in.

You've successfully subscribed to Daily Dose of Data Science.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.