Why Do We Use log-loss To Train Logistic Regression?

👉

Do you remember the time you first learned about logistic regression?

In most cases, its loss function — log-loss, is introduced out of nowhere, without proper intuition, understanding, and most importantly, without answering the question “Why log-loss”?

$$ \text{log-loss} = - \sum_{i=1}^{N} y_{i} \cdot log(\hat y_{i}) + (1-y_{i}) \cdot log(1 - \hat y_{i}) $$

In fact, while drafting this blog, I did a quick Google search, and disappointingly, NONE of the top-ranked results discussed this.

The question is: Why do we specifically minimize the log-loss to train logistic regression? What is so special about it? Where does it come from?

$$ \text{log-loss} = - \sum_{i=1}^{N} y_{i} \cdot log(\hat y_{i}) + (1-y_{i}) \cdot log(1 - \hat y_{i}) $$

Let’s dive in!

Background

Before understanding the origin and utility of this loss function in logistic regression, it is immensely crucial to know how we model data while using logistic regression.

In other words, let’s understand how we frame its modeling mathematically.

Read the full article

Already have an account? Sign in

Why Do We Use log-loss To Train Logistic Regression?

Background

Read the full article

Read next

Conformal Predictions: Build Confidence in Your ML Model's Predictions

Quantization: Optimize ML Models to Run Them on Tiny Hardware

HDBSCAN: The Supercharged Version of DBSCAN — An Algorithmic Deep Dive

Join the Daily Dose of Data Science Today!