Study Card: Log Loss

Direct Answer

Key Terms

Example

Imagine building a model to predict whether a customer will click on an online advertisement. The model outputs a probability of a click (e.g., 0.8 means 80% chance of clicking). If the actual outcome is a click (1), and the model predicts 0.1, the log loss would be high due to that major error. If the true label is "no click" (0) and the model predicts 0.2, the error is smaller, hence, the log loss would be lower.

Numeric Example: For Actual label = 1 and Prediction = 0.9: Log loss = -log(0.9) ≈ 0.105. For Actual label = 0 and Prediction = 0.1, Log loss = -log(1 - 0.1) = -log(0.9) ≈ 0.105. For Actual label = 1 and Prediction = 0.1: Log loss = -log(0.1) ≈ 2.3.

Code Implementation

import numpy as np
from sklearn.metrics import log_loss
import matplotlib.pyplot as plt

# Example predictions (probabilities)
y_pred = np.array([0.1, 0.3, 0.7, 0.9, 0.2, 0.8])

# Example true labels (0 or 1)
y_true = np.array([0, 0, 1, 1, 0, 1])

# Calculate log loss
loss = log_loss(y_true, y_pred)
print(f"Log Loss: {loss}")

# Plotting to illustrate the concept
y_prob = np.linspace(0.01, 0.99, 100)

plt.figure(figsize=(8,5))
plt.plot(y_prob, -np.log(y_prob), label='Actual class is 1, prediction is less than 1', color='blue')
plt.plot(y_prob, -np.log(1-y_prob), label='Actual class is 0, prediction is greater than 0', color = 'red')

plt.title('-log(x) vs x')
plt.xlabel('Prediction(x)')
plt.ylabel('Log Loss')
plt.legend()
plt.grid(True)

# Zoomed plot to show what happens close to x=0, x=1
plt.xlim([0, 0.2])
plt.figure(figsize=(8,5))
plt.plot(y_prob, -np.log(1-y_prob), label='Actual class is 0, prediction is greater than 0', color='red')
plt.title('-log(1-x) for small x values')
plt.xlim([0, 0.2])
plt.xlabel('Prediction(x)')
plt.ylabel('Log Loss')
plt.grid(True)

plt.figure(figsize=(8,5))
plt.plot(y_prob, -np.log(y_prob), label='Actual class is 1, prediction is less than 1', color='blue')
plt.title('-log(x) for x close to 1')
plt.xlim([0.8, 1])
plt.xlabel('Prediction(x)')
plt.ylabel('Log Loss')
plt.grid(True)
plt.show()

Related Concepts