Not every business question involves predicting a continuous number like revenue or cost. Many of the most important decisions revolve around binary outcomes: Will a customer churn or stay? Will a loan default or not? Will a prospect convert or bounce? Linear regression is ill-suited for these problems because it can produce predicted probabilities outside the 0-to-1 range, and its assumptions about normally distributed errors do not hold for binary data.
Logistic regression solves this by modeling the probability that the outcome equals 1 (the event of interest) using a mathematical function that naturally stays between 0 and 1. Instead of fitting a straight line through the data, logistic regression fits an S-shaped curve called the logistic function (or sigmoid).
=EXP(b0+b1*x)/(1+EXP(b0+b1*x))P(y=1) is the probability of the event occurring, b0 is the intercept, and b1 is the coefficient for predictor x.
=LN(p/(1-p))p into log odds. This transformation is the link function that makes logistic regression a linear model on the log-odds scale.
The logit function is the inverse of the logistic function. While the logistic function maps any real number to a probability between 0 and 1, the logit maps a probability back to the real line. This duality is what makes logistic regression work: the model is linear in the log-odds, but non-linear in the probability space.
In Excel, logistic regression coefficients can be estimated using the Solver add-in (maximizing the log-likelihood) or through the Analysis ToolPak with binary data preparation.
NorthStar Enterprises wants to predict customer churn — whether a customer will leave within the next quarter. The outcome is binary: churned (1) or retained (0). Using logistic regression, the team models churn probability based on purchase frequency, months since last order, customer satisfaction score, and contract type.
A linear regression would predict values like -0.15 or 1.30 for some customers — impossible probabilities. Logistic regression constrains every prediction to a valid probability between 0 and 1.
Interpreting logistic regression coefficients requires thinking on the odds scale rather than the probability scale. Each coefficient represents the change in the log odds of the outcome for a one-unit increase in the predictor. To make this more intuitive, we convert to the odds ratio by exponentiating the coefficient.
=EXP(b)OR > 1 means the odds increase; OR < 1 means the odds decrease.
A useful shortcut: to express the percentage change in odds, compute (OR − 1) × 100. An OR of 1.35 means a 35% increase in odds; an OR of 0.71 means a 29% decrease in odds.
In NorthStar's churn model, the coefficient for purchase frequency (monthly purchases) is b = −0.34. The odds ratio is:
OR = e−0.34 = 0.71
This means each additional purchase per month reduces the odds of churn by 29%. Customers who buy more frequently are substantially less likely to leave. This finding supports NorthStar's loyalty program, which incentivizes repeat purchases.
Odds ratios are the natural currency of logistic regression interpretation. An OR of 2.5 does not mean the probability is 2.5 times higher — it means the odds are 2.5 times higher. The distinction between odds and probability matters: odds can exceed 1, but probability cannot.
Unlike linear regression, where R-squared measures model fit, logistic regression uses classification-based metrics. After the model produces predicted probabilities, we apply a threshold (typically 0.5) to convert probabilities into predicted classes: above the threshold is predicted as 1, below as 0. Comparing these predictions to actual outcomes produces the confusion matrix.
| Predicted: Stay | Predicted: Churn | |
|---|---|---|
| Actual: Stay | TN = 430 | FP = 70 |
| Actual: Churn | FN = 63 | TP = 237 |
=(TP+TN)/(TP+TN+FP+FN)TP = true positives, TN = true negatives, FP = false positives, FN = false negatives.
=TP/(TP+FN)=TN/(TN+FP)NorthStar's churn model achieved the following on the test data:
The slightly lower sensitivity means some churning customers slip through. Since the cost of losing a customer (lifetime value) far exceeds the cost of a retention offer, NorthStar decided to lower the classification threshold from 0.5 to 0.4 — catching more true churners at the expense of some false alarms.
The cost of false negatives versus false positives should drive your choice of classification threshold. In churn prediction, missing a churner (false negative) is typically far more expensive than sending a retention offer to a loyal customer (false positive). Adjust the threshold to align with business priorities, not just to maximize overall accuracy.
Logistic regression extends regression analysis to binary outcomes, opening the door to classification problems that pervade business decision-making.
Binary Outcomes: Logistic regression models the probability of a binary event using the logistic (sigmoid) function, keeping predictions between 0 and 1.
Interpretation: Coefficients represent changes in log odds. Convert to odds ratios (eb) for intuitive interpretation. OR > 1 increases odds; OR < 1 decreases odds.
Classification: The confusion matrix, accuracy, sensitivity, specificity, and AUC measure model performance. Adjust the classification threshold based on the relative cost of false positives vs. false negatives.