Simple linear regression models the relationship between one predictor and one outcome. In practice, business outcomes are rarely driven by a single factor. Multiple regression extends the model to include two or more predictors, allowing us to estimate each predictor's effect while holding the others constant.
The predicted value of the dependent variable is a linear combination of all predictors:
LINEST() array formula or Data Analysis Toolpak > Regressionb0 is the intercept, b1, b2, …, bk are the partial regression coefficients, and x1, x2, …, xk are the predictor variables.
Each coefficient bj represents the expected change in the dependent variable for a one-unit increase in xj, holding all other predictors constant. This "holding constant" interpretation is what distinguishes multiple regression from running separate simple regressions.
NorthStar wants to predict quarterly revenue (in thousands) using four predictors: advertising spend (x1, in thousands), number of salespeople (x2), average product price (x3, in dollars), and an economic index (x4, 0–100 scale). Using 20 quarters of historical data, the regression output yields:
Revenue = −120 + 2.8(Ad Spend) + 340(Salespeople) + 0.95(Avg Price) + 15.2(Econ Index)
The coefficient b2 = 340 means that each additional salesperson is associated with a $340,000 increase in quarterly revenue, holding the other three predictors constant.
R² (coefficient of determination) measures the proportion of variance in the dependent variable explained by the set of predictors. However, R² has a fundamental flaw: it never decreases when you add a predictor, even if that predictor is useless noise.
Adjusted R² penalizes the model for each additional predictor, only increasing when a new predictor improves the model more than expected by chance alone.
=1-(1-RSQ)*(n-1)/(n-k-1) or from Regression outputR² is the unadjusted coefficient of determination, n is the sample size, and k is the number of predictors.
NorthStar's 4-predictor model yields R² = 0.81 and Adjusted R² = 0.78. An analyst adds a fifth predictor (average employee tenure), and R² rises slightly to 0.812 — but Adjusted R² drops to 0.771.
The decline in Adjusted R² signals that the fifth predictor adds complexity without meaningful explanatory power. NorthStar should keep the simpler 4-predictor model.
When two or more predictors are highly correlated with each other, the regression suffers from multicollinearity. This inflates the standard errors of the coefficients, making individual predictors appear non-significant even when the overall model is strong.
The VIF quantifies how much the variance of a coefficient is inflated due to collinearity with other predictors. It is calculated by regressing each predictor on all other predictors.
=1/(1-RSQ(auxiliary))Ri² is the R-squared from regressing predictor xi on all other predictors. A VIF of 1 means no collinearity; VIF > 5 or 10 signals problematic multicollinearity.
NorthStar's analyst notices that advertising spend (x1) and marketing budget (a new candidate predictor) are highly correlated — both measure how much the company invests in promotion. An auxiliary regression of advertising spend on marketing budget yields R² = 0.917.
VIF = 1 / (1 − 0.917) = 12.0
With VIF = 12, including both predictors severely inflates the standard errors. The analyst drops marketing budget and retains advertising spend, which has a clearer business interpretation.
Multiple regression is powerful but requires careful model building. Always check VIF for multicollinearity, compare R² vs. Adjusted R² when adding predictors, and remember that each coefficient represents the effect of that predictor holding all others constant. More predictors is not always better — parsimony and interpretability matter.
This chapter introduced multiple regression, the distinction between R² and Adjusted R², and the problem of multicollinearity.
Multiple Regression: Predicts an outcome from multiple predictors. Each coefficient measures the effect of one predictor, holding all others constant.
Adjusted R²: Penalizes for adding useless predictors. Use it to compare models with different numbers of predictors.
Multicollinearity: When predictors are highly correlated, VIF > 5 or 10 signals problems. Address by removing or combining correlated variables.