When a one-way ANOVA rejects H0, we know that at least one group mean differs from the others — but we do not know which pairs of means are significantly different. Post-hoc (Latin for "after this") tests perform pairwise comparisons while controlling the familywise error rate, the probability of making at least one Type I error across all comparisons.
With k groups there are k(k−1)/2 possible pairwise comparisons. Running individual t-tests inflates the overall Type I error rate. For example, with 4 groups there are 6 comparisons; at α = 0.05 each, the familywise error rate climbs to roughly 1 − (0.95)6 ≈ 0.26 — far above the intended 5%.
In Chapter 2, NorthStar's ANOVA rejected H0 for employee satisfaction across 4 divisions (F(3,36) = 5.77, p = 0.0025). HR now needs to determine which specific pairs of divisions differ. With 4 groups, there are 4(3)/2 = 6 pairwise comparisons to evaluate.
The Tukey HSD test compares every pair of group means. A pair is declared significantly different if the absolute difference between their means exceeds a critical threshold called the Honestly Significant Difference.
=ABS(mean1-mean2) then compare to HSD thresholdq is the critical value from the Studentized Range table (based on k groups and dfW), MSW is the within-group mean square from ANOVA, and n is the common sample size per group.
For each pair of groups (i, j): if |i − j| > HSD, the difference is statistically significant. Otherwise, we have insufficient evidence that those two means differ.
A statistically significant ANOVA result tells us that group means differ, but not how much. Effect size quantifies the magnitude of the difference. The most common effect size for one-way ANOVA is eta-squared (η²), which represents the proportion of total variance explained by the grouping variable.
=SSB/SST (from ANOVA output)SSB is the between-group sum of squares and SST is the total sum of squares from the ANOVA table.
Cohen's guidelines for interpreting η²:
From Chapter 2, NorthStar's ANOVA produced SSB = 947.50 and SST = 2919.50. The effect size is:
η² = 947.50 / 2919.50 = 0.325
This is a large effect — division membership explains about 32.5% of the variance in employee satisfaction. This is not only statistically significant but also practically meaningful. HR should invest in understanding why Corporate Services scores so much higher.
Always report effect size alongside p-values. A small p-value tells you the result is unlikely due to chance; η² tells you whether the effect is large enough to matter in practice. With large samples, even trivial differences can be "significant." Effect size provides the practical context that p-values alone cannot.
This chapter covered two essential follow-ups to a significant ANOVA: post-hoc pairwise comparisons and effect size measurement.
Post-Hoc Tests: After rejecting H0 in ANOVA, use Tukey HSD (equal n), Bonferroni (conservative), or Scheffé (most conservative) to identify which specific pairs of means differ.
Tukey HSD: Compare each |i − j| to HSD = q · (MSW/n)1/2. Pairs exceeding the threshold are significantly different.
Eta-Squared: η² = SSB/SST measures the proportion of variance explained. Always report alongside p-values.