All the tests we have covered so far — Z-tests, t-tests, ANOVA — deal with continuous, numerical data. But many business questions involve categorical data: defect types, customer segments, satisfaction levels. The chi-square test is designed for exactly these situations.
The goodness-of-fit test asks whether the observed frequency distribution of a single categorical variable matches an expected (hypothesized) distribution. For example, if defects should be equally distributed across four categories, do the actual counts support that assumption?
=CHISQ.TEST(actual_range, expected_range)Oi is the observed count for category i, Ei is the expected count, and the sum is over all k categories. For goodness of fit, df = k − 1.
GreatLakes tracks defects in four categories: dimensional, surface finish, material, and assembly. Management expects defects to be equally distributed across all four categories (25% each). In a recent audit of 160 defects, the observed counts were: 45 dimensional, 32 surface finish, 58 material, and 25 assembly. Does this distribution differ significantly from the expected equal split?
| Category | Observed (O) | Expected (E) | (O − E)² / E |
|---|---|---|---|
| Dimensional | 45 | 40 | 0.625 |
| Surface Finish | 32 | 40 | 1.600 |
| Material | 58 | 40 | 8.100 |
| Assembly | 25 | 40 | 5.625 |
While the goodness-of-fit test examines one categorical variable, the test of independence examines the relationship between two categorical variables. The data are organized in a contingency table (also called a cross-tabulation), where rows represent one variable and columns represent the other.
Under the assumption of independence, the expected count for each cell is:
=(row_total * col_total) / grand_totaldf = (r − 1)(c − 1), where r is the number of rows and c is the number of columns.
GreatLakes surveys 200 employees about job satisfaction across three shifts. Management wants to know whether satisfaction level is related to shift assignment, or whether the distributions are similar across shifts.
| Day | Evening | Night | Row Total | |
|---|---|---|---|---|
| Satisfied | 50 | 30 | 10 | 90 |
| Neutral | 20 | 25 | 15 | 60 |
| Dissatisfied | 10 | 15 | 25 | 50 |
| Col Total | 80 | 70 | 50 | 200 |
| Day | Evening | Night | |
|---|---|---|---|
| Satisfied | 36.0 | 31.5 | 22.5 |
| Neutral | 24.0 | 21.0 | 15.0 |
| Dissatisfied | 20.0 | 17.5 | 12.5 |
The chi-square test is widely applicable, but it does have important requirements:
If your contingency table is 2×2 and any expected count is less than 5, use Fisher’s exact test instead. For larger tables with some small expected counts, you can often combine similar categories to increase expected frequencies above the threshold of 5.
Chi-square tests work with counts, not percentages. Always verify that all expected cell frequencies are at least 5 before interpreting the results. The chi-square test tells you whether an association exists, but not how strong it is — for that, examine the standardized residuals or compute Cramér’s V.
In this final chapter of STATS200, we covered the chi-square family of tests for categorical data. Here is what you should take away:
Goodness of Fit: Tests whether the observed distribution of a single categorical variable matches an expected distribution. Uses df = k − 1.
Test of Independence: Tests whether two categorical variables are related using a contingency table. Expected counts are computed as (row total × column total) / grand total. Uses df = (r − 1)(c − 1).
Assumptions: Random sample, independent observations, all expected counts ≥ 5. Always use raw counts, never percentages.
Excel: Use =CHISQ.TEST(actual_range, expected_range) to get the p-value directly. Use =CHISQ.INV.RT(alpha, df) to find the critical value.
Congratulations — you have completed all seven chapters of STATS200: Business Statistics. You now have a solid foundation in descriptive statistics, hypothesis testing, confidence intervals, regression, ANOVA, and chi-square tests. Return to the course overview to review any chapter.
Back to STATS200 Overview