Chapter 4

Confidence Intervals

📖 ~45 min read 📈 1 interactive chart ✍️ 1 practice question 🎯 2 linked games

4.1 What Is a Confidence Interval?

In inferential statistics we rarely know the true population parameter. Instead, we collect a sample and use it to estimate the parameter. A single number computed from the sample — such as the sample mean — is called a point estimate. It is our best single guess, but it comes with no built-in measure of uncertainty.

A confidence interval (CI) extends the point estimate into a range of plausible values for the population parameter. Rather than stating “the average lead time is 4.2 days,” we say “we are 95% confident that the true average lead time is between 3.9 and 4.5 days.” The interval communicates both our estimate and the precision of that estimate.

The 95% Confidence Level

A 95% confidence interval does not mean there is a 95% probability that the true parameter lies inside this specific interval. The true parameter is fixed — it either falls inside or it does not. Instead, the “95%” refers to the procedure: if we were to draw many random samples and build a CI from each one, approximately 95 out of every 100 intervals would contain the true parameter.

Margin of Error

Every confidence interval has the form: point estimate ± margin of error. The margin of error captures how far the interval extends on either side of the point estimate. A smaller margin of error means a more precise estimate — but achieving it requires a larger sample size, lower confidence level, or less variability in the data.

⚠ Common Misconception

Wrong: “There is a 95% chance that the population mean is inside this interval.”

Right: “If we repeated this sampling procedure many times, 95% of the resulting intervals would contain the population mean.” The confidence level describes the long-run success rate of the method, not the probability for any single interval.

🎮
Practice: CI Challenge Build intervals from repeated samples and see how often they capture the true mean

4.2 Confidence Interval for a Mean

When constructing a confidence interval for a population mean, the formula depends on whether the population standard deviation is known or unknown.

Z-Interval (Population Standard Deviation Known)

When the population standard deviation (σ) is known — rare in practice but common in textbooks — we use the standard normal (Z) distribution. The critical value Z* corresponds to the desired confidence level (e.g., Z* = 1.96 for 95%).

Z-Interval for a Mean
📊 Excel: =CONFIDENCE.NORM(alpha, std_dev, n)
where x is the sample mean, Z* is the critical value, σ is the population standard deviation, and n is the sample size.

T-Interval (Population Standard Deviation Unknown)

In most real-world situations, we do not know σ and must estimate it with the sample standard deviation s. This additional uncertainty means we use the t-distribution instead of the Z-distribution. The t-distribution is wider (heavier tails), especially for small samples, reflecting the extra estimation error. As the sample size grows, the t-distribution approaches the standard normal.

T-Interval for a Mean
📊 Excel: =CONFIDENCE.T(alpha, std_dev, n)
where x is the sample mean, t* is the critical value from the t-distribution with df = n − 1, s is the sample standard deviation, and n is the sample size.
✎ Worked Example: 95% CI for Mean Lead Time
1
GreatLakes Manufacturing sampled n = 36 shipments and found a mean lead time of x = 4.2 days with a sample standard deviation of s = 0.9 days.
2
Since σ is unknown, use the t-interval. With df = 36 − 1 = 35, the critical value for 95% confidence is t* ≈ 2.030.
3
Calculate the margin of error:
ME = t* × (s / n^{1/2}) = 2.030 × (0.9 / 36^{1/2}) = 2.030 × (0.9 / 6) = 2.030 × 0.15 = 0.305
4
Build the interval:
CI = 4.2 ± 0.305 = (3.895, 4.505)
5
Result: We are 95% confident that GreatLakes Manufacturing's true average lead time is between 3.90 and 4.51 days.
🏪 GreatLakes Manufacturing

The 95% CI of (3.90, 4.51) days tells operations management that the true average lead time is almost certainly below 4.5 days. If the company's service-level agreement guarantees delivery within 5 days, this interval provides strong evidence that the process is meeting the target on average.

✓ Check Your Understanding
To cut the margin of error in half, you must:
A) Double the sample size
B) Quadruple the sample size
C) Double the confidence level
D) Reduce the standard deviation

4.3 Factors Affecting CI Width

Understanding what makes a confidence interval wider or narrower is critical for designing studies and interpreting results. Three factors control the width of a confidence interval:

1. Sample Size (n)

Increasing the sample size reduces the standard error (s / n1/2), which shrinks the margin of error. Because n appears under a square root, you must quadruple the sample size to cut the margin of error in half. This diminishing-returns relationship is one of the most important practical insights in statistics.

2. Confidence Level

A higher confidence level (e.g., 99% vs. 95%) requires a larger critical value, which widens the interval. There is always a trade-off between confidence and precision. A 99% CI is wider than a 95% CI built from the same data — you pay for greater confidence with less precision.

3. Standard Deviation (Variability)

More variable data produces wider intervals. If the data values are spread far from the mean, our estimate of the mean is less precise. Reducing measurement variability (through better processes or more consistent sampling) narrows the CI without requiring a larger sample.

How Sample Size Affects Confidence Interval Width
💡 Key Takeaway

The three levers for CI width are sample size, confidence level, and data variability. In practice, analysts most often increase sample size to narrow an interval, since changing the confidence level or reducing inherent variability may not be feasible.

🎮
Practice: CI Width Race Adjust sample size, confidence level, and SD to hit a target interval width

4.4 Chapter Summary

In this chapter we moved from point estimates to interval estimates, learning how confidence intervals quantify the uncertainty inherent in sampling. Here is what you should take away:

💡 Chapter 4 Summary

Point vs. Interval Estimates: A point estimate is a single number; a confidence interval provides a range of plausible values plus a measure of how confident we are.

Z- and T-Intervals: Use Z when σ is known, t when σ is unknown. The t-interval is wider to account for the extra estimation uncertainty.

Width Factors: Larger n narrows the CI; higher confidence widens it; more variability widens it. Quadrupling n halves the margin of error.

Interpretation: A 95% CI means 95% of similarly constructed intervals would contain the true parameter — it does not mean there is a 95% probability the parameter is in this specific interval.

📋 Chapter 4 — Formula Reference
Measure Formula Excel Function
Z-Interval
=CONFIDENCE.NORM(alpha, std_dev, n)
T-Interval
=CONFIDENCE.T(alpha, std_dev, n)
Margin of Error
=CONFIDENCE.T(alpha, s, n)
Standard Error
=s/SQRT(n)
Up Next
Chapter 5: Correlation Analysis