Every business decision begins with understanding the data at hand. Before building predictive models or running hypothesis tests, analysts must first describe what the data looks like. This is the domain of descriptive statistics — a collection of methods for summarizing, organizing, and visualizing raw data so that meaningful patterns emerge.
Descriptive statistics answer fundamental questions: What is a typical value in our dataset? How much do the observations spread out? Are there unusually high or low values that deserve attention? These questions may seem simple, but the answers form the foundation of every advanced analysis that follows.
In a business context, descriptive statistics are the first tool managers reach for. A regional sales director looking at quarterly revenue figures across 50 territories does not start with regression analysis — they start by computing averages, sorting from high to low, and plotting bar charts. Descriptive statistics turn a sprawling spreadsheet into a concise story.
Consider the difference between receiving a list of 10,000 daily transactions versus receiving a single summary: average transaction value is $47, with a standard deviation of $12, and a median of $44. The summary instantly communicates what would take hours to absorb from the raw data. Descriptive statistics compress information without losing the essential shape of the data.
Throughout this chapter, we will follow a single running example — LakeFront Retail Co. — to see how each concept applies to a real business scenario.
LakeFront Retail Co. is a mid-size retail chain operating 12 stores around the Great Lakes region. Management wants to understand weekly revenue patterns across all stores to decide where to invest in expansion, which locations need support, and how consistent performance is across the chain.
The dataset below shows weekly revenue (in thousands of dollars) for each of the 12 stores. This data will serve as our working example throughout the entire chapter.
| Store | Revenue ($K) | Region |
|---|
The first question we typically ask about a dataset is: what is a typical value? Measures of central tendency give us a single number that represents the “center” or “middle” of the data. The three most common measures are the mean, median, and mode.
The mean is the most widely used measure of central tendency. It is computed by adding all values in the dataset and dividing by the number of observations. The mean takes every data point into account, which makes it a comprehensive summary — but also makes it sensitive to extreme values.
=AVERAGE(range) is the sample mean, xi is each individual observation, and n is the total number of observations.
The median is the middle value when data are arranged in order from smallest to largest. If the dataset has an odd number of observations, the median is the single middle value. If the dataset has an even number, the median is the average of the two middle values. Because the median depends only on position — not magnitude — it is resistant to outliers.
=MEDIAN(range)n, sort the data and average the values at positions n/2 and (n/2)+1.
The mode is the value that occurs most frequently in a dataset. In continuous data (like revenue figures), it is common for no value to repeat, in which case we say the data has no mode. The mode is most useful for categorical or discrete data, such as the most popular product category sold across stores.
LakeFront's mean weekly revenue is $55.5K, but the median is $53.5K. The mean is pulled higher by the Chicago store, which generates $89K per week — significantly more than any other location. This $2K gap between mean and median tells management that the revenue distribution is right-skewed: a few high-performing stores push the average above the typical store's revenue.
For planning purposes, the median ($53.5K) may be a better representation of what a “typical” LakeFront store earns each week.
The mean uses every data point in its calculation, making it sensitive to outliers. The median depends only on the middle position, making it robust. When data is skewed, the median often provides a more representative measure of the “typical” value.
Knowing the center of a dataset is only half the picture. Two datasets can share the same mean yet look entirely different. Imagine two retail chains that both average $55K in weekly revenue per store — one might have stores clustered between $50K and $60K, while the other has stores ranging from $20K to $90K. The spread (or variability) of data tells us how much individual values differ from one another and from the center.
The most common measures of spread are the range, variance, standard deviation, and interquartile range (IQR).
The range is the simplest measure of spread: the difference between the maximum and minimum values. It is easy to compute but only considers two data points, making it highly sensitive to outliers.
=MAX(range)-MIN(range)While the range only uses two values, the variance considers how far every data point deviates from the mean. It is computed by finding each observation's deviation from the mean, squaring those deviations (to eliminate negative signs), summing them, and dividing by n − 1 (for a sample). The standard deviation is the square root of the variance, bringing the measure back to the original units.
We divide by n − 1 rather than n because we are working with a sample, not the entire population. This correction (called Bessel's correction) produces an unbiased estimate of the population variance.
=VAR.S(range)s² is the sample variance, xi is each observation, is the sample mean, and n is the sample size.
=STDEV.S(range)s is the square root of the variance, expressed in the same units as the data.
| Store | xi | xi − | (xi − )² |
|---|---|---|---|
| Duluth | 48 | −7.5 | 56.25 |
| Thunder Bay | 52 | −3.5 | 12.25 |
| Marquette | 41 | −14.5 | 210.25 |
| Traverse City | 63 | 7.5 | 56.25 |
| Green Bay | 55 | −0.5 | 0.25 |
| Milwaukee | 71 | 15.5 | 240.25 |
| Chicago | 89 | 33.5 | 1122.25 |
| Gary | 44 | −11.5 | 132.25 |
| Toledo | 38 | −17.5 | 306.25 |
| Cleveland | 61 | 5.5 | 30.25 |
| Erie | 47 | −8.5 | 72.25 |
| Buffalo | 57 | 1.5 | 2.25 |
| Sum of squared deviations: | 2,241.00 | ||
A standard deviation of approximately $14.3K tells LakeFront's management that most stores generate weekly revenue within roughly one standard deviation of the mean — between about $41K and $70K. Stores outside this range (Chicago at $89K, Toledo at $38K) deserve special attention: Chicago as a model for success, and Toledo as a candidate for operational review.
Standard deviation measures how much individual data points typically differ from the mean. Adding a constant to every value shifts the center but does not change the spread — the distances between points remain the same. Standard deviation is only affected by operations that change the relative positions of data points, such as multiplying by a constant.
Numbers alone rarely tell the full story. A well-designed visualization can reveal patterns, outliers, and distributions that summary statistics might obscure. In business settings, charts and graphs are often the primary way analysis gets communicated to stakeholders who may not be comfortable interpreting raw numbers.
The type of chart you choose depends on what you want to communicate:
For LakeFront's data, a bar chart sorted by revenue is the most immediate way to compare store performance. Let us look at one now.
LakeFront's regional VP uses this sorted bar chart to quickly identify the top and bottom performers. The visualization instantly reveals that Chicago is a clear outlier at the top, with revenue more than double that of Toledo. The chart also shows a natural grouping: three stores above $60K (Chicago, Milwaukee, Traverse City), six stores between $44K and $57K forming the core, and three stores below $48K that may need attention.
Individual summary statistics are useful, but they become powerful when combined into a complete descriptive profile. Reporting only the mean without a measure of spread can be misleading — it hides how much variation exists in the data. Similarly, reporting only the range tells you about the extremes but nothing about the center.
A best-practice descriptive summary includes:
Let us compile LakeFront's complete summary dashboard.
This dashboard tells a complete story: LakeFront's typical store earns mid-$50Ks weekly, but there is meaningful variation ($14.3K std dev) across the chain. The gap between mean and median suggests right skew driven by top performers like Chicago. Management should investigate both the high performers (to replicate success) and the lowest performers (to identify improvement opportunities).
Always report both a measure of center and a measure of spread. The mean without context is incomplete — you need standard deviation or IQR to understand how representative that mean actually is. A complete descriptive profile, paired with a well-chosen visualization, gives stakeholders the information they need to make sound decisions.
In this chapter, we covered the foundational tools of descriptive statistics — the methods that transform raw data into meaningful summaries. Here is what you should take away:
Central Tendency: The mean, median, and mode each capture a different sense of the “center.” The mean is comprehensive but sensitive to outliers; the median is robust to skewed data; the mode is best for categorical variables.
Spread: Range, variance, and standard deviation quantify how much data values differ from one another. Standard deviation is the most commonly reported measure because it is in the same units as the data.
Visualization: Charts translate numbers into visual patterns. Choose the chart type based on what you want to communicate — comparisons (bar chart), distributions (histogram), or relationships (scatter plot).
The Big Picture: Descriptive statistics are not an end in themselves — they are the essential first step that guides every subsequent analysis. A thorough descriptive profile combines center, spread, and visualization to tell a complete, honest story about your data.