The Central Limit Theorem: Statistics Applied

This article is an excerpt from the Shortform book guide to "Naked Statistics" by Charles Wheelan. Shortform has the world's best summaries and analyses of books you should be reading.

Like this article? Sign up for a free trial here .

What is the central limit theorem in statistics? What can the central limit theorem tell us about the distribution of the sample mean?

The central limit theorem states that the mean of a representative sample will be close to the mean of the larger population. Therefore, we can confidently make inferences about a population from a sample or about a sample from a population, and we can compare samples to each other.

Let’s explore each of these general applications of inferential statistics with an example.

Making Inferences About a Population From a Sample

The central limit theorem in statistics allows us to ask research questions that would otherwise be unanswerable.

For example, many people believe that participating in gymnastics at a young age stunts girls’ growth. You can’t possibly collect data on every female gymnast to answer this question. Still, with the central limit theorem and inferential statistics, you can use a sample of gymnasts to test the null hypothesis that: “There is no difference in height as an adult between female gymnasts and non-gymnasts.”

Explicit Sampling Methods

Including how samples are collected in a study is an important feature of ensuring replicability, a feature of high-quality research. If a study is replicable, other researchers can verify results, and trust in the study increases. In contrast, no one can verify the results when a study isn’t replicable, and the study becomes less trustworthy. Replication is why scientific papers include a materials and methods section.

For example, as an ornithologist reading a study about local bird populations, you might raise an eyebrow if the study reports a much higher population density for a particular species than you’d estimate for your region. Without an explicit description of sampling techniques, you’d have no idea how the researchers arrived at those numbers. Did they count the number of nests in a given quadrant? Did they sit in one spot for a given amount of time and count birds? Did they play bird sounds and tally responses? Without context for their sampling methods, the researchers’ results might become meaningless to your critical eye.

Making Inferences About a Sample From a Population

Since we know that samples look like their underlying population, we can also use the central limit theorem to make inferences about the composition of a sample taken from a given population. For example, say the organizers of a fun-run want to know how long they should give participants to finish their two-mile course. They can’t know the exact pace of each participant who will show up at the race, but they can use average running paces from the general population to assume that the majority of participants will finish the course between the 20- and 30-minute mark.

Marketing and Data

A common example of population-level data being used to make inferences about samples of people is targeted marketing. This trend also broadly applies to women and marketing. Data shows that female consumers make up to 80% of all purchasing decisions and that women make the plurality of couples’ decisions. Therefore, a great deal of current marketing targets women rather than men, with the logic being that ads reaching a sample of female shoppers will yield higher returns than ads reaching a sample of male shoppers.

The Central Limit Theorem: Statistics Applied