This article is an excerpt from the Shortform book guide to "Naked Statistics" by Charles Wheelan. Shortform has the world's best summaries and analyses of books you should be reading.

What is central tendency in statistics? What are the different ways to measure central tendency?

Central tendency is a descriptive statistic that represents the middle of a data set. There are three main statistical measures of central tendency: the mean, median, and mode. Each of these measures describes a slightly different central position within a data set.

Let’s examine the three statistical measures of central tendency.

## Central Tendency: The “Middle” of a Data Set

Some of the most basic descriptive statistics are measures of central tendency, or what Wheelan refers to as the “middle” of a data set.

We talk about averages, one measure of central tendency, all the time. But as we’ll see, there are two main ways to communicate the midpoint of a data set: the mean (what we usually refer to as the average) and the median. As statistics students, we should understand the difference between the two and when to use one over the other.

### The Mean (Average)

The average, or mean, of a data set is the sum of all of the values in the data set divided by the number of data points.

For example: If you wanted to know the average number of cookies you eat each time you open a package, you would keep track of the number of cookies you eat at each sitting and divide that number by the number of cookie-eating events.

The sum of the values in your data set: 15+8+6+10+9= 48

Divided by the number of data points (5): 48/5= 9.6

You average 9.6 cookies per sitting.

#### Limitations of Using the Mean

Wheelan cautions that the mean can be a misleading figure because it doesn’t convey the influence of outliers in a data set. (An outlier is a data point that is numerically far from others in the same data set.) In other words, a few “extreme” pieces of data can skew the mean in either direction, giving us a warped sense of the average.

For example, a store manager may report that her average monthly sales of Easter egg chocolates totaled \$300 over the last year. However, her monthly sales data shows that she sold \$3,000-worth of chocolate eggs in April, while sales for the other 11 months totaled between zero and \$25. In this data set, the month of April is an outlier, and the mean of \$300 doesn’t provide the truest picture of average chocolate egg sales for the store.

### The Median

The median is another way to measure central tendency and is not influenced by outliers. The median takes an ordered data set (where the values are organized into ascending order) and divides it in half. The median is the middle value of a data set (or the average of the two middle values if the data set has an even number of data points).

Back to our chocolate eggs example, our ordered data set might look like this:

To calculate the median, we take the average of four and 10, which is seven. So the median chocolate egg sales figure is \$7, which is a very different figure from the mean of \$300, even though both are measures of central tendency.

### Communicating Central Tendency

Since the mean and median can communicate different messages about the “middle” of a dataset, it’s important to keep the difference between them in mind when communicating and interpreting statistics. Wheelan explains that it’s common for people to share the mean instead of the median, or vice versa, to suit their goals.

For example, say the beach authorities at a fictional beach were collecting data on the number of jellyfish stings swimmers suffered each week throughout the summer. The data might look something like this:

(Shortform note: In this example, the dataset is naturally ordered, so we don’t need to order it to determine the median.)

The mean number of jellyfish stings is 42. The median number of stings is zero. Beach authorities could either say:

A) “Visit our beach! The mean number of weekly stings per 500 swimmers throughout the summer is only 42!”

or

B) “Visit our beach! The median number of weekly stings throughout the summer is zero!”

Neither of these statements is incorrect, but they convey a different message to prospective swimmers. The beach authorities are sure to advertise option B over option A because option B makes the beach look more attractive.

This example highlights two of Wheelan’s cautions about descriptive statistics.

First, neither the mean nor the median tells prospective visitors the “story” behind the dataset, which suggests a “jellyfish season” at the beach that might be worth planning a visit around. But again, when we condense the real world into a statistic, this nuance is lost.

Second, this example showcases how statistics make it possible to mislead people without actually lying. Many readers would likely read option B and interpret it as reassurance that they can visit the beach at any time and are highly unlikely to get stung by a jellyfish. As we can see, in mid-September, this is simply not true.

Statistical Measures of Central Tendency Explained

### ———End of Preview———

#### Like what you just read? Read the rest of the world's best book summary and analysis of Charles Wheelan's "Naked Statistics" at Shortform .

Here's what you'll find in our full Naked Statistics summary :

• An explanation and breakdown of statistics into digestible terms
• How statistics can inform collective decision-making
• Why learning statistics is an exercise in self-empowerment

#### Darya Sinusoid

Darya’s love for reading started with fantasy novels (The LOTR trilogy is still her all-time-favorite). Growing up, however, she found herself transitioning to non-fiction, psychological, and self-help books. She has a degree in Psychology and a deep passion for the subject. She likes reading research-informed books that distill the workings of the human brain/mind/consciousness and thinking of ways to apply the insights to her own life. Some of her favorites include Thinking, Fast and Slow, How We Decide, and The Wisdom of the Enneagram.