This article is an excerpt from the Shortform book guide to "How to Lie With Statistics" by Darrell Huff. Shortform has the world's best summaries and analyses of books you should be reading.
Like this article? Sign up for a free trial here.
How do liars manipulate statistics to make them more favorable? How can hard numbers possibly be skewed?
Manipulating statistics isn’t as difficult as one may think. Although statistics are hard numbers and lying about them isn’t legal, that doesn’t mean they can’t be skewed or framed in a way to make the presenter look better.
Keep reading for the top eight ways that statistics are manipulated.
How Liars Manipulate Statistics
When searching for the truth, statistics are appealing—they seem like hard, believable numbers, and they’re necessary for expressing certain information, such as census data. Manipulating statistics seems impossible, right? Wrong.
Statistics aren’t as objective as they seem. These people—advertisers, companies, anyone with an agenda—often don’t even have to actually lie. Statistics is a flexible enough field that would-be liars can make their case with implications, omissions, and distraction, rather than outright falsehoods.
Here are the top eight techniques that anybody with an agenda can use to manipulate statistics in their favor:
Technique #1: Citing Misleading “Averages”
The first technique is using the word “average” without specifying what kind of average a figure represents. Each kind is calculated differently and gives different information (and a different impression) about the data:
Average Type #1: Mean. This number is the result of adding up all the sample’s numbers and then dividing by the number of samples.
This is a useful average for liars to use because it allows them to:
- Make the number look bigger and better. If a university wants to attract students, the larger the average income of its graduates, the more attractive it looks to prospective students. Even if there are just a few high salaries, the math will make the mean look higher than any of the other averages.
- Hide inequality. If 90 employees at a company are paid $20,000 a year and the boss is paid $200,000, the mean is ((90*20,000)+(1*200,000))/91=21,978. The mean doesn’t show that one person is paid a lot more than everyone else.
In turn, hiding that they’re using the mean, by simply using the more general “average” to describe the figure, benefits liars by obscuring the fact that they’re using such an unreliable calculation.
Technique #2: Giving Precise Figures to Appear More Reputable
Another number-fudging technique is to include a decimal to make a figure look more precise and therefore reputable. (For example, reading that most people sleep 7.84 hours a night sounds a lot more impressive than “about eight hours.”)
Liars can get decimals by doing calculations (for example, calculating the mean) on inexact figures that weren’t measured to the decimal point.
Technique #3: Using Percentages to Hide Numbers and Calculations
Like decimals, giving percentages instead of raw figures can make numbers look more precise and reputable than they really are.
Here are some additional ways liars manipulate percentages and their associated terms for their gain:
1. Hiding raw numbers and small sample sizes. Percentages don’t give any indication of the absolute value of raw figures, so liars can use them to mask unfavorable numbers or suspiciously small sample sizes.
2. Using different bases. Because percentages don’t give any indication of the raw figures (bases) used to calculate them, liars can compare percentages calculated off different bases to distort their results.
Liars can also combine percentages and averages while manipulating bases to mask the real data even more. For example, if milk has gone down from $2 a pint to $1, but bread has gone up from $1 to $2, liars can massage percentage math and choose different bases to prove the cost of living has gone up or down, depending on their agenda. To show costs went up, they can decide that last year’s prices were the base (100%). Milk’s price has halved (50%) and bread’s price has doubled (200%). The average of 50% and 200% is 125%, so prices have increased by 25% since last year.
To show costs went down, they can decide that this year is the base year (100%). With this base, milk used to cost 200% more and bread cost 50% less—you get the same average of 125%, but since the base is different, it shows a decrease of 25% since last year.
3. Adding up percentages. Percentages aren’t numbers—you can’t meaningfully add or subtract them.
4. Giving percentage points instead of percentages to confuse people. Percentage points are the difference between two percentages. For instance, the difference between 5% and 7% is two percentage points. If a liar doesn’t want to report how much money her company made, and her return on investment was 3% last year and 6% this year, she might say “return on investment rose three percentage points.” A three-point increase sounds much smaller than a doubling, even though they mean the same thing in this case.
Technique #4: Using the Most Favorable Form
The fourth technique is to report numbers in whatever form most exaggerates or minimizes them; whichever will further a liar’s agenda. For example, return on sales, return on investment, and increase or decrease in profits are all ways of reporting how much money a company made. Most people won’t realize that each type of measure tells only part of the story. For example, if you buy a stock every morning for $99 and sell it in the afternoon for $100, you’re making only 1% on total sales, which doesn’t sound like a great return. However, over 30 days, you’re making 30% on total money invested—a much better-sounding prospect.
Technique #5: Omitting Statistical Qualifiers
The last way to fudge numbers is to leave out information that puts caveats on their accuracy or further explains them. There are four types of information liars often neglect to include with their figures:
1. Probable error. Probable error is a measure of how reliable a figure is, expressed as a range that the true result will fall between. (It’s impossible to find the single number that represents the true result because measuring systems aren’t perfectly accurate.) If you’re presented with a single figure, and aren’t given any indication of how reliable it is or what the probable error is, it may not be accurate at all.
2. Degree of significance. The degree of significance is a measure of how likely it is that results are due to chance. In most cases, for a figure to be statistically significant, the degree needs to be no more than 5%—this means that 95 out of 100 times, the results are real and not attributable to chance. If the degree isn’t given, it may be higher than 5%, which means the results could be due more to chance than anything else.
3. What the comparison is to. Some stats promise to “triple” the effectiveness of a product, or offer “25% more,” but don’t say what they’re compared against. A granola bar that contains 25% more protein than a competitor’s, versus a bar that contains 25% more protein than a rock, are two entirely different things.
4. Negligibility. While there may be mathematical differences between figures, sometimes, these differences are so small they don’t make any practical difference—but liars fail to point this out.
Technique #6: Citing Semi-Related Figures
If liars can’t prove something, sometimes, they’ll prove something else that sounds like it’s the same as what they were trying to prove.
In some cases, the semi-related figure can actually give a more accurate picture of the situation than the direct figure.
Technique #7: Attributing Correlation to Causation
The next technique involves pushing the idea that if there’s a relationship between two factors, one of them caused the other, and whichever factor is most favorable to a liar’s argument is the cause.
- For example, one study found that smokers got lower grades at college. A non-smoking activist with an agenda might report this as “If you stop smoking, your grades will improve.”
This is misleading because:
1. It’s often impossible to know which factor is the cause and which is the effect.
- For example, people struggling with the stress of bad grades could be driven to smoking for relief: In other words, bad grades could be the cause of smoking, not the effect of it.
Sometimes, the two factors are so interrelated they both act as both cause and effect.
2. Both factors may be effects of some other cause. While the relationship between the factors is real, the cause-and-effect is uncertain.
- For example, maybe the same people who smoke are the same people who have low grades because they like socializing more than studying.
3. The relationship between the two may be only due to chance.4. Even if there is a real cause-and-effect relationship, that doesn’t mean it applies to everyone. Correlations are tendencies.
- For example, while it’s fairly conclusive that people who get a post-secondary education have higher incomes than those who don’t, that doesn’t mean that you will make more money if you go to college than if you don’t. (It’s also unknown if the people who make more money and also went to college might still have higher salaries even if they hadn’t gone to college—college attracts bright, rich people who already had a better chance at making more money.)
5. Correlations can be caused by humans and trends, rather than the factor you think they’re caused by.
- For example, older women tend to walk with their toes farther apart than younger women. This is because posture trends changed over the years, not because women’s posture necessarily changes as they age (which is what some people may assume).
An advanced version of misleadingly attributing correlation to causation is to presume that the correlation extends beyond the data. For example, a study might find more rain results in better crops, and someone might assume that this correlation holds in all circumstances: in other words, that more and more rain always results in better crops. However, that relationship doesn’t hold forever—if the rain is so heavy it causes floods, the crops will suffer.
Technique #8: Manipulating Graphics
1. Truncate the graphs.To make changes look larger than they are, liars remove the empty space on a graph so that the part the data occupies is the only part shown. This will make the slope of a line look steeper, or the difference between bars look greater.
2. Add more divisions to the y-axis. Like truncation, this will visibly amplify the differences between measures.
3. Leave the graph labels and numbers out. To be meaningful, diagrams and graphs need labels and numbers, otherwise, it’s impossible to know what they show.
4. In bar graphs, use illustrations instead of bars. In a bar chart, the height of the bar is what indicates the measurement. When you replace a bar with an illustration—say, a bag of money—when you increase the height of the moneybag, all the other dimensions scale proportionally. Increasing the width and depth (if 3-D) of the image makes the differences between the two images look much larger.
5. In before-and-after photos, change multiple things about the subject after the before picture has been taken to make the change look more significant than it really is.
6. In maps, use the many variables to create visual illusions. Since maps include many features (legends, border, different-sized regions, and so on), they’re excellent tools for misdirection.
———End of Preview———
Like what you just read? Read the rest of the world's best book summary and analysis of Darrell Huff's "How to Lie With Statistics" at Shortform.
Here's what you'll find in our full How to Lie With Statistics summary:
- The 10 ways you might end up fooled by statistics
- How to differentiate between legitimate and lying statistics
- Why you can't even trust a graph