What Is Selection Bias in Research?—Explained

This article is an excerpt from the Shortform book guide to "Naked Statistics" by Charles Wheelan. Shortform has the world's best summaries and analyses of books you should be reading.

Like this article? Sign up for a free trial here .

What is selection bias in research methodology? How does selection bias affect research findings?

Selection bias occurs when individuals chosen to partake in a study are not representative of the population of interest. Selection bias can be subtle—if researchers aren’t cognizant of selection bias when developing data collection methods, the fact that a sample is not truly random might go unnoticed.

Here’s why it’s important to watch out for selection bias when collecting data for the purposes of research.

Selection bias

What is selection bias in research? Selection bias happens when our sample is not random, and certain subsets of the population are over- or underrepresented. Selection bias can be subtle. If researchers are not cognizant of selection bias when developing data collection methods, the fact that a sample is not truly random might go unnoticed.

For example: Say you wanted to collect data on people’s political leanings before an election, and you decided to collect your data at an art show outside of town. You might think that your sample was random because the art show was a public event, the crowd was a mix of people from different parts of town, and people of all ages were represented. However, it’s likely that your data would be biased towards the opinions of wealthier residents because the people at the art show can afford the cars they used to drive out of town and the art for sale.

Selection bias can also happen when people are able to self-select into (or out of) a study. When we allow the people who feel strongly enough about a study to become the sample, our data is automatically skewed. For example, if you were to stand on the sidewalk with a banner promoting a local dog park and asked “random” people to take your survey, chances are that dog lovers strongly in favor of a dog park would be over-represented in your results, since they would take the time to come over.

The Modern Anti-Vax Movement

Wheelan references the anti-vax movement as an example of the detrimental effects of a lack of data literacy. But the study that sparked the current anti-vax movement is also an example of problematic selection bias.

Andrew Wakefield’s 1998 study published in The Lancet proposed an unfounded link between the MMR vaccine and autism, which promoted fear and suspicion among parents and largely precipitated the modern wave of anti-vaccination sentiment. Wakefield’s proposed link between vaccines and autism has been discredited in several research studies. However, the panic that his study caused remains.

In addition to Wakefield’s findings being discredited, his research methods have also drawn intense criticism. The study was problematic for several reasons, only two of which we’ll cover.

First, the sample size was 12 children; as Wheelan highlights throughout the book, large sample sizes increase the reliability of results. Additionally, the families of those 12 children were recruited into the study through an anti-MMR vaccine campaign. As Wheelan explains, in addition to being large, reliable samples should be random and representative. This precludes the self-selected of participants with a strong personal interest in the results of the study.

What Is Selection Bias in Research?—Explained