This article is an excerpt from the Shortform book guide to "Everybody Lies" by Seth Stephens-Davidowitz. Shortform has the world's best summaries and analyses of books you should be reading.
Like this article? Sign up for a free trial here .
What is the book Everybody Lies about? What should you take away from the book?
In Everybody Lies, Seth Stephens-Davidowitz argues that people willingly confess all of their secrets in their Google searches and other web activity. This information can be found through big data and can be used for the greater good.
Read below for a brief overview of the book Everybody Lies by Seth Stephens-Davidowitz.
Everybody Lies Book Overview
In his book Everybody Lies, Seth Stephens-Davidowitz explains big data’s potential to revolutionize social science research. The book’s central premise is that people reveal more about themselves when making web searches than they would ever reveal in public or in a traditional survey. Stephens-Davidowitz argues that by harnessing data from search results and similar sources, scientists have access to all new insights into issues like sexuality, racism, and health. He suggests that these insights can inform better social policies, improve institutions like education and health care, promote social equity, and bring hidden injustice to light.
Stephens-Davidowitz has a Ph.D. in economics and has worked as a data scientist at Google and as a contributor to TheNew York Times. Everybody Lies draws on his research using Google search results as well as data from PornHub, Wikipedia, and more. Though the book contains many surprising and interesting findings from this research, its real purpose is to explain the benefits—and drawbacks—of big data research. Stephens-Davidowitz says he hopes to inspire readers to enter the field of data science, much as Steven Levitt and Stephen J. Dubner’s Freakonomics inspired him to do the same.
Before we get into the specifics of how to use big data well, let’s look at the bigger picture of what big data is and why Stephens-Davidowitz says we should care about it. Though data science might seem arcane, Stephens-Davidowitz argues that it’s really an extension of our natural intuition and of the kinds of studies social scientists have always been interested in.
What Is “Big Data”?
Stephens-Davidowitz explicitly refuses to define “big data,” arguing that the only way to do so is by assigning an arbitrary numerical cutoff—in other words, by deciding that if you have at least X data points, you have big data. While it’s fair to say that big data is a somewhat fluid concept, other data experts accept that there are some key characteristics that we can use to pin down “big data.” Chief among these characteristics are the “three Vs”:
- Volume: The sheer amount of data. As you’d expect, big data implies very large data sets, typically measured in terabytes or petabytes. In refusing to define big data, Stephens-Davidowitz is mostly refusing to identify a specific volume at which data becomes big.
- Velocity: With big data, new information comes in fast. If your data set consists of Twitter posts, for example, you’re taking in an average of 6,000 new tweets per second. Note that not all of the data Stephens-Davidowitz uses is high velocity. Google search results are; historical census databases aren’t. But as we’ll see later in this guide, big data techniques can yield new insights into older data that has accumulated gradually over time.
- Variety: Big data comes in many forms (text, video, audio, and so on) and doesn’t fit neatly into a standardized database. Stephens-Davidowitz does talk about data variety, as we’ll see, though he doesn’t explicitly say that variety is one of the defining characteristics of big data.
How to Use Data Well
Despite all of these potential advantages, Stephens-Davidowitz acknowledges that it’s easy to use big data ineffectively—for example, by obsessing over the sheer size of your dataset without thinking about what that data can actually do for you. To get the most out of big data, Stephens-Davidowitz says you should focus on its four main benefits: new types of information, unprecedented honesty, high resolution, and easy cause-effect analysis.
Data’s Drawbacks and Dangers
Even though Stephens-Davidowitz is openly enthusiastic about data studies, he’s aware that data has drawbacks and limitations and can lead to great harm if used unethically. Stephens-Davidowitz identifies two drawbacks (false corrections and data for data’s sake) and two dangers (exploitation and minority reports).
———End of Preview———
Like what you just read? Read the rest of the world's best book summary and analysis of Seth Stephens-Davidowitz's "Everybody Lies" at Shortform .
Here's what you'll find in our full Everybody Lies summary :
- How people confess their darkest secrets to Google search
- How this "big data" can be used in lieu of voluntary surveys
- The unethical uses and limitations of big data