Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

After signing my book contract, I had a clear vision of how the book should be structured. Near the start, you may recall, I described a scene at my family’s Thanksgiving table. My family members debated my sanity and tried to figure out why I, at thirty-three, couldn’t seem to find the right girl.

The conclusion to this book, then, practically wrote itself. I would meet and marry the girl. Better still, I would use Big Data to meet the right girl. Perhaps I could weave in tidbits from the courting process throughout. Then the story would all come together in the conclusion, which would describe my wedding day and double as a love letter to my new wife.

Unfortunately, life didn’t match my vision. Locking myself in my apartment and avoiding the world while writing a book probably didn’t help my romantic life. And I, alas, still need to find a wife. More important, I needed a new conclusion.

I pored over many of my favorite books in trying to find what makes a great conclusion. The best conclusions, I concluded, bring to the surface an important point that has been there all along, hovering just beneath the surface. For this book, that big point is this: social science is becoming a real science. And this new, real science is poised to improve our lives.

In the beginning of Part II, I discussed Karl Popper’s critique of Sigmund Freud. Popper, I noted, didn’t think that Freud’s wacky vision of the world was scientific. But I didn’t mention something about Popper’s critique. It was actually far broader than just an attack on Freud. Popper didn’t think any social scientist was particularly scientific. Popper was simply unimpressed with the rigor of what these so-called scientists were doing.

What motivated Popper’s crusade? When he interacted with the best intellectuals of his day—the best physicists, the best historians, the best psychologists—Popper noted a striking difference. When the physicists talked, Popper believed in what they were doing. Sure, they sometimes made mistakes. Sure, they sometimes were fooled by their subconscious biases. But physicists were engaged in a process that was clearly finding deep truths about the world, culminating in Einstein’s Theory of Relativity. When the world’s most famous social scientists talked, in contrast, Popper thought he was listening to a bunch of gobbledygook.

Popper is hardly the only person to have made this distinction. Just about everybody agrees that physicists, biologists, and chemists are real scientists. They utilize rigorous experiments to find how the physical world works. In contrast, many people think that economists, sociologists, and psychologists are soft scientists who throw around meaningless jargon so they can get tenure.

To the extent this was ever true, the Big Data revolution has changed that. If Karl Popper were alive today and attended a presentation by Raj Chetty, Jesse Shapiro, Esther Duflo, or (humor me) myself, I strongly suspect he would not have the same reaction he had back then. To be honest, he might be more likely to question whether today’s great string theorists are truly scientific or just engaging in self-indulgent mental gymnastics.

If a violent movie comes to a city, does crime go up or down? If more people are exposed to an ad, do more people use the product? If a baseball team wins when a boy is twenty, will he be more likely to root for them when he’s forty? These are all clear questions with clear yes-or-no answers. And in the mountains of honest data, we can find them.

This is the stuff of science, not pseudoscience.

This does not mean the social science revolution will come in the form of simple, timeless laws.

Marvin Minsky, the late MIT scientist and one of the first to study the possibility of artificial intelligence, suggested that psychology got off track by trying to copy physics. Physics had success finding simple laws that held in all times and all places.

Human brains, Minsky suggested, may not be subject to such laws. The brain, instead, is likely a complex system of hacks—one part correcting mistakes in other parts. The economy and political system may be similarly complex.

For this reason, the social science revolution is unlikely to come in the form of neat formulas, such as E = MC2. In fact, if someone is claiming a social science revolution based on a neat formula, you should be skeptical.

The revolution, instead, will come piecemeal, study by study, finding by finding. Slowly, we will get a better understanding of the complex systems of the human mind and society.


A proper conclusion sums up, but it also points the way to more things to come.

For this book, that’s easy. The datasets I have discussed herein are revolutionary, but they have barely been explored. There is so much more to be learned. Frankly, the overwhelming majority of academics have ignored the data explosion caused by the digital age. The world’s most famous sex researchers stick with the tried and true. They ask a few hundred subjects about their desires; they don’t ask sites like PornHub for their data. The world’s most famous linguists analyze individual texts; they largely ignore the patterns revealed in billions of books. The methodologies taught to graduate students in psychology, political science, and sociology have been, for the most part, untouched by the digital revolution. The broad, mostly unexplored terrain opened by the data explosion has been left to a small number of forward-thinking professors, rebellious grad students, and hobbyists.

That will change.

For every idea I have talked about in this book, there are a hundred ideas just as important ready to be tackled. The research discussed here is the tip of the tip of the iceberg, a scratch on the scratch of the surface.

So what else is coming?

For one, a radical expansion of the methodology that was used in one of the most successful public health studies of all time. In the mid-nineteenth century, John Snow, a British physician, was interested in what was causing a cholera outbreak in London.

His ingenious idea: he mapped every cholera case in the city. When he did this, he found the disease was largely clustered around one particular water pump. This suggested the disease spread through germ-infested water, disproving the then-conventional idea that it spread through bad air.

Big Data—and the zooming in that it allows—makes this type of study easy. For any disease, we can explore Google search data or other digital health data. We can find if there are any tiny pockets of the world where prevalence of this disease is unusually high or unusually low. Then we can see what these places have in common. Is there something in the air? The water? The social norms?

Seth Stephens-Davidowitz's books