Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

This book is about Big Data, in general. But this chapter has mostly emphasized Google searches, which I have argued reveal a hidden world very different from the one we think we see. So are other Big Data sources digital truth serum, as well? The fact is, many Big Data sources, such as Facebook, are often the opposite of digital truth serum.

On social media, as in surveys, you have no incentive to tell the truth. On social media, much more so than in surveys, you have a large incentive to make yourself look good. Your online presence is not anonymous, after all. You are courting an audience and telling your friends, family members, colleagues, acquaintances, and strangers who you are.

To see how biased data pulled from social media can be, consider the relative popularity of the Atlantic, a respected, highbrow monthly magazine, versus the National Enquirer, a gossipy, often-sensational magazine. Both publications have similar average circulations, selling a few hundred thousand copies. (The National Enquirer is a weekly, so it actually sells more total copies.) There are also a comparable number of Google searches for each magazine.

However, on Facebook, roughly 1.5 million people either like the Atlantic or discuss articles from the Atlantic on their profiles. Only about 50,000 like the Enquirer or discuss its contents.

ATLANTIC VS. NATIONAL ENQUIRER POPULARITY COMPARED BY DIFFERENT SOURCES



Circulation

Roughly 1 Atlantic for every 1 National Enquirer



Google Searches

1 Atlantic for every 1 National Enquirer



Facebook Likes

27 Atlantic for every 1 National Enquirer



For assessing magazine popularity, circulation data is the ground truth. Google data comes close to matching it. And Facebook data is overwhelmingly biased against the trashy tabloid, making it the worst data for determining what people really like.

And as with reading preferences, so with life. On Facebook, we show our cultivated selves, not our true selves. I use Facebook data in this book, in fact in this chapter, but always with this caveat in mind.


To gain a better understanding of what social media misses, let’s return to pornography for a moment. First, we need to address the common belief that the internet is dominated by smut. This isn’t true. The majority of content on the internet is nonpornographic. For instance, of the top ten most visited websites, not one is pornographic. So the popularity of porn, while enormous, should not be overstated.

Yet, that said, taking a close look at how we like and share pornography makes it clear that Facebook, Instagram, and Twitter only provide a limited window into what’s truly popular on the internet. There are large subsets of the web that operate with massive popularity but little social presence.

The most popular video of all time, as of this writing, is Psy’s “Gangnam Style,” a goofy pop music video that satirizes trendy Koreans. It’s been viewed about 2.3 billion times on YouTube alone since its debut in 2012. And its popularity is clear no matter what site you are on. It’s been shared across different social media platforms tens of millions of times.

The most popular pornographic video of all time may be “Great Body, Great Sex, Great Blowjob.” It’s been viewed more than 80 million times. In other words, for every thirty views of “Gangnam Style,” there has been about at least one view of “Great Body, Great Sex, Great Blowjob.” If social media gave us an accurate view of the videos people watched, “Great Body, Great Sex, Great Blowjob” should be posted millions of times. But this video has been shared on social media only a few dozen times and always by porn stars, not by average users. People clearly do not feel the need to advertise their interest in this video to their friends.

Facebook is digital brag-to-my-friends-about-how-good-my-life-is serum. In Facebook world, the average adult seems to be happily married, vacationing in the Caribbean, and perusing the Atlantic. In the real world, a lot of people are angry, on supermarket checkout lines, peeking at the National Enquirer, ignoring the phone calls from their spouse, whom they haven’t slept with in years. In Facebook world, family life seems perfect. In the real world, family life is messy. It can occasionally be so messy that a small number of people even regret having children. In Facebook world, it seems every young adult is at a cool party Saturday night. In the real world, most are home alone, binge-watching shows on Netflix. In Facebook world, a girlfriend posts twenty-six happy pictures from her getaway with her boyfriend. In the real world, immediately after posting this, she Googles “my boyfriend won’t have sex with me.” And, perhaps at the same time, the boyfriend watches “Great Body, Great Sex, Great Blowjob.”





DIGITAL TRUTH





DIGITAL LIES




? Searches

? Social media posts



? Views

? Social media likes



? Clicks

? Dating profiles



? Swipes





THE TRUTH ABOUT YOUR CUSTOMERS




In the early morning of September 5, 2006, Facebook introduced a major update to its home page. The early versions of Facebook had only allowed users to click on profiles of their friends to learn what they were doing. The website, considered a big success, had at the time 9.4 million users.

But after months of hard work, engineers had created something they called “News Feed,” which would provide users with updates on the activities of all their friends.

Users immediately reported that they hated News Feed. Ben Parr, a Northwestern undergraduate, created “Students Against Facebook news feed.” He said that “news feed is just too creepy, too stalker-esque, and a feature that has to go.” Within a few days, the group had 700,000 members echoing Parr’s sentiment. One University of Michigan junior told the Michigan Daily, “I’m really creeped out by the new Facebook. It makes me feel like a stalker.”

David Kirkpatrick tells this story in his authorized account of the website’s history, The Facebook Effect: The Inside Story of the Company That Is Connecting the World. He dubs the introduction of News Feed “the biggest crisis Facebook has ever faced.” But Kirkpatrick reports that when he interviewed Mark Zuckerberg, cofounder and head of the rapidly growing company, the CEO was unfazed.

The reason? Zuckerberg had access to digital truth serum: numbers on people’s clicks and visits to Facebook. As Kirkpatrick writes:

Zuckerberg in fact knew that people liked the News Feed, no matter what they were saying in the groups. He had the data to prove it. People were spending more time on Facebook, on average, than before News Feed launched. And they were doing more there—dramatically more. In August, users viewed 12 billion pages on the service. But by October, with News Feed under way, they viewed 22 billion.

Seth Stephens-Davidowitz's books