It’s not hard to locate in this data at least a faint echo of Freud’s Oedipal complex. He hypothesized a near-universal desire in childhood, which is later repressed, for sexual involvement with opposite-sex parents. If only the Viennese psychologist had lived long enough to turn his analytic skills to PornHub data, where interest in opposite-sex parents seems to be borne out by adults—with great explicitness—and little is repressed.
Of course, PornHub data can’t tell us for certain who people are fantasizing about when watching such videos. Are they actually imagining having sex with their own parents? Google searches can give some more clues that there are plenty of people with such desires.
Consider all searches of the form “I want to have sex with my . . .” The number one way to complete this search is “mom.” Overall, more than three-fourths of searches of this form are incestuous. And this is not due to the particular phrasing. Searches of the form “I am attracted to . . . ,” for example, are even more dominated by admissions of incestuous desires. Now I concede—at the risk of disappointing Herr Freud—that these are not particularly common searches: a few thousand people every year in the United States admitting an attraction to their mother. Someone would also have to break the news to Freud that Google searches, as will be discussed later in this book, sometimes skew toward the forbidden.
But still. There are plenty of inappropriate attractions that people have that I would have expected to have been mentioned more frequently in searches. Boss? Employee? Student? Therapist? Patient? Wife’s best friend? Daughter’s best friend? Wife’s sister? Best friend’s wife? None of these confessed desires can compete with mom. Maybe, combined with the PornHub data, that really does mean something.
And Freud’s general assertion that sexuality can be shaped by childhood experiences is supported elsewhere in Google and PornHub data, which reveals that men, at least, retain an inordinate number of fantasies related to childhood. According to searches from wives about their husbands, some of the top fetishes of adult men are the desire to wear diapers and wanting to be breastfed, particularly, as discussed earlier, in India. Moreover, cartoon porn—animated explicit sex scenes featuring characters from shows popular among adolescent boys—has achieved a high degree of popularity. Or consider the occupations of women most frequently searched for in porn by men. Men who are 18–24 years old search most frequently for women who are babysitters. As do 25–64-year-old men. And men 65 years and older. And for men in every age group, teacher and cheerleader are both in the top four. Clearly, the early years of life seem to play an outsize role in men’s adult fantasies.
I have not yet been able to use all this unprecedented data on adult sexuality to figure out precisely how sexual preferences form. Over the next few decades, other social scientists and I will be able to create new, falsifiable theories on adult sexuality and test them with actual data.
Already I can predict some basic themes that will undoubtedly be part of a data-based theory of adult sexuality. It is clearly not going to be the identical story to the one Freud told, with his particular, well-defined, universal stages of childhood and repression. But, based on my first look at PornHub data, I am absolutely certain the final verdict on adult sexuality will feature some key themes that Freud emphasized. Childhood will play a major role. So will mothers.
It likely would have been impossible to analyze Freud in this way ten years ago. It certainly would have been impossible eighty years ago, when Freud was still alive. So let’s think through why these data sources helped. This exercise can help us understand why Big Data is so powerful.
Remember, we have said that just having mounds and mounds of data by itself doesn’t automatically generate insights. Data size, by itself, is overrated. Why, then, is Big Data so powerful? Why will it create a revolution in how we see ourselves? There are, I claim, four unique powers of Big Data. This analysis of Freud provides a good illustration of them.
You may have noticed, to begin with, that we’re taking pornography seriously in this discussion of Freud. And we are going to utilize data from pornography frequently in this book. Somewhat surprisingly, porn data is rarely utilized by sociologists, most of whom are comfortable relying on the traditional survey datasets they have built their careers on. But a moment’s reflection shows that the widespread use of porn—and the search and views data that comes with it—is the most important development in our ability to understand human sexuality in, well . . . Actually, it’s probably the most important ever. It is data that Schopenhauer, Nietzsche, Freud, and Foucault would have drooled over. This data did not exist when they were alive. It did not exist a couple decades ago. It exists now. There are many unique data sources, on a range of topics, that give us windows into areas about which we could previously just guess. Offering up new types of data is the first power of Big Data.
The porn data and the Google search data are not just new; they are honest. In the pre-digital age, people hid their embarrassing thoughts from other people. In the digital age, they still hide them from other people, but not from the internet and in particular sites such as Google and PornHub, which protect their anonymity. These sites function as a sort of digital truth serum—hence our ability to uncover a widespread fascination with incest. Big Data allows us to finally see what people really want and really do, not what they say they want and say they do. Providing honest data is the second power of Big Data.
Because there is now so much data, there is meaningful information on even tiny slices of a population. We can compare, say, the number of people who dream of cucumbers versus those who dream of tomatoes. Allowing us to zoom in on small subsets of people is the third power of Big Data.
Big Data has one more impressive power—one that was not utilized in my quick study of Freud but could be in a future one: it allows us to undertake rapid, controlled experiments. This allows us to test for causality, not merely correlations. These kinds of tests are mostly used by businesses now, but they will prove a powerful tool for social scientists. Allowing us to do many causal experiments is the fourth power of Big Data.
Now it is time to unpack each of these powers and explore exactly why Big Data matters.
3
DATA REIMAGINED
At 6 A.M. on a particular Friday of every month, the streets of most of Manhattan will be largely desolate. The stores lining these streets will be closed, their fa?ades covered by steel security gates, the apartments above dark and silent.
The floors of Goldman Sachs, the global investment banking institution in lower Manhattan, on the other hand, will be brightly lit, its elevators taking thousands of workers to their desks. By 7 A.M. most of these desks will be occupied.