There are thousands of searches every year, for example, for “I hate cold weather,” “People are annoying,” and “I am sad.” Of course, those thousands of Google searches for “I am sad” represent only a tiny of fraction of the hundreds of millions of people who feel sad in a given year. Searches expressing thoughts, rather than looking for information, my research has found, are only made by a small sample of everyone for whom that thought comes to mind. Similarly, my research suggests that the seven thousand searches by Americans every year for “I regret having children” represent a small sample of those who have had that thought.
Kids are obviously a huge joy for many, probably most, people. And, despite my mom’s fear that “you and your stupid data analysis” are going to limit her number of grandchildren, this research has not changed my desire to have kids. But that unseemly regret is interesting—and another aspect of humanity that we tend not to see in the traditional datasets. Our culture is constantly flooding us with images of wonderful, happy families. Most people would never consider having children as something they might regret. But some do. They may admit this to no one—except Google.
THE TRUTH ABOUT SEX
How many American men are gay? This is a legendary question in sexuality research. Yet it has been among the toughest questions for social scientists to answer. Psychologists no longer believe Alfred Kinsey’s famous estimate—based on surveys that oversampled prisoners and prostitutes—that 10 percent of American men are gay. Representative surveys now tell us about 2 to 3 percent are. But sexual preference has long been among the subjects upon which people have tended to lie. I think I can use Big Data to give a better answer to this question than we have ever had.
First, more on that survey data. Surveys tell us there are far more gay men in tolerant states than intolerant states. For example, according to a Gallup survey, the proportion of the population that is gay is almost twice as high in Rhode Island, the state with the highest support for gay marriage, than Mississippi, the state with the lowest support for gay marriage.
There are two likely explanations for this. First, gay men born in intolerant states may move to tolerant states. Second, gay men in intolerant states may not divulge that they are gay; they are even more likely to lie.
Some insight into explanation number one—gay mobility—can be gleaned from another Big Data source: Facebook, which allows users to list what gender they are interested in. About 2.5 percent of male Facebook users who list a gender of interest say they are interested in men; that corresponds roughly with what the surveys indicate. And Facebook too shows big differences in the gay population in states with high versus low tolerance: Facebook has the gay population more than twice as high in Rhode Island as in Mississippi.
Facebook also can provide information on how people move around. I was able to code the hometown of a sample of openly gay Facebook users. This allowed me to directly estimate how many gay men move out of intolerant states into more tolerant parts of the country. The answer? There is clearly some mobility—from Oklahoma City to San Francisco, for example. But I estimate that men packing up their Judy Garland CDs and heading to someplace more open-minded can explain less than half of the difference in the openly gay population in tolerant versus intolerant states.*
In addition, Facebook allows us to focus in on high school students. This is a special group, because high school boys rarely get to choose where they live. If mobility explained the state-by-state differences in the openly gay population, these differences should not appear among high school users. So what does the high school data say? There are far fewer openly gay high school boys in intolerant states. Only two in one thousand male high school students in Mississippi are openly gay. So it ain’t just mobility.
If a similar number of gay men are born in every state and mobility cannot fully explain why some states have so many more openly gay men, the closet must be playing a big role. Which brings us back to Google, with which so many people have proved willing to share so much.
Might there be a way to use porn searches to test how many gay men there really are in different states? Indeed, there is. Countrywide, I estimate—using data from Google searches and Google AdWords—that about 5 percent of male porn searches are for gay-male porn. (These would include searches for such terms as “Rocket Tube,” a popular gay pornographic site, as well as “gay porn.”)
And how does this vary in different parts of the country? Overall, there are more gay porn searches in tolerant states compared to intolerant states. This makes sense, given that some gay men move out of intolerant places into tolerant places. But the differences are not nearly as large as the differences suggested by either surveys or Facebook. In Mississippi, I estimate that 4.8 percent of male porn searches are for gay porn, far higher than the numbers suggested by either surveys or Facebook and reasonably close to the 5.2 percent of pornography searches that are for gay porn in Rhode Island.
So how many American men are gay? This measure of pornography searches by men—roughly 5 percent are same-sex—seems a reasonable estimate of the true size of the gay population in the United States. And there is another, less straightforward way to get at this number. It requires some data science. We could utilize the relationship between tolerance and the openly gay population. Bear with me a bit here.
My preliminary research indicates that in a given state every 20 percentage points of support for gay marriage means about one and a half times as many men from that state will identify openly as gay on Facebook. Based on this, we can estimate how many men born in a hypothetically fully tolerant place—where, say, 100 percent of people supported gay marriage—would be openly gay. My estimate is about 5 percent would be, which fits the data from porn searches nicely. The closest we might have to growing up in a fully tolerant environment is high school boys in California’s Bay Area. About 4 percent of them are openly gay on Facebook. That seems in line with my calculation.
I should note that I have not yet been able to come up with an estimate of same-sex attraction for women. The pornography numbers are less useful here, since far fewer women watch pornography, making the sample less representative. And of those who do, even women who are primarily attracted to men in real life seem to enjoy viewing lesbian porn. Fully 20 percent of videos watched by women on PornHub are lesbian.
Five percent of American men being gay is an estimate, of course. Some men are bisexual; some—especially when young—are not sure what they are. Obviously, you can’t count this as precisely as you might the number of people who vote or attend a movie.