The Undoing Project: A Friendship that Changed the World

At first glance it resembles a poem. What it was, in fact, was early fodder for his and Danny’s next article, which would also be their first attempt to put their thinking in such a way that it might directly influence the world outside of their discipline. Before returning to Israel, they had decided to write a paper about how people made predictions. The difference between a judgment and a prediction wasn’t as obvious to everyone as it was to Amos and Danny. To their way of thinking, a judgment (“he looks like a good Israeli army officer”) implies a prediction (“he will make a good Israeli army officer”), just as a prediction implies some judgment—without a judgment, how would you predict? In their minds, there was a distinction: A prediction is a judgment that involves uncertainty. “Adolf Hitler is an eloquent speaker” is a judgment you can’t do much about. “Adolf Hitler will become chancellor of Germany” is, at least until January 30, 1933, a prediction of an uncertain event that eventually will be proven either right or wrong. The title of their next paper was “On the Psychology of Prediction.” “In making predictions and judgments under uncertainty,” they wrote, “people do not appear to follow the calculus of chance or the statistical theory of prediction. Instead, they rely on a limited number of heuristics which sometimes yield reasonable judgments and sometimes lead to severe and systematic error.”

Viewed in hindsight, the paper looks to have more or less started with Danny’s experience in the Israeli army. The people in charge of vetting Israeli youth hadn’t been able to predict which of them would make good officers, and the people in charge of officer training school hadn’t been able to predict who among the group they were sent would succeed in combat, or even in the routine day-to-day business of leading troops. Danny and Amos had once had a fun evening trying to predict the future occupations of their friends’ small children, and had surprised themselves by the ease, and the confidence, with which they had done it. Now they sought to test how people predicted—or, rather, to dramatize how people used what they now called the representativeness heuristic to predict.

To do this, however, they needed to give them something to predict.

They decided to ask their subjects to predict the future of a student, identified only by some personality traits, who would go on to graduate school. Of the then nine major courses of graduate study in the United States, which would he pursue? They began by asking their subjects to estimate the percentage of students in each course of study. Here were their average guesses:

Business: 15 percent

Computer Science: 7 percent

Engineering: 9 percent

Humanities and Education: 20 percent

Law: 9 percent

Library Science: 3 percent

Medicine: 8 percent

Physical and Life Sciences: 12 percent

Social Science and Social Work: 17 percent

For anyone trying to predict which area of study any given person was in, those percentages should serve as a base rate. That is, if you knew nothing at all about a particular student, but knew that 15 percent of all graduate students were pursuing degrees in business administration, and were asked to predict the likelihood that the student in question was in business school, you should guess “15 percent.” Here was a useful way of thinking about base rates: They were what you would predict if you had no information at all.

Now Danny and Amos sought to dramatize what happened when you gave people some information. But what kind of information? Danny spent a day inside the Oregon Research Institute stewing over the question—and became so engrossed by his task that he stayed up all night creating what at the time seemed like the stereotype of a graduate student in computer science. He named him “Tom W.”

Tom W. is of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and by flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to have little feel and little sympathy for other people and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense.

They would ask one group of subjects—they called it the “similarity” group—to estimate how “similar” Tom was to the graduate students in each of the nine fields. That was simply to determine which field of study was most “representative” of Tom W.

Then they would hand a second group—what they called the “prediction” group—this additional information:

The preceding personality sketch of Tom W. was written during Tom’s senior year in high school by a psychologist, on the basis of projective tests. Tom W. is currently a graduate student. Please rank the following nine fields of graduate specialization in order of the likelihood that Tom W. is now a graduate student in each of these fields.

They would not only give their subjects the sketch but inform them that it was a far from reliable description of Tom W. That it had been written by a psychologist, for a start; they would further tell subjects that the assessment had been made years earlier. What Amos and Danny suspected—because they had tested it first on themselves—is that people would essentially leap from the similarity judgment (“that guy sounds like a computer scientist!”) to some prediction (“that guy must be a computer scientist!”) and ignore both the base rate (only 7 percent of all graduate students were computer scientists) and the dubious reliability of the character sketch.

The first person to arrive for work on the morning Danny finished his sketch was an Oregon researcher named Robyn Dawes. Dawes was trained in statistics and legendary for the rigor of his mind. Danny handed him the sketch of Tom W. “He read it over and he had a sly smile, as if he had figured it out,” said Danny. “And he said, ‘Computer scientist!’ After that I wasn’t worried about how the Oregon students would fare.”

The Oregon students presented with the problem simply ignored all objective data and went with their gut sense, and predicted with great certainty that Tom W. was a computer scientist. Having established that people would allow a stereotype to warp their judgment, Amos and Danny then wondered: If people are willing to make irrational predictions based on that sort of information, what kind of predictions might they make if we give them totally irrelevant information? As they played with this idea—they might increase people’s confidence in their predictions by giving them any information, however useless—the laughter to be heard from the other side of the closed door must have grown only more raucous. In the end, Danny created another character. This one he named “Dick”:

Dick is a 30 year old man. He is married with no children. A man of high ability and high motivation, he promises to be quite successful in his field. He is well liked by his colleagues.

Then they ran another experiment. It was a version of the book bag and poker chips experiment that Amos and Danny had argued about in Danny’s seminar at Hebrew University. They told their subjects that they had picked a person from a pool of 100 people, 70 of whom were engineers and 30 of whom were lawyers. Then they asked them: What is the likelihood that the selected person is a lawyer? The subjects correctly judged it to be 30 percent. And if you told them that you were doing the same thing, but from a pool that had 70 lawyers in it and 30 engineers, they said, correctly, that there was a 70 percent chance the person you’d plucked from it was a lawyer. But if you told them you had picked not just some nameless person but a guy named Dick, and read them Danny’s description of Dick—which contained no information whatsoever to help you guess what Dick did for a living—they guessed there was an equal chance that Dick was a lawyer or an engineer, no matter which pool he had emerged from. “Evidently, people respond differently when given no specific evidence and when given worthless evidence,” wrote Danny and Amos. “When no specific evidence is given, the prior probabilities are properly utilized; when worthless specific evidence is given, prior probabilities are ignored.”*

There was much more to “On the Psychology of Prediction”—for instance, they showed that the very factors that caused people to become more confident in their predictions also led those predictions to be less accurate. And in the end it returned to the problem that had interested Danny since he had first signed on to help the Israeli army rethink how it selected and trained incoming recruits:

Michael Lewis's books