Representativeness heuristic

The representativeness heuristic is used when making judgments about the probability of an event being representional in character and essence of a known prototypical event. It is one of a group of heuristics (simple rules governing judgment or decision-making) proposed by psychologists Amos Tversky and Daniel Kahneman in the early 1970s as "the degree to which [an event] (i) is similar in essential characteristics to its parent population, and (ii) reflects the salient features of the process by which it is generated". The representativeness heuristic works by comparing an event to a prototype or stereotype that we already have in mind. For example, if we see a person who is dressed in eccentric clothes and reading a poetry book, we might be more likely to think that they are a poet than an accountant. This is because the person's appearance and behavior are more representative of the stereotype of a poet than an accountant.

The representativeness heuristic can be a useful shortcut in some cases, but it can also lead to errors in judgment. For example, if we only see a small sample of people from a particular group, we might overestimate the degree to which they are representative of the entire group. Heuristics are described as "judgmental shortcuts that generally get us where we need to go – and quickly – but at the cost of occasionally sending us off course." Heuristics are useful because they use effort-reduction and simplification in decision-making.

When people rely on representativeness to make judgments, they are likely to judge wrongly because the fact that something is more representative does not actually make it more likely. The representativeness heuristic is simply described as assessing similarity of objects and organizing them based around the category prototype (e.g., like goes with like, and causes and effects should resemble each other). This heuristic is used because it is an easy computation. The problem is that people overestimate its ability to accurately predict the likelihood of an event. Thus, it can result in neglect of relevant base rates and other cognitive biases.

Determinants of representativeness
The representativeness heuristic is more likely to be used when the judgement or decision to be made has certain factors.

Similarity
When judging the representativeness of a new stimulus/event, people usually pay attention to the degree of similarity between the stimulus/event and a standard/process. It is also important that those features be salient. Nilsson, Juslin, and Olsson (2008) found this to be influenced by the exemplar account of memory (concrete examples of a category are stored in memory) so that new instances were classified as representative if highly similar to a category as well as if frequently encountered. Several examples of similarity have been described in the representativeness heuristic literature. This research has focused on medical beliefs. People often believe that medical symptoms should resemble their causes or treatments. For example, people have long believed that ulcers were caused by stress, due to the representativeness heuristic, when in fact bacteria cause ulcers. In a similar line of thinking, in some alternative medicine beliefs patients have been encouraged to eat organ meat that corresponds to their medical disorder. Use of the representativeness heuristic can be seen in even simpler beliefs, such as the belief that eating fatty foods makes one fat. Even physicians may be swayed by the representativeness heuristic when judging similarity, in diagnoses, for example. The researcher found that clinicians use the representativeness heuristic in making diagnoses by judging how similar patients are to the stereotypical or prototypical patient with that disorder.

Randomness
Irregularity and local representativeness affect judgments of randomness. Things that do not appear to have any logical sequence are regarded as representative of randomness and thus more likely to occur. For example, THTHTH as a series of coin tosses would not be considered representative of randomly generated coin tosses as it is too well ordered.

Local representativeness is an assumption wherein people rely on the law of small numbers, whereby small samples are perceived to represent their population to the same extent as large samples. A small sample which appears randomly distributed would reinforce the belief, under the assumption of local representativeness, that the population is randomly distributed. Conversely, a small sample with a skewed distribution would weaken this belief. If a coin toss is repeated several times and the majority of the results consists of "heads", the assumption of local representativeness will cause the observer to believe the coin is biased toward "heads".

Tom W.
In a study done in 1973, Kahneman and Tversky divided their participants into three groups:


 * "Base-rate group", who were given the instructions: "Consider all the first-year graduate students in the U.S. today. Please write down your best guesses about the percentage of students who are now enrolled in the following nine fields of specialization." The nine fields given were business administration, computer science, engineering, humanities and education, law, library science, medicine, physical and life sciences, and social science and social work.
 * "Similarity group", who were given a personality sketch. "Tom W. is of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and by flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to feel little sympathy for other people and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense." The participants in this group were asked to rank the nine areas listed in part 1 in terms of how similar Tom W. is to the prototypical graduate student of each area.
 * "Prediction group", who were given the personality sketch described in 2, but were also given the information "The preceding personality sketch of Tom W. was written during Tom's senior year in high school by a psychologist, on the basis of projective tests. Tom W. is currently a graduate student. Please rank the following nine fields of graduate specialization in order of the likelihood that Tom W. is now a graduate student in each of these fields."

The judgments of likelihood were much closer for the judgments of similarity than for the estimated base rates. The findings supported the authors' predictions that people make predictions based on how representative something is (similar), rather than based on relative base rate information. For example, more than 95% of the participants said that Tom would be more likely to study computer science than education or humanities, when there were much higher base rate estimates for education and humanities than computer science.

The taxicab problem
In another study done by Tversky and Kahneman, subjects were given the following problem:

A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. 85% of the cabs in the city are Green and 15% are Blue.

A witness identified the cab as Blue. The court tested the reliability of the witness under the same circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colours 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue rather than Green knowing that this witness identified it as Blue?

Most subjects gave probabilities over 50%, and some gave answers over 80%. The correct answer, found using Bayes' theorem, is lower than these estimates:
 * There is a 12% probability (0.12 = 0.15 × 0.80) that the blue cab is (correctly) identified by the witness as blue.
 * There is a 17% probability (0.17 = 0.85 × 0.20) that the green cab is (incorrectly) identified by the witness as blue.
 * There is therefore a 29% probability (0.29 = 0.12 + 0.17) that the cab is identified by the witness as blue.
 * This results in a 41% probability (0.41 &asymp; 0.12 ÷ 0.29) that the cab identified as blue was actually blue.

This result can be achieved by Bayes' theorem which states:

$$P(B|I) = \frac{P(I | B)\, P(B)}{P(I)}.$$

where:

P(x) - a probability of x,

B - the cab was blue,

I - the cab is identified by the witness as blue,

B | I - the cab that is identified as blue, was blue,

I | B - the cab that was blue, is identified by the witness as blue.



Representativeness is cited in the similar effect of the gambler's fallacy, the regression fallacy and the conjunction fallacy.

Base rate neglect and base rate fallacy
The use of the representativeness heuristic will likely lead to violations of Bayes' Theorem:
 * $$P(H|D) = \frac{P(D | H)\, P(H)}{P(D)}.$$

However, judgments by representativeness only look at the resemblance between the hypothesis and the data, thus inverse probabilities are equated:

$$P(H|D)=P(D|H)$$

As can be seen, the base rate P(H) is ignored in this equation, leading to the base rate fallacy. A base rate is a phenomenon's basic rate of incidence. The base rate fallacy describes how people do not take the base rate of an event into account when solving probability problems. This was explicitly tested by Dawes, Mirels, Gold and Donahue (1993) who had people judge both the base rate of people who had a particular personality trait and the probability that a person who had a given personality trait had another one. For example, participants were asked how many people out of 100 answered true to the question "I am a conscientious person" and also, given that a person answered true to this question, how many would answer true to a different personality question. They found that participants equated inverse probabilities (e.g., $$P(conscientious|neurotic)=P(neurotic|conscientious)$$) even when it was obvious that they were not the same (the two questions were answered immediately after each other).

A medical example is described by Axelsson. Say a doctor performs a test that is 99% accurate, and you test positive for the disease. However, the incidence of the disease is 1/10,000. Your actual risk of having the disease is 1%, because the population of healthy people is so much larger than the disease. This statistic often surprises people, due to the base rate fallacy, as many people do not take the basic incidence into account when judging probability. Research by Maya Bar-Hillel (1980) suggests that perceived relevancy of information is vital to base-rate neglect: base rates are only included in judgments if they seem equally relevant to the other information.

Some research has explored base rate neglect in children, as there was a lack of understanding about how these judgment heuristics develop. The authors of one such study wanted to understand the development of the heuristic, if it differs between social judgments and other judgments, and whether children use base rates when they are not using the representativeness heuristic. The authors found that the use of the representativeness heuristic as a strategy begins early on and is consistent. The authors also found that children use idiosyncratic strategies to make social judgments initially, and use base rates more as they get older, but the use of the representativeness heuristic in the social arena also increase as they get older. The authors found that, among the children surveyed, base rates were more readily used in judgments about objects than in social judgments. After that research was conducted, Davidson (1995) was interested in exploring how the representativeness heuristic and conjunction fallacy in children related to children's stereotyping. Consistent with previous research, children based their responses to problems off of base rates when the problems contained nonstereotypic information or when the children were older. There was also evidence that children commit the conjunction fallacy. Finally, as students get older, they used the representativeness heuristic on stereotyped problems, and so made judgments consistent with stereotypes. There is evidence that even children use the representativeness heuristic, commit the conjunction fallacy, and disregard base rates.

Research suggests that use or neglect of base rates can be influenced by how the problem is presented, which reminds us that the representativeness heuristic is not a "general, all purpose heuristic", but may have many contributing factors. Base rates may be neglected more often when the information presented is not causal. Base rates are used less if there is relevant individuating information. Groups have been found to neglect base rate more than individuals do. Use of base rates differs based on context. Research on use of base rates has been inconsistent, with some authors suggesting a new model is necessary.

Conjunction fallacy
A group of undergraduates were provided with a description of Linda, modelled to be representative of an active feminist. Then participants were then asked to evaluate the probability of her being a feminist, the probability of her being a bank teller, or the probability of being both a bank teller and feminist. Probability theory dictates that the probability of being both a bank teller and feminist (the conjunction of two sets) must be less than or equal to the probability of being either a feminist or a bank teller. . A conjunction cannot be more probable than one of its constituents. However, participants judged the conjunction (bank teller and feminist) as being more probable than being a bank teller alone. Some research suggests that the conjunction error may partially be due to subtle linguistic factors, such as inexplicit wording or semantic interpretation of "probability". The authors argue that both logic and language use may relate to the error, and it should be more fully investigated.

Disjunction fallacy
From probability theory the disjunction of two events is at least as likely as either of the events individually. For example, the probability of being either a physics or biology major is at least as likely as being a physics major, if not more likely. However, when a personality description (data) seems to be very representative of a physics major (e.g., pocket protector) over a biology major, people judge that it is more likely for this person to be a physics major than a natural sciences major (which is a superset of physics).

Evidence that the representativeness heuristic may cause the disjunction fallacy comes from Bar-Hillel and Neter (1993). They found that people judge a person who is highly representative of being a statistics major (e.g., highly intelligent, does math competitions) as being more likely to be a statistics major than a social sciences major (superset of statistics), but they do not think that he is more likely to be a Hebrew language major than a humanities major (superset of Hebrew language). Thus, only when the person seems highly representative of a category is that category judged as more probable than its superordinate category. These incorrect appraisals remained even in the face of losing real money in bets on probabilities.

Insensitivity to sample size
Representativeness heuristic is also employed when subjects estimate the probability of a specific parameter of a sample. If the parameter highly represents the population, the parameter is often given a high probability. This estimation process usually ignores the impact of the sample size.

A concept proposed by Tversky and Kahneman provides an example of this bias in a problem about two hospitals of differing size.

"Approximately 45 babies are born in the large hospital while 15 babies are born in the small hospital. Half (50%) of all babies born in general are boys. However, the percentage changes from 1 day to another. For a 1-year period, each hospital recorded the days on which >60% of the babies born were boys. The question posed is: Which hospital do you think recorded more such days?


 * The larger hospital (21)
 * The smaller hospital (21)
 * About the same (that is, within 5% of each other) (53)"

The values shown in parentheses are the number of students choosing each answer.

The results show that more than half the respondents selected the wrong answer (third option). This is due to the respondents ignoring the effect of sample size. The respondents selected the third option most likely because the same statistic represents both the large and small hospitals. According to statistical theory, a small sample size allows the statistical parameter to deviate considerably compared to a large sample. Therefore, the large hospital would have a higher probability to stay close to the nominal value of 50%.