Flynn effect

The Flynn effect is the substantial and long-sustained increase in both fluid and crystallized intelligence test scores that were measured in many parts of the world over the 20th century, named after researcher James Flynn (1934–2020). When intelligence quotient (IQ) tests are initially standardized using a sample of test-takers, by convention the average of the test results is set to 100 and their standard deviation is set to 15 or 16 IQ points. When IQ tests are revised, they are again standardized using a new sample of test-takers, usually born more recently than the first; the average result is set to 100. When the new test subjects take the older tests, in almost every case their average scores are significantly above 100.

Test score increases have been continuous and approximately linear from the earliest years of testing to the present. For example, a study published in the year 2009 found that British children's average scores on the Raven's Progressive Matrices test rose by 14 IQ points from 1942 to 2008. Similar gains have been observed in many other countries in which IQ testing has long been widely used, including other Western European countries, as well as Japan and South Korea.

There are numerous proposed explanations of the Flynn effect, such as the rise in efficiency of education, along with skepticism concerning its implications. Similar improvements have been reported for semantic and episodic memory. Some research suggests that there may be an ongoing reversed Flynn effect (i.e., a decline in IQ scores) in Norway, Denmark, Australia, Britain, the Netherlands, Sweden, Finland, and German-speaking countries. This is said to have started in the 1990s and to be occurring despite the average performance of 15-year-olds in those same countries ranking above the international average on the OECD Programme for International Student Assessment in reading, mathematics, and science in 2000, 2003,  2006, 2009, 2012, 2015, and 2018. In certain cases, this apparent reversal may be due to cultural changes which render parts of intelligence tests obsolete. Meta-analyses indicate that, overall, the Flynn effect continues, either at the same rate, or at a slower rate in developed countries.

Origin of term
The Flynn effect is named for James Robert Flynn, who did much to document it and promote awareness of its implications. The term was coined by Richard Herrnstein and Charles Murray in their 1994 book The Bell Curve. Flynn stated that, if asked, he would have named the effect after Read D. Tuddenham who "was the first to present convincing evidence of massive gains on mental tests using a nationwide sample" in a 1948 article.

Although the general term for the phenomenon—referring to no researcher in particular—continues to be "secular rise in IQ scores", many textbooks on psychology and IQ testing have now followed the lead of Herrnstein and Murray in calling the phenomenon the Flynn effect.

Rise in IQ
IQ tests are updated periodically. For example, the Wechsler Intelligence Scale for Children (WISC), originally developed in 1949, was updated in 1974, 1991, 2003, and again in 2014. The revised versions are standardized based on the performance of test-takers in standardization samples. A standard score of IQ 100 is defined as the mean performance of the standardization sample. Thus one way to see changes in norms over time is to conduct a study in which the same test-takers take both an old and new version of the same test. Doing so confirms IQ gains over time. Some IQ tests, for example, tests used for military draftees in NATO countries in Europe, report raw scores, and those also confirm a trend of rising scores over time. The average rate of increase seems to be about three IQ points per decade in the United States, as scaled by the Wechsler tests. The increasing test performance over time appears on every major test, in every age range, at every ability level, and in every modern industrialized country, although not necessarily at the same rate as in the United States. The increase was continuous and roughly linear from the earliest days of testing to the mid-1990s. Though the effect is most associated with IQ increases, a similar effect has been found with increases in attention and of semantic and episodic memory.

Ulric Neisser estimated that using the IQ values of 1997, the average IQ of the United States in 1932, according to the first Stanford–Binet Intelligence Scales standardization sample, was 80. Neisser states that "Hardly any of them would have scored 'very superior', but nearly one-quarter would have appeared to be 'deficient.'" He also wrote that "Test scores are certainly going up all over the world, but whether intelligence itself has risen remains controversial."

Trahan et al. (2014) found that the effect was about 2.93 points per decade, based on both Stanford–Binet and Wechsler tests; they also found no evidence the effect was diminishing. In contrast, Pietschnig and Voracek (2015) reported, in their meta-analysis of studies involving nearly 4 million participants, that the Flynn effect had decreased in recent decades. They also reported that the magnitude of the effect was different for different types of intelligence ("0.41, 0.30, 0.28, and 0.21 IQ points annually for fluid, spatial, full-scale, and crystallized IQ test performance, respectively"), and that the effect was stronger for adults than for children.

Raven (2000) found that, as Flynn suggested, data interpreted as showing a decrease in many abilities with increasing age must be re-interpreted as showing that there has been a dramatic increase of these abilities with the date of birth. On many tests this occurs at all levels of ability.

Some studies have found the gains of the Flynn effect to be particularly concentrated at the lower end of the distribution. Teasdale and Owen (1989), for example, found the effect primarily reduced the number of low-end scores, resulting in an increased number of moderately high scores, with no increase in very high scores. In another study, two large samples of Spanish children were assessed with a 30-year gap. Comparison of the IQ distributions indicated that the mean IQ scores on the test had increased by 9.7 points (the Flynn effect), the gains were concentrated in the lower half of the distribution and negligible in the top half, and the gains gradually decreased as the IQ of the individuals increased. Some studies have found a reverse Flynn effect with declining scores for those with high IQ.

In 1987, Flynn took the position that the very large increase indicates that IQ tests do not measure intelligence but only a minor sort of "abstract problem-solving ability" with little practical significance. He argued that if IQ gains did reflect intelligence increases, there would have been consequent changes of our society that have not been observed (a presumed non-occurrence of a "cultural renaissance"). By 2012 Flynn no longer endorsed this view of intelligence, having elaborated and refined his view of what rising IQ scores meant.

Precursors to Flynn's publications
Earlier investigators had discovered rises in raw IQ test scores in some study populations, but had not published general investigations of that issue in particular. Historian Daniel C. Calhoun cited earlier psychology literature on IQ score trends in his book The Intelligence of a People (1973). R. L. Thorndike drew attention to rises in Stanford-Binet scores in a 1975 review of the history of intelligence testing. In 1982, Richard Lynn recorded an increase in average IQ among the population of Japan.

Intelligence
There is debate about whether the rise in IQ scores also corresponds to a rise in general intelligence, or only a rise in special skills related to taking IQ tests. Because children attend school longer now and have become much more familiar with the testing of school-related material, one might expect the greatest gains to occur on such school content-related tests as vocabulary, arithmetic or general information. Just the opposite is the case: abilities such as these have experienced relatively small gains and even occasional decreases over the years. Meta-analytic findings indicate that Flynn effects occur for tests assessing both fluid and crystallized abilities. For example, Dutch conscripts gained 21 points during only 30 years, or 7 points per decade, between 1952 and 1982. This rise in IQ test scores is not wholly explained by an increase in general intelligence. Studies have shown that while test scores have improved over time, the improvement is not fully correlated with latent factors related to intelligence. Other researchers argue that the IQ gains described by the Flynn effect are due in part to increasing intelligence, and in part to increases in test-specific skills. One study suggested that the IQ gains reflected changes in modes of thinking that better reflected cognitive skills assessed by IQ tests rather than raw intelligence itself.

Schooling and test familiarity
The duration of average schooling has increased steadily. However, a criticism of this explanation is that if (in the United States) older and younger subjects, with similar educational levels, are compared together, then the IQ gains appear almost undiminished in each group compared to when they are considered individually.

Many studies find that children who do not attend school score drastically lower on the tests than their regularly attending peers. During the 1960s, when some Virginia counties closed their public schools to avoid racial integration, compensatory private schooling was available only for White children. On average, the scores of African-American children who received no formal education during that period decreased at a rate of about six IQ points per year.

Another explanation is an increased familiarity of the general population with tests and testing. For example, children who take the very same IQ test a second time usually gain five or six points. However, this seems to set an upper limit on the effects of test sophistication. One problem with this explanation and others related to schooling is that in the US, the groups with greater test familiarity show smaller IQ increases.

Early intervention programs have shown mixed results. Some preschool (ages 3–4) intervention programs like "Head Start" do not produce lasting changes of IQ, although they may confer other benefits. The "Abecedarian Early Intervention Project", an all-day program that provided various forms of environmental enrichment to children from infancy onward, showed IQ gains that did not diminish over time. The IQ gains in the experimental group compared to the control group was 4.4 points. These gains persisted until at least age 21.

Citing a high correlation between rising literacy rates and gains in IQ, David Marks has argued that the Flynn effect is caused by changes in literacy rates.

Generally more stimulating environment
Still another theory is that the general environment today is much more complex and stimulating. One of the most striking 20th-century changes in the human intellectual environment has come from the increase of exposure to many types of visual media. From pictures on the wall to movies to television to video games to computers, each successive generation has been exposed to richer optical displays than the one before and may have become more adept at visual analysis. This would explain why visual tests like the Raven's have shown the greatest increases. An increase only of particular forms of intelligence would explain why the Flynn effect has not caused a "cultural renaissance too great to be overlooked."

In 2001, William Dickens and James Flynn presented a model for resolving several contradictory findings regarding IQ. They argue that the measure "heritability" includes both a direct effect of the genotype on IQ and also indirect effects such that the genotype changes the environment, thereby affecting IQ. That is, those with a greater IQ tend to seek stimulating environments that further increase IQ. These reciprocal effects result in gene environment correlation. The direct effect could initially have been very small, but feedback can create large differences in IQ. In their model, an environmental stimulus can have a very great effect on IQ, even for adults, but this effect also decays over time unless the stimulus continues (the model could be adapted to include possible factors, like nutrition during early childhood, that may cause permanent effects). The Flynn effect can be explained by a generally more stimulating environment for all people. The authors suggest that any program designed to increase IQ may produce long-term IQ gains if that program teaches children how to replicate the types of cognitively demanding experiences that produce IQ gains outside the program. To maximize lifetime IQ, the programs should also motivate them to continue searching for cognitively demanding experiences after they have left the program.

Flynn in his 2007 book What Is Intelligence? further expanded on this theory. Environmental changes resulting from modernization—such as more intellectually demanding work, greater use of technology, and smaller families—have meant that a much larger proportion of people are more accustomed to manipulating abstract concepts such as hypotheses and categories than a century ago. Substantial portions of IQ tests deal with these abilities. Flynn gives, as an example, the question 'What do a dog and a rabbit have in common?' A modern respondent might say they are both mammals (an abstract, or a priori answer, which depends only on the meanings of the words dog and rabbit), whereas someone a century ago might have said that humans catch rabbits with dogs (a concrete, or a posteriori answer, which depended on what happened to be the case at that time).

Nutrition
Improved nutrition is another possible explanation. Today's average adult from an industrialized nation is taller than a comparable adult of a century ago. That increase of stature, likely the result of general improvements in nutrition and health, has been at a rate of more than a centimeter per decade. Available data suggest that these gains have been accompanied by analogous increases in head size, and by an increase in the average size of the brain. This argument had been thought to suffer the difficulty that groups who tend to be of smaller overall body size (e.g. women, or people of Asian ancestry) do not have lower average IQs.

A 2005 study presented data supporting the nutrition hypothesis, which predicts that gains will occur predominantly at the low end of the IQ distribution, where nutritional deprivation is probably most severe. An alternative interpretation of skewed IQ gains could be that improved education has been particularly important for this group.

A century ago, nutritional deficiencies may have limited body and organ functionality, including skull volume. The first two years of life are a critical time for nutrition. The consequences of malnutrition can be irreversible and may include poor cognitive development, educability, and future economic productivity. On the other hand, Flynn has pointed to 20-point gains on Dutch military (Raven's type) IQ tests between 1952, 1962, 1972, and 1982. In 1962 he observed that Dutch 18-year-olds had a major nutritional handicap. They were either in the womb or were recently born, during the great Dutch famine of 1944—when German troops monopolized food and 18,000 people died of starvation. Yet, concludes Flynn, "they do not show up even as a blip in the pattern of Dutch IQ gains. It is as if the famine had never occurred." It appears that the effects of diet are gradual, taking effect over decades (affecting mother as well as the child) rather than a few months.

In support of the nutritional hypothesis, it is known that, in the United States, the average height before 1900 was about 10 cm (~4 inches) shorter than it is today. Possibly related to the Flynn effect is a similar change of skull size and shape during the last 150 years. A Norwegian study found that height gains were strongly correlated with intelligence gains until the cessation of height gains in military conscript cohorts towards the end of the 1980s. Both height and skull size increases probably result from a combination of phenotypic plasticity and genetic selection over this period. With only five or six human generations in 150 years, time for natural selection has been very limited, suggesting that increased skeletal size resulting from changes in population phenotypes is more likely than recent genetic evolution.

It is well known that micronutrient deficiencies change the development of intelligence. For instance, one study has found that iodine deficiency causes a fall, on average, of 12 IQ points in China.

Scientists James Feyrer, Dimitra Politi, and David N. Weil have found in the U.S. that the proliferation of iodized salt increased IQ by 15 points in some areas. Journalist Max Nisen has stated that with this type of salt becoming popular, that "the aggregate effect has been extremely positive."

Daley et al. (2003) found a significant Flynn effect among children in rural Kenya, and concluded that nutrition was one of the hypothesized explanations that best explained their results (the others were parental literacy and family structure).

Infectious diseases
Eppig, Fincher, and Thornhill (2011) conducted a study looking at different US states found that states with a higher prevalence of infectious diseases had lower average IQ. The effect remained after controlling for the effects of wealth and educational variation.

Atheendar Venkataramani (2010) studied the effect of malaria on IQ in a sample of Mexicans. Malaria eradication during the birth year was associated with increases in IQ. It also increased the probability of employment in a skilled occupation. The author suggests that this may be one explanation for the Flynn effect and that this may be an important explanation for the link between national malaria burden and economic development. A literature review of 44 papers states that cognitive abilities and school performance were shown to be impaired in sub-groups of patients (with either cerebral malaria or uncomplicated malaria) when compared with healthy controls. Studies comparing cognitive functions before and after treatment for acute malarial illness continued to show significantly impaired school performance and cognitive abilities even after recovery. Malaria prophylaxis was shown to improve cognitive function and school performance in clinical trials when compared to placebo groups.

Heterosis
Heterosis, or hybrid vigor associated with historical reductions of the levels of inbreeding, has been proposed by Michael Mingroni as an alternative explanation of the Flynn effect. However, James Flynn has pointed out that even if everyone mated with a sibling in 1900, subsequent increases in heterosis would not be a sufficient explanation of the observed IQ gains.

Reduction of lead in gasoline
One study found the drop in blood lead levels in the United States from the 1970s to 2007 correlated with a 4-5 point increase in IQ.

Possible end of progression
Jon Martin Sundet and colleagues (2004) examined scores on intelligence tests given to Norwegian conscripts between the 1950s and 2002. They found that the increase of scores of general intelligence stopped after the mid-1990s and declined in numerical reasoning sub-tests.

Teasdale and Owen (2005) examined the results of IQ tests given to Danish male conscripts. Between 1959 and 1979 the gains were 3 points per decade. Between 1979 and 1989 the increase approached 2 IQ points. Between 1989 and 1998 the gain was about 1.3 points. Between 1998 and 2004 IQ declined by about the same amount as it gained between 1989 and 1998. They speculate that "a contributing factor in this recent fall could be a simultaneous decline in proportions of students entering 3-year advanced-level school programs for 16–18-year-olds." The same authors in a more comprehensive 2008 study, again on Danish male conscripts, found that there was a 1.5-point increase between 1988 and 1998, but a 1.5-point decrease between 1998 and 2003/2004.

In Australia, the IQ of 6–12 year olds as measured by the Colored Progressive Matrices has shown no increase from 1975 to 2003.

In the United Kingdom, a study by Flynn (2009) found that tests carried out in 1980 and again in 2008 show that the IQ score of an average 14-year-old dropped by more than two points over the period. For the upper half of the results, the performance was even worse. Average IQ scores declined by six points. However, children aged between five and 10 saw their IQs increase by up to half a point a year over the three decades. Flynn argues that the abnormal drop in British teenage IQ could be due to youth culture having "stagnated" or even dumbed down. Researcher Richard House, commenting on the study, also mentions the computer culture diminishing reading books as well as a tendency towards teaching to the test.

Bratsberg & Rogeberg (2018) present evidence that the Flynn effect in Norway has reversed between the years 1962-1991, and that both the original rise in mean IQ scores and their subsequent decline within this period can be observed within families consisting of native-born parents and their children, indicating that environmental factors were the likely cause for these changes. Because IQ data was only available for male Norwegians, who were subject to military conscription, years of schooling were used as an approximation for female IQ to support this conclusion.

One possible explanation of a worldwide decline in intelligence, suggested by the World Health Organization and the Forum of International Respiratory Societies' Environmental Committee, is an increase in air pollution, which now affects over 90% of the world's population.

IQ group differences
If the Flynn effect has ended in developed nations but continues in less developed ones, this would tend to diminish national differences in IQ scores.

Also, if the Flynn effect has ended for the majority in developed nations, it may still continue for minorities, especially for groups like immigrants where many may have received poor nutrition during early childhood or have had other disadvantages. A study in the Netherlands found that children of non-Western immigrants had improvements for g, educational achievements, and work proficiency compared to their parents, although there were still remaining differences compared to ethnic Dutch.

In the United States, the IQ gap between black and white people was gradually closing over the last decades of the 20th century, as black test-takers increased their average scores relative to white test-takers. For instance, Vincent reported in 1991 that the black–white IQ gap was decreasing among children, but that it was remaining constant among adults. Similarly, a 2006 study by Dickens and Flynn estimated that the difference between mean scores of black people and white people closed by about 5 or 6 IQ points between 1972 and 2002, a reduction of about one-third. In the same period, the educational achievement disparity also diminished. Reviews by Flynn and Dickens, Mackintosh, and Nisbett et al. all concluded that the gradual closing of the gap was a real phenomenon.

Flynn has commented that he never claimed that the Flynn effect has the same causes as observed differences in average IQ test performance between blacks and whites, but that it shows that environmental factors can create IQ differences of a magnitude similar to that gap. Wicherts et al. had previously suggested a similar interpretation in a 2004 paper. Flynn also argued that his findings undermine the so-called Spearman's hypothesis, which hypothesized that differences in g factor are the major driver of the blacks-whites IQ gap.