AI winter

In the history of artificial intelligence, an AI winter is a period of reduced funding and interest in artificial intelligence research. The field has experienced several hype cycles, followed by disappointment and criticism, followed by funding cuts, followed by renewed interest years or even decades later.

The term first appeared in 1984 as the topic of a public debate at the annual meeting of AAAI (then called the "American Association of Artificial Intelligence"). Roger Schank and Marvin Minsky—two leading AI researchers who experienced the "winter" of the 1970s—warned the business community that enthusiasm for AI had spiraled out of control in the 1980s and that disappointment would certainly follow. They described a chain reaction, similar to a "nuclear winter", that would begin with pessimism in the AI community, followed by pessimism in the press, followed by a severe cutback in funding, followed by the end of serious research. Three years later the billion-dollar AI industry began to collapse.

There were two major winters approximately 1974–1980 and 1987–2000, and several smaller episodes, including the following:


 * 1966: failure of machine translation
 * 1969: criticism of perceptrons (early, single-layer artificial neural networks)
 * 1971–75: DARPA's frustration with the Speech Understanding Research program at Carnegie Mellon University
 * 1973: large decrease in AI research in the United Kingdom in response to the Lighthill report
 * 1973–74: DARPA's cutbacks to academic AI research in general
 * 1987: collapse of the LISP machine market
 * 1988: cancellation of new spending on AI by the Strategic Computing Initiative
 * 1990s: many expert systems were abandoned
 * 1990s: end of the Fifth Generation computer project's original goals

Enthusiasm and optimism about AI has generally increased since its low point in the early 1990s. Beginning about 2012, interest in artificial intelligence (and especially the sub-field of machine learning) from the research and corporate communities led to a dramatic increase in funding and investment, leading to the current AI boom.

Machine translation and the ALPAC report of 1966
Natural language processing (NLP) research has its roots in the early 1930s and began its existence with the work on machine translation (MT). However, significant advancements and applications began to emerge after the publication of Warren Weaver's influential memorandum in 1949. The memorandum generated great excitement within the research community. In the following years, notable events unfolded: IBM embarked on the development of the first machine, MIT appointed its first full-time professor in machine translation, and several conferences dedicated to MT took place. The culmination came with the public demonstration of the IBM-Georgetown machine, which garnered widespread attention in respected newspapers in 1954.

Just like all AI booms that have been followed by desperate AI winters, the media tended to exaggerate the significance of these developments. Headlines about the IBM-Georgetown experiment proclaimed phrases like "The bilingual machine," "Robot brain translates Russian into King's English," and "Polyglot brainchild." However, the actual demonstration involved the translation of a curated set of only 49 Russian sentences into English, with the machine's vocabulary limited to just 250 words. To put things into perspective, a 2006 study made by Paul Nation found that humans need a vocabulary of around 8,000 to 9,000-word families to comprehend written texts with 98% accuracy.

During the Cold War, the US government was particularly interested in the automatic, instant translation of Russian documents and scientific reports. The government aggressively supported efforts at machine translation starting in 1954. Another factor that propelled the field of mechanical translation was the interest shown by the Central Intelligence Agency (CIA). During that period, the CIA firmly believed in the importance of developing machine translation capabilities and supported such initiatives. They also recognized that this program had implications that extended beyond the interests of the CIA and the intelligence community.

At the outset, the researchers were optimistic. Noam Chomsky's new work in grammar was streamlining the translation process and there were "many predictions of imminent 'breakthroughs'". However, researchers had underestimated the profound difficulty of word-sense disambiguation. In order to translate a sentence, a machine needed to have some idea what the sentence was about, otherwise it made mistakes. An apocryphal example is "the spirit is willing but the flesh is weak." Translated back and forth with Russian, it became "the vodka is good but the meat is rotten." Later researchers would call this the commonsense knowledge problem.

By 1964, the National Research Council had become concerned about the lack of progress and formed the Automatic Language Processing Advisory Committee (ALPAC) to look into the problem. They concluded, in a famous 1966 report, that machine translation was more expensive, less accurate and slower than human translation. After spending some 20 million dollars, the NRC ended all support. Careers were destroyed and research ended.

Machine translation shared the same path with NLP from the rule-based approaches through the statistical approaches up to the neural network approaches, which have in 2023 culminated in large language models.

The failure of single-layer neural networks in 1969
Simple networks or circuits of connected units, including Walter Pitts and Warren McCulloch's neural network for logic and Marvin Minsky's SNARC system, have failed to deliver the promised results and were abandoned in the late 1950s. Following the success of programs such as the Logic Theorist and the General Problem Solver, algorithms for manipulating symbols seemed more promising at the time as means to achieve logical reasoning viewed at the time as the essence of intelligence, either natural or artificial.

Interest in perceptrons, invented by Frank Rosenblatt, was kept alive only by the sheer force of his personality. He optimistically predicted that the perceptron "may eventually be able to learn, make decisions, and translate languages". Mainstream research into perceptrons ended partially because the 1969 book Perceptrons by Marvin Minsky and Seymour Papert emphasized the limits of what perceptrons could do. While it was already known that multilayered perceptrons are not subject to the criticism, nobody in the 1960s knew how to train a multilayered perceptron. Backpropagation was still years away.

Major funding for projects neural network approaches was difficult to find in the 1970s and early 1980s. Important theoretical work continued despite the lack of funding. The "winter" of neural network approach came to an end in the middle 1980s, when the work of John Hopfield, David Rumelhart and others revived large scale interest. Rosenblatt did not live to see this, however, as he died in a boating accident shortly after Perceptrons was published.

The Lighthill report
In 1973, professor Sir James Lighthill was asked by the UK Parliament to evaluate the state of AI research in the United Kingdom. His report, now called the Lighthill report, criticized the utter failure of AI to achieve its "grandiose objectives". He concluded that nothing being done in AI could not be done in other sciences. He specifically mentioned the problem of "combinatorial explosion" or "intractability", which implied that many of AI's most successful algorithms would grind to a halt on real world problems and were only suitable for solving "toy" versions.

The report was contested in a debate broadcast in the BBC "Controversy" series in 1973. The debate "The general purpose robot is a mirage" from the Royal Institution was Lighthill versus the team of Donald Michie, John McCarthy and Richard Gregory. McCarthy later wrote that "the combinatorial explosion problem has been recognized in AI from the beginning".

The report led to the complete dismantling of AI research in the UK. AI research continued in only a few universities (Edinburgh, Essex and Sussex). Research would not revive on a large scale until 1983, when Alvey (a research project of the British Government) began to fund AI again from a war chest of £350 million in response to the Japanese Fifth Generation Project (see below). Alvey had a number of UK-only requirements which did not sit well internationally, especially with US partners, and lost Phase 2 funding.

DARPA's early 1970s funding cuts
During the 1960s, the Defense Advanced Research Projects Agency (then known as "ARPA", now known as "DARPA") provided millions of dollars for AI research with few strings attached. J. C. R. Licklider, the founding director of DARPA's computing division, believed in "funding people, not projects" and he and several successors allowed AI's leaders (such as Marvin Minsky, John McCarthy, Herbert A. Simon or Allen Newell) to spend it almost any way they liked.

This attitude changed after the passage of Mansfield Amendment in 1969, which required DARPA to fund "mission-oriented direct research, rather than basic undirected research". Pure undirected research of the kind that had gone on in the 1960s would no longer be funded by DARPA. Researchers now had to show that their work would soon produce some useful military technology. AI research proposals were held to a very high standard. The situation was not helped when the Lighthill report and DARPA's own study (the American Study Group) suggested that most AI research was unlikely to produce anything truly useful in the foreseeable future. DARPA's money was directed at specific projects with identifiable goals, such as autonomous tanks and battle management systems. By 1974, funding for AI projects was hard to find.

AI researcher Hans Moravec blamed the crisis on the unrealistic predictions of his colleagues: "Many researchers were caught up in a web of increasing exaggeration. Their initial promises to DARPA had been much too optimistic. Of course, what they delivered stopped considerably short of that. But they felt they couldn't in their next proposal promise less than in the first one, so they promised more." The result, Moravec claims, is that some of the staff at DARPA had lost patience with AI research. "It was literally phrased at DARPA that 'some of these people were going to be taught a lesson [by] having their two-million-dollar-a-year contracts cut to almost nothing!'" Moravec told Daniel Crevier.

While the autonomous tank project was a failure, the battle management system (the Dynamic Analysis and Replanning Tool) proved to be enormously successful, saving billions in the first Gulf War, repaying all of DARPAs investment in AI and justifying DARPA's pragmatic policy.

The SUR debacle
As described in:

"In 1971, the Defense Advanced Research Projects Agency (DARPA) began an ambitious five-year experiment in speech understanding. The goals of the project were to provide recognition of utterances from a limited vocabulary in near-real time. Three organizations finally demonstrated systems at the conclusion of the project in 1976. These were Carnegie-Mellon University (CMU), who actually demonstrated two system [HEARSAY-II and HARPY]; Bolt, Beranek and Newman (BBN); and System Development Corporation with Stanford Research Institute (SDC/SRI)"

"The system that came closest to satisfying the original project goals was the CMU HARPY system. The relatively high performance of the HARPY system was largely achieved through 'hard-wiring' information about possible utterances into the system's knowledge base. Although HARPY made some interesting contributions, its dependence on extensive pre-knowledge limited the applicability of the approach to other signal-understanding tasks."

DARPA was deeply disappointed with researchers working on the Speech Understanding Research program at Carnegie Mellon University. DARPA had hoped for, and felt it had been promised, a system that could respond to voice commands from a pilot. The SUR team had developed a system which could recognize spoken English, but only if the words were spoken in a particular order. DARPA felt it had been duped and, in 1974, they cancelled a three million dollar a year contract.

Many years later, several successful commercial speech recognition systems would use the technology developed by the Carnegie Mellon team (such as hidden Markov models) and the market for speech recognition systems would reach $4 billion by 2001.

For a description of Hearsay-II see Hearsay-II, The Hearsay-II Speech Understanding System: Integrating Knowledge to Resolve Uncertainty and A Retrospective View of the Hearsay-II Architecture which appear in Blackboard Systems

Reddy gives a review of progress in speech understanding at the end of the DARPA project in a 1976 article in Proceedings of the IEEE.

Contrary view
Thomas Haigh argues that activity in the domain of AI did not slow down, even as funding from DoD was being redirected, mostly in the wake of congressional legislation meant to separate military and academic activities. That indeed professional interest was growing throughout the 70s. Using the membership count of ACM's SIGART, the Special Interest Group on Artificial Intelligence, as a proxy for interest in the subject, the author writes: "(...) I located two data sources, neither of which supports the idea of a broadly based AI winter during the 1970s. One is membership of ACM's SIGART, the major venue for sharing news and research abstracts during the 1970s. When the Lighthill report was published in 1973 the fast-growing group had 1,241 members, approximately twice the level in 1969. The next five years are conventionally thought of as the darkest part of the first AI winter. Was the AI community shrinking? No! By mid-1978 SIGART membership had almost tripled, to 3,500. Not only was the group growing faster than ever, it was increasing proportionally faster than ACM as a whole which had begun to plateau (expanding by less than 50% over the entire period from 1969 to 1978). One in every 11 ACM members was in SIGART."

The collapse of the LISP machine market
In the 1980s, a form of AI program called an "expert system" was adopted by corporations around the world. The first commercial expert system was XCON, developed at Carnegie Mellon for Digital Equipment Corporation, and it was an enormous success: it was estimated to have saved the company 40 million dollars over just six years of operation. Corporations around the world began to develop and deploy expert systems and by 1985 they were spending over a billion dollars on AI, most of it to in-house AI departments. An industry grew up to support them, including software companies like Teknowledge and Intellicorp (KEE), and hardware companies like Symbolics and LISP Machines Inc. who built specialized computers, called LISP machines, that were optimized to process the programming language LISP, the preferred language for AI research in the USA.

In 1987, three years after Minsky and Schank's prediction, the market for specialized LISP-based AI hardware collapsed. Workstations by companies like Sun Microsystems offered a powerful alternative to LISP machines and companies like Lucid offered a LISP environment for this new class of workstations. The performance of these general workstations became an increasingly difficult challenge for LISP Machines. Companies like Lucid and Franz LISP offered increasingly powerful versions of LISP that were portable to all UNIX systems. For example, benchmarks were published showing workstations maintaining a performance advantage over LISP machines. Later desktop computers built by Apple and IBM would also offer a simpler and more popular architecture to run LISP applications on. By 1987, some of them had become as powerful as the more expensive LISP machines. The desktop computers had rule-based engines such as CLIPS available. These alternatives left consumers with no reason to buy an expensive machine specialized for running LISP. An entire industry worth half a billion dollars was replaced in a single year.

By the early 1990s, most commercial LISP companies had failed, including Symbolics, LISP Machines Inc., Lucid Inc., etc. Other companies, like Texas Instruments and Xerox, abandoned the field. A small number of customer companies (that is, companies using systems written in LISP and developed on LISP machine platforms) continued to maintain systems. In some cases, this maintenance involved the assumption of the resulting support work.

Slowdown in deployment of expert systems
By the early 1990s, the earliest successful expert systems, such as XCON, proved too expensive to maintain. They were difficult to update, they could not learn, they were "brittle" (i.e., they could make grotesque mistakes when given unusual inputs), and they fell prey to problems (such as the qualification problem) that had been identified years earlier in research in nonmonotonic logic. Expert systems proved useful, but only in a few special contexts. Another problem dealt with the computational hardness of truth maintenance efforts for general knowledge. KEE used an assumption-based approach supporting multiple-world scenarios that was difficult to understand and apply.

The few remaining expert system shell companies were eventually forced to downsize and search for new markets and software paradigms, like case-based reasoning or universal database access. The maturation of Common Lisp saved many systems such as ICAD which found application in knowledge-based engineering. Other systems, such as Intellicorp's KEE, moved from LISP to a C++ (variant) on the PC and helped establish object-oriented technology (including providing major support for the development of UML (see UML Partners).

The end of the Fifth Generation project
In 1981, the Japanese Ministry of International Trade and Industry set aside $850 million for the Fifth Generation computer project. Their objectives were to write programs and build machines that could carry on conversations, translate languages, interpret pictures, and reason like human beings. By 1991, the impressive list of goals penned in 1981 had not been met. According to HP Newquist in The Brain Makers, "On June 1, 1992, The Fifth Generation Project ended not with a successful roar, but with a whimper." As with other AI projects, expectations had run much higher than what was actually possible.

Strategic Computing Initiative cutbacks
In 1983, in response to the fifth generation project, DARPA again began to fund AI research through the Strategic Computing Initiative. As originally proposed the project would begin with practical, achievable goals, which even included artificial general intelligence as long-term objective. The program was under the direction of the Information Processing Technology Office (IPTO) and was also directed at supercomputing and microelectronics. By 1985 it had spent $100 million and 92 projects were underway at 60 institutions, half in industry, half in universities and government labs. AI research was well-funded by the SCI.

Jack Schwarz, who ascended to the leadership of IPTO in 1987, dismissed expert systems as "clever programming" and cut funding to AI "deeply and brutally", "eviscerating" SCI. Schwarz felt that DARPA should focus its funding only on those technologies which showed the most promise, in his words, DARPA should "surf", rather than "dog paddle", and he felt strongly AI was not "the next wave". Insiders in the program cited problems in communication, organization and integration. A few projects survived the funding cuts, including pilot's assistant and an autonomous land vehicle (which were never delivered) and the DART battle management system, which (as noted above) was successful.

AI winter of the 1990's and early 2000's
A survey of reports from the early 2000's suggests that AI's reputation was still poor:


 * Alex Castro, quoted in The Economist, 7 June 2007: "[Investors] were put off by the term 'voice recognition' which, like 'artificial intelligence', is associated with systems that have all too often failed to live up to their promises."
 * Patty Tascarella in Pittsburgh Business Times, 2006: "Some believe the word 'robotics' actually carries a stigma that hurts a company's chances at funding."
 * John Markoff in the New York Times, 2005: "At its low point, some computer scientists and software engineers avoided the term artificial intelligence for fear of being viewed as wild-eyed dreamers."

Many researchers in AI in the mid 2000's deliberately called their work by other names, such as informatics, machine learning, analytics, knowledge-based systems, business rules management, cognitive systems, intelligent systems, intelligent agents or computational intelligence, to indicate that their work emphasizes particular tools or is directed at a particular sub-problem. Although this may be partly because they consider their field to be fundamentally different from AI, it is also true that the new names help to procure funding by avoiding the stigma of false promises attached to the name "artificial intelligence".

In the late 1990's and early 21st century, AI technology became widely used as elements of larger systems, but the field is rarely credited for these successes. In 2006, Nick Bostrom explained that "a lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore." Rodney Brooks stated around the same time that "there's this stupid myth out there that AI has failed, but AI is around you every second of the day."

AI spring (2015-2020) followed by a short winter (2020-2022)
There was an unprecedented burst of research activity in AI from 2015 to 2020 (see Figure), with the publications mentioning the keyword "artificial intelligence" increasing from 169,000 in the year 2014 to 590,000 in the year 2019, showing massive double digit growth rates year over year, from 12% in the year 2016 to 45% in the year 2018, the growth being higher than 30% year on year from 2017 to 2019. A close parallel trend is reflected in publications mentioning the keywords "machine learning" and "neural network". This growth was unprecedented in both numbers and impact, having a revolutionizing effect on most AI applications.

The explosion in the worldwide availability of cheap, dynamically scalable, and large scale cloud computing potentially pumped this boom in AI research publications. The availability of cheap cloud computing meant that even college students could deploy and experiment with large AI models.

The explosive boom from 2015 to 2020 was followed by an equally spectacular decimation with the fall in the number of publications exceeding 33% year on year from 2021 to 2023. This was only the second instance in the history of AI research when publications fell by more than 20%, the first one occurring during the first AI winter.

The causal events for this last ‘AI Winter’ do not seem to be completely clear. A portion of it could be potentially attributed to the extraordinary circumstances created by the Covid pandemic which brought about a widespread re-orientation of research focus and funding across academic domains, including AI. However, a significant part of it was probably only the correctional slowdown (regression towards the mean) after the unprecedented explosive growth from 2015 to 2020.

Current AI spring (2022-present)
AI has reached the highest levels of interest and funding in its history in the past few years by every possible measure, including: publications, patent applications, total investment ($50 billion in 2022), and job openings (800,000 U.S. job openings in 2022). The successes of the current "AI spring" or "AI boom" are advances in language translation (in particular, Google Translate), image recognition (spurred by the ImageNet training database) as commercialized by Google Image Search, and in game-playing systems such as AlphaZero (chess champion) and AlphaGo (go champion), and Watson (Jeopardy champion). A turning point was in 2012 when AlexNet (a deep learning network) won the ImageNet Large Scale Visual Recognition Challenge with half as many errors as the second place winner.

The 2022 release of OpenAI's AI chatbot ChatGPT which as of January 2023 has over 100 million users, has reinvigorated the discussion about artificial intelligence and its effects on the world.