Index Thomisticus

The Index Thomisticus was a digital humanities project begun in the 1940s that created a concordance to 179 texts centering around Thomas Aquinas. Led by Roberto Busa, the project indexed 10,631,980 words over the course of 34 years, initially onto punched cards. It is considered a pioneering project in the field of digital humanities.

Project
Busa began the project in 1946. IBM agreed in 1949 to sponsor the project until its completion. They assigned Paul Tasman, an executive at the company, to work with Busa. Busa selected 179 texts centering around Thomas Aquinas that would be put into a form that was machine-readable. 118 of the works were written by Aquinas, and the remaining 61 items were either at one point mis-attributed to him or an attempt to complete an unfinished work begun by Aquinas.

A significant part of the project was the data entry, which was meticulously carried out by a team of female keypunch operators. Their dedication and precision were instrumental in the success of the project. This work of punching the text was made between 1950 and 1966. They worked in Gallarate, Italy, and the project peaked in size in 1962 with 70 workers. After the punching was complete, the data was lemmatised in a semi-automatic process.

The completed project indexed a total of 10,631,980 words in fifty-six volumes over 70,000 pages—divided into ten volumes of indexes, followed by thirty-one volumes of concordances of Aquinas's works, eight volumes of concordances of related authors, and seven volumes that reprinted the source texts. The seven completely reprinting the source texts were sold separately. The first volume was published in 1974, and publication was completed in 1980. The project used a total of 1,500 km of tape and it took an estimated 10,000 hours of computer work and 1 million hours of human work to complete. The Index was released on CD-ROM in 1992 and a website was launched in 2005.

Reception, impact, and legacy
A review published of the project in Computers and the Humanities described it as "as innovative and fascinating a reference work as the technology that made it possible." In 1993, the project was described as the "second largest printed work of this century". The same review called it "excessive" and asked what its purpose was, going on to describe it as "the most pedantic work ever written". In 2020, The Economist described it as "the creation story of the digital humanities." An article in Umanistica Digitale wrote that "the project developed for the first time, methods for dealing with unstructured language". It influenced projects such as Key Word in Context. The project is also sometimes listed as one of the earliest instances of an e-book.