Venice Time Machine

The Venice Time Machine is a large international project launched by the École Polytechnique Fédérale de Lausanne (EPFL) and the Ca' Foscari University of Venice in 2012 that aims to build a collaborative multidimensional model of Venice by creating an open digital archive of the city's cultural heritage covering more than 1,000 years of evolution. The project aims to trace circulation of news, money, commercial goods, migration, artistic and architectural patterns amongst others to create a Big Data of the Past. Its fulfillment would represent the largest database ever created on Venetian documents. The project is an example of the new area of scholar activity that has emerged in the Digital Age: Digital Humanities.

The project's widespread critical acclaim led to the submission of a European counterpart proposal to the European Commission in April 2016. The Venice Time Machine forms the technological basis of the proposed European Time Machine.

The first full reconstruction of Venice showing the evolution of the city between 900 and 2000 was shown at the Venice Biennale of Architecture in 2018. The Venice Time Machine model of the city of Venice in 1750 was also used for an exhibition at the Grand Palais in Paris in September 2018.

Organisation and funding
The Venice Time Machine Project was launched by EPFL and the Ca' Foscari University of Venice in 2012. It includes collaboration from major Venetian patrimonial institutions: the State Archive in Venice, The Marciana Library, The Instituto Veneto and the Cini Foundation. The project is currently supported by the READ (Recognition and Enrichment of Archival Documents) European project, the SNF project Linked Books and ANR-SNF Project GAWS. The international board includes renowned scholars from Stanford, Columbia, Princeton, and Oxford. In 2014, The Lombard Odier Foundation joined the project Venice Time Machine as a financial partner.

Technology and tools
The State Archives of Venice contain a massive amount of hand-written documentation in languages evolving from medieval times to the 20th century. An estimated 80 km of shelves are filled with over a thousand years of administrative documents, from birth registrations, death certificates and tax statements, all the way to maps and urban planning designs. These documents are often very delicate and are occasionally in a fragile state of conservation. The diversity, amount and accuracy of the Venetian administrative documents are unique in Western history. By combining this mass of information, it is possible to reconstruct large segments of the city's past: complete biographies, political dynamics, or even the appearance of buildings and entire neighborhoods.

Scanning
Paper documents are turned into high-resolution digital images with the help of scanning machines. Different types of documents impose various constraints on the type of scanning machines that can be used and on the speed at which a document can be scanned. In partnership with industry, EPFL is working on a semi-automatic, robotic scanning unit capable of digitizing about 1000 pages per hour. Multiple units of this kind will be built to create an efficient digitization pipeline adapted to ancient documents. Another solution currently being explored at EPFL involves scanning books without turning the pages at all. This technique uses X-ray synchrotron radiation produced by a particle accelerator.

Transcription
The graphical complexity and diversity of hand-written documents make transcription a daunting task. For the Venice Time Machine, scientists are currently developing novel algorithms that can transform images into probable words. The images are automatically broken down into sub-images that potentially represent words. Each sub-image is compared to other sub-images, and classified according to the shape of word it features. Each time a new word is transcribed, it allows millions of other word transcripts to be recognized in the database.

Text processing
The strings of probable words are then turned into possible sentences by a text processor. This step is accomplished by using, among other tools, algorithms inspired by protein structure analysis that can identify recurring patterns.

Connecting data
The real wealth of the Venetian archives lies in the connectedness of its documentation. Several keywords link different types of documents, which makes the data searchable. This cross-referencing of imposing amounts of data organizes the information into giant graphs of interconnected data. Keywords in sentences are linked together into giant graphs, making it possible to cross-reference vast amounts of data, thereby allowing new aspects of information to emerge.

The Digital Humanities Laboratory of EPFL announced on 1 March 2016 the development of REPLICA, a new search engine for the study and enhanced use of the Venetian cultural heritage to be online by the end of 2016.

Praise

 * Interdisciplinarity and internationalism. Major Venetian patrimonial institutions, academic institutions and professors coming from different disciplines and different institutions across the world are collaborating to achieve this collective effort. The Venice Time Machine page describes three hundred researchers and students from different disciplines (Natural Sciences, Engineering, Computer Science, Architecture, History and History of arts) to have collaborated for this project.
 * Development of technology. The programme faces multiple technical challenges associated with converting the unique and vast cultural heritage into a digital archive. Mass digitization not only requires the systematic scanning of ancient manuscripts, but also the automatic processing of different hand-writing styles, as well as the analysis of Latin and several other languages as they evolve through time. Researchers of EPFL working on the Venice Time Machine project have, for instance, presented a methodology to analyze linguistic changes by studying 200 years of Swiss newspaper archives.
 * Democratization of knowledge and culture. The project seeks to open up knowledge and history to a wider audience through a virtual database that anyone can access, thus enhancing the link between the scholars and the wider public. Moreover, in reverse Digital Humanities aims to reduce barriers to the contribution and the sharing of knowledge and data by allowing a wider public to contribute to the effort of collection of data. The elite group of scholars and professionals should no longer be the only ones that can contribute and dissipate cultural and historical knowledge and digital humanities seeks to reduce this.

Criticism

 * Skewed audience. The whole project, along with the development of technology it entails, seems to be for a purely Western audience. Both the Venice Time Machine and the subsequent European Time Machine are centered around European history, culture and patrimonial heritage. Nothing has been done so far to include more regions' cultural history (although the project and digital humanities are still in its early stages) but still goes to show that more value is given to European history.
 * Content selection. The scientists and researchers working on the project that develop the datasets still have the power to select the information presented to the audience, which goes against the initiative's goal of knowledge democratization. The scientists involved are in a position of power to curate the content and educational information of the Venetian database.
 * Business opportunity in disguise. Previous similar initiatives suggest that creating a link between the scholars and the wider public represents a business opportunity for those that control such a data platform. For instance Google Books and Google Scholar have helped to achieve Google's long-term strategy to change users' habits of searching for books of both scholarly and popular reading and making the digital become a key mean to find knowledge, information and the historic past.
 * Ethical issues regarding Big Data. Although the data collected is mainly from the population that lived in the past, the same ethical issues arise nonetheless as with Big Data. Data collection is not always guaranteed to be anonymous, for instance, "if an individual's patterns are unique enough, outside information can be used to link the data back to an individual". As technology continues to advance current anonymization procedures are likely to decrease, according to Joshua Fairfield. Researchers may find that requiring consent from the families concerned is cost-ineffective.

Other consequences

 * The programme seeks to develop numerous tools and technologies that question and challenge the role of historians and humanists altogether. Alan Liu and William G. Thomas III identify in their "Humanities in the Digital Age" contribution a paradigm shift where the technological tools are increasingly becoming indispensable and believe humanists should shape the humanities' long-term digital future and must hence be proactive to avoid having the digital infrastructure built for them.