Archie (search engine)

Archie is a tool for indexing FTP archives, allowing users to more easily identify specific files. It is considered the first Internet search engine. The original implementation was written in 1990 by Alan Emtage, then a postgraduate student at McGill University in Montreal, Canada. Archie was superseded by other, more sophisticated search engines, including Jughead and Veronica, which were search engines for the Gopher protocol. These were in turn superseded by search engines like Yahoo! in 1995 and Google in 1998. Work on Archie ceased in the late 1990s. A legacy Archie server was maintained for historic purposes in Poland at Interdisciplinary Centre for Mathematical and Computational Modelling in the University of Warsaw until 2023.

With assistance from the University of Warsaw, a new Archie server was created and opened for public access at The Serial Port, a web-based computer museum, on 11 May 2024.

Origin
Archie first appeared in 1986, while Emtage was the systems manager at the McGill University School of Computer Science. His predecessor had attempted to persuade the institution to connect to the Internet, but due to the expensive cost — roughly $35,000 per year for a sluggish link to Boston — it had been challenging to persuade the appropriate parties that the investment was worthwhile.

The name derives from the word "archive" without the 'v'. Emtage has said that contrary to popular belief, there was no association with the Archie Comics. Despite this, other early Internet search technologies such as Jughead and Veronica were named after characters from the comics. Anarchie, one of the earliest graphical FTP clients was named for its ability to perform Archie searches.

Function
The earliest versions of Archie would simply search a list of public anonymous File Transfer Protocol (FTP) sites using the Telnet protocol and create index files available via FTP. To view the contents of a file, it had first to be downloaded. The indexes are updated on a regular basis (contacting each roughly once a month, so as not to waste too many resources of the remote servers) and requested a listing. These listings were stored in local files to be searched using the Unix grep command.

The developers populated the engine's servers with databases of anonymous FTP host directories. This was used to find specific file titles since the list was plugged in to a searchable database of FTP sites. Archie did not recognize natural language requests nor index the content inside the files. Therefore, users had to know the title of the file they wanted. The ability to index the content inside the files was later introduced by Gopher.

Development
Emtage and Heelan wrote a script allowing people to log in and search collected information using the Telnet protocol at the host "archie.mcgill.ca" [132.206.2.3]. Later, more efficient front- and back-ends were developed, and the system spread from a local tool, to a network-wide resource, and a popular service available from multiple sites around the Internet. The collected data would be exchanged between the neighbouring Archie servers. The servers could be accessed in multiple ways: using a local client (such as archie or xarchie); telnetting to a server directly; sending queries by electronic mail; and later via a World Wide Web interface. At the peak of its popularity, the Archie search engine accounted for 50% of Montreal Internet traffic.

In 1992, Emtage, along with Deutsch and some financial help from McGill University, formed Bunyip Information Systems with a licensed commercial version of the Archie search engine used by millions of people worldwide. Heelan followed them into Bunyip soon after, where he together with Bibi Ali and Sandro Mazzucato significantly updated the Archie database and indexed web pages. Work on the search engine ceased in the late 1990s, and the company dissolved in 2003.