Wikipedia:Wikipedia Signpost/2022-03-27/From the archives


 * In honor of World Backup Day on March 31, this month we look back at when early backups of Wikipedia were recovered, and reflect on the importance of keeping data safe. The following Signpost article, by Jarry1250, originally appeared in the 20 December 2010 edition as "Bugs, Repairs, and Internal Operational News".

Old Wikipedia archive uncovered
This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work!

Tim Starling, a developer and system administrator working for the Wikimedia Foundation, announced this week his discovery of backups of Wikipedia pages from February, March and August 2001, which, he said, were assumed to be permanently lost. Though it was originally thought possible that the later backups might include early revisions of Wikipedias in other major languages such as French and German, it now seems that the ad hoc nature of the backups meant that they only refer to the English-language "WikiPedia".

"I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups", he said. "I asked various people who might have had one. I had given up hope." However, he uncovered two UseModWiki files which contained a record of every change made to Wikipedia from January 15 to August 17, 2001. The files are available to download here; Brian Mingus has created an online index of articles in the encyclopedia at the time (example revisions to the 'boat' article), as has Wikipedia researcher Joseph Reagle, who also compiled a list of the top 20 Wikipedia contributors from these early stages. Given the discovery's timing, weeks before the tenth anniversary of the English-language (and first) Wikipedia, interesting snippets (such as the one at the top of this article) are also being collated on a special "Wikipedia in the Beginning" page.

As for what happens now with the revisions – which are invaluable in terms of correctly attributing contributors to the project in line with its copyleft licence – Tim writes, "I'm developing a script which will import the dump into a modified MediaWiki instance, the idea being that I can then export XML from it [i.e. transform it into a modern style database dump] ... I'm not sure when that will be."