User:Emijrp/Wikipedia Archive


 * If you are searching for an offline reader of Wikipedia, go to . If you want to archive your wiki, read more at WikiTeam. If you want to create a mirror of Wikipedia dumps, read Mirroring Wikimedia project XML dumps.

This unofficial Wikipedia Archive is a project to keep a record of Wikimedia projects backups, check their status and encourage the creation of better and periodically copies.

This page contains links to backups and other historical information about Wikipedia, Wiktionary, Wikibooks, Wikisource, Wikiquote, Wikiversity, Wikinews, Wikispecies, Commons, Wikidata and Meta. There are also some data about her predecessors or old Internet encyclopedia projects now closed: Interpedia, GNUPedia and Nupedia.

Wikipedia pages can be edited by everyone and the histories are saved and publicly available, but usually is hard to find very old stuff. I hope you enjoy this page. Nostalgia time!

Please, if you have enough free disk space, consider downloading Wikimedia dumps. You can use this downloader (it requires GNU/Linux) to get all 7zip full history files for all projects. You can't read these files in an easy way, but you can preserve them for posterity. I think that we must save a copy of Wikipedia in every country around the world.

If you are an admin of a free-licensed wiki, consider publishing backups of the text articles and images (read tutorial). Every year, dozens of wikis disappear around the globe. WikiTeam is trying to avoid that destruction.

You are welcome to improve this page or leave a message in the talk page.

Permit mirror sites
"When information is available on the web only at one site, its availability is vulnerable. A local problem—a computer crash, an earthquake or flood, a budget cut, a change in policy of the school administration—could cut off access for everyone forever. To guard against loss of the encyclopedia's material, we should make sure that every piece of the encyclopedia is available from many sites on the Internet, and that new copies can be put up if some disappear.

There is no need to set up an organization or a bureaucracy to do this, because Internet users like to set up “mirror sites” which hold duplicate copies of interesting web pages. What we must do in advance is ensure that this is legally permitted.

Therefore, each encyclopedia article and each course should explicitly grant irrevocable permission for anyone to make verbatim copies available on mirror sites. This permission should be one of the basic stated principles of the free encyclopedia.

Some day there may be systematic efforts to ensure that each article and course is replicated in many copies—perhaps at least once on each of the six inhabited continents. This would be a natural extension of the mission of archiving that libraries undertake today. But it would be premature to make formal plans for this now. It is sufficient for now to resolve to make sure people have permission to do this mirroring when they get around to it."

- Richard M. Stallman

Wikimedia dumps
You can download different dumps of Wikipedia and her sister projects. The most recent ones, offline readable versions, historical dumps, etc.

Latest dumps
The official download website offers backups for every Wikimedia project and every language:
 * XML & SQL
 * Static HTML (outdated)
 * DVD distributions (only a few languages and not all articles/images)

The dumps are usually huge XML files. You can parse them using pywikipediabot xmlreader.py script and mediawiki-utilities. This is thought mostly for researchers. If you are searching for an offline reader of Wikipedia, go to the section.

Mirrors
Currently, there is only one working mirror of dumps.wikimedia.org site. It is located in Your.org servers.

WikiTeam periodically upload dumps to a collection at Internet Archive.

Offline versions


Currently there are several options for reading an offline copy of Wikipedia:


 * Kiwix (using openZIM files) (website) (files)
 * Okawix (website) (torrents) (files)
 * Wikipedia on CD/DVD:
 * English:
 * 2008/9 Wikipedia Selection for Schools Torrent TPB
 * http://thepiratebay.org/torrent/5337159/Wikipedia_on_CD_KiWix_WPCD_WPRV_0.5_English_2007-04-06
 * 2006 Wikipedia CD Selection ISO (or ZIP)
 * Malayalam:
 * Malayalam Wikisource on CD (download) (browse) (torrent)
 * Malayalam Wikipedia on CD (download) (browse) (torrent)
 * How-to
 * Spanish:
 * CDPedia & DVDPedia http://python.org.ar/pyar/Proyectos/CDPedia
 * http://www.wiki-web.es/mediawiki-offline-reader/
 * iPhone:
 * wiki2touch (it works with bzip2 files)
 * Offline Wiki: http://offline-wiki.googlecode.com/git/app.html

More info in Wikipedia on CD/DVD and Version 1.0 Editorial Team.

Historical dumps

 * See also: https://dumps.wikimedia.org/archive

This is a list with some links to old Wikimedia dumps and other related bunches of raw data (usually many zipped GB, which sometimes expands to TB).


 * English Wikipedia dump from 2010-03-12 full history in a single file.
 * English Wikipedia dump from 2008-01-03 at Internet Archive. All files.
 * English Wikipedia dump from 2007-04-02 at Internet Archive. Only pages-meta-history in bz2.
 * English Wikipedia dump from 2006-11-04 at Internet Archive. All files except the large bz2 (but 7z yes!).
 * English Wikipedia dump from 2006-08-16. Only pages-meta-history in 7z format.
 * English Wikipedia dump from 2004-12-23 at Internet Archive. SQL.
 * Dump of Wikipedia from December 20, 2001 at Internet Archive. Only bz2, around 30 MB. Also available as nostalgia in Wikipedia official dumps page.
 * And very old dumps at Internet Archive (the first available is from 2003). Not really for use, only for archivers and old data hoarders. You can download some of the tiniest SQL files (I think Internet Archive only saves files < 500 KB or so). More info.
 * Oldest Wikipedia backup by Tim Starling discovery: http, mirror, torrent

Torrents

 * See also: Data dump torrents.


 * Burnbit: "enwiki", "pages-articles",

Page view logs
The visits logs for Wiki[mp]edia projects (originally named Domas visits logs) have the number of views for every wiki page per hour. There were some issues while collecting this data in the past, please read this and this, to learn how realiable is this data.

You can download the packages from Wikimedia servers and we periodically upload the recent ones to Internet Archive (thanks to User:Hydriz). There is a version of these files by Erik Zatche, which contain some derived data (not all). Following, the list of mirror links to IA:


 * 2007: Dec (12.5 GB). Total: 12.5 GB. There is no visits logs previously to 2007-12-09 18:00:00 UTC.


 * 2008: Jan (19.5 GB), Feb (18.5 GB), Mar (19.6 GB), Apr (20.6 GB), May (28.6 GB), Jun (34.1 GB), Jul (33.7 GB), Aug (33.8 GB), Sep (34.3 GB), Oct (35.4 GB), Nov (34.7 GB), Dec (36.8 GB). Total: 349.5 GB


 * 2009: Jan (38.7 GB), Feb (34.7 GB), Mar (40.5 GB), Apr (38.4 GB), May (40.5 GB), Jun (39.4 GB), Jul (39.5 GB), Aug (40.2 GB), Sep (27.6 GB), Oct (43.4 GB), Nov (43.3 GB), Dec (42.6 GB; no projectcounts). Total: 468.8 GB


 * 2010: Jan (42.6 GB; no projectcounts), Feb (41.9 GB; no projectcounts), Mar (45.4 GB; no projectcounts), Apr (42.1 GB), May (42.5 GB), Jun (33.7 GB), Jul (34.6 GB), Aug (46.6 GB), Sep (46.3 GB), Oct (47.0 GB), Nov (46.2GB; no projectcounts), Dec (47.3 GB). Total: 516.1 GB


 * 2011: Jan (48.9 GB), Feb (48.0 GB), Mar (51.2 GB), Apr (48.9 GB), May (52.0 GB), Jun (51.2 GB), Jul (52.7 GB), Aug (53.7 GB), Sep (48.4 GB), Oct (58.1 GB), Nov (56.4 GB), Dec (51.3 GB). Total: 621.6 GB


 * 2012: Jan (61.1 GB), Feb (50.4 GB), Mar (56.1 GB), Apr (56.2 GB), May (58.7 GB), Jun (57.6 GB), Jul (59.7 GB), Aug (59.9 GB), Sep (61.7 GB), Oct (64.3 GB), Nov (63.9 GB), Dec (65.3 GB). Total: 725.8 GB


 * 2013: Jan (69.1 GB), Feb (61.3 GB), Mar (68.2 GB), Apr (64.1 GB), May (68.6 GB), Jun (66.7 GB), Jul (64.6 GB), Aug (65.6 GB), Sep (62.7 GB), Oct (67.4 GB), Nov (63.3 GB), Dec (65.9 GB). Total: 787.5 GB


 * 2014: Jan (67.2 GB), Feb (64.1 GB), Mar (70.9 GB), Apr (68.5 GB), May (66.8 GB), Jun (66.4 GB), Jul (70.5 GB), Aug (71.7 GB), Sep (75.4 GB), Oct (71.1 GB), Nov (66.1 GB), Dec (67.5 GB). Total: 826.1 GB.


 * 2015: Jan (68.2 GB), Feb (59.9 GB), Mar (68.3 GB), Apr (68.2 GB), May (70.1 GB), Jun (57.8 GB), Jul (64.7 GB), Aug (62.7 GB), Sep (64.0 GB), Oct (63.0 GB), Nov (62.6 GB), Dec (64.5 GB). Total: 774.1 GB.


 * 2016: Jan (65.7 GB), Feb (59.5 GB), Mar (67.6 GB), Apr (69.0 GB), May (63.2 GB), Jun (61.7 GB), Jul (60.9 GB), Aug (8.09 GB). Total: 455.8 GB.

An updated version of the pageview statistics were made available that included the mobile sites. They are available at a different directory and contains more data than the above.


 * 2014: Sep (26.7 GB), Oct (93.8 GB), Nov (88.4 GB), Dec (89.8 GB). Total: 298.7 GB. There is no visits logs previously to 2014-09-23 00:00:00 UTC.


 * 2015: Jan (91.4 GB), Feb (81.0 GB), Mar (91.6 GB), Apr (91.9 GB), May (95.3 GB), Jun (79.8 GB), Jul (86.9 GB), Aug (86.3 GB), Sep (86.7 GB), Oct (85.9 GB), Nov (84.8 GB), Dec (87.4 GB). Total: 1049.0 GB.


 * 2016: Jan (90.7 GB), Feb (82.6 GB), Mar (92.7 GB), Apr (93.3 GB), May (87.7 GB), Jun (85.3 GB), Jul (85.5 GB), Aug (11.3 GB). Total: 629.1 GB.

After 2016-08-05 05:00:00 UTC, the pageview statistics stopped generating, as they are all replaced by a new pageviews dataset.

Daily user pageviews for all local Wikipedies (assembled by Dušan Kreheľ, separately local Wikipedies, format d0cmf, from 2015-07) are stored on archive.org (link).

(About uploading those files to the IA:  ; or a way better solution.)

Image tarballs
There is a copy of almost all Wikimedia Commons files up to 2013 at Internet Archive (about 34 TB). In 2012, some image dumps were published for some Wikipedia languages. Another one from 2005 only covers English Wikipedia images.

The best pictures of some years are available too, for download from the Internet Archive (2006, 2007, 2008, 2009, 2010, 2011)

Help seed the garden of knowledge
We store everything on the Internet Archive, but who archives the archivers themselves? You, of course! We believe in distributed preservation. Is it really needed? Yes, it turns out nobody but you wants to mirror the decade-long work of millions volunteers.

Give a look at how much space you have on your hard disk: probably, tens or hundreds GB. Pick a torrent you can fit from those below, click the link and keep reseeding! If possible, also sign in the table row.

Mailing lists archives

 * The Wikimedia mailing lists archives in gzip
 * The Nupedia mailing lists:
 * Our first article message by Larry Sanger:
 * A: aerospace-l, anthro-l, anomphen-l, archae-l, architect-l, astronomy-l,
 * M: music-l,
 * Z: zoology-l,
 * Complete list
 * Complete archives in mbox format


 * Larry Sanger mails about Nupedia/Wikipedia in Usenet

Not Wiki[mp]edia
Other free projects which allow downloading their dumps:
 * Wikia: http://wiki-stats.wikia.com
 * Citizendium: http://en.citizendium.org/wiki/CZ:Downloads
 * OpenStreetMap: https://wiki.openstreetmap.org/wiki/Database_dump
 * OmegaWiki: http://www.omegawiki.org/Development