Talk:List of Web archiving initiatives

Comment - need to add more initatives & repository URLs
I do not appreciate that list - it would be central to have repository urls for those which are offering public access to archived contents. More initiatives see e.g. at http://archiv.twoday.net/topics/Webarchivierung --92.72.202.15 (talk) 15:42, 31 March 2011 (UTC)

Need to define terms like "Archived Contents (millions)"
I actually like the list, since there isn't anything else much like it anywhere else. Does anyone know what the "Archived Contents (millions)" column means? Is that the number of URLs present in the repository, or the number of snapshots of all URLs over time, or is it something else? Edsu (talk) 18:35, 28 October 2013 (UTC)

Need to normalize to FTE
The "Number of employees" column makes it difficult to compare. It would be better if the information was normalized to Full time employee equivalents (FTE). --PeterKz (talk) 19:18, 6 February 2014 (UTC)

Need to define most terms in this article
Most of the terms in this article come from the research paper:

"A survey on web archiving initiatives published by the Portuguese Web Archive team."

The terms are not clear to the typical Wikipedia reader, being too obscure and academic in tone.

For example: - Lentower (talk) 13:09, 29 April 2014 (UTC)
 * In the lead: "web archiving initiatives, archived data, and access methods", which could each be explained at the start of each table.
 * Most of the column headers in each table need to be defined and clarified. See remarks in prior sections.

Archive.is is not notable
That site is not established for notability and cannot be. So it is not for inclusion to Wikipedia. It is referenced to the primary source, i.e., the site itself, and that site does not even exist. Do not revert my edit without establishing a reason, thanks. ~ R . T . G  13:37, 2 September 2014 (UTC)
 * I improved the entry using new site name and notable source (CNET Japan). Also note that this site is the second most popular archive site after Wayback Machine according to Alexa. 88.246.46.189 (talk) 14:12, 2 September 2014 (UTC)
 * That whole article is nothing more than an advertisement. It gives no reason for why they are writing about it.  Doesn't establish notablity.  ~  R . T . G  14:28, 2 September 2014 (UTC)
 * MOS:SAL does not say that notability is a metric by which one includes something on a list.— Ryūlóng ( 琉竜 ) 14:50, 2 September 2014 (UTC)


 * Alexa ranks it as having overtaken the library of congress in the last two months. There's not much point trying to argue about Alexa, but for a site with no content and scant advertising to outdo all of the sites which have been running for twenty years with massive libraries of content unrelated to archiving?  There are apparently sites which provide dummy traffic for about $5 per 10,000.  That's not very expensive for some people.  They have 2,000,000 website snapshots.  That is all they have.  My opinion is that they've got their instant save button incorporated into some sort of malware this year and compounded it with the traffic they are getting from the controversy on this site but hey, that's just an opinion.  I shouldn't even do that some would say. (troll on my new disciple)  ~  R . T . G  14:59, 2 September 2014 (UTC)
 * (MOS:SAL, paragraph 2 "Being articles, stand-alone lists are subject to Wikipedia's content policies, such as verifiability, no original research, neutral point of view, and what Wikipedia is not, as well as the notability guidelines.") ~ R . T . G  15:03, 2 September 2014 (UTC)
 * Your "opinion" is no more than WP:IDL. 88.246.46.189 (talk) 15:21, 2 September 2014 (UTC)
 * The entry on Archive.is seems to be verified, not original research, and neutrally written. Stop tilting at windmills across the project RTG.— Ryūlóng ( 琉竜 ) 16:57, 2 September 2014 (UTC)

Aleph Archives
Do we need to have producers of archive software in the list ? For example "Aleph Archives" is not an archive at all, it is a commercial company developing and selling software for archives. 88.246.46.189 (talk) 15:25, 2 September 2014 (UTC)
 * I would suppose that it is an initiative, even if that initiative is not the archiving itself, it is a highly notable archiving related initiative. What might be an improvement is to split off a small section of the software, but at such a long list of something I've no familiarity with, it would be much easier for someone who knows about archive software.  Even if that section is only one at the moment, it's still a sort of division of that type in the information.  ~  R . T . G  15:22, 14 November 2014 (UTC)

External links modified
Hello fellow Wikipedians,

I have just added archive links to 1 one external link on List of Web archiving initiatives. Please take a moment to review my edit. If necessary, add after the link to keep me from modifying it. Alternatively, you can add to keep me off the page altogether. I made the following changes:
 * Added archive https://web.archive.org/20120925004220/http://www.nyu.edu/library/bobst/research/tam/webarchive.html to http://www.nyu.edu/library/bobst/research/tam/webarchive.html/

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Cheers.—cyberbot II  Talk to my owner :Online 21:23, 1 February 2016 (UTC)

Inclusion criteria
In order to make this list easily maintainable and follow Wikipedia's policies and guidelines, I suggest only including entries with their own Wikipedia articles. --Ronz (talk) 17:22, 9 February 2017 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified 2 external links on List of Web archiving initiatives. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added tag to http://210.82.118.162:9090/webarchive
 * Added archive https://web.archive.org/web/20130927195054/http://ediasporas.ticmigrations.fr/ to http://ediasporas.ticmigrations.fr/
 * Added tag to http://digital.cacak-dis.rs/english/web-archive-of-cacak/
 * Added archive https://web.archive.org/web/20130927195054/http://ediasporas.ticmigrations.fr/ to http://ediasporas.ticmigrations.fr/

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 04:09, 28 December 2017 (UTC)

Megalodon.jp
Megalodon.jp need to be mention. Is minor in comparison to perma.cc / webrecorder.io or even archive.is (which some user above have said ironically that is not relevant), but I think it should be here --Jakeukalane (talk) 13:16, 26 March 2019 (UTC)

The first column of the table is a mess
The first column of the table in List of Web archiving initiatives, the "Name" column, is a mashup of plain text names with endnotes, wikilinks, red links, and external links. There is no apparent regularity in formatting here. I propose that the column should include either (preferably) wikilinks or plain text names with endnotes, no red links (per WP:WTAF) and no external links (per WP:EL). Above, in, suggested including only "entries with their own Wikipedia articles" (i.e. wikilinks) but that suggestion apparently did not go anywhere. What I am proposing is a step toward some kind of regularity. Biogeographist (talk) 12:05, 10 June 2019 (UTC)
 * Go ahead. This list is a mess. --Ronz (talk) 15:24, 10 June 2019 (UTC)

Pay/Freemium/Free
Need a column that specifies pay structure(s): Pay is entirely non-free (Arkiwera). Freemium is a mix of free and pay services (Conifer/Rhizome). Free is entirely free (WaybackMachine). -- Green  C  22:03, 12 April 2024 (UTC)

Mix of Public and Open Initiatives, Specialized Scoped "Initiatives", and Commercial Service Providers
The list doesn't seem particularly useful with its mix of scope and listed content.

I would find it much more appropriate and useful if it were split into three lists:

Kissaki (talk) 08:44, 9 June 2024 (UTC)
 * 1) Public and Open Initiatives where anyone can archive (broadly) any web pages like the Wayback Machine
 * 2) Specialized Initiatives like the numerous self-funded and self-scoped government webpage archiving initiatives listed archiving only their own domains
 * 3) (Commercial) service offerers like Aleph Archives (inclusion and applicability of which was discussed in another thread)


 * I would expect Web archive to include anything from the web, not only specific subdomains or archiving service providers. See Web and World Wide Web. Thus, I would even find it appropriate to remove anything that does not intend to archive a broader section of the web from this list titled "List of Web archiving initiatives". Kissaki (talk) 08:49, 9 June 2024 (UTC)