User talk:GreenC/WaybackMedic

ia save command url
Since this bot is fixing internet archive urls in archive-url, you should consider adding another task. There is a legitimate url that looks like this:

That command should never be used in cs1|2 templates because each time a reader clicks it, another copy of the current version of the targeted website is saved at internet archive. When Module:Citation/CS1 detects these urls, it emits an error message, and disables the archive link in the final rendering of the citation. cs1|2 templates with this particular error are categorized in.

The fix that I would suggest for this bot would be to change  to.

—Trappist the monk (talk) 15:35, 28 May 2016 (UTC)

Thanks, good info. After this bot is finished running, I hope to make a version #2 that will sweep all IA links (the current is limited subset) and the problems listed in can be part of it. I believe /*/ is also an error and should be replaced with the closest available snapshot (which can be retrieved via the IA API). -- Green  C  17:27, 28 May 2016 (UTC)
 * I disagree. The closest available snapshot (not clear to me what that actually means: closest to what?) may be a 404; the content of the 'closest' may not be the same as, or even substantially similar to, the content of the original site on the date that the citation was added to the article.  That is why the module doesn't just replace   with a timestamp concocted from archive-date.  Replacement of   is a task that is best accomplished by humans who can evaluate the content of the archived page to see that it supports the content of the Wikipedia article.
 * —Trappist the monk (talk) 17:40, 28 May 2016 (UTC)
 * Closest to the accessdate, the date added to the article. The IA API handles 404s, it only returns requested codes (200s etc). -- Green  C  18:48, 28 May 2016 (UTC)
 * A robot should not be making the choice of which snapshot should be linked by a cs1|2 template. If that were the case, again, the module could do it and we wouldn't need archive-url and archive-date.  Fixing the   urls is sufficient.
 * —Trappist the monk (talk) 23:55, 28 May 2016 (UTC)

At 1970s energy crisis, I just hand-corrected two citations where WaybackMedic inserted incorrect ref links that didn't match the cited content. Seems like a dangerous thing for a bot to be doing without human verification. At a minimum, I'd suggest teaching WaybackMedic to leave detailed Talk page messages in these cases, as Cyberbot II did at Talk:1970s energy crisis. —Patrug (talk) 02:40, 31 May 2016 (UTC)

Adding archive date
I'm not sure whether the bot already does it, I certainly don't see this listed as a function, but it may be useful to have it extract archivedate from the URL if missing, as editors might sometimes forget to include it. nyuszika7h (talk) 17:20, 3 June 2016 (UTC)


 * This task is done by Cyberbot II. The two bots are similar enough that it might be worth trying to merge them, to combine the best Wayback-related features of both. As another example, Cyberbot II describes its edits on the article's Talk page for human verification, which I also recommended for WaybackMedic in the section above. —Patrug (talk) 07:47, 8 June 2016 (UTC)

False dead link
In this edit, the bot removed the archive link and marked the ref with dead link. The archive link was bad, but the original URL was still good. Could the bot check the value of parameter ? If , it should refrain from adding the dead-link template. — Gorthian (talk) 10:24, 6 June 2016 (UTC)
 * Yes it shouldn't have added a dead link. I'll have to research what happened there. It doesn't rely on deadurl value because it could be out of date, it checks the site for return codes. Thanks for the notice. -- Green  C  13:26, 6 June 2016 (UTC)