Wikipedia:Bots/Requests for approval/MelonBot 5


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

MelonBot
Operator: Happy‑melon

Automatic or Manually Assisted: Supervised automatic

Programming Language(s): Python using pywikipedia

Function Summary: Reorganisation of Peer review archives

Edit period(s) (e.g. Continuous, daily, one time run): one very long run

Edit rate requested: 5-10 edits per minute

Already has a bot flag (Y/N): Y

Function Details: See User:MelonBot/PR

Discussion
How would it re-organise them, and why? ~  Dreamy   §  20:15, 29 January 2008 (UTC)
 * Currently peer reviews are started at Wikipedia:Peer review/articlename, and are 'archived' only by removing them from PR once they've been inactive for a month or so. If a user wants to ask for another review of the same article they are instructed to move the old review page to Wikipedia:Peer review/articlename/archive1 and replace the resulting redirect with a new review request.  This of course breaks all the links to the old review, from, , etc.  In addition, a quick check of Special:Prefixindex reveals that some people's interpretation of "archiving" ranges from "archive1" to "Archive 1" to "Attempt 1" to "1" to "date" to a variety of other wierd and wonderful formats, all of which make old peer reviews almost impossible to reliably locate.  Geometry guy is working on a new set of templates for peer review which will automatically create new peer reviews at the lowest-numbered N for which Wikipedia:Peer review/articlename/archive1 does not already exist, so all new peer reviews will be archived in an orderly fashion.  But someone or something has to go through the 6000-some old peer reviews and sort out the mess that has been created by five years of avoiding-the-problem-and-hoping-it-will-go-away. Happy‑melon 21:30, 29 January 2008 (UTC)
 * That seems alright then... You have been  ~  Dreamy   §  22:13, 29 January 2008 (UTC)

For the record, I have argued that this job is not a good use of resources. The pages are where they are, and are linked at relevant PR archives, FAC pages, and WikiProject pages. (Well, some links are a problem because the pages get moved and the links are not updated, but it's not clear how any automated script is going to fix that.)

There are some 5000 pages at Wikipedia:Peer review/Articlename. Moving these all to /archive1 would involve moving 5000 pages around and creating 5000 redirects. These are linked usually at least once in a PR archive and once in ArticleHistory, so if only those are updated (and not project links and user talk pages), we're still talking about 20000 edits.

And what's the benefit? Basically, as I understand it, the benefit of making the filenames nice (and presumably keeping them sequential in time) is to allow for an archive templates to find them all - although what archive template is not clear at this point. But the assumption is that a user needs to move a page to make a new PR - which is not the case. A template can easily find the next available page for the user; I coded up an example and Geometry guy already implemented it at WP:GAR.

If you want to get the second and later PRs in order (ie, the ones with a /something already), that's only a few hundred pages, and could probably be justified. But I see no reason to move the first PR from Wikipedia:Peer review/Articlename. By convention, that's can be the first one; it doesn't need a /archive1. Gimmetrow 02:12, 30 January 2008 (UTC)
 * I understand Gimmetrow's objections, and I would be highly reluctant to complete such a major change on a smaller or less well-supported Wiki than Wikipedia. But It is not our job to worry about performance - we should do whatever we as editors think is necessary to improve this encyclopedia.  We've sent the job queue up to over half a million standardising cleanup templates. We've had scripts run to revert thousands of bot edits when they've proved to be problematic.  Bots from CFD/W edit thousands of pages daily to change category tags when often all that's being done is to change a category name from "Articles About Foo" to "Articles about Foo".  Many smaller changes have been made to the Wiki, but we've made larger changes too, and we'll continue to make them as we deem them necessary.  Unlike the template standardisation, this is not something that will happen instantaneously - even with MelonBot running at full speed, it will take days if not weeks to complete.  And most importantly of all, it's something that, if we do it right, only has to be done once.  Happy</b>‑<b style="color:darkorange;">melon</b> 11:37, 30 January 2008 (UTC)
 * I don't know what you're referring to by a smaller wiki. I am first and foremost objecting to the moves of pages that are just fine where they are: WP:PR/Articlename. I see no need to move those; they aren't difficult to find to begin with, and they don't help the template system (which is my idea and prototype, by the way). All the subpages at /try this and /archive that are a somewhat different matter. Furthermore, there would be quite a few issues to sort out. 1) are you, in fact, updating links after any moves? 2) which ones? 3) how would you keep the subpages in sequential order. Gimmetrow 21:01, 30 January 2008 (UTC)
 * Essentially I am saying that yes, perhaps on a smaller wiki this would be a large enough operation to have serious performance considerations - here on Wikipedia, with 11 million pages, it's barely a drop in the ocean. I'm not claiming that reviews at <tt>Wikipedia:Peer review/articlename</tt> are hard to find by hand, only that having pages at this location makes them utterly impossible to archive easily.  The simple fact is that when an editor wants to start another peer review for an article which already has one at <tt>Wikipedia:Peer review/articlename</tt>, they 1) have to move the review somewhere else, at which point it can very easily get lost depending on where they move it to, and 2) they really need to update all incoming links to the old peer review, which of course never gets done.  The net effect is that, yes, peer reviews do get lost under the current system, and that is something that can be corrected in this easy, if large-scale, tidying operation.
 * In response to your technical questions, all links are updated after each move, including transclusions and piped links; and pages will kept in sequential order by doing the non-standard archives first and in alphabetical order. I can't think of any realistic archive format which would result in a newer peer review appearing earlier in the list than an older one for the same title.  <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 21:26, 30 January 2008 (UTC)

(←) I promised to comment here. Gimmetrow, Happymelon and I all understand that Gimmetrow's clever template idea means that in future all peer reviews will be placed, ab initio, in the next free /archiveN page. I have written the templates to do this, and there is a pretty good chance they will go online tomorrow (I even hoped to do it tonight, but one has to test these things). That means that whatever job is being contemplated here is a one-off, and it is worth getting one-offs right. My own perspective is that a one-off demand on the servers is worthwhile as long as it does something to make life easier for human volunteers on an ongoing basis.

As Gimmetrow rightly points out, the main issue is the hodgepodge of peer review archive pages. There are merely hundreds of these, and they need to be fixed. After this, what remains are the Peer review/ARTICLE NAME pages. These are not the oldest peer reviews in general, but the most recent, although in most cases, there has only been one peer review, and so the two concepts (oldest/most recent) are the same. However, at the very least, Peer review/ARTICLE NAME pages for which there are Peer review/ARTICLE NAME/something pages do need to be archived, to record them as peer reviews more recent than the archived peer reviews. Otherwise they will be hard to find in the new peer review system which stores all peer reviews under archived names.

I agree with Gimmetrow that those Peer review/ARTICLE NAME pages for which there is no archived peer review could become the de facto archive zero original peer review, but this will just confuse human editors. As I say, if there is a temporary fix involving server time which will make life easier for human volunteers on an ongoing basis, then it should not be opposed.

My main concern about use of resources is the effort that HappyMelon will be devoting to make this work. But HappyMelon clearly believes quite strongly that this is a worthwhile thing to do, and so has my fullest support.

This discussion is very helpful, however. Lets also see what issue emerge from the test, and allow HappyMelon to adapt his proposal accordingly. Geometry guy 22:56, 30 January 2008 (UTC)


 * (after ec) Under the new system, an editor will initiate a PR with <tt></tt> on the talk page, which will expand to peerreview and link to <tt>Wikipedia:Peer review/articlename</tt> if that page doesn't exist. If a page exists at <tt>Wikipedia:Peer review/articlename</tt>, the editor does not do anything different. The editor types exactly the same thing: <tt></tt> on the talk page, and the template finds <tt>Wikipedia:Peer review/articlename/archive1</tt>. Nothing will ever be moved, nor links updated, again. I also don't want HappyMelon to do unnecessary work. I've implemented this system at portal peer review (with PPR and portalpeerreview). Gimmetrow 23:03, 30 January 2008 (UTC)
 * That's a nice idea, but if you want to do something, do it: newPR is a redlink, whereas PR has been developed; it has been really hard work getting this to the implementation stage without disrupting the continuity of the PR process. It may go online tomorrow, but only if I can be sure it won't mess up the peer review process. There will still be problems when I do it anyway. That takes time and effort. It makes me admire all the more the time and effort you devote to the work you do with GimmeBot, but I hope you likewise appreciate it takes time and effort to implement your template idea at PR smoothly; and I'm doing it. Geometry guy 23:44, 30 January 2008 (UTC)
 * I was doing this on portal peer review, with the idea that once the bugs were worked out, it would be a quick transition to PR. Gimmetrow 00:04, 31 January 2008 (UTC)
 * Great. I'm similarly transitioning from GAR experience to PR (although I am already experienced in automating the PR page). It is not a quick transition, but I hope we can compare notes: maybe this weekend. Geometry guy 00:10, 31 January 2008 (UTC)

Some specific cases: Dealing with all this seems like a lot of work to code, and I just don't see the benefit. Gimmetrow 23:18, 30 January 2008 (UTC)
 * 1) What do you do if an article has WP:PR/Article/Archive1 and WP:PR/Article/Try 1 and WP:PR/Article/archive1?
 * 2) What do you do with all the WP:PR/Article pages that are already pre-made for the next PR, but have no real content?
 * 3) What if WP:PR/Article from 2 years ago was mentioned to 100 members of a project - do you update all those links?
 * 4) When you're updating links, can you update links written as Wikipedia:Peer_review/My_article/Try_1 and Wikipedia:Peer review/My article/Try 1 (and other variations)?
 * 5) What about redirects to the PR subpage, which may be linked in PR archives or ArticleHistory?
 * An exception is raised and the move performed manually.
 * If an edit by GimmeBot is detected to the page, a manual interupt requires human confirmation that the move is OK. If GimmeBot is the only editor, the title is automatically skipped.
 * All links are updated. We might as well do this properly.
 * Yes
 * Redirects are also diverted - they would be fixed anyway by the DoubleRedirect bots
 * <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 20:45, 31 January 2008 (UTC)

Well you can see a half-dozen complete move operations in MelonBot's recent contributions. I can't see any problems that I haven't fixed, although there are a wide variety of mistakes, most of which have only occurred once. I'm going to keep running nice and slowly and try and pick up on any new errors - if anyone spots any, drop a line here or on either my or MelonBot's talk page. <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 20:45, 31 January 2008 (UTC)


 * It is looking good. Meanwhile, I've implemented the changes at PR so that all future peer reviews will be stored permanently in /archiveN pages: no more PR page moves after this! Geometry guy 09:47, 1 February 2008 (UTC)
 * Are you sure it's a good idea to do that before we've finished all the moves? Any new repeat requests during the changeover period will probably end up out of order.  <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 10:06, 1 February 2008 (UTC)
 * Those are low-frequency, and I'm keeping an eye out for them. I needed to implement the changes to simplify the archiving process. Geometry guy 10:25, 1 February 2008 (UTC)
 * On the other hand, this has convinced me that we really do have to deal with all 6000 pages. Even when a Wikipedia process is relatively straightforward, editors often fail to follow the instructions (if they read them) correctly. When the process is inherently confusing, mistakes are more frequent and someone has to pick up the pieces. It simply isn't acceptable that the Wikipedia:Peer review/ARTICLE NAME pages are a mixture of hard and soft redirects, and peer reviews old and new. It is just too confusing to leave some of them as a de facto archive0: they will be in constant danger of being overwritten or randomly moved to archive pages as regular editors struggle to figure out what is going on.
 * So I can only express my gratitude to Happy-Melon for being willing to sort this mess out once and for all. Geometry guy 19:37, 2 February 2008 (UTC)

BAGAssistanceNeeded Given that I've "kept running nice and slowly" through 17,000 edits with very few mistakes, perhaps a closure is warranted :D?? <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 10:02, 19 February 2008 (UTC)
 * 25,401 edits later, the run is done. I guess you missed the boat on approving this one :D <b style="color:forestgreen;">Happy</b>‑<b style="color:darkorange;">melon</b> 18:23, 20 February 2008 (UTC)
 * Well, I guess you finished it, and my estimate of the edit count was about right. Now that it's done, it's good that it was done. Gimmetrow 05:30, 24 February 2008 (UTC)
 * Can we close this now, then? SQL <sup style="font-size: 5pt;color:#999">Query me!  08:48, 1 March 2008 (UTC)
 * Whoops. Because the task is finished,  -- Cobi(t 09:25, 1 March 2008 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.