Wikipedia:Bots/Requests for approval/ListasBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

ListasBot
Operator: Matt (talk)

Automatic or Manually Assisted: Automatic, supervised

Programming Language(s): C#, current source code is here

Function Overview: Helps to reduce the backlog at Category:Biography articles without listas parameter.

Edit period(s): Not quite sure how to describe this yet. Probably "on demand, probably weekly or twice a week" would probably be the best way to describe it right now. I'll elaborate more in the Function Details section.

Already has a bot flag (Y/N): N

Function Details: This is a bot that runs under AWB. It traverses the list of articles in the Biography articles without listas parameter category. These pages are, by definition, articles that have a WPBiography tag, but do not have a listas attribute as part of that tag. The bot does the following things with the page:
 * Looks for a DEFAULTSORT tag, or a listas parameter as part of another template. If it finds one, it:
 * Removes any DEFAULTSORT tags it finds from the page.
 * Converts the name so that it does not use any non-alphabetical characters, other than a hyphen.
 * Adds the listas parameter to the WPBiography template
 * Syncs all the other listas parameters to the one being inserted into the WPBiography tag
 * Removes any duplicate listas parameters in the WPBiography tag
 * If it doesn't find one:
 * It looks at the namespace of the page.
 * If the namespace is a file talk page, user page, or user talk page, it skips it and moves on to the next page.
 * If the namespace is a category talk page, it uses the de-Unicodified, de-punctuated name of the page (minus the "Category talk:" prefix), then proceeds to step 3 in the list above.
 * If the title of the article is a single word, it uses the de-Unicodified, de-punctuated name of the page (minus the prefix, such as "Talk:"), then proceeds to step 3 in the list above.
 * If it can't find a WPBiography tag in the page (which happens when the page was created using instead of ), it appends  to the end of the article.

As to the edit periods, apparently AWB's list of articles that it will work on is limited to 25,000 articles, and this category has a backlog of over 300,000 pages. I've brought this up on the AWB feature request page, however, with this limit in place, it would only do a large amount of good the first time around, and any subsequent runs after that would probably tax the Wiki servers more than the good that it would do.

Discussion
Call me naive, but is there any reason why the WPBiography listas parameter should have priority over other tags on the page? There's not really a hierarchy of projects. Q T C 00:51, 13 March 2009 (UTC)
 * Well, the best reason I can think to give you is that not having a listas parameter in the WPBiography tag causes the tag to be included in the category I mentioned above that I don't really feel like typing out again (which is listed right at the top of WP:BACKLOG). I suppose there's nothing wrong with leaving the other listas and DEFAULTSORT tags in, as long as everything matches -- if there's any mismatch, it shows up as an error on the page.  I figured that I'd have mine remove them from the page so that there's less chance of a mismatch happening (and it would probably solve some existing mismatches in the process).  That, and it makes the page shorter and easier to maintain.  Matt (talk) 00:58, 13 March 2009 (UTC)
 * Yes but wont removing the listas parameter from other templates have the possibility of adding the pages right back into another maintenance category WPOtherProject without listas? Q  T C 01:13, 13 March 2009 (UTC)
 * Hmmm, good point. So would the proper strategy here be just to make sure that all the listas tags match?  Any qualms about removing DEFAULTSORT from those pages?

To answer the first point, about singling out WP Biog, from what I have read in the discussions about project banners:
 * None of the other projects have created a category for the pages in their category that do not have the listas parameter.
 * All of the other projects banners have been "fixed" so that the listas parameter is no longer required.
 * Up until very recently, within the past three months, no one seemed to care about the problem.

As for the other points:
 * If another banner has a listas parameter that is different than that of WP Biog, the least that can happen is that a red warning will appear on the page that there is a DEFAULTSORT conflict. I believe but am not certain that the disaster that used to happen no longer will happen.  (It used to be that if the WP Greece banner followed the WP Biog banner and created a DEFAULTSORT conflict the WP Greece banner would explode.)
 * However, there may be some banners that have not been fixed and as there is no hierarchy of projects, it would probably be a good idea for all the banners on a page to have the same listas parameter.
 * The DEFAULTSORT template should only be on the main space, the article. There it is extremely handy -- it sets the sort order for the article in all the categories that appear below it.  In the case of biography articles that are about real people the DEFAULTSORT template should be replaced with the Lifetime template but that is a matter for the extremely picky and I am not yet that close to the edge.

When this bot is perfected or its results monitored very closely it woud be a great help in eliminating a major problem and keeping the problem from being more than a minor irritation.
 * JimCubb (talk) 16:07, 13 March 2009 (UTC)
 * Sounds really good to me. What percentage of the 300,000 do you think it'll "fix". - Jarry1250 (t, c) 17:04, 13 March 2009 (UTC)
 * Admittedly, not as many as I would have liked. I've been testing it out as an assisted editing tool, and out of about 5,000 pages, it only found about 300 to fix.  Matt (talk) 00:08, 14 March 2009 (UTC)
 * ADDENDUM: Sorry to everyone if I'm confusing you by adding/changing stuff in while the debate is going on. However, with the changes I made, there's the potential to fix several thousand pages in total.  Matt (talk) 04:18, 14 March 2009 (UTC)
 * Several thousand pages fixed is several thousand pages that humans won't have to waste their time on. . We'll see if any problems surface. - Jarry1250 (t, c) 08:37, 14 March 2009 (UTC)
 * . Bot performed as expected.  Matt (talk) 23:14, 14 March 2009 (UTC)

Extrapolating from the test as an assisted editing tool, it should fix about 18,000 pages. If it can only do half that many we should all be thrilled.

How many pages did it look at in the 20-edit trial in order to find 20 that it could fix?
 * JimCubb (talk) 23:32, 14 March 2009 (UTC)
 * It looked at about 120 total. The first 5 were category talk pages, which were kinda given to it as "gimme's" to make sure it would handle the category talk pages correctly.  I think there's close to 2,000 of the category talk pages on the list.  Matt (talk) 01:13, 15 March 2009 (UTC)


 * Given the backlog, I think that there is simply no reason not to just be grateful and approve. At the very least, this removes the imaginary backlog from the backlog figures, allowing more productive / targeted approach towards the remaining, real backlog. - Jarry1250 (t, c) 11:21, 15 March 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.