Wikipedia:Bots/Requests for approval/PaievBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

PaievBot
Operator: Paiev Discuss!

Automatic or Manually Assisted: Pseudo-automatic; requires operator input and then acts based on it. Unsupervised.

Programming Language(s): PHP, using SQL's handy SxWiki framework.

Function Summary: bot adds a given template text to talk pages of articles in a given category and sub-categories.

Edit period(s) (e.g. Continuous, daily, one time run): Daily

Edit rate requested: 6 edits per minute

Already has a bot flag (Y/N): N

Function Details: The bot is given a few parameters from an HTML form: template to be added, edit summary, main category, and partial template to search for (e.g. if the template is, this parameter would be {{WikiProject X ). It grabs a list of articles in the main category, checks the talk page for the template and adds it if necessary, and then grabs a list of sub-categories in the main category and repeats until every category/sub-category has been searched.
 * Note that this does not necessarily apply to just adding in templates, but in its broadest form can be used to make mass, conditional edits to articles and/or their talk pages in a given category, although the intention is to add or modify templates.


 * Also note that in order to minimize/eliminate the tagging of unrelated articles, two constructs will be used: a list of "whitelist" and "blacklist" subcategories, in which all articles will be tagged or not tagged, respectively, and a maximum search depth parameter, where any article deeper than n subcategories will not be tagged. A combination of the two can minimize unwanted changes.


 * I can see quite a few applications of this bot's functionality. For example, this and this, two requests for bot work recently, could be done by the bot. Paiev Discuss! 05:56, 19 December 2007 (UTC)

Discussion
That seems quite handy! — Preceding unsigned comment added by 76.116.177.115 (talk • contribs) 23:13, December 14, 2007

Do you support WikiProject Banner Shell -- maelgwn - talk 07:09, 15 December 2007 (UTC)


 * Yes, provided that the template is already on the page and that the template to be added in supports the nested= parameter. It will not add the template to pages with many templates (something on the to-do list). Paiev Discuss! 02:18, 16 December 2007 (UTC)

Recursively following subcategories is not a great idea; the category structure is not set up that way. In addition to the possibility of loops (which are permitted, but discouraged), it's possible to find unrelated articles buried deep in subcategories. For example, if you started in the 'Music' category, you would include Theon of Smyrna as an eventual descendant. — Carl (CBM · talk) 15:57, 15 December 2007 (UTC)


 * I realized that there was going to be some completely unrelated material if you go deep enough; therefore, I programmed in a "whitelist" and "blacklist" of categories. Categories on the whitelist will have the entire category and all of its subcategories (and their subcategories, etc) searched. Categories on the "blacklist" will not be searched. These are optional parameters to the bot. Furthermore, there is a maximum depth that can be searched, so that categories more than n degrees away from the starting category will not be searched. I believe that use of the latter combined with some use of the former should be enough to keep unrelated information to a minimum, though I'm completely open to suggestions. Paiev Discuss! 02:18, 16 December 2007 (UTC)


 * The most reliable way to handle this is to have the person who requests the tagging just give you a list of articles, for which he or she is responsible for verifying the correctness. — Carl (CBM · talk) 04:35, 16 December 2007 (UTC)


 * The problem that I see with that is that a list of articles for a large category may number into the hundreds; furthermore, generating that list would require quite a bit of time. My idea with the "blacklisting" of subcategories is that subcategories that have no bearing on the template/main category can be identified and blacklisted quickly, reducing false positives to a minimum while minimizing the time required. A list of articles requires quite a bit of time from the person collecting them unless it's a very small category.


 * For example, let's say that the bot was adding to articles in Category:Geography of the United Kingdom (example taken from bot requests page). A cursory glance through leads me to the conclusion that the subcategories of the main category Category:Time in the United Kingdom, Category:British toponymy, and Category:Royal Geographical Society should all probably be blacklisted, while Category:Reservoirs in the United Kingdom should be whitelisted (assuming I'm interpreting the scope/aims of this wikiproject correctly. I would set the bot doing that, and then while it would be busy adding the template to all talk pages of articles in the main category/subcategories and to the whitelisted subcategory's subcategories, I would open up another window and poke around the other subcategories, black/whitelisting as necessary, and then when the first request would finish I would run the second one to finish the task.


 * That was probably a much longer response than needed containing more than necessary. Anyway, in my opinion writing up a list of articles would be rather lengthy and time-consuming, although it is indeed the way to go for absolute maximum reliability. It amounts to hundreds of articles for the aforementioned example, however. Paiev Discuss! 06:08, 16 December 2007 (UTC)


 * I don't quite understand. What's the difference between a whitelisted subcategory and an ordinary subcategory (Counties of the United Kingdom in your example)? My concern here is that there is a potential for a lot of manual cleanup after the bot runs, if it accidentally tags too many articles. — Carl (CBM · talk) 12:45, 16 December 2007 (UTC)


 * In the aforementioned example, the bot would first search/tag all the articles in the main category and one subcategory deep (the depth variable would be set to one), except for the "blacklisted" categories, which it would not search/tag. In addition, it would search through all the articles/subcategories of any whitelisted categories recursively and tag them, regardless of the depth variable. Any categories that are on the whitelist will have all of their articles/subcategories searched recursively and tagged because from what I can tell all of the articles in the category and its subcategories fit the template, while an ordinary subcategory would not be exhaustively searched but only searched to a certain depth because I haven't verified that articles beyond that depth merit the template. Hope that's a bit less confusing. - Paiev Discuss! 18:18, 16 December 2007 (UTC)


 * That's much more clear, thanks. You might want to edit the bot description at the top of the request to reflect this changed functionality. Would you take care of selecting the whitelist/blacklist categories, or would you expect the person making the request to do so? — Carl (CBM · talk) 15:13, 18 December 2007 (UTC)


 * I edited the function summary very slightly and added on an indented section to the function details. As for the whitelist/blacklist, I would probably do it myself very conservatively at first (so if I was unsure whether or not something belonged, I would blacklist it). Then, I would take the blacklist and present it to somebody involved with the task and have them look over it, and then run the bot with a revised list. If, however, I was given a list or if the search space was relatively small, I would just use a list given by someone involved.


 * It's been a few days since the last BAG edit; any comments/questions/trial approval/rejection/other? I hope you don't mind me using the assistance needed template. BAGAssistanceNeeded Paiev Discuss! 05:56, 19 December 2007 (UTC)

It would be interesting to see your blacklist/whitelist as well for the job. -- maelgwn - talk 08:10, 19 December 2007 (UTC)


 * Trial completed. There were no problems that I saw other than a few issues with the one page that had on it, and those issues have been fixed. I set it to go two levels deep (so the main category as well as its subcategories) with a whitelist of "Urban areas of the United Kingdom;Non-Christian religious placenames in Britain;" and a blacklist of "Maps of the United Kingdom;Time in the United Kingdom;British toponymy;Royal Geographical Society;" (I wasn't sure about whether or not a few of these were relevant, so I blacklisted them. Better safe then sorry is the policy that I will be adopting with this bot; I can always ask the relevant people about something). They are semicolon delimited.  Paiev Discuss! 02:22, 21 December 2007 (UTC)

-- maelgwn - talk 05:49, 21 December 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.