Wikipedia:Bots/Requests for approval/QEDKbot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

QEDKbot
Operator:

Time filed: 17:04, Saturday, February 8, 2020 (UTC)

Function overview: Deleting and nominating empty categories under WP:CSD.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python (mostly Pywikibot and mwAPI)

Source code available: Not yet, will release on GitHub once deployed

Links to relevant discussions (where appropriate): See Bots/Requests for approval/AnkitAWB 2, a previous version of this task run using AWB but tagged for deletion instead of deleting. Test runs were succesful, with one out of all nominated categories not being deleted (not due to a bot error). Advertised on WP:AN: Administrators' noticeboard/Archive317

Edit period(s): Twice daily (deletion), every 4 days (tagging)

Estimated number of pages affected: ~90k (<140k) (excludes hiddencats and includes all other categories with 0 members, even category redirects and possibly empty categories)

Namespace(s): Categories

Exclusion compliant (Yes/No): Yes

Adminbot (Yes/No): Yes

Function details:

General: Tagging: Deletion: If possible, I'd like a split trial where we nominate a fixed number of categories and delete a fixed number of categories.
 * The bot will go over all categories with no members.
 * If the category does not exist, it will skip the page (this is necessary due to a lag in DB replicas).
 * If it exists, it will check that the category has 0 members.
 * If the category is a category redirect, it will check for backlinks.
 * If it has a talkpage and 1 backlink or if it has no talkpage and 0 backlinks, it adds the category Category:Empty categories with no backlinks, which can be assessed for CSD#G6 by other editors. A lot of these cat-redirects of these nature are implausible typos (which if in the article namespace would be eligible for R3) or meant for utility where this is none.
 * If the category is tagged with, , , (Cf* to be accurate) or its redirecting templates, it will skip the page.
 * If the category is not any of the above, it will nominate it for deletion under CSD#C1.
 * It will check the Category:Empty categories awaiting deletion category.
 * If the latest revision is from 7 days ago, and does not meet any above criteria (of being possibly empty or a redirect category), it will delete the page. If the category has a talk page, it will delete it under CSD#G8.

Discussion

 * This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT ⚡ 18:40, 8 February 2020 (UTC)
 * Does "nominate it for deletion" mean nominate it for WP:CFD? Also note what the bot above me said. Jo-Jo Eumerus (talk) 18:47, 8 February 2020 (UTC)
 * No, the bot will nominate it for CSD#C1. Apologies for the error above, all reverted now, and logging in userspace only. --qedk (t 桜 c) 18:49, 8 February 2020 (UTC)
 * Preliminary testing here: User:QEDKbot/Catlog, looks good to me. Once, the deletion bit is set up, will log that too. --qedk (t 桜 c) 12:19, 9 February 2020 (UTC)
 * Category redirects are not eligible for C1. Where is the community approval for this task as required by WP:ADMINBOT? —&thinsp;JJMC89&thinsp; (T·C) 21:02, 9 February 2020 (UTC)
 * It does not nominate any redirect, disambiguation, CSD-nominated or non-empty category for deletion - only sorts them into a tracking category and nominates the rest for deletion which would meet C1. --qedk (t 桜 c) 21:14, 9 February 2020 (UTC)
 * Either way, posted to AN. --qedk (t 桜 c) 21:20, 9 February 2020 (UTC)
 * I know it doesn't, but you say the purpose of the tracking category is for assess[ment] for CSD#C1 by other editors. Assessment for C1 isn't needed since they are never eligible for C1. —&thinsp;JJMC89&thinsp; (T·C) 21:33, 9 February 2020 (UTC)
 * All category redirects aren't useful and can be deleted under G6 (if not C1), that's also what the template says. --qedk (t 桜 c) 21:39, 9 February 2020 (UTC)
 * Assessment of category redirects should be done through WP:RFD, like all redirects. If there is a consensus that G6 can be applied to delete category redirects, please post a link to the discussion. C1 does not apply as category redirects are not categories, empty or otherwise. WP:G8 can apply if the target has been deleted. Ivanvector (Talk/Edits) 16:24, 10 February 2020 (UTC)
 * I'm unsure what you're talking about, category redirects are categories, even if they are soft redirects to other categories. Assessment of all pages in the category namespace are done through WP:CFD, this is also evidenced from a lot of these category redirects which have backlinks to CFD (none to RFD afaik), where they were discussed and redirected in lieu of deletion. WP:CATRED is a subsection of CfD. --qedk (t 桜 c) 16:36, 10 February 2020 (UTC)
 * Also, regarding the last bot run for categories, a lot of them were deleted under C1 or G6. G6 applies for all technical deletions or uncontroversial maintanence tasks, thus including deleting category redirects that are not useful. --qedk (t 桜 c) 16:40, 10 February 2020 (UTC)
 * Well, they do come up at WP:RFD from time to time. Probably getting off-topic to your bot now, but (IMO) G6 shouldn't be used where another criterion applies. "Category redirects that are not useful" seems like a subjective criterion to me (who determines they're not useful?) but a redirect that points to a deleted page already qualifies for G8 deletion, that's what I meant. More specifically, a redirect that points at an empty category would be deleted (G8) when the category is deleted (C1), so determining the redirect's utility is a moot point. Ivanvector (Talk/Edits) 17:36, 10 February 2020 (UTC)
 * Well, it would still be administrator discretion, the bot does not delete or tag for CSD, any category redirects, it is true that there's multiple routes it could go to. --qedk (t 桜 c) 19:19, 10 February 2020 (UTC)
 * I am aware of many maintenance categories which are tagged as such but not as empty category (in fact, one of which I removed from a page recently at ), as the already-provided category templates communicate mostly duplicate information. --Izno (talk) 04:15, 10 February 2020 (UTC)
 * From preliminary testing, such categories are mostly not empty and should otherwise be eligible under C1? --qedk (t 桜 c) 05:48, 10 February 2020 (UTC)
 * I didn't see it from the task list you posted at AN, but will your bot disregard maintenance categories that are intended to be empty? Ivanvector (Talk/Edits) 16:24, 10 February 2020 (UTC)
 * As far as I am aware, maintanence categories that can be empty at a given time are tagged with, so the bot will skip all of them. --qedk (t 桜 c) 16:30, 10 February 2020 (UTC)
 * Perfect. Ivanvector (Talk/Edits) 17:36, 10 February 2020 (UTC)
 * Question: what will happen when an editor adds an article to the category after the category has been added to Category:Empty categories awaiting deletion? Marcocapelle (talk) 22:53, 10 February 2020 (UTC)
 * Even after the list is fetched via the API, the bot performs two checks for each category: 1) "does the page exist?" 2) "does the category have no members?", failing either of which, the bot simply skips doing any action on the page. --qedk (t 桜 c) 07:57, 11 February 2020 (UTC)
 * Ok thanks. Then I have no objection. Marcocapelle (talk) 17:40, 11 February 2020 (UTC)


 * See User:QEDKbot/Deletion catlog for an overview of the other function. --qedk (t 桜 c) 19:28, 13 February 2020 (UTC)


 * I'm currently the admin who deals with empty categories the most frequently (that is, every day) and UnitedStatesian is the primary editor who tags empty categories for deletion. The current system works fine and it's unclear to me how this bot would assist the daily work that we do. I don't see what problem this is solving. Does this bot conflict with BernsteinBot that is run by MZMcBride that we currently rely on? I don't know why MZMcBride is not included in this discussion or any of the editors and admins who work with categories, especially empty categories.
 * The primary problem we currently have is maintenance categories (normally categories organized by day) that do not appear on the Empty Category list because they are now excluded. This was a decision made by MZMcBride that I don't agree with but without this exclusion, empty categories for future dates were appearing on the list. Now, it is much more time-consuming for me and other admins to go through these maintenance categories, category by category, looking for empty categories from days that have passed. In general, we have a problem with categories tagged G6 that don't appear in deletion categories. This has been brought up on the Technical Village Pump multiple times and we've been told that WMF has been working on this issue for years but it is a low priority. if you could resolve this problem, your bot would be a welcome addition. Liz Read! Talk! 04:08, 14 February 2020 (UTC)


 * The bot does not even edit in the same namespace as BernsteinBot, so I doubt a conflict would occur (unsure what kind of conflict you're referring to, I'm guessing the kind where they edit in the same areas)? BernsteinBot pulls a very specific set of empty categories for database reports, mine does not, it pulls all categories which have no members, checks if they exist and have no members and filters them in if they do not have any backlinks. Now, coming to deletion, the bot automatically determines when the cats were included and filters them out if they do not meet C1 criteria, that bot also automatically detects if the category meets the said C1 criteria and deletes them after an appropriate amount of time passes, every aspect is automated. This bot does not use database reports, but rather fetches all categories with 0 members via the API, this includes the maintanence categories you stated above, however if said maintanence categories are tagged with, it will skip it, since that bot cannot determine if it should be a maintanence category that should exist. Now, coming to the final aspect, there is a new tracking category which is Category:Empty categories with no backlinks, this category is basically to track categories that basically have no utility and now editors can choose to trawl through them and identify categories no longer needed and tag/delete them. --qedk (t 桜 c) 10:15, 14 February 2020 (UTC)
 * Could you reflag this bot for purposes of testing (and because I don't want to flood RecentChanges), I requested a self-removal (with no issues), so I hope it's not an issue now. But, just for the sake of confirmation, this bot will only edit in its own userspace, so you needn't be worried about fallout. --qedk (t 桜 c) 10:22, 14 February 2020 (UTC)
 * , done.  Maxim (talk)  13:52, 14 February 2020 (UTC)
 * BAGAssistanceNeeded --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 19:57, 14 February 2020 (UTC)

Regarding the source code, "Not yet, will release on GitHub once deployed" is a bit weird.

The categories system is pretty bad overall. The quarter-assed support for category redirects is part of the problem. The distinction between a category description page existing and the category being populated is also part of the problem. The two issues are specific to MediaWiki and should be resolved there, in my opinion. The current practice of having bots auto-creating and auto-deleting maintenance (and non-maintenance) categories constantly is silly and unnecessary. --MZMcBride (talk) 00:25, 16 February 2020 (UTC)
 * Production code is a bit different from development code and I don't want to polish it until it's up for trial, most of the what will happen in the trial can be seen at the logs: User:QEDKbot/Catlog, User:QEDKbot/Deletion catlog. I honestly do not know if WMF will fix anything, so this bot is the best I've got. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 08:33, 16 February 2020 (UTC)


 * Unless it seems unreasonable to those involved, 40 tagging and 10 deletions seem like decent numbers, as neither will overload the system if there's an issue. Primefac (talk) 21:33, 23 February 2020 (UTC)
 * Did you intend to admin-flag the bot as well or is there more process to it? Also, does "40 tagging" signify 40 CSD tags or 40 categorizations? --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 06:04, 24 February 2020 (UTC)
 * Forgot, fixed. Not really bothered what type of edit is made (to answer your second question), though I suppose a few of each type would be reasonable as proof of concept. Primefac (talk) 11:12, 24 February 2020 (UTC)
 * If you're here because the bot is misbehaving, use the task disable page (listed on the userpage) as an alternative to blocking, there is a small lag, so please have some patience. With thanks. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 20:26, 24 February 2020 (UTC)
 * Actually got finished a few days before but was busy to write a full report, so a quick run-down of things:
 * User:QEDKbot/Catlog, User:QEDKbot/Deletion catlog are some rough logs of what happened (deletion logs more accurate ofc)
 * Deletion looks good to go! No complaints, I think 10 deletions (excluding talk pages deleted under G8, I missed that out), all solid on a review.
 * Tagging category as empty, had an issue where the bot double-tagged, fixed in a later version. Good to go now!
 * Tagging for deletion needs some work, I have to fine-tune the regex further and ensure proper matches.
 * In that aspect, I'm asking for an extended trial, 100 taggings, and another 20 deletions, since deletion is the sensitive aspect and I don't want the bot to mess up later. Courtesy ping to . --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 21:34, 5 March 2020 (UTC) Adding a note that I've reverted all false positives (mostly due to the C1 mistaggings) and I will be there to clean up if necessary. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 桜 c) 21:38, 5 March 2020 (UTC)
 * BAGAssistanceNeeded --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 心 c) 16:12, 17 March 2020 (UTC)
 * Is it really necessary for the bot to make so many edits to its userspace? * Pppery * <sub style="color:#800000">it has begun... 23:49, 17 March 2020 (UTC)
 * It's a cronjob for a task that is disabled in categoryspace (see above; so it only updates the userspace). It makes 48 edits/per day, is that a lot at all? --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 心 c) 08:33, 18 March 2020 (UTC)
 * It is, compared to the number of edits the bot makes that are not in userspace. * Pppery * <sub style="color:#800000">it has begun... 13:21, 18 March 2020 (UTC)
 * Nvm, I forgot that I created a config page for this specific purpose onwiki. Shut off for a while now. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 心 c) 11:05, 20 March 2020 (UTC)
 * As requested, please do 100 taggings and 20 deletions. Primefac (talk) 22:08, 22 March 2020 (UTC)
 * BAGAssistanceNeeded The bot needs a +sysop. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif">qedk (t 心 c) 18:59, 26 March 2020 (UTC)
 * ✅, with my apologies for forgetting Primefac (talk) 21:47, 26 March 2020 (UTC)
 * Please begin the log summaries per the standard conventions, like:
 * to aid scripts often used to parse deletion logs for generating statistics. Also, this bot has performed a lot more than 20 deletions that it was approved for. SD0001 (talk) 18:12, 4 April 2020 (UTC)
 * to aid scripts often used to parse deletion logs for generating statistics. Also, this bot has performed a lot more than 20 deletions that it was approved for. SD0001 (talk) 18:12, 4 April 2020 (UTC)


 * 20 deletions don't include talk pages of categories, it's the "category deletion" mandate that is being tested, the G8 is merely consequential, as for the number of categories, the bot seems to have performed 21, which is one page above the limit (is that an issue?). And as for the edit summary bit, I will update it on the next run. --<span style="font-family:'Trebuchet MS',Geneva,sans-serif"> qedk ( t  愛  c ) 21:34, 4 April 2020 (UTC)
 * Thanks for the clarification. I don't think 1 page above the limit is an issue (though I'm no BAG).
 * As an enhancement, I'm wondering whether you can also handle monthly maintenance category deletions (probably with a separate BRFA)? Such cats (example Category:Articles_with_dead_external_links_from_October_2010) are automatically G6-nominated by the template when the category is empty. But presently, deletion has to be done by human admins, even though it is a purely mechanical task. SD0001 (talk) 05:07, 14 April 2020 (UTC)
 * I can do it sure, but I don't understand the entire procedure, could you point me to the template and the category where it gets put after that nomination so I can take a look, thanks! --<span style="font-family:'Trebuchet MS',Geneva,sans-serif"> qedk ( t  愛  c ) 15:20, 14 April 2020 (UTC)
 * didn't you work on automatically deleting DMCs? I found User:AnomieBOT/source/tasks/DatedCategoryDeleterTest.pm. ‑‑Trialpears (talk) 17:41, 14 April 2020 (UTC)
 * I ran a logging-only task to see if it seemed worthwhile to have AnomieBOT III do those deletions. But (at the time anyway) human admins were getting to them rapidly enough that it didn't seem worth seeking approval. Anomie⚔ 21:18, 14 April 2020 (UTC)
 * Ok, good to know! Perhaps your code will come in handy here then.


 * Courtesy ping to . Apologies for the slowness, real-life caught up with me and I also wanted to be thorough with getting things right. Everything looks okay now (the last run was unsupervised and had no issues). --<span style="font-family:'Trebuchet MS',Geneva,sans-serif"> qedk ( t  愛  c ) 06:52, 14 May 2020 (UTC)
 * BAGAssistanceNeeded --<span style="font-family:'Trebuchet MS',Geneva,sans-serif"> qedk ( t  愛  c ) 08:18, 20 May 2020 (UTC)
 * Primefac (talk) 18:02, 22 May 2020 (UTC)

Update: following a discussion, specifically this subsection, the bot task has been amended to specify the bot's approved tasks: The bot has also been rate-limited to no more than 100 db-c1 taggings in a week initially to avoid flooding the C1 category.
 * Tag empty categories with db-c1 if they meet the following criteria:
 * not a category redirect
 * does not have an "exclusion template", such as pec, as described in the original discussion
 * is not a WikiProject assessment category or other valid maintenance category determined to be potentially empty and not yet exempted via the previous point
 * Delete empty categories tagged with db-c1 that meet the criteria described in the original discussion

Significant concerns about the running of this bot task should be directed to WP:BOTN following the normal procedures. Updates and amendments to this updated task will be indicated here. Primefac (talk) 17:48, 28 August 2020 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.