Wikipedia:Bots/Requests for approval/HasteurBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

HasteurBot 2
Operator:

Time filed: 01:04, Monday July 29, 2013 (UTC)

Automatic, Supervised, or Manual: Manual Automatic

Programming language(s): python/pywikipedia, SQLite (for maintaining who has been notified about what article DB)

Source code available:

Function overview: To Traverse Category:AfC_submissions_by_date and it's decendant subcategories (starting with October 2008) to
 * 1) Identify AfC submissions (using the page prefix Wikipedia talk:Articles for creation/) that have not been edited in more than 180 days
 * 2) Perform a null edit on the page so that templates and categories may be re-evaluated and populate the Category:G13 eligible AfC submissions category if appropriate
 * 3) Give notice to the page author that the page is eligible for deletion and could be deleted in the near future

Links to relevant discussions (where appropriate):

Edit period(s): Edits in driver data batch sets (Year-Month invocations) to prevent any one edit set overflowing the memory heap of the process. Once the Bot has given notice that to a user that there is potential to remove a specific article (Unique Submission-Author key) the bot will not notify again. Initially, bot will be ran by hand and monitored, but once the majority of the backlog of 80 thousand stale AfCs have been notified, bot will transition into a 2x a day automated maintenance process to notify as other pages become eligible. If a category has truly been emptied (no potentially eligible pages ever), configuration of the driver data will be modified to not look at those categories any more.

Estimated number of pages affected: Currently there are 80 declined AfC submissions. So I expect there to be 80 thousand null edits and 80 thousand user page notifications. As new pages become eligible, this number will go progressively up.

Exclusion compliant (Yes/No): Yes, Bot will subscribe to the afd/prod opt outs on user talk pages (though I think that this is a bad idea) using Template:Bots (which i notice doesn't have the opt out logic) and will obey exclusions directed at itself (HasteurBot) globally.

Already has a bot flag (Yes/No): No

Function details: After venting my frustrations regarding the stalled progress on getting Bots/Requests for approval/HasteurBot approved, suggested in IRC, that it might be useful to send out a nudge on the 180 day mark (or after) so that users are aware that CSD:G13 is going to start to be enforced and gives them time to do something about it. This also has the side benefit of starting to populate the Category:G13 eligible AfC submissions category by having all the tempaltes re-evaluated on the null edit.

Depending on the acceptance of this task, I may withdraw the HasteurBot 1 task and re-work it to only nominate submissions where the author has been given 30 days since the bot gave the nudge notice to the author. This gives an editor the 180 days from the last edit to the nudge notice and an extra 30 days after that notice before the bot could take the step of nominating.

Discussion
Could you include newest expired articles in your trial/initial run, so there is a higher chance the users respond and may be give feedback. Please also use a descriptive summary with links and all, since this is more than likely to deal with new/unexperienced users. Can you also post your message you are posting? If you are using Db-afc-notice, I strongly suggest making a bot version for that. Also how are you handling blocked users? (Also, if I forget, since this isn't explicitly mentioned in edit periods, please keep a reasonable edit rate when approved.) — HELL KNOWZ  ▎TALK 21:11, 31 July 2013 (UTC)
 * AFC_TITLE being the the title/location of the article

AFC_TITLE concern
Hi there, I'm HasteurBot. I just wanted to let you know that AFC_TITLE a page you created has not been edited in at least 180 days. The Articles for Creation space is not an indefinite storage location for content that is not appropriate for articlespace.

If your submission is not edited soon, it could be nominated for deletion. If you would like to attempt to save it, you will need to improve it.

If the deletion has already occured, instructions on how you may be able to retrieve it are available at WP:REFUND/G13.

Thank you for your diligence. ~
 * Thanks Hasteur (talk) 22:07, 31 July 2013 (UTC)
 * So, the virtual hosting service I was using to develop is Globally-ipblocked. No edits have ever come from it, but I'm in the process of petitioning for a labs shell account and setting it up.  I will have more to report as soon as that's finished Hasteur (talk) 22:51, 31 July 2013 (UTC)


 * BotTrialCompleteRunning from Toolserver now, Special:Contribs/HasteurBot
 * From User talk:NathanDodson - User talk:Bargie-Bong: First run that had an exception that I fixed. Terminated before my limit
 * From User talk:Imsimplyalice - User talk:65.129.178.234: Second run. Terminated at a self imposed 5 nags.
 * Reserving remainder (21 edits) for additional testing as required. Because the bot is not flagged bot, it appeared to insert arbitrary wait times to throttle the submissions. Hasteur (talk) 23:55, 31 July 2013 (UTC)


 * I support the idea of this bot. I am not entirely convinced of the actual process of it indiscriminately tagging G13 drafts for deletion. I've been going through and reviewing the drafts that the job queue has been taking its sweet time populating the category with.  I'm personally finding that about 30-40% of the drafts "possibly could" be turned into actual articles. Here's what I think the bot needs to do...
 * Null edit all of the drafts that should currently be eligible so that us humans will have a fully populated category to work from.
 * Notify the creator's and submitters (may be multiple submitters if there were multiple declines) that the draft is currently eligible for G13 as an abandoned draft and offer some links and whatnot to find more sources to possibly make the notability threshold. There was talk about it sending out nudges at the 3 month mark, and I'm not opposed to that as well.
 * This should be done at the 6 month mark, unless
 * The bot can determine that the draft found its way into main article space another way.
 * An article with the same name exists
 * A simple search makes a 75% or greater match of the draft in article space
 * The bot can determine that the draft is a copyvio (could parse the page for URLs and test them against the page using duplication detector or copyvios tool that already exist)
 * This should cause the bot to blank the page and tag it G12 & G13
 * It is a blank, nonsense, non-English submission
 * 30 days later, if none of the above criteria were met forcing an instant nomination, and no edits have been made, it should be tagged as G13 and a notice sent to the creator/submitters.
 * Just brainstorming some ideas to help this bot be more useful and productive. I think that these things would alleviate some of the concerns I have seen the community raise.  I also think that setting such a low initial limit isn't that productive if it is doing all of these other tests as well.  You can ping me here with, or find me on  if you have any questions or need clarification of my thoughts (I honestly wouldn't blame you). Technical 13 (talk) 02:18, 1 August 2013 (UTC)
 * Please read the description... This task only nags the creator that their AfC submission is 180 days old. The other task would nominate for G13. I believe there are copyvio bots out there already so that  should be handled by those bots.  Rather than package everything and the sink into this, let the individual bots take care of what they handle best.  What about AfC submissions that are named funny and when they were copy pasted into articlespace someone went and fixed the title?  G12 and the other logic does take more specialized and hueristic logic. For the time being worrying about Nudging the creators of the submission is just this request. Hasteur (talk) 03:01, 1 August 2013 (UTC)
 * Copyvio is a complex task and probably beyond the scope of this for now, I also don't think this should be imposed here, as it is applicable for many places/tasks on Wiki and we cannot verify this everywhere. Even if it is copyvio, it'll get deleted as stale anyway. I do like the idea of checking versus article space, although also not mandatory, just in case there are non-obvious copy-paste problems (like users circumventing the declined AfC process). However, I do think it would be prudent to do something about submissions that have the article of the same name, have some sort of manual review; Hasteur, I believe a simple "article page exists" check would do. Perhaps an extra maintenance category or list for review and manual CSD tagging or fixing. As a sidenote, blanked submissions can be interpreted as "author request deletion" though. However detecting non-English and such is beyond the scope of bots without manual review, we wouldn't approve that for an automatic task. — HELL KNOWZ  ▎TALK 13:27, 1 August 2013 (UTC)
 * My bad, I had thought had retracted the other request and merged it into this one. I was just trying to prevent tagging drafts as G13 without a manual review of the draft unless there was a way to test that there was no need for review, such as... See above. Those suggestions are obviously not needed for a bot that simply null edits (which is a special api request that does not require adding any blank lines anywhere in the draft, so I'm not sure what that concern was below) and notifies of impending doom. Technical 13 (talk) 18:16, 4 August 2013 (UTC)
 * My bad, I had thought had retracted the other request and merged it into this one. I was just trying to prevent tagging drafts as G13 without a manual review of the draft unless there was a way to test that there was no need for review, such as... See above. Those suggestions are obviously not needed for a bot that simply null edits (which is a special api request that does not require adding any blank lines anywhere in the draft, so I'm not sure what that concern was below) and notifies of impending doom. Technical 13 (talk) 18:16, 4 August 2013 (UTC)

The task currently says "Manual" (all edits reviewed before made, bot is just an assisted editing tool), but BRFA task 1 for bot is Automatic, how does that work? — HELL KNOWZ  ▎TALK 13:29, 1 August 2013 (UTC)
 * I guess I had the definitions wrong in my head. I will manually invoke the bot to crawl the categories (python g13_nudge_bot.py -from:AfC_submissions_by_date/01_February_2013).  From there the bot will edit on it's own because a driver data set has been provided.  Task 1 is completely automatic (python g13_nom_bot.py) due to the fact that the "who was notified about what submission at what time" record that goes into the SQL database and picks up the (up to) 50 oldest notified submissions that are not yet nominated by the bot and evaluates them for continued G13 worthyness. Hasteur (talk) 13:43, 1 August 2013 (UTC)

Just to confirm, you do skip the pages with already existing messages about the same draft? Also, why are you using level 3 headings instead of 2? And I wonder why it says "Thank you for your diligence" -- what diligence is that? They left the draft stale, if anything it's the opposite, so saying that is patronizing at best. It's best to be on the safe side with messages to new users. — HELL KNOWZ  ▎TALK 21:19, 1 August 2013 (UTC)
 * After saving the notice to the user's page I write a record into the sqlite database that says what Submission and User were notified for the nudge. If I traverse the same category again, I skip over any (Page-User) sets I already have as being notified. I just typed in the notice and I'll change it to a level 2 heading. Would Thank you for your attention be better? I'm open to using whatever language works. Hasteur (talk) 21:34, 1 August 2013 (UTC)
 * I meant a notice posted by someone else about the same submission, like using Db-afc-notice or even a regular CSD. And I think that message would be fine. — HELL KNOWZ  ▎TALK 21:38, 1 August 2013 (UTC)
 * Hrm... It would only notify if the page was 180 days old, so it would exclude ones where the page had recently been edited (to add a nomination). I hope we don't have CSD nominations that are sitting around for 180 days that my bot would be picking up on. Seems like an extraordinary edge case. Hasteur (talk) 22:42, 1 August 2013 (UTC)


 * BotTrialComplete: Updated heading level and "Thank you for your attention" changes made. From 23:08 to 23:27.  One oddity in the batch  which, as part of the null edit portion, moved the location of the Category section because categories are at the absolute bottom (HTML comments trip the line).  To correct this instead of doing the addtext at the bottom of the page, which has the potential to relocate categories, do the null edit at the top of the page. Hasteur (talk) 23:43, 1 August 2013 (UTC)
 * *poke* Completed the trial fairly quickly, just waiting on your approval to get the backlog started on the notifying. Hasteur (talk) 18:07, 4 August 2013 (UTC)


 * You can run this task for as many notification you need for the task #1 trial. I prefer to approve both at the same time. Also have a minor issue below. — HELL KNOWZ  ▎TALK 18:33, 4 August 2013 (UTC)
 * So I'm provisionally authorized to start seeding my database with say "January 2013 submission" notifications? Just want to make sure and know if I could start nudging the really old submissions (Like 2009). Thanks Hasteur (talk) 18:43, 4 August 2013 (UTC)
 * You can consider it approved for the purposes of testing. May be hold off mass scale notifications for a bit. — HELL KNOWZ  ▎TALK 18:53, 4 August 2013 (UTC)

- moving categories around from other messages? — HELL KNOWZ  ▎TALK 18:33, 4 August 2013 (UTC)
 * That incedental came from the addtext code I picked up from the pywikibot logic.  bypasses the relocate code for interwikis/categories/etc. if I'm adding text at the bottom of the page. Hasteur (talk) 18:41, 4 August 2013 (UTC)

Ok, I seeded in some extra noms/whatnot. Standing by for the first task. I'll add more as we need to. Hasteur (talk) 02:25, 5 August 2013 (UTC)

Approved as a supplementary/prerequisite task for the BRFA #1. See my full comments there that are applicable to this BRFA as well. Again, a few useful suggestion here that I hope botop will consider implementing, but currently out of this BRFA's scope. — HELL KNOWZ  ▎TALK 18:52, 18 August 2013 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.