Wikipedia:Bots/Requests for approval/BHGbot 5


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

BHGbot 5
Operator:

Time filed: 17:56, Wednesday, March 25, 2020 (UTC)

Function overview: Diffuse Category:Directorial debut films into by-year and by-decade categories.

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrowser

Source code available: See AWB settings files linked below

Links to relevant discussions (where appropriate): Wikipedia talk:WikiProject Film

Edit period(s): one time run

Estimated number of pages affected: about 6,235. Category:Directorial debut films contains ~6,330 pages (live tally= ), of which ~98 will be skipped because they are not currently in a  (see Petscan query for live count of pages to be skipped)

Namespace(s): Article

Exclusion compliant (Yes/No): Yes

Function details: If the article does not contain, then skip it (actual regex is a little more sophisticated).

If it does contain, then:
 * 1) add a , using the   value from
 * 2) remove

I have uploaded the AWB settings file to Bots/Requests for approval/BHGbot 5/AWB settings. Note that I have included a few example articles in the list, some with a  and some without. They are just a sample, and not in any way a full list.

The  categories to be populated do not currently exist. There will be approximately 140 of them in all, including by-year, by-decade and by-century categories, and the container cats. I will create them in advance, using AWB, and do not seek BOT authorisation for that task.

Note that I recently created and populated Category:Debut novels by date+subcats, using the same methodology as I propose here, so I know that this works.

The discussion at Wikipedia talk:WikiProject Film only started 24 hours ago (at 17:44, 24 March 2020). Three editors have posted so far, all supporting the idea, but the discussion should probably run for a few days before a final decision is made. I am opening the BRFA now to allow technical evaluation to proceed in parallel with the is-this-a-good-idea discussion. -- Brown HairedGirl  (talk) • (contribs) 17:56, 25 March 2020 (UTC)

Discussion

 * Note: I have notified WT:WikiProject Film of this BRFA. -- Brown HairedGirl  (talk) • (contribs) 18:09, 25 March 2020 (UTC)
 * Looks good to me. Let me know when you are ready for a trial. -- The SandDoctor Talk 08:02, 31 March 2020 (UTC)
 * in this case, a trial would be problematic, because it would be populating about 130 categories which do not currently exist. So a trial would:
 * cause WP:REDNOT problems, or
 * need to be promptly reverted, or
 * require the creation of at least some of a hierarchy of categories which would need to be deleted if the task doesn't proceed. There will need to be a Category:Directorial debut films by date mirroring Category:Debut novels by date, and the minimum number of categories needed to avoid WP:REDNOT after doing just one film is 7. (e.g. Category:1977 directorial debut films would require parents Category:Directorial debut films by year, Category:1970s directorial debut films, Category:Directorial debut films by decade, Category:20th-century directorial debut films, Category:Directorial debut films by century, and Category:Directorial debut films by date.)
 * None of those options seems to be to me to be a good idea.
 * I have done unsaved tests using the AWB settings provided, and they run fine.
 * So it seems to me that best way to conduct a trial would be for a BAG member to test it themself using AWB, using either the sample pages already listed in Bots/Requests for approval/BHGbot 5/AWB settings, or any other subset of Category:Directorial debut films.   This is an easy one to check, because in each case the change is simply the removal of one category and addition of another.
 * Alternatively, I could do a small set and then promptly self-revert, but I don't like cluttering page histories in that way.
 * But would it be too big an ask to seek approval without trial, by taking as the test my previous work diffusing Category:Debut novels? See e.g. these 50 edits which use exactly the same methodology. --  Brown HairedGirl  (talk) • (contribs) 10:42, 31 March 2020 (UTC)
 * - hope you are well. Appreciate that you may have other priorities, but I was just wondering on the progress of this? Thanks.  Lugnuts  Fire Walk with Me 12:43, 2 April 2020 (UTC)
 * Thanks, @Lugnuts. I'm doin fine, and I hope you are well too in these tough times. (I am praying for John Prine and Michael Rosen, both of whom are very ill ... and hoping that my Wiki friends stay safe).
 * Basically, I am ready to roll whenever BAG gives its approval, and at 5 edits/minute the job will take about 21 hours. There's just this issue of a trial run, on which I await a reply from TheSandDoctor. But since the virus is disrupting everyone's lives, I expect that responses may not be superfast, and my experience is that BAG usually has a frustratingly high latency even in good times. --  Brown HairedGirl  (talk) • (contribs) 12:58, 2 April 2020 (UTC)
 * So far so good, fingers crossed it stays like that. Thanks for the update.  Lugnuts  Fire Walk with Me 13:06, 2 April 2020 (UTC)
 * , am I correct in that the plan is to create these subcats, regardless of whether the bot is approved? If so, there's no reason to revert anything; just create the cats that need to be created during the trial and leave the untouched ones empty? Primefac (talk) 16:39, 2 April 2020 (UTC)
 * @Primefac, I don't intend to create these categories unless the job is done by bot. Populating them manually will be too big a job and too error-prone, so I will have no part in creating or populating these categories unless the bot does it. --  Brown HairedGirl  (talk) • (contribs) 17:01, 2 April 2020 (UTC)
 * Great, sounds like there shouldn't be any issues with this bot run, so we'll keep things relatively small as proof-of-concept. I believe and I are around for most of today so assuming things go according to plan I suspect that this task should be approved by the end of the day. Primefac (talk) 17:38, 2 April 2020 (UTC)


 * 25 edits: see contribs list
 * I fed it the sample list that I included in Bots/Requests for approval/BHGbot 5/AWB settings, then added on the first 100 pages from Category:Directorial debut films. I stopped it after 25 edits, which meant processing 28 pages, including 3 skipped:
 * 2 Coelhos
 * 2 Cool 2 Be 4gotten
 * 2 Frogs in the West
 * 2 Penkuttikal
 * 3 (2012 Tamil film)
 * skipped &mdash; 3 Gante 30 Dina 30 Second
 * skipped &mdash; 8 ½ $
 * 8 Heads in a Duffel Bag
 * 8 Thottakkal
 * 9 (2009 animated film)
 * The 9th Company
 * 10 Cloverfield Lane
 * 10 Things I Hate About You
 * skipped &mdash; List of directorial debuts
 * 3 A.M. (2001 film)
 * 3 Bahadur
 * 3 Days to Go
 * The 3 L'il Pigs
 * 3 Monkeys
 * 3 Strikes (film)
 * 3 Wheeling
 * The 4th Tenor
 * 5 Idiots
 * 6 Teens
 * 7 chili in 7 giorni
 * 7 Naatkal
 * 7th Day (film)
 * 10 Years (2011 film)
 * AFAICS, there were false positives and no false negatives, and all edits look accurate.
 * how does it look to you? -- Brown HairedGirl  (talk) • (contribs) 18:36, 2 April 2020 (UTC)
 * how does it look to you? -- Brown HairedGirl  (talk) • (contribs) 18:36, 2 April 2020 (UTC)


 * As expected, looks good. I do assume that the categories will be created as necessary (though as stated elsewhere here that is not part of this task). Primefac (talk) 20:57, 2 April 2020 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.

Thanks again to TheSandDoctor and Primefac for their help. -- Brown HairedGirl  (talk) • (contribs) 02:59, 4 April 2020 (UTC)
 * Job done, in 5499 edits: see this contribs list and its earlier pages.