Wikipedia:Bots/Requests for approval/ZackBot 11


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

ZackBot 11
Operator:

Time filed: 20:19, Sunday, October 21, 2018 (UTC)

Function overview: Clean out

Automatic, Supervised, or Manual: Automatic

Programming language(s): Ruby

Source code available: User:ZackBot/Albums

Links to relevant discussions (where appropriate): N/A

Edit period(s): one time run

Estimated number of pages affected: ~90,000

Namespace(s): Mainspace

Exclusion compliant (Yes/No): yes

Function details: Been having a pretty good run at doing this as a semi-automated process where I basically copy and paste the source code into a script which then converts the page and I manually preview it and click save but it is taking too much time so I want to just do a fully automated run.

Bottom line what this does is parse the existing template to the necessary format and then substitute the infobox code in. The way the template has been written, it can be substituted for proper formatting. The issue with just doing a straight substitution is that the regular expressions in the template do not cover all cases. My code covers a much higher percentage (about 99% in my testing) and more importantly, when it hits one of those 1% cases, it skips over the page and doesn't make the edit rather than introducing errors.

Discussion
Can you be a bit more specific about the kinds of transformations you're doing? If I understand the task correctly from your edit history, the bot would completely reformat the infobox—is this something we're okay with on 90,000 pages? —  Earwig   talk 22:29, 21 October 2018 (UTC)
 * thanks for the message! So the real focus here is on removing the deprecated parameters. An added bonus of the way the subst template has been setup is that it re-formats the source code to be nicely tabbed and spaced. On 99% of the pages, there will be no noticeable change on the front end. There are some pages that currently are not properly using the next album/previous album parameters. Those will see minor changes to conform with the templates documentation. -- Zack mann  (Talk to me/What I been doing) 22:46, 21 October 2018 (UTC)
 * Does your code avoid or fix errors? – Jonesey95 (talk) 04:23, 22 October 2018 (UTC)
 * both. As described above it either fixes them, or if it cannot, it simply skips the page leaving it to be done manually. -- Zack mann  (Talk to me/What I been doing) 04:45, 22 October 2018 (UTC)
 * just to expand on that... The template currently has an insanely complicated substitution method in it that was masterfully written by, but really is crazy complicated (to be clear, Jc86035, you did a great job! That isn't a dig are you.). The program that I wrote is able to use more advanced parsing techniques than are available in WikiMarkup. The code I'm using involves multiple different regular expressions so the issues that are present in that group are almost entirely resolved. The very limited number of cases where it cannot be resolved, the page is just skipped. Of the ~15,000 pages I've done manually, with the exception of some of the first pages when I was still debugging the process, none of them have introduced errors into the page. Let me know if you have any more questions. -- Zack mann  (Talk to me/What I been doing) 17:55, 22 October 2018 (UTC)


 * My bot, DeprecatedFixerBot, already does this, see Bots/Requests for approval/DeprecatedFixerBot 3. I was just planning to finish that this week. -- The SandDoctor Talk 15:32, 23 October 2018 (UTC)
 * Though I do skip the string errors category pages. -- The SandDoctor Talk 15:34, 23 October 2018 (UTC)
 * Only reason it isn't down to 30k pages right now is that at some point into the run on my server yesterday it ran into an error (new C++ code, old python...something I plan to resolve tonight, aka git pull the missing file). It just takes a bit of time to parse/edit, so I tend to set it up on the server and grind away 50k pages at a time as compared to running locally on my laptop. -- The SandDoctor Talk 15:46, 23 October 2018 (UTC)
 * Maybe Zackmann08 can focus on the Module:String errors, since there are about 5,000 of them. – Jonesey95 (talk) 16:01, 23 October 2018 (UTC)
 * happy to work with you to capture the pages you don't resolve. Up to you. Certainly don't want to compete. :-) -- Zack mann  (Talk to me/What I been doing) 18:10, 23 October 2018 (UTC)
 * Yaarige Saluthe Sambala is an example of a page where I was able to handle a case your bot skipped, so I think both bots could be useful. -- Zack mann  (Talk to me/What I been doing) 20:50, 23 October 2018 (UTC)
 * I will make sure that my bot is done its running by the weekend, while you focus on Module:String errors how about? -- The SandDoctor Talk 21:15, 23 October 2018 (UTC)

Sure thing! Could I get approval for a trial run and then I will unleash the beast this weekend once you are done? -- Zack mann  (Talk to me/What I been doing) 21:33, 23 October 2018 (UTC)
 * - Let's see how this works. SQL Query me!  09:08, 20 November 2018 (UTC)
 * many thanks! here are the results. -- Zack mann  (Talk to me/What I been doing) 17:57, 20 November 2018 (UTC)
 * any update on this? Let me know if you have any feedback. :-) -- Zack mann  (Talk to me/What I been doing) 01:52, 23 November 2018 (UTC)
 * trial is complete, waiting for next steps. Anything else I can do? -- Zack mann  (Talk to me/What I been doing) 06:09, 26 November 2018 (UTC)
 * any chance you can take a look at this? -- Zack mann  (Talk to me/What I been doing) 16:00, 6 December 2018 (UTC)

I gotta say I find this incredibly disappointing. This is now the third BRFA that I have withdrawn and simply done manually. I understand there is no rush, but nearly a month with no follow up from any members of WP:BAG... Trying to get things done to improve the wiki and it seems like the BAGs are not doing their job here. -- Zack mann  (Talk to me/What I been doing) 19:11, 14 December 2018 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.