Wikipedia:Bots/Requests for approval/DeprecatedFixerBot 3


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

DeprecatedFixerBot 3
Operator:

Time filed: 22:35, Thursday, March 22, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: https://github.com/TheSandDoctor/Music-infoboxes-deprecated-param-fixer

Function overview: The bot goes through Category:Music infoboxes with deprecated parameters looking for either Infobox Album, Extra chronology, Extra album cover, Extra track listing (of course, any of their redirects/synonyms as well). If they are found, the bot appends "subst:" to the title to trigger the substitution trick (as noted/recommended in all templates linked above) to resolve deprecated parameters.

Links to relevant discussions (where appropriate): N/A. Template:Infobox album, Template:Extra chronology, Template:Extra album cover, Template:Extra track listing

Edit period(s): A series of shorter runs until resolved

Estimated number of pages affected: 149,557 (approx)

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Namespace(s): Mainspace only. Wherever present (mostly mainspace)

Function details: The bot goes first generates an internal list of the page names within Category:Music infoboxes with Module:String errors. It then goes through Category:Music infoboxes with deprecated parameters. If the title of the next page to edit is within the errors category, then it skips it and will not edit that page. If the page title is not in the list, the bot looks for either Infobox Album, Extra chronology, Extra album cover, Extra track listing (of course, any of their redirects/synonyms as well). If they are found, the bot checks if they contain released (and that it contains a date, ie not "Unreleased"). In the event that it does, the bot proceeds directly to substituting the template(s) and checks its edit (see below).

In the event that the released parameter is not found, the bot checks for a date within the this album,prev album,next album for dates if released is not found. If a valid date format is found, then it appends the appropriate year parameter and moves the found date into it (the exception, of course, being WikiLinks, in that case they are just copied rather than moved). Once the incompatibilities with the substitution trick have been worked out, the bot appends "subst:" to the title to trigger it the substitution trick (as noted/recommended in all templates linked above) to resolve deprecated parameters.

As a last resort the bot also checks its edit after making it. The bot will revert itself if the following two criteria are met:
 * 1) the page contains "', so it is left open-ended. The bot does not use regular expressions for this, so it only matches that part of the string (which is the part that matters).}} after the bot's edit and
 * 2) the bot was the last user to edit the page

Discussion

 * -- The SandDoctor Talk 23:16, 29 March 2018 (UTC)
 * Please run a trial and post back a summary of your results and a link to the diffs. — xaosflux  Talk 13:28, 12 April 2018 (UTC)
 * Diffs (most recent 50, bot will not run again until this BRFA is complete) -- The SandDoctor Talk 16:44, 12 April 2018 (UTC)
 * Pages in Category:Music infoboxes with Module:String errors need manual attention. Edits like Special:Diff/836087082, Special:Diff/836087087, Special:Diff/836087114, Special:Diff/836087144, Special:Diff/836087156, Special:Diff/836087190, Special:Diff/836087214, Special:Diff/836087927, Special:Diff/836088186, Special:Diff/836088188, Special:Diff/836088194, Special:Diff/836088888, and Special:Diff/836088935, cause errors to appear in articles. Special:Diff/836088194 and Special:Diff/836087156 also missed . —&thinsp;JJMC89&thinsp; (T·C) 02:29, 13 April 2018 (UTC)
 * I would like to request a new trial (not limited by time) so that I can test improvements when I have made them (busy with finals now for next two weeks). My guess about the error is that it is due to the other albums not being WikiLinks, but that is just a guess at the moment that I will have to investigate further when I have the chance. Thank you for bringing those up, I could add a check to ensure that the page being edited is not the same as one in that category. As for missing the infobox, I missed those two. Not sure what would have caused the issue, but I will look into it ASAP. -- The SandDoctor Talk 03:02, 13 April 2018 (UTC)
 * I don't like to leave these in trial "indefinitely", if you need more than 2 months to do this we can but this whole request on ice until you are ready. — xaosflux  Talk 13:29, 13 April 2018 (UTC)
 * Thank you . Instead of editing the page, I had the bot in "dry-run mode" where it spits out its changes (what it would send to server to save) into a text file. I took (the relevant infobox) spit out of Annette (album) and this time, it found the infobox album (sandbox diff). I checked Art Pepper with Duke Jordan in Copenhagen 1981's infobox as well in the sandbox and it now works fine by the looks of things. The bot is also behaving properly on Anita O'Day & the Three Sounds now as well. I will live edit the three shortly. (For this task's script) the bot now checks what is present in Category:Music infoboxes with Module:String errors before editing pages. (cc ) -- The SandDoctor Talk 04:01, 14 April 2018 (UTC)
 * I know why it worked perfectly on those pages. During revert JJMC fixed the error. *mental facepalm*. Will dry run some more pages and get back to you. -- The SandDoctor Talk 04:07, 14 April 2018 (UTC)

Doing the subst trick on its own will not work with these Infoboxes. Because fields like,   and    have been free text since day one, there are all kinds of weird and wonderful user formats, and errors that are present and that trip the subst command up, resulting in string errors. I've done a lot of work with AWB on the music infoboxes and the range of formats is ridiculous, e.g. I kept creating a new a rule to handle each new formatting style I found for the  fields. I eventually had to stop because there were so many of them, it just became quicker to manually edit them. I strongly urge you to either do a separate bot run to pre-parse the  fields or make the bot re-check a page after an edit and revert itself if a string error has been introduced. The unacceptable situation is for a bot run that is only doing a subst edit, because that will result in Category:Music infoboxes with Module:String errors being swamped with a wave of extra articles that would be left for manual clean up, and that isn't an acceptable solution. - X201 (talk) 07:47, 18 April 2018 (UTC)
 * I do not have time to work on this for the next week or so (finals wrapping up). I agree that swamping Module:String errors is not an acceptable solution, that is why I am working on a solution that doesn't use the subst trick for extra chronology. I will keep this page updated when possible, but I wouldn't expect any updates until next week some time when finals are over and I don't have any more studying to do and can work on this more consistently. -- The SandDoctor Talk 20:38, 18 April 2018 (UTC)
 * I have written and tested a revert function so the bot now has the ability to revert its latest edit and have also implemented the rough framework/outline that should allow the bot to review its edit after making it. The plan is to link the two and revert itself if:
 * A: spots C in the page, and
 * B: was the last editor of the page
 * That could be a useful backup should the other preventative actions fail. Again, I will have more time to work on this next week and shall keep everyone updated. -- The SandDoctor Talk 03:08, 20 April 2018 (UTC)
 * Looks like a good plan. Good luck with the finals. - X201 (talk) 07:22, 20 April 2018 (UTC)

Cleanup/revert functionality has now been tested and incorporated into the bot. Description has been updated to reflect the bot procedural differences (still does the same task, just updated to reflect that it does it slightly more cautiously now) -- The SandDoctor Talk 05:24, 24 April 2018 (UTC)
 * -- The SandDoctor Talk 21:37, 1 May 2018 (UTC)
 * Reviewing last run. — xaosflux  Talk 15:13, 10 May 2018 (UTC)


 * with the following (or any slower as desired by operator) ramp up schedule:
 * 1500 edits, 1 day hold
 * 1500 edits, 1 day hold
 * 5000 edits, 3 day hold
 * Open editing. — xaosflux  Talk 15:20, 10 May 2018 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.