Wikipedia:Bots/Requests for approval/Hazard-Bot 34


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Hazard-Bot 34
Operator:

Time filed: 03:27, Monday, December 28, 2015 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: GitHub

Function overview: Updates the list of common mistakes for WikiProject Fix common mistakes

Links to relevant discussions (where appropriate): Bot requests, Bot requests/Archive 63

Edit period(s): Perhaps monthly

Estimated number of pages affected: 24, plus possibly the log table to make 25

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: At least for now, I'm planning to manually trigger this task whenever I'm aware of an updated dump (and it's available on Tool Labs). The script will go through all articles, searching for the common mistakes checked by the WikiProject (currently those listed at WikiProject Fix common mistakes, though it's flexible enough to change). This task is only about updating the lists, not fixing the "mistakes" (if they are indeed mistakes, that is). To reduce false positives, I'll be searching for the mistakes with both a leading and a trailing space in the text. Also, I'm making available a page (possibly User:Hazard-Bot/Common mistakes blacklist, though I'm open to alternatives) to list pages to exclude from the lists (there definitely will always be false positives, so this will be a means of avoiding the same set of recurring pages on the lists on every run). Hopefully this is straightforward enough. Now to notify the WikiProject of their late Christmas present. Hazard SJ 03:27, 28 December 2015 (UTC)

Discussion
Pretty straight forward, give it a test, report back results. — xaosflux  Talk 04:08, 28 December 2015 (UTC)

Thank you for my Christmas present. I'm the poor sap who currently does this manually, so I wholeheartedly approve of this. I also add a leading and a trailing space when creating the list, so Hazard is doing the same things I currently do. I use "blacklists" on CheckWikipedia. However, I haven't used a blacklist for Fix Common Mistakes, but this is a good idea. I would put the blacklist under the project's subpage. Thank you again. Bgwhite (talk) 06:43, 28 December 2015 (UTC)


 * you're welcome. I also realize I should have used the name "whitelist" as opposed to "blacklist". Anyways, how does WikiProject Fix common mistakes/Whitelist or WikiProject Fix common mistakes/Whitelisted pages sound? Let me know what you want. Also, I see that  got a quick start on WikiProject Fix common mistakes/a a from the first batch of lists. How were they? Hazard SJ 14:50, 28 December 2015 (UTC)


 * I should probably automate this too Hazard SJ 15:03, 28 December 2015 (UTC)
 * I think the simple ones (both words the same) should be okay. I started working on an a and I've noticed some regular phrases that should probably be excluded: an a cappella, an a priori, an a la carte, an a fortiori, and an a posteriori. (I only saw the last two a couple times.) I remember seeing some pages with math/formulas/code that would always show up in the dump but not necessarily have an error, but I didn't take note of them. I'll try to keep track of new ones that I run into. —&thinsp;JJMC89&thinsp; (T·C) 15:38, 28 December 2015 (UTC)


 * After a little talk with The Earwig, I've made some changes. The set of mistakes to be checked can now be configured from User:Hazard-Bot/FIX/Scan configuration. Each level 2 heading identifies the mistake, then unordered lists within the sections can identify exceptions. I've filled in the mistakes, and added some of the "an a" exceptions (feel free to update them). As for the previous "blacklist", I've corrected the name to "whitelist", and it's now at User:Hazard-Bot/FIX/Whitelisted pages. This would probably come in handy for perhaps pages with quotes that contain errors, or whatever else the case may be, since specific phrases can be directly included within the configuration page. Again, moving those pages to subpages of the WikiProject is perfectly fine (possibly WikiProject Fix common mistakes/Scan configuration and WikiProject Fix common mistakes/Whitelisted pages?), and we'll probably also want to add some protection to minimize tampering. Since I just did a scan and don't have the next dump as yet, I'll hold off on the next run for a bit. (P.S.  I tried using   instead of the spaces, but that also included , creating multiple false positives, and I was unable to figure how to exclude that single character.) Hazard SJ</b> 08:16, 29 December 2015 (UTC)


 * The next enwiki dump is currently being generated, so I'll hopefully be able to run that within the next few days. <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 06:46, 22 January 2016 (UTC)
 * but we had so much fun! No? (Kidding.)
 * that is AWESOME. I'm typically only working on here once a month or so (Great Userbox War etc., don't ask) so I just saw this. But I repeat - AWESOME! Anything that can improve WP:FIX, I'm all for it. Thank you so much! Let me know how it works out! Sct72 (talk) 00:38, 23 January 2016 (UTC)
 * The next batch is out (January 2016 dump)! <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 17:04, 3 February 2016 (UTC)
 * (Gene Rayburn) Hazard is soooooo slow. (crowd) How slow is he?  (Gene Rayburn)  A beat Hazard in a 100m "dash".
 * The February dump started today. Should be ready in a couple of days.    To be fair, January's was late in starting up, but that won't stop me from giving you a hard time  :)  Bgwhite (talk) 21:30, 3 February 2016 (UTC)
 * And the February dump still hasn't been completed :) <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 07:35, 13 February 2016 (UTC)
 * Are we good to go here? — Earwig   talk  22:15, 3 February 2016 (UTC)
 * Possibly, I haven't encountered any problems so far, and the edits are to a limited set of pages (which can be controlled by WikiProject Fix common mistakes/Scan configuration). It would be nice to have that page, as well as WikiProject Fix common mistakes/Whitelisted pages, furnished with a silver lock (semi-protection). Additionally, a new set of suggestions came by, which I addressed below, so hopefully we're good there as well. <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 07:35, 13 February 2016 (UTC)

Looking good. If possible, it may be good to not match inside comments, some tags (<score ></score> [may have parameters inside the opening tag], <math ></math>, <source ></source>, <pre ></pre>), and file names ( , image, etc.). —&thinsp;JJMC89&thinsp; (T·C) 07:46, 5 February 2016 (UTC)
 * I didn't get image, but cb3970f should have covered the other things you requested. <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 07:35, 13 February 2016 (UTC)
 * Everything looks good to me. Thanks. —&thinsp;JJMC89&thinsp; (T·C) 08:14, 13 February 2016 (UTC)
 * —  Earwig   talk 03:40, 28 February 2016 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.