Wikipedia:Bots/Requests for approval/BU RoBOT 32


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

BU RoBOT 32
Operator:

Time filed: 04:04, Saturday, January 28, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: AWB

Function overview: Mark non-free use images as having a rationale if they have a completed rationale template.

Links to relevant discussions (where appropriate): Wikipedia_talk:Files_for_discussion

Edit period(s): One-time run initially, then run as needed.

Estimated number of pages affected: Difficult to say, but many. Possibly as high as 100,000.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Runs through files transcluding non-free use rationale templates. Checks if they have non-empty values for certain parameters that must be specified for the rationale to be "complete". Then checks to see if a related non-free use tag is present. If it is, sets the "image has rationale" parameter to the value "yes" and adds the "auto" parameter with the value "yes" to populate a tracking category of images with non-free use rationales assessed by the bot. Images used more than once or which are used on articles other than one listed/linked somewhere on the file description page will be skipped.

The goal here is to clear out the tracking category Category:Non-free images for NFUR review so that it's useful again. With 180k+ images in the category, it's not feasible for our editors of the file namespace to review them all to determine if the non-free use rationale templates are appropriate in a timely fashion. This bot will cut down that category to just images with non-template rationales, incomplete rationales, or no rationale at all. These are the ones that need the most urgent review.

Discussion

 * What if we had the bot add a template, "This was automatically reviewed by BU RoBOT" (or similar)? That way we could go by transclusions to find the automated reviews, similar to commons:User:FlickreviewR/reviewed-pass &mdash; MusikAnimal  talk  04:51, 1 February 2017 (UTC)
 * As described above, I had intended to set an auto parameter to yes in the automatically reviewed template, which would populate a tracking category. A secondary template seems unnecessary to me. This is in-line with how my bot marks automatically reviewed classes for WikiProject templates, etc. ~ Rob 13 Talk 05:22, 1 February 2017 (UTC)
 * Sorry, I somehow looked over that. Locating the bot reviews was my concern, so no need for a template. Could you elaborate on what the and related non-free tags are? Also, why are we skipping images used more than once / on articles other than one listed? We'd rather have human review for those? Finally, how does this differ from Bots/Requests for approval/Legobot 22 which supposedly is still running? Pardon my ignorance, I'm not too familiar with this process &mdash;  MusikAnimal  talk  05:36, 1 February 2017 (UTC)
 * Wasn't even aware of that Legobot task until earlier today, but it hasn't run since at least 2014. See this discussion, where 1989 pointed out that Legobot hasn't made edits to the file namespace since May 2014. As for certain parameters, it depends on the non-free tag. For instance, Non-free use rationale logo requires at least the "author" and "use" parameters to be filled out to be valid. If one of those isn't filled in, the bot shouldn't mark it as having a rationale, since the rationale is incomplete. In the case of a logo, the related non-free tag would be Non-free logo. That varies by the type of image. I would stick to images with special non-free use rationale templates that are mostly filled in by default because the more "broad" templates (Non-free use rationale 2) require more review. We're skipping images with more than one article because they would require a separate rationale for each individual article, which would be difficult for the bot to check. ~ Rob 13 Talk 05:44, 1 February 2017 (UTC)
 * Would there be a tracking category for files that the bot finds no NFU tag in? Or would the bot merely ignore it and move on? Iazyges   Consermonor   Opus meum  17:20, 1 February 2017 (UTC)
 * Ignore and move on. Because non-templated rationales are perfectly appropriate, there's no way to tag "problematic" ones, persay. My hope is that clearing this tracking category of all the crap will leave behind those that need human review, which can then be done more expeditiously. ~ Rob 13 Talk 21:35, 1 February 2017 (UTC)
 * I think it would be best if this could be set up as a regularly run task instead of relying on manual runs by people using AWB. Legoktm (talk) 20:34, 1 February 2017 (UTC)
 * Bot review is not as good as human review, and the hope is that once the bot clears away a lot of the crap, humans can start taking over reviewing the newly-incoming images. I'm hoping this task won't be needed forever, but rather just as a temporary fix for a tracking category that has gone a bit mental. Any human looks at a 180k+ file category and walks away, but if we can get it down to something more reasonable, I expect that our editors of the file namespace would work on it and eliminate the long-term need for the bot. ~ Rob 13 Talk 21:35, 1 February 2017 (UTC)
 * Ideally yes, humans would do the job of bots. But realistically we had gotten the category down to everything that the bot couldn't do (a few thousand IIRC) and now it's just grown back, indicating that humans are not really interested in working on this problem. Legoktm (talk) 02:24, 2 February 2017 (UTC)
 * I think we have more editors interested in the file namespace at this moment than in the past, though, due to some recent recruitment drives. Worst case, we can see if it grows back, at which point another run with AWB might make sense. I don't see a reason to encourage human editors to completely ignore this by having an ongoing bot, though. I've seen at least one editor assess a good many of these in a single run, so there are some people working on these. ~ Rob 13 Talk 03:56, 2 February 2017 (UTC)
 * I agree this would probably be better suited as a regular automated task, but a one-time AWB run is a good start. Let's give it a go &mdash; MusikAnimal  talk  02:26, 7 February 2017 (UTC)
 * Before I actually do the trial, I'd like to expand the scope of this task. I said above that I wouldn't handle Non-free use rationale 2 and other broad rationale templates, but I'd like to roll that back to include such templates when all appropriate fields are filled in. After reviewing Legoktm's approval (which included this) and the broader community consensus to filter through these in a thorough manner, I think this is both supported by consensus and past practice. I also think it's the only way to actually accomplish what we want to, which is to get eyes on those files with no non-free use rationales ASAP. Any objections to that? ~ Rob 13 Talk 13:48, 10 February 2017 (UTC)
 * As long as all "required" fields are filled in for each respective template, I think this is fine &mdash; MusikAnimal  talk  22:48, 10 February 2017 (UTC)
 * Contribs. Did the trial using Non-free use rationale logo/Non-free logo. Everything would work similarly for other examples. The only thing that isn't straight-forward is skip conditions, but those aren't too difficult either. ~ Rob 13 Talk 10:54, 11 February 2017 (UTC)

Please add a link or two to the bot's summary so it points to the explanation of the task its doing. "reviewing" is a bit vague.


 * - The NFUR doesn't have author
 * - This one is just weird, but I guess technically correct

Reviewed all other edits and they look fine.

Could the bot skip pages that have extremely short field values, like "—" or "n/a" or "yes" for a lot of it. Like someone added the template because the guideline said so, but didn't bother filling it out properly. I'd say anything super-short likely needs human review. — HELL KNOWZ  ▎TALK 19:19, 11 February 2017 (UTC)
 * Thank you for reviewing. The first edit isn't an error persay, but it is weird, mostly because the page is weird. For some reason, the page had two NFURs, both for the same article. One was fully valid, the other was not (due to the missing field). I've removed the extraneous template. Note that the bot is skipping all files that are in use on multiple articles entirely (as of the last run of the scan that's putting together a list, anyway), so this type of thing won't pop up as an issue on those cases. This was just an oddity due to the double template for one article. For the second edit, indeed, it is weird, and I'll fix it. The bot did its job, though. For Non-free rationale logo, the n/a check is unnecessary because the scan putting together a list of pages to process is ensuring that the one page each image is in use on is wikilinked from the file page, which ensures sufficient information in the only field that needs to be filled in manually ("Article"). The only other field usually has short text ("Use"), because it's a simple switch. I'm already checking to ensure it's populated with one of the appropriate switch values. When I expand this to something like Non-free use rationale 2, I will absolutely add checks to ensure fields that should contain substantial information have more than a certain number of characters in them. Good idea! ~ Rob 13 Talk 21:36, 11 February 2017 (UTC)
 * How open-ended do you expect the expansion of the task to be? I would say that including more templates and combinations it recognizes, adding additional skip conditions or attempting to detect issues (while not doing anything about them) are valid expansions. Tagging more than the "has rationale" or filling out additional information would require further BRFAing, but probably speedy. Detecting multi-page use would probably need BRFA, because it's likely full of gotchas. — HELL KNOWZ  ▎TALK 21:59, 11 February 2017 (UTC)
 * I would only expand in the ways I've detailed above (more templates, always with similar skip conditions to what's been discussed). I would also consider additional skip conditions above and beyond what has been discussed here to be fair game, as that only results in edits not being made, rather than expanding which edits are being made. Anything outside that (including everything you listed as requiring a new BRFA) sounds like an entirely new task to me, and I would file a new BRFA accordingly. I do not have any plans for other such tasks, though. I doubt this can be expanded any further without causing concern about false positives. In particular, the multi-page use sounds like a whole lot of WP:CONTEXTBOT to me. (As an exercise in semantics, I wouldn't call adding more templates an "expansion". I've explicitly asked for open approval here related to tagging all non-free file templates with having a rationale, provided that a completed NFUR is present on the page in the form of a template with certain parameters filled in. I'll do that and nothing more. I never actually expand a task beyond what I asked for approval for without at least consulting a BAG member.) ~ Rob 13 Talk 22:28, 11 February 2017 (UTC)
 * Sounds good. I'll leave this to to (further review and) close, since  originally trialed it. Only note is using a better-worded summary I mentioned above. —  HELL KNOWZ  ▎TALK 23:27, 11 February 2017 (UTC)
 * Is linking to this BRFA sufficient or do you have something more specific you're looking for? It's difficult to expand much on the wording without going over the character limit now that I'm wikilinking to this discussion, and "review" is the word typically used for this sort of activity. I could go with "determined presence of NFUR" possibly, although that isn't really a complete descriptor of what the bot does. ~ Rob 13 Talk 00:42, 12 February 2017 (UTC)
 * I suggest a link to bot's (sub)page, where the task is explained and you can add any additional comments in the future. Linking to BRFA is usually okay too, but I don't personally like it because BRFA is closed after trial, so you can't clarify or expand anything unless the task is a one-off run. Actual wording is up to you, really, and you would know the FUR-related wording better anyway. I was just concerned that from an outside look (non-regulars to image maintenance), I would have no idea what the bot "reviewed". — HELL KNOWZ  ▎TALK 01:23, 12 February 2017 (UTC)

Looks good! I'm going to go ahead and approve as I know you'll come up with a fitting edit summary. I think the current one is actually fine if you add some links, maybe make "non-free use rationale" link to WP:NFUR, and "Task 32" link to an explanatory bot subpage as Hellknowz suggested &mdash; MusikAnimal  talk  02:11, 13 February 2017 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.