Wikipedia:Bots/Requests for approval/FastilyBot 1


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

FastilyBot 1
Operator:

Time filed: 04:03, Sunday, November 15, 2015 (UTC)

Automatic, Supervised, or Manual: Automatic supervised

Programming language(s): Java

Source code available:

Function overview:
 * Function 1:
 * 1) Select files tagged with  where the file is already (as in matching sha-1 hashes) on Commons.
 * 2) Edit each file to replace  with  (so as to request admin review).
 * Function 2:
 * 1) Select files tagged with  where the file is NOT already on Commons (SKIP files where the local file does not match the Commons file linked in the template OR files where the local file and Commons file share the same title but do not match)
 * 2) Edit each file to replace  with  (so as to request human review - human should untag the file if not suitable for moving)

Note: this bot task does not actually find files to tag for transfer to Commons. It only looks at files which have already been tagged with or

Links to relevant discussions (where appropriate):

Edit period(s): Weekly

Estimated number of pages affected: Probably <1000 for the first run, but likely no more than 50 for each subsequent run

Exclusion compliant (Yes/No): Sure? Though to be honest, I don't see any benefit in doing so

Already has a bot flag (Yes/No): No

Function details: Basically as described in the Function Overview section. It's a simple housekeeping task and shouldn't be controversial. I'll also make use of the  parameters in each of the listed templates so users know a bot made the edit. Thanks for your consideration. - F ASTILY 04:03, 15 November 2015 (UTC)

Discussion

 * 1) Please add {{subst:ncd}} instead of Now Commons so that the template is properly dated.
 * 2) What does task 2 mean? Are you checking that the file on Commons is identical, or only that a file exists on Commons?
 * 3) Have you asked users who process most of the 'Now Commons' tags, such as User:Diannaa and User:Magog the Ogre, about what they think of a bot which adds 'Now Commons' tags to thousands of files with potentially incorrect essential information on Commons? Maybe it's better to tag the files in batches, since a lot of files may need to be cleaned up? --Stefan2 (talk) 16:56, 15 November 2015 (UTC)
 * {{subst:ncd}} does not have a bot parameter, so I will be manually filling parameters to look like
 * Identical of course. If a file is non-identical, then the bot will edit the file to remove the, and add  (with all parameters,   included, filled in) in its place.
 * This is not part of the bot's task. Please take a moment to review the Function Overview section.  - F ASTILY  22:16, 15 November 2015 (UTC)


 * Hi, Fastily; it's good to see you. Please create your bot's userpage with bot or similar when you get a chance (and the talk, too). The first task seems fine, but I wonder if there is an exception to the second. When moving to Commons, is it possible that the the user will upload a different version that is fundamentally the same image (higher quality, perhaps)? If the Commons file does not exist, then the task is fine, but I figure these other cases may require human review and would be best logged or tagged somehow. —  Earwig   talk 07:28, 16 November 2015 (UTC)
 * Hi Earwig! Good to see you too :) That is a very good point, I'll add a rule to Function 2, handling instances where the local file is tagged  but does not match the Commons file; I can have the bot log the incident to a subpage in its userspace, and then skip the file.  - F ASTILY  07:59, 16 November 2015 (UTC)
 * A bot should not remove NowCommons if a file with the name exists on Commons as {{subst:ncd}} can be used if the file on Wikipedia is a lower resolution copy of the file on Commons. --Stefan2 (talk) 15:46, 16 November 2015 (UTC)
 * Yes, that is what I just said :) - F ASTILY 22:10, 16 November 2015 (UTC)


 * What happens if mtc isn't used directly but transcluded by a template, for example ? --Stefan2 (talk) 15:46, 16 November 2015 (UTC)
 * Nothing, obviously. The bot is only intended to take action in clear-cut cases where it can edit without breaking anything. - F ASTILY  22:10, 16 November 2015 (UTC)


 * (when you have the code written, of course). Would like to see a roughly even split of each sub-task. —  Earwig   talk 01:30, 18 November 2015 (UTC)
 * Thanks Earwig! I'll try to have this done by the end of next week - F ASTILY  21:21, 18 November 2015 (UTC)


 * Comment After some thinking, I don't think that it is appropriate to have a bot for task 2. If a file has Now Commons and the file doesn't exist on Commons, then it may mean that the file has been deleted on Commons. If the file has been deleted there, it is usually not appropriate to copy the file over a second time as it would just be deleted again. A user needs to check these tags manually to determine how the deletion on Commons affects the file's status on Wikipedia. For example, if the file on Commons was tagged with, it might be necessary to translate that template to a corresponding Wikipedia template, such as db-f9 or non-free fair use. There are currently only 136 files with Now Commons and most if not all of the files exist on Commons, so it shouldn't be a lot of work for a user to manually remove the tag when a file has been deleted on Commons.
 * In task 1, what does the bot do if the file has both mtc and NowCommons? Users who use Commons Helper to transfer files to Commons will typically add {{subst:ncd}} to a new section on the page while preserving the mtc template, so many files contain both templates. There is no need to add multiple NowCommons templates to the same file. --Stefan2 (talk) 17:27, 18 November 2015 (UTC)
 * For the first paragraph: That is precisely why I proposed the task. The bot identifies obvious cases where the  tag is inappropriate, and converts it to, so that a human can review the file and take action (e.g. untag local file if it is not appropriate for Commons) accordingly.  Have you taken a moment to review the Function Overview?
 * I do see the point being made here, though. Now Commons is definitely incorrect in the case Stefan2 outlines above, but perhaps Move to Commons is not the best template and we can have the bot leave a more specific message. I actually just found Deleted on Commons, which looks very appropriate. —  Earwig   talk 21:38, 18 November 2015 (UTC)
 * So in theory, the task is fine, just so long as the bot is going to double check to make sure there's no deletion-log entry for the page on commons. If Fastily wants to actually add  (DoC), then so be it, but so long as the bot skips over anything that's already been deleted (and especially if the page here already has the DoC template) it gets skipped, too. -- slakr  \ talk / 22:12, 18 November 2015 (UTC)
 * Right, although I wonder why a file would be tagged with NowCommons when no transfer attempt was actually made, other than user error (which I imagine is rare for this task—do we have examples of it happening?). There is also Incomplete move to Commons, but that seems to have a slightly different use case. —  Earwig   talk 22:19, 18 November 2015 (UTC)
 * @Earwig: You're correct that there are few occurrences of obviously misapplied tags.  I only proposed this task because it'll be easy to write; Subtask 1 shares a lot of code with Subtask 2.
 * @Slakr: I am not an admin on either Commons or enwp, so I would not be able to verify if a file deleted on Commons is identical to en.wp's copy. I definitely wouldn't want to apply  erroneously - F ASTILY  00:52, 19 November 2015 (UTC)
 * For the second paragraph: Simple. is removed and no additional  is added. - F ASTILY  21:20, 18 November 2015 (UTC)

Any news? —  Earwig   talk 00:29, 2 January 2016 (UTC)
 * Hi, apologies for the delay, I've been implementing new jwiki features for the bot. At the rate I'm going, I should have this ready within a week.  - F ASTILY  10:49, 6 January 2016 (UTC)

for Subtask 1. I am opting out of implementing Subtask 2; a crude scripted check shows the number of affected files to be fewer than 10, which doesn't warrant a bot task imo. - F ASTILY 11:02, 19 January 2016 (UTC)
 * In Special:Diff/700582618, the bot didn't replace an mtc template when adding a NowCommons template. That file's mtc template is transcluded by PD-US-1923-abroad. There is a potential risk that PD-US-1923-abroad may co-exist with keep local, in which case NowCommons shouldn't be added. Are you checking this? A normal mtc template, which isn't transcluded by something like PD-US-1923-abroad, should normally not co-exist with keep local. --Stefan2 (talk) 12:57, 19 January 2016 (UTC)
 * Yes, the bot worked as expected, flagging a file which is present on both enwp and Commons for admin review. I think this is fine, because a) the tag clearly states the file was flagged by a bot, and b) admins that delete properly transferred files are careful to check for copyright issues.  The scenario you're describing is out of scope for this task, but it is something that could be handled via bot; I don't mind doing it, so I'll write up a separate task. - F ASTILY  22:47, 19 January 2016 (UTC)
 * The task as described before the trial was run was to replace mtc with nowCommons if the file is present on Commons, but do nothing if an mtc tag can't be removed. If the task has changed to add nowCommons irrespective of whether there is an mtc tag which can be removed, then the bot needs to check whether keep local or do not move to Commons is used on the file information page and skip such files as they are unlikely to meet WP:F8. If the bot always checks whether an mtc tag can be removed, then this check should be unnecessary as mtc is unlikely to co-exist on the same file information page as one of those tags.
 * Some of the files needed extensive cleanup on Commons, for example c:Special:Diff/185163235 and c:Special:Diff/185163203, where the source, author and copyright tag were wrong. Even if a source was stated, the copyright tag was not always correct, see c:Special:Diff/185162465. The files are marked with a note Be careful. This file is tagged by a bot (FastilyBot). Be sure to check the file at Commons before deleting this file.) but do not otherwise seem to differ from other files with nowCommons. For example, the files appear in the same categories. --Stefan2 (talk) 16:01, 20 January 2016 (UTC)
 * The point of this bot is to check if a file exists on both Commons and enwp, and flag it for admin attention. Admins are expected to review each file before deleting the local and fix problems as necessary, which is why it is against the rules to use automated tools like Twinkle to batch-delete under F8. I have decided against adding the filter you're describing, because then files which are inappropriately transferred would never be brought to our attention; I'll add it back in if there is consensus to do so.  Also, please see Bots/Requests for approval/FastilyBot 2, which addresses the removal of  from blatantly ineligible files.  - F ASTILY  01:54, 21 January 2016 (UTC)
 * If you do not add the check I described, then admins processing F8 deletions will have to engage in edit warring with your bot since it will tag files which are both hosted locally and on Commons but do not qualify for deletion on either project. I don't think that it is a good idea to have a bot which promotes edit warring.
 * For example, File:1855-Melville Island.jpg is hosted locally and also exists as c:File:Melville Island Sketch (1855).jpg, but since the file has a "keep local" template, the file doesn't qualify for deletion per WP:F8, and it shouldn't be deleted on Commons either. Files like this are not to be tagged with NowCommons. --Stefan2 (talk) 22:17, 21 January 2016 (UTC)
 * I agree with you, but that situation as described would be impossible, because I'm generating the list of files to process via transclusions of . - F ASTILY  06:48, 22 January 2016 (UTC)
 * I think I'm ready for a BAG member to review the request :) - F ASTILY 02:53, 27 January 2016 (UTC)
 * A simple anti-edit warring measure would be to ensure that the bot never edits the same page more than once (keep track of page IDs, page titles, check for username in page history, whatever). I'm not sure how often it would come up, but I can't imagine the bot would ever need to; such a situation likely means a human did something weird and the bot should stay away. That aside, I think we're mostly good to go... Stefan2, any other thoughts? —  Earwig   talk 07:10, 27 January 2016 (UTC)
 * @Earwig: The bot is automatic supervised, so I'll be reviewing its logs and edits after each run. If there was edit warring, I'm sure I'd see it during my review - F ASTILY  23:55, 28 January 2016 (UTC)
 * @Earwig or any other BAG member: If there aren't any other objections, could this please be approved? I'd love to get this task started soon :) - F ASTILY  05:47, 31 January 2016 (UTC)
 * —  Earwig   talk 06:28, 31 January 2016 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.