Wikipedia:Bots/Requests for approval/Community Tech bot 5


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

Community Tech bot 5
Operator: and the Wikimedia Community Tech team

Time filed: 06:19, Monday, May 14, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python/Pywikibot

Source code available: https://github.com/MaxSem/CommonsNotifier (might be moved elsewhere later)

Function overview: notifying article authors about problems with Commons images used.

Links to relevant discussions (where appropriate): Wishlist survey #10, enwiki discussion

Edit period(s): continuous. Will start with running every 15 minutes, then adjust if needed.

Estimated number of pages affected: several thousand per month

Namespace(s): Talk:

Exclusion compliant (Yes/No): yes, standard Pywikibot code

Adminbot (Yes/No): no

Function details:

Whenever a Commons image used on this wiki in mainspace is nominated for deletion:
 * Wait some time in case the nomination was vandalism/mistake. We will start with 15 minutes' wait for speedy deletion nominations and 1 hour for discussion-based deletions but may change the delays later if needed.
 * Post short messages to affected articles' talk pages, linking to deletion discussion where applicable. The messages can be seen at T188151.
 * The bot will not post to more than 10 pages per file. Initially, these will be selected randomly but later 10 most watched pages will be notified.

Discussion

 * I'm assuming the reason the initial notifications will be random is down to technical difficulties in getting the most-watched pages? I'm not entirely sure that a random scatter is the best backup approach however...  Richard 0612  10:19, 14 May 2018 (UTC)


 * Correct, getting the list of the top 10 most followed pages proved to be technically prohibitive. (It's possible, but caused a significant performance drag on the bot and we think we can find a better alternative.) We're starting small and will iterate based on feedback if the bot is too quiet. We're documenting all ideas of ways to notify if a file is used on more than 10 pages at T190313. — Trevor Bolliger, WMF Product Manager (t)  21:35, 14 May 2018 (UTC)
 * I would opt for the 10 most active pages (by number of edits in some sensible timeframe - say the past month) rather than a random 10, but I'll follow/contribute to the discussion on Phabricator.  Richard 0612  21:44, 14 May 2018 (UTC)


 * Hi, wanted to walk through your process so for example:


 * 1) If I nominated commons:File:Stylised_atom_with_three_Bohr_model_orbits_and_stylised_nucleus.svg for some sort of deletion at commons
 * 2) The bot will pull this list of 1235 articles
 * 3) The bot will randomly pick 10 of those, and leave a message at the associated talk page
 * Correct? — xaosflux  Talk 14:29, 14 May 2018 (UTC)
 * Yes. MaxSem (WMF) (talk) 21:18, 14 May 2018 (UTC)


 * If the talk page is a redirect, will the bot follow it? — xaosflux  Talk 14:29, 14 May 2018 (UTC)
 * Redirects are skipped (as in no page gets edited) because it's suspicious why an article with actual content and images would have its talk page redirected. MaxSem (WMF) (talk) 21:18, 14 May 2018 (UTC)


 * If the talk page is protected, will a different target be selected assuming there more then 10 possible targets? — xaosflux  Talk 14:29, 14 May 2018 (UTC)
 * Yes. MaxSem (WMF) (talk) 21:18, 14 May 2018 (UTC)


 * If an image is in use in an article, because it is in a template (e.g. in the example above) can you notify the template talk? (Perhaps expand this to include Template as well as Main)? —  xaosflux  Talk 14:33, 14 May 2018 (UTC)
 * Figuring out which template is responsible is tricky, so right now there's no such functionality. MaxSem (WMF) (talk) 21:18, 14 May 2018 (UTC)
 * how is this "tricky"? — xaosflux  Talk 21:32, 14 May 2018 (UTC)
 * I believe Max means tricky to determine from the page level, not the template level. I provided a similar response to Richard above, but in short we're starting with a quiet bot and will build functionality if the 'pick 10 pages at random' is ineffective. We're documenting all ideas of ways to notify if a file is used on more than 10 pages at T190313. Posting on template talk pages is on the list. — Trevor Bolliger, WMF Product Manager (t)  21:35, 14 May 2018 (UTC)


 * As the supposed goal of this is to let people know there is something going on, will you be asserting the bot attribute on these edits? (where it may be filtered from watchlists, but will also avoid flooding recent changes). — xaosflux  Talk 14:35, 14 May 2018 (UTC)
 * Good question! Because this is a notification (as opposed to routine maintenance or other bot activities) its edits should not be marked as bot. Does this match what you would expect from this type of bot? — Trevor Bolliger, WMF Product Manager (t)  21:35, 14 May 2018 (UTC)
 * Yes, I think avoiding the bot flag will be necessary for it to get the attention of page watchers - only concern would be flooding of the Recent Changes Feed. What is the highest expected edit rate that would be in recentchanges at a time? —  xaosflux  Talk 21:38, 14 May 2018 (UTC)
 * Perhaps some options could be incorporated using tags and or 'minor' flags to allow people to hide this? — xaosflux  Talk 21:38, 14 May 2018 (UTC)
 * Oh, great idea! We're not 100% sure about the exact rate at which Recent Changes will be affected, it could depend on time zones and the habits of active Commoners. We calculated some high level week-long stats for how many pages would be edited for all Wikimedia wikis, but we did not drill down into just English Wikipedia. It was suggested that we first enable the bot with the `bot` flag and disable it later once we get a better idea of it's frequency. Do you think this would be helpful or is it unnecessary? — Trevor Bolliger, WMF Product Manager (t)  00:06, 15 May 2018 (UTC)


 * approving for an initial trial. During trial use the bot parameter while we judge the impact it could have an RCP. As possible, please include the tag  . —  xaosflux  Talk 01:49, 15 May 2018 (UTC)
 * Will do, thank you! The ticket for making the bot trial compliant is T194778. — Trevor Bolliger, WMF Product Manager (t)  16:24, 15 May 2018 (UTC)
 * Note, an initial small sample run is available here. — xaosflux  Talk 01:50, 23 May 2018 (UTC)


 * OperatorAssistanceNeeded Please provide a report/update on your trial. — xaosflux  Talk 19:18, 12 June 2018 (UTC)
 * Hey, the bot was off for a significant part of the time since the approval. There were some problems, eventually resolved. The bot is stopped now. Max Semenik (talk) 21:03, 12 June 2018 (UTC)


 * Edits are here. — xaosflux  Talk 21:07, 12 June 2018 (UTC)


 * in general the edits look OK - was there any editor feedback or commons engagement stats? — xaosflux  Talk 21:10, 12 June 2018 (UTC)
 * other than reporting the bug when the bot got crazy, no other input. We don't have any stats. Max Semenik (talk) 01:37, 13 June 2018 (UTC)
 * Actually in going through the edits, I'm not very happy about these: Talk:Mexican_peso - was this part of the "got crazy" detection? — xaosflux  Talk 02:17, 13 June 2018 (UTC)
 * Yes, that (permalink) was part of the "got crazy" but it's been resolved now. We're now batching the talk page notifications into lists (like such) to avoid this in the future. Also of note: the bot is set to run every 15 minutes. This can be adjusted if needed, but we feel it's timely for Speedy Deletion nominations. — Trevor Bolliger, WMF Product Manager (t)  14:45, 13 June 2018 (UTC)
 * Not quite so, and  - the files here are all different so this is a lack of batching which we fixed in T195629, as can be seen e.g. here. The part that users have noticed was repeated posting about the same image which we also fixed from two different aspects (now the bot leaves machine-readable comments about each file and avoids keeping the database connection open for too long to prevent timeouts). Max Semenik (talk) 05:29, 15 June 2018 (UTC)


 * OK, well I suppose we should get ready for production. We haven't determined any solid use for production tagging yet unless you have a need/desire for one, so please reconfigure for no tags on the edits. For this task, keep using "minor", and not using "bot" flags. Once this is ready, do a short (<100 edit) test and report back here for final approval. —  xaosflux  Talk 02:13, 13 June 2018 (UTC)
 * — xaosflux  Talk 02:13, 13 June 2018 (UTC)
 * trial complete. Max Semenik (talk) 08:15, 16 June 2018 (UTC)
 * they look good. As far as future editor feedback, will User talk:Community Tech bot be monitored or do editors need to go to meta? —  xaosflux  Talk 13:53, 16 June 2018 (UTC)
 * about ready to close this out, just want to confirm how communication is going to take place. There is a local talk page with a banner to go to meta, but it also appears to be getting used here.  I'm good pretty much either way, just want to make sure it is clear for editors and that where any messages go is monitored.  Thanks, —  xaosflux  Talk 01:40, 18 June 2018 (UTC)
 * We are happy to receive feedback either the user talk page here on English Wikipedia, the user talk page on Meta, or the project's talk page on Meta. User talk:Community Tech bot here on ENWP is most preferred. — Trevor Bolliger, WMF Product Manager (t)  16:21, 18 June 2018 (UTC)


 * — xaosflux  Talk 16:36, 18 June 2018 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.