Wikipedia:Bots/Requests for approval/FrescoBot 13


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

FrescoBot 13
Operator:

Time filed: 22:19, Sunday, February 26, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: standard pywikipedia

Function overview: it will fix a wide range of common problems in galleries like syntax problems, duplicates and nonexistent images.

Links to relevant discussions (where appropriate): Help:Gallery tag

Edit period(s): monthly

Estimated number of pages affected: 2000? 9000?

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This script is running on Commons since september 2016, so it is already pretty tested. As usual I try to fix in one single edit many different problems.
 * Syntax issues
 * images inserted in the gallery with standard markup using square brackets (eg., )
 * nonexistent parameters like thumb, left, right, upright, size, etc.  -->   (example)
 * empty and completely redundant gallery tags like  -->   (example)
 * multiple pipe characters after the filename
 * missing pipe character after the filename
 * captions as image parameters and therefore not visible
 * categories mistakenly inserted in dummy galleries
 * HTML tags problems
 * tags not closed
 * pointless br tags at the end of the line
 * center tags without any content
 * center tags around galleries in packed mode
 * closing tags without the open tag
 * other less common problems with tags
 * Dummy caption: it will blank common dummy captions (eg. Add caption here)
 * Duplicate images in the same gallery with the same description ,
 * Nonexistent images
 * lines without any valid image name (like Symbol point of order.svg, Symbol point of order.svg, )
 * before removing any suspect invalid image the bot will check for the existence of the file Symbol point of order.svg
 * if the file apparently does not exist then it will try to fix some common mistakes like a missing/extra | at the end/beginning of the filename
 * Invisibile characters
 * invisible LTR (U+200E) and RTL (U+200F) marks in the filename area (completely useless)
 * other invisible control characters (from U+2027 to U+202F) in the filename area (completely useless)
 * unexpanded special strings
 * unexpanded user signatures (3 or more tildes within the galleries)
 * unexpanded magic words

Discussion
Is there some sort of guideline/policy or consensus that invalid syntax entries should be removed? This feels like WP:CONTEXTBOT. I can see it as desired in Commons, where there's so many galleries and bad syntax is just removed. But we have way more editors and way less[cn] galleries. Actually deleted file links are normally removed (or should be) by an file de-linking bot, otherwise they are likely user errors. I could see this as uncontroversial if you only removed links to files previously deleted. — HELL KNOWZ  ▎TALK 12:00, 27 February 2017 (UTC)


 * Well, IMHO the mere existence of CHECKWIKI shows a large consensus about removing technical mistakes from Wikipedia. If needed, I could name many good reasons to desire a correct and cleaner page source code over a mess full of mistakes and rubbish. :) Here on en.wiki there are less galleries, nevertheless there are 100868 pages with at least one gallery. I just tested in read only the script on the first 500 pages with galleries and 9% of them contains at least one of the above problems. We definitely have a job for a bot. Most of the "nonexistent images" are just rubbish or old captions forgot in the page by the delinking bot during the image removing process (example). I could avoid to check the existance of apparently valid image names, but removing a (rare) misspelled image name actually helps a lot in fixing the problem: everybody watching the page will be notified that a bot removed nonexistent images from the gallery. -- Basilicofresco  (msg) 20:18, 27 February 2017 (UTC)

CHECKWIKI has proved contentious in the past, the recent Magioladitis case highlights this. How do you feel WP:COSMETICBOT fits in with these proposals? Some of these changes would obviously affect the visual appearance of the page, some will not. Will the bot only perform a cosmetic change when other, substantial changes are present? TheMagikCow (talk) 20:52, 27 February 2017 (UTC)


 * You misunderstood me. I only asked about removing entries, i.e. the entire line: Fille:Cat.jpg|Kitty -> nothing. This is an example of a user error and a classic WP:CONTEXTBOT example. The bot cannot always determine why this line is invalid, when a human easily might. Per WP:BOTREQUIRE, you should show that each subtask "carefully adheres to relevant policies and guidelines", including that it is okay to automatically remove entries that don't have a file, regardless if they can be fixed by a human. Just because there are a lot of subtasks you listed and majority are good to go, does not exclude each one from having to have consensus/guideline/policy. It is up to the community to decide by agreement or reasonable WP:SILENCE that "removing a (rare) misspelled image name actually helps a lot in fixing the problem". The rest of the issues are fine as they are purely syntactic errors (e.g. invalid tag) or unambiguously wrong content errors (e.g., placeholder text). — HELL KNOWZ  ▎TALK 21:01, 27 February 2017 (UTC)


 * Cosmetic changes (moving/removing spaces, etc) will only be performed when more substantial changes are present. Nevertheless I would like to stress that for example closing a tag will not affect the visual appearance of the page, but it is far to be considered just a cosmetic change. It will improve the consistency of the markup and therefore the accessibility. @Hellknowz Now I understand what you mean. Well your example is already properly fixed by the bot as a syntax problem (eg. ). During the early tests on Commons I inserted several regex designed to fix the lagest number of reasonable misspellings. In this way the risk of removing any image with an easy mistake was greatly reduced and I did not receive any complaint of this kind. In any case what do you suggest? Should I open a discussion at the village pump and ask for comments about this specific point? -- Basilicofresco  (msg) 01:39, 28 February 2017 (UTC)


 * Would you want to file this as 2 BRFAs (make another for unknown line removal) to avoid the delay of seeking consensus? I can quickly trial and very likely approve all the other subtasks, as they are--as I mentioned--purely syntactic errors or unambiguously wrong content errors that you have already been running on Commons and the bot does not need to make any WP:CONTEXTBOT decisions. The line removal task can then be advertised separately and it would be clear what it does. No complaints on Commons is a good precedent, but it's not something that can establish consensus for BOTPOL/BRFA for what is essentially content removal when the bot cannot fix it. — HELL KNOWZ  ▎TALK 14:00, 28 February 2017 (UTC)


 * Now that the minor changes have been addressed, I support the recommendation made by Hellknowz. I support all the tasks, bar line removal until there is consensus. TheMagikCow (talk) 21:19, 28 February 2017 (UTC)

Basilicofresco, let's open a discussion where we will get consensus to do these edits too. I support you. -- Magioladitis (talk) 21:44, 28 February 2017 (UTC)

I totally support this task. -- Magioladitis (talk) 12:26, 28 February 2017 (UTC)

Is is correct - does this task only affect content inside galleries, and not content elsewhere in articles? &mdash; Carl (CBM · talk) 12:26, 17 March 2017 (UTC)
 * Function overview specifies "in galleries", so that is what I assumed. — HELL KNOWZ  ▎TALK 12:35, 17 March 2017 (UTC)
 * I think it never hurts to have confirmation from the bot operator, when things are not quite clear. &mdash; Carl (CBM · talk) 12:41, 17 March 2017 (UTC)
 * Of course it affects only the image galleries. I took care to properly parse even misused gallery tags in order to avoid problems. -- Basilicofresco  (msg) 07:02, 21 May 2017 (UTC)
 * I opened a discussion at Wikipedia talk:Image use policy. If you feel it is missing something, do not hesitate to express your opinion. Thanks. -- Basilicofresco  (msg) 12:11, 21 May 2017 (UTC)
 * Could you either move the discussion to Village pump (proposals) or post a notice there of the ongoing discussion? Unlike Commons where images dominate, very few people are invested in the file namespace on enwiki or watch WP:Image use policy. You'll get more community input at the village pump than such a policy talk page. ~ Rob 13 Talk 04:16, 3 June 2017 (UTC)
 * Yep, good idea. Done! -- Basilicofresco  (msg) 19:52, 4 June 2017 (UTC)


 * BotTrial The village pump notice resulted in more comments with unanimous support for this task. Seems good to trial. Please be especially careful your bot only makes other changes when the main task is also being completed. ~ Rob 13 Talk 05:18, 13 June 2017 (UTC)
 * Are there results from trialing? — xaosflux  Talk 12:39, 7 July 2017 (UTC)

Note This task partially covers CHECKWIKI error 85 and CHECKWIKI error 29. I totlly support it. -- Magioladitis (talk) 23:09, 7 July 2017 (UTC)


 * BotExpired No response from operator - who has not made an edit in almost a month either. May be reactivated by editor at a later time if desired. — xaosflux  Talk 02:51, 15 July 2017 (UTC)
 * Re-opened per request. Headbomb {t · c · p · b} 18:35, 17 August 2017 (UTC)

is bot trial complete? -- Magioladitis (talk) 19:36, 17 August 2017 (UTC)
 * Completed! I'm sorry for the long delay, thanks for the patience. At the time in June and July I had been working far from home overseas and I had no enough free time to proper retest the script on a new wiki and follow potential issues, so I waited. The script seems ready to me. -- Basilicofresco  (msg) 05:53, 18 August 2017 (UTC)

Magioladitis (talk) 08:02, 18 August 2017 (UTC)
 * D Please summarize your trial results. — xaosflux  Talk 21:41, 20 August 2017 (UTC)


 * I did not notice that brfa was still pending. As I said on 18 August 2017, the trial result was looking good to me. "Trial completed" an no complains/questions sounded to me as a green light so... I started the bot. I'm sorry, please accept my apologies for the misunderstanding. Well at least I can summarize the result of this extended trial: no real issues emerged. There were just 4 clarification enquires over 10 days:
 * User talk:Basilicofresco --> No problems, the user just misread the diff.
 * User talk:Basilicofresco --> No problems, the user just misread the diff.
 * User talk:Basilicofresco --> The bot removed an invisibile chunk of text hidden in the gallery. I explained my opinion and the user did not replied further. The edit looks good and reasonable to me.
 * User talk:Basilicofresco --> The bot actually fixed only part of the problem about the closing ref tag alone on a new line (it is not valid). The user fixed the rest. Maybe I could insert a specific fix in order to detect this specific mistake, but it really seems very rare.
 * Do you have any additional question or request? Ps. Please note that I was not notified about the OperatorAssistanceNeeded tag. -- Basilicofresco  (msg) 21:06, 29 August 2017 (UTC)
 * -- Basilicofresco  (msg) 10:58, 30 August 2017 (UTC)
 * I have no issues with the first three inquiries on your talk, but the last one, I do have a problem with. I understand the concept GIGO, and naturally garbage data in, will result in meaningless output data.  However in this case, your bot is designed to take garbage data and make it meaningful.  So I would like to see broken refs get fixed within galleries, even if it's rare.— CYBERPOWER  ( Chat ) 08:00, 3 September 2017 (UTC)
 * No problem, I inserted a specific regex in order to fix this rare mistake. -- Basilicofresco  (msg) 04:40, 4 September 2017 (UTC)


 * — CYBERPOWER  ( Chat ) 07:32, 4 September 2017 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.