Wikipedia:Bots/Requests for approval/Yobot 48


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Yobot 48
Operator:

Time filed: 09:37, Friday, February 3, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available:

Function overview: Fix external link with double prefix

Links to relevant discussions (where appropriate):

Edit period(s): Daily

Estimated number of pages affected: 5 pages per day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Example. Changes:
 *  http://http://foo.com  →  http://foo.com 
 *  https://https://foo.com  →  https://foo.com 
 *  http://https://foo.com  →  https://foo.com 
 *  https://http://foo.com  →  http://foo.com </tt>
 * <tt> http:// http://foo.com </tt> → <tt> http://foo.com </tt>
 * <tt> https:// https://foo.com </tt> → <tt> https://foo.com </tt>
 * <tt> http:// https://foo.com </tt> → <tt> https://foo.com </tt>
 * <tt> https:// http://foo.com </tt> → <tt> http://foo.com </tt>
 * <tt> ftp://ftp://foo.com </tt> → <tt> ftp://foo.com </tt>

Discussion

 * I don't see any issues with this task. Will general fixes be enabled? ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 11:08, 3 February 2017 (UTC)


 * prior to beginning please make sure the bot's userpage is up to date with what tasks are currently approved or not. Due to the prior confusions of being a very busy bot, please list any running tasks with links to the associated BRFA for the specific task.  On edit summaries, include the task number and a link either to your task list on your user page or directly to this BRFA.  This task should not also run unrelated edits to the specific function (such as AWB genfixes) - but may be expanded to also repair double https links as well as http if you want. —  xaosflux  Talk 18:32, 3 February 2017 (UTC)


 * Have you abandoned this task? —  xaosflux  Talk 05:48, 10 February 2017 (UTC)

,, ,.

Permanent link. This month the list had only 11 pages. 2 more found by daily scan.

No general fixes performed. -- Magioladitis (talk) 07:00, 10 February 2017 (UTC)


 * - please retrial, the edit summaries are broken, see above. — xaosflux  Talk 02:34, 11 February 2017 (UTC)
 * Also please clean up User:Yobot as above. — xaosflux  Talk 02:35, 11 February 2017 (UTC)

Check again. To avoid creating hundreds of redirects I'll create only those that their tasks are most likely to actually get approval. I created the appropriate redirect now so there is no red link. I use redirects to save some bytes from the edit summary. -- Magioladitis (talk) 09:25, 11 February 2017 (UTC)

I updated the list of tasks. -- Magioladitis (talk) 09:30, 11 February 2017 (UTC)

Interesting note: The edit that introduced the duplicated http was made 12 days earlier and nobody fixed it. This shows how important this task is. -- Magioladitis (talk) 09:39, 11 February 2017 (UTC)

Permanent link. This time I used a list created by my own regex. -- Magioladitis (talk) 10:11, 11 February 2017 (UTC)


 * Trial looks fine to me. I oppose making this task open-ended with any new fix casually related being added at a later date with no oversight. That's how we got here in the first place. I'd be fine if the stuff described below were coded and added now, with another trial to make sure they work properly. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 10:18, 11 February 2017 (UTC)
 * technical tweaks are allowed to all bots. -- Magioladitis (talk) 10:24, 11 February 2017 (UTC)
 * Expanding a task is not a technical tweak. Technical tweaks are allowed. Entirely new fixes that are related but distinct are not. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 10:25, 11 February 2017 (UTC)
 * Fixing dupe prefixes is not an entirely new task. -- Magioladitis (talk) 10:34, 11 February 2017 (UTC)
 * In this task description you said "I will do A and I won't do B." If you then do B, that's a new task. Technical tweaks is fixing a bug, moving to a different programming language with the same function, tightening up the code so it runs faster, etc. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 10:36, 11 February 2017 (UTC)
 * Covering encoding that is not covered now it's technical. -- Magioladitis (talk) 10:55, 11 February 2017 (UTC)
 * It depends entirely on what approval you receive. Here, you've asked for approval to do specific encoding changes and explicitly said you won't handle these types of fixes. You don't get to randomly rescind promises made during the BRFA while the bot is running. As a side note, do you think turning on general fixes is "technical"? I would want an answer to that before this is approved, because I'm very worried about this attitude that you're able to change major functionality of this task after approval. That is not how approval is intended to work. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 10:58, 11 February 2017 (UTC)
 * No general fixes at this stage. Still I wonder about why you are always worried. -- Magioladitis (talk) 11:06, 11 February 2017 (UTC)
 * What does "at this stage" mean? Would you seek additional approval before adding them? ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 11:07, 11 February 2017 (UTC)
 * It means task. The second question is some kind of bad faith? -- Magioladitis (talk)
 * No. It's clarification in light of your statements here. It shouldn't be this hard to illicit a response between "Yes, I would consider turning on genfixes without additional approval to be appropriate under the bot policy." and "No, I would not consider turning on genfixes without additional approval to be appropriate under the bot policy." ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 15:40, 11 February 2017 (UTC)

Don't try to wikilawyer me again. The examples below are not related to the task as I explicitly wrote. The example below won't be done by the bot at this stage (i.e. if they don't get additional approval). I wonder where this bad faith comes from. -- Magioladitis (talk) 13:24, 11 February 2017 (UTC)
 * Past Magioladitis: "technicsl (sic) tweaks are allowed to all bots." That was in response to me explicitly stating that you should do exactly what you just wrote, so why exactly did you write back and argue that the examples below were technical tweaks? The mind boggles. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 13:28, 11 February 2017 (UTC)

to allow fixing double prefix with different encoding is technical. To catch things like the youtube example is not. I need to find a dictionary for that I guess :) Maybe the one that contains the word "cosmetic" which is, I guess, a neologism. - Magioladitis (talk) 13:31, 11 February 2017 (UTC)

See my examples! The one is not even about prefixes. It's about InternetArchiveBot. How did come to the conclusion I would like to include this in my bot task? Same for general fixes. I did not even mentioned them. -- Magioladitis (talk) 13:32, 11 February 2017 (UTC)

Nor the other example is about double prefix. It' about a broken prefix. These are self-notes that I usually email to people. But you asked for more transparency. I also left messages to WP:CHECKWIKI. -- Magioladitis (talk) 13:34, 11 February 2017 (UTC)

The only one which is technical, I asked for feedback here:. -- Magioladitis (talk) 13:35, 11 February 2017 (UTC)


 * Ok, let's start over, because I genuinely have no idea what you're saying or arguing at the moment. Magioladitis, you are getting approval for what you wrote in the function details. Nothing more, nothing less. If you want approval for more, change it now so we can run a trial on it. I don't object to adding more to the task, if you would like to, but do it now. If you do not add to the function details now, then this bot is being approved for those functions and nothing more. Running additional fixes, even if related, would be an unauthorized bot. Technical changes mean changing how the bot runs, not what it fixes. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 13:36, 11 February 2017 (UTC)

Exactly. Detecting the double prefix is technical issue. If the url is encoded one way or another is not a change of bot behaviour. Is this clear now? -- Magioladitis (talk) 13:38, 11 February 2017 (UTC)

Example of a double prefix with space. Found 5 of them in a total of 5+ million Wikipedia pages. Still it is easy to include them in the list of fixes. -- Magioladitis (talk) 13:42, 11 February 2017 (UTC)

I think you have to calm down and read what I wrote about the task. This may be more productive than waiting in the corner that I make some typo and then report it. -- Magioladitis (talk) 15:11, 11 February 2017 (UTC)


 * I've read what you've wrote many times over, but you then talked about expansions and asserted that some of them were part of this task. I'll drop this and assume you'll run your bots within the bounds of the function details and your initial promises made here (pre all this "expansion" talk), though you never have before. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 15:34, 11 February 2017 (UTC)

the non-technical expansion is unrelated to the task. -- Magioladitis (talk) 15:49, 11 February 2017 (UTC)


 * I appreciate you making that clear. This leaves the question of why you responded with the statement that technical tweaks are permissible when I said the same thing earlier, but I guess that will remain a mystery to me. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 15:53, 11 February 2017 (UTC)

Because you keep making the same question in various places. -- Magioladitis (talk) 15:57, 11 February 2017 (UTC)


 * That's indicative of a need for clarity on all BRFAs, not just one. That is demanded of all bot operators, including myself. I comment on many BRFAs asking for more details. When you file 25 BRFAs in three days, all with few details, you can expect a lot of questions. That's not a function of the BRFAs being yours. It's a function of the contents of each BRFA and the quantity. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 16:08, 11 February 2017 (UTC)
 * Sure. That's why I also started leaving notes for further reference. Sorry if this was confusing for you. Next time just please contact me instead of reporting to ArbCom just waiting for a mistake to prove yourself right. -- Magioladitis (talk) 16:11, 11 February 2017 (UTC)

Potential expansions in the future
These are some random notes related to the task but not exactly part of the task:

The bot does not fix these since this is not a double prefix. I wonder if another bot could check for these. Nor fixes these. I wonder if this is worth for CHECKWIKI to add a check. I fixed both manually. -- Magioladitis (talk) 09:51, 11 February 2017 (UTC)

Another one that potentially could be fixed by bots but right now I find those manually:. -- Magioladitis (talk) 09:54, 11 February 2017 (UTC)

I usually find these by inspiration or by checking randomly hundreds of pages everyday. Then, I email some people. Since, it seems the community wants to get more involved I try to report these in the appropriate pages. I think that not every minor modification needs additional approval. -- Magioladitis (talk) 09:56, 11 February 2017 (UTC)

There are many links that perhaps are OK or the result of InternetArchiveBot being broken back in November 2016. . deadurl=bot: unknown does not seem right to me. These need manual attention. I don't know how many they are. -- Magioladitis (talk) 09:59, 11 February 2017 (UTC)

No need for a bot to fix this:. I wonder if CHECKWIKI could still detect it. Only 1 instance. -- Magioladitis (talk) 13:47, 11 February 2017 (UTC)

No need for a bot to fix this:. I wonder if CHECKWIKI could still detect it. Only 1 instance. I recall that SmackBot was fixing those at some point. Not sure. -- Magioladitis (talk) 13:50, 11 February 2017 (UTC)

Common mistakes could potentially trick Spam filter:. -- Magioladitis (talk) 14:01, 11 February 2017 (UTC)

Trial review
Why is <tt>http</tt> preferred over <tt>https</tt>? I would say it should be the other way around. The bot doesn't actually verify links, so without further information, we should rather use the (hopefully) secure version.

Will this fix any other protocols, like <tt>ftp:</tt>?

Please define (explicitly) what an "external link" is from bot's perspective and add this to function details. This is a template parameter. This isn't even the start of an "external link". These are citation url parameters. This is plain text and not an actual link. None of these are the example given in the function details and the bot appears to simply find-replace regardless of the actual usage.

The task would be approved for only the details listed. Bad encoding detection is not a "technical issue". need to be listed in function details or requested as an expansion at WT:BRFA later. You have not demonstrated diligence to avoid expanding the tasks beyond the intended approval. At this point, you haven't demonstrated diligence to even provide full function details or expand them on request. — HELL KNOWZ  ▎TALK 14:36, 12 February 2017 (UTC)

you can approve the task for only these if you like. I'll make an additional BRFA after this one is approved then. Let's go step by step. -- Magioladitis (talk) 17:06, 12 February 2017 (UTC)

I can make a different BRFA for <tt>ftp:</tt>. <tt>http</tt> is preferred over <tt>https</tt> because the first works always and there is a bot converting to https if necessary. -- Magioladitis (talk) 17:09, 12 February 2017 (UTC)


 * I'd recommend preferring whichever version is second in the malformed link. These typically arise because someone copy-pasted a full URL into a field where they already typed in "http:" or "https:", which would mean the second version is the one they used to actually access the site. I struggle to think of any case which would case things to work the other way around. ~ Rob 13 <sup style="margin-left:-1.0ex;">Talk 23:13, 12 February 2017 (UTC)

OK. Here you are. I still don;t understand why some people think the discussion here is more centralised than the discussion in CHECKWIKI's page or AWB's bug page. -- Magioladitis (talk) 23:19, 12 February 2017 (UTC)


 * I'm planning on approving this for the 9 exact use cases listed at the top - as to where they appear on the page, I don't see any examples of legitimate uses that need to be maintained - please comment below if I've missed something. — xaosflux  Talk 05:27, 13 February 2017 (UTC)
 * I don't think there are legitimate uses (theoretically, you could come up with like something like google.com?search=http://https://, but it's unlikely). But I also expect the botop to actually confirm this and state in the function details that the bot would not attempt to determine where the "link" is located on assumption that it is always wrong anyway. Otherwise, this approval can be construed to only apply to mw:Help:Links. It sets a bad precedent. — HELL KNOWZ  ▎TALK 10:57, 13 February 2017 (UTC)
 * I will fix every instance of double prefix, in every single place.Database scan says that right now there are no instances of double prefixes. Bot is exclusion complaint though. -- Magioladitis (talk) 11:02, 13 February 2017 (UTC)

approving these is just fine. Thanks. -- Magioladitis (talk) 09:26, 13 February 2017 (UTC)


 * Task approved. To be clear, the only thing being approved here is fixing the 9 use cases listed above - these can be anywhere on a page.  Should some edge case come up that editors complain about, the task should stop until such a dispute is settled. —  xaosflux  Talk 01:52, 14 February 2017 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.