Wikipedia:Bots/Requests for approval/KiranBOT 5


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard. The result of the discussion was

KiranBOT 5
Operator:

Time filed: 16:51, Monday, January 9, 2023 (UTC)

Automatic, Supervised, or Manual: supervised

Programming language(s): AWB

Source code available: AWB's custom module using regex, will upload in my userspace soon

Function overview: remove references/links on mass level (expired/hijacked domains)

Links to relevant discussions (where appropriate): special:permalink/1132589552 at WP:COIN

Edit period(s): mostly one time run per request (removing spammy link)

Estimated number of pages affected: around 1000 for current request

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: currently, pakrail.com redirects to an online casino website. It has been used in around 1170 railway related articles. I created a regex that finds the instance of pakrail.com, and removes the

I made around 50 edits through my alt account using that regex. Currently it is removing the links if it is in referencing template.

There is no scope for mistake, I would like the approval for saving the edits automatically.

currently it is not removing the plain link from "external link" section. (eg: ) I will remove these links using some other method from AWB, and I will perfect the method soon.

PS: previous BRFAs were filed under bot's old username, UsernamekiranBOT. —usernamekiran (talk) 16:54, 9 January 2023 (UTC) PPS: the pakrail.com was never the official webiste. —usernamekiran (talk) 17:13, 9 January 2023 (UTC)

Discussion
Is there some reason you don't just let 's bot (see Link rot/URL change requests) do this? * Pppery * it has begun... 16:57, 9 January 2023 (UTC)
 * Honestly speaking, I did not recall it at the moment, and it makes me feel stupid now. But now that I have code ready, I would prefer to go with my own AWB editing. —usernamekiran (talk) 17:06, 9 January 2023 (UTC)
 * This is what we call a "JUDI" site see WP:JUDI - there are processes already setup to deal with these we have procesed 100s of hijacked JUDI domains. You don't want to remove all the references or links. They can be flipped to usurped in some cases, tagged with in others, etc.. it's a complex process. See WP:USURPURL. Code is already in place to handle it. -  Green  C  18:18, 9 January 2023 (UTC)


 * example diffs: removal of Webarchive/wayback machine link, removal of bare ref tag, removal of cite web template. —usernamekiran (talk) 17:06, 9 January 2023 (UTC)
 * The archive URLs should not be deleted. See WP:USURPURL for how to deal with usurped domains. You want to maintain the citation as much as possible, by replacing the bad usurped URL with a good archived version. -- Green  C  18:27, 9 January 2023 (UTC)
 * fortunately I already had stopped after making exactly 150 edits. But the reliability of the current source is also disputed. So I think removing that particular source would be okay. —usernamekiran (talk) 20:09, 9 January 2023 (UTC)
 * I don't see a dispute discussion in the BRFA. -- Green  C  00:29, 10 January 2023 (UTC)

In Special:Diff/1132588299 you left behind an orphaned ref. It worked out in the end, after AnomieBOT rescued it you just took care of that copy too, but it would have been better to not leave the orphan in the first place. Anomie⚔ 04:38, 10 January 2023 (UTC)
 * Yes, I updated the regex earlier so now it removes all kinds of links that I could think of/came across. Before that update, it couldn't remove plain external links, like I mentioned above in the original request. Now it does that as well. —usernamekiran (talk) 06:04, 10 January 2023 (UTC)
 * That's nice, but has nothing to do with what I said. Anomie⚔ 12:14, 10 January 2023 (UTC)
 * I apologise for the confusion. I meant, now it removes plain external links, and by the last statement Now it does that as well. I was referring to the defined references, like the first diff you provided, where a fragment was left behind. Now it handles such format as well. —usernamekiran (talk) 12:38, 10 January 2023 (UTC)

Bot trial complete well, sort of. It was using my alt, I did around 1100 edits semi-automatically, all these edits were okay. The only unexpected one pointed above by Anomie (I somehow missed it when I was doing the edits), but now it has been taken care of. —usernamekiran (talk) 15:46, 10 January 2023 (UTC)
 * all the ~1100 edits. —usernamekiran (talk) 06:25, 11 January 2023 (UTC)
 * Great and all that you ran tests on your other accounts, but you can't say "trial complete" if it never went to trial. Primefac (talk) 11:48, 11 January 2023 (UTC)

BAG assistance needed I have already finished this particular task. But would it possible to get a clearance for non-controversial, non-cosmetic, non-judgement call (non CONTEXTBOT) one-off find-and-replace tasks? I don't come across such tasks much, but in case I do, it would be convenient to have "auto save" option on AWB. I will test my regex thoroughly on my sandbox before every task. —usernamekiran (talk) 05:16, 25 January 2023 (UTC)
 * The question of why GreenC's bot is insufficient for this sort of task was never answered. Since this particular request has finished I am inclined to deny the request, but in the interest of there potentially being a compelling argument for having a second bot I will hold off for now. Primefac (talk) 11:19, 31 January 2023 (UTC)
 * My code is only for finding a particular full domain, and various links of these domain (abc.com/123, abc.com/345) inside various templates/formats of wikipedia (ref, cite, and others), whereas GreenC's code is far more versatile. In case only removal or find-and-replace is required, my code can be used. Other than that, it doesn't seem much useful at least for now (in case I come across something, I might develop the code further). But for now, I don't think this will be anything like GreenC. However, I am still interested in getting approval for (non-controversial) one-off find-and-replace tasks, like I expressed in my previous comment. —usernamekiran (talk) 10:10, 1 February 2023 (UTC)
 * I added pakrail.com to WP:JUDI (Special:Diff/1127581454/1136850856) which is a queue for usurped domains it gets done in batches.  --  Green  C  13:45, 1 February 2023 (UTC)

I am uncomfortable giving carte blanche approval for these types of bots when there are a) bots that already do "the thing" and b) there is not a proven track record of similar/successful runs. While I am not necessarily holding everyone to the same standard, PrimeBOT 30 was my tenth task request to do "the thing", at which point I asked (and received) open-ended task-running. This is your first URLREQ-style task request, and it was already completed by the time the discussion came back around to even getting to trial.This is not to say that the type of task that you are asking to do cannot be done by your bot, just that this request for this set of operating conditions is not something that I want to approve (in other words, there is no prejudice against similar tasks being filed in the future). Primefac (talk) 07:49, 4 February 2023 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard.