Wikipedia:Bots/Requests for approval/K.Kapil77 Bot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard. The result of the discussion was

K.Kapil77 Bot
Operator:

Time filed: 08:54, Wednesday, April 28, 2021 (UTC)

Function overview: This Manually-assisted Bot rectifies minor spelling mistakes on less to mid popular pages based on high confidence heuristics.

Automatic, Supervised, or Manual: Manual

Programming language(s): Python

Source code available: Will be made available after approval and testing

Links to relevant discussions (where appropriate):

Edit period(s): daily

Estimated number of pages affected: 10-15 per day

Namespace(s): Articles

Exclusion compliant (Yes/No): Yes

Function details: We have generated a dataset containing minor to larger spelling mistakes using Wikipedia dump based on heuristics like edit distance, popularity, frequency of use. We manually verify about 10-15 such changes everyday on our own designed portal at https://www.iitg.ac.in/cseweb/WikiFeedback. This bot will reflect those changes from our SQL server to Wikipedia bot.

Discussion
Hi there. Interesting project. For the purposes of approval, I'm afraid it would not be responsible for us to approve a closed-source bot, that also requires registration to use, and could violate WP:CONTEXTBOT unless all the edits are supervised. But it appears you might be affiliated with an Indian Institutes of Technology (judging by the domain of your site) so I'm not going to decline this BRFA yet, as your project may be promising, but the function will need tweaking to meet our standards for bots. If we did approve a bot like this we need to have a high level of confidence that it isn't going to make problematic changes (see WP:CONTEXTBOT). You do mention that you'll manually verify 10-15 a day. The bot could be coded as a tool using OAuth to make the edits on a user account directly (see WP:BOTMULTIOP). A bot account isn't usually needed to make a tool like this. All the edits would have to be manually verified by a human before going live on an article to make sure that they're okay. ProcrastinatingReader (talk) 14:05, 28 April 2021 (UTC)
 * Hello ProcrastinatingReader, Yes I am a research student at an Indian Institutes of Technology. Sorry, I guess it wasn't clear from my previous writing that we have a high confidence dataset which will also be voted upon by many users from our research group on the Portal I mentioned above. Based on User voting (expected 10-15 votes per day per user), our heuristics and recent revisions on similar pages on Wikipedia we would want our bot to make changes, keep tracking if some human user reverts back the changes made and notify us as well as our voters about revisions. I couldn't mention complete heuristics of generating the dataset since it is a live project. Also since all changes are being verified manually, I believe chances of problematic changes is quite low. I can make source code available only after permissions from our group, if its non-availability is a deal-breaker. Please do let me what can be done to get this bot live. K.Kapil77 (talk) 18:29, 28 April 2021 (UTC)
 * Source code is not required by default to be published, but it's encouraged, and for some tasks it can be preferable to have others look over it.
 * You can code your algorithm to generate diffs for a change however you like of course (eg using user voting or analysis of revisions). The key for the purposes of this bot is that you need a human to look over the change and make sure it's okay before it actually goes live on an article. If you're looking to move into unsupervised territory, this could maybe be re-evaluated if there's a track record of generated changes being accurate, but in the first instance I don't think a bot could be approved for unsupervised editing in this manner.
 * As for the 'checking over' part, this is referred to as semi-automated editing. You can use OAuth (see OAuth/For Developers) to directly make the edits on the human editor's Wikipedia account, so it wouldn't necessarily need to go via a bot account and this method wouldn't really require any sort of approval. ProcrastinatingReader (talk) 18:45, 28 April 2021 (UTC)
 * They have already said all edits are manually vetted, so I don't think CONTEXTBOT is applicable. There exist a lot of bots whose edits are manually supervised. Making it an OAuth tool is added complication which is useful only when the creators want to crowdsource vetting of the edits (here they indicate they'll do the vetting themselves). So all in all I don't see any issues here. Having a quick trial will likely help further review. – SD0001  (talk) 12:42, 29 April 2021 (UTC)
 * Hmm. I guess if these conditions (all edits manually verified by the same editor) will not change in the future it won't actively be a problem. do you want to take it through trial? ProcrastinatingReader (talk) 12:52, 29 April 2021 (UTC)
 * Please don't mark edits as minor for the trial (so that they show up in watchlists and RecentChanges). Let us know if you run into any issues. – SD0001  (talk) 13:07, 29 April 2021 (UTC)
 * Hello! Please consider my edits from User:K.Kapil77 Here : Special:Contributions/K.Kapil77. Most of the edits made on 28, 29 April 2021. No edit has been reverted yet, showing my proof of work. I have made an edit to my sandbox from Bot to show working accuracy of bot code.User:K.Kapil77_Bot/sandbox (I'm using pywikibot)K.Kapil77 (talk) 06:58, 25 May 2021 (UTC)
 * It's not at all clear which are the edits the bot would make and which are your personal edits, which is why bot trials should be done via the bot account. I see a lot of errors like these:
 * not a typo; See WP:ENGVAR; article has Use British English up at the top.
 * was originally not wrong at all; the edit made it wrong
 * doesn't look like a spelling fix
 * Also, the piping removed in edits like, , , don't look appropriate for a bot to do. It seems the original pipings were editorial decisions to make the display text different from the article name, which may or may not be necessary – but that's not for a bot to decide. –  SD0001  (talk) 13:21, 25 May 2021 (UTC)
 * All our edits by Bot are monitored and reflected on the page only after manual confirmation. (Please read Bot Description above). It is as good as an spell error identified by bot and approved by human. The type of edits our Bot makes is : - Spell Correction,  - Proper Reference,  - Correcting Improper Reference.  These Edits are based on proper heuristics and don't generate a mistake , if not completely resolving. The piping removal you mentioned maybe just because the the original Entity Page is actually the best Display name rather than the one using the Pipe, also manually verified. I don't see any issue with Bot's performance as such.K.Kapil77 Bot (talk) 15:38, 25 May 2021 (UTC)
 * Please only make unambiguous spell corrections. Changing Varkari to Warkari isn't that, and even falls afoul of WP:DONOTFIXIT since it's not a misspelling – Asian names/words can have different acceptable romanisations. Regarding the 2nd edit ("Proper Reference") it changed  to   on an article  that's on Bombay so it's obvious which IIT we're talking about.  added a link where one didn't exist before and wasn't necessary (see WP:OVERLINK). We can't let a bot-flagged account make edits like this. –  SD0001  (talk) 13:11, 26 May 2021 (UTC)
 * Alright! I’ll restrict these types of changes and push only spell corrections like typos. I’ll update the new edits via Bot account and update here by tomorrow. 13:54, 26 May 2021 (UTC)K.Kapil77 Bot (talk) 13:54, 26 May 2021 (UTC)


 * Hi, Please check my latest edits and let me know if these kind of edits are allowed for my bot, if yes please approve bot Special:Contributions/K.Kapil77 Bot K.Kapil77 Bot (talk) 10:20, 8 June 2021 (UTC)
 * No. changes a MOS:DATERANGE-compliant date to a non-compliant form. I don't have the know-how to tell whether,  are net-positives; they do look like, but surely a bot going around making such edits is going to generate controversy when errors arise. These are more appropriately done via a human account. I don't see any edits here that are fixing spelling mistakes in words that aren't proper nouns. –  SD0001  (talk) 15:14, 10 June 2021 (UTC)

per above. – SD0001  (talk) 15:28, 10 June 2021 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard.