Wikipedia:Bots/Requests for approval/RaBOTnik

RaBOTnik

 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol neutral vote.svg Request Expired.

Operator:

Time filed: 21:56, Thursday September 12, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available:

Function overview: Its main purpose is to place stress marks on Russian names in the leads of articles about Russian people or places. For example, it will replace  with. It will maintain stats on stress usage and top names that need to be addressed.

Links to relevant discussions (where appropriate):

Edit period(s): It will do an initial run through all articles that call lang-ru. Then it will run every once in a while to improve new articles or if new words are added to its database.

Estimated number of pages affected: 37,000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No):

Function details: We will go through an offline database dump and generate a list of all the names that appear in calls to lang-ru. This will allow us to identify the most common names. Then once the bot is live, it will go through all articles on enwiki that call the lang-ru template and it will put stress marks where they go. If it can't find them on the database, it might look for them on ruwiki (most articles in ruwiki that contain uncommon names in the title have the stress marks in the first appeareance of the title in the lead). The bot will also keep stats on the most popular names that it doesn't know the stress for, so we can consider adding them to the database. We will schedule further runs after adding a significant amount of names to the database, or when enough time has passed so that there will be many new articles.

Discussion
What offline dump does the bot go through -- English, Russian, all of them? Does it only use template usage or anywhere else in text? And if it only sees stresses used one, does it still consider that a reliable hit? Is there perhaps a Russian dictionary or some place you can look these up more reliably? Also, when you say "We will ..", who is "we", is this a case of WP:BOTMULTIOP? P.S. Nice pun, I'm kind of willing to overlook WP:BOTACC "The account's name should identify the operator or bot function." — HELL KNOWZ  ▎TALK 22:12, 12 September 2013 (UTC)


 * Wow, that was a very quick answer, thank you, I appreciate it. I'll try to answer your questions:


 * What offline dump does the bot go through -- English, Russian, all of them? The bot won't go through the offline dump. That is a "task zero" that we will do. We will go through the offline dump of enwiki, and make a list of all the words that appear inside of calls to lang-ru. From this we will build the initial list of common names and their stress information. And that is the list that the bot will start with.Azylber (talk) 22:39, 12 September 2013 (UTC)


 * Does it only use template usage or anywhere else in text? Initially we will only process text inside calls to lang-ru. I also think that a future task to add to the bot would be to search for articles that contain (Russian: xxxx) in the lead, and standardise them into calls to lang-ru. Azylber (talk) 22:41, 12 September 2013 (UTC)


 * And if it only sees stresses used one, does it still consider that a reliable hit? Is there perhaps a Russian dictionary or some place you can look these up more reliably? We can't use a dictionary because we're not dealing with words, we're dealing with names. Fortunately, however, and answering the first bit of your question, Russian names are fairly standard and they're always stressed in the same place, so it is reliable. If the bot ever comes across a word which is stressed differently on different articles, it will add it to the a list things for a human to look at. Azylber (talk) 22:51, 12 September 2013 (UTC)
 * "Dictionary" was probably the wrong word here, I rather meant "some place"(s) where lists of names are collected. — HELL KNOWZ  ▎TALK 08:57, 13 September 2013 (UTC)


 * Also, when you say "We will ..", who is "we", is this a case of WP:BOTMULTIOP? Wikipedia is a collaborative project. We all work together to document human knowledge. In this spirit, several editors have been kind enough to offer their help. For example, []. So it is not a case of BOTMULTIOP, as I will be the only operator, but others will help to maintain the database. That's also the reason that I don't want to put my name in the bot, I feel it would be unfair, because even though it was my idea, I have been and I will continue to be helped by other editors. Besides, the name is brilliant! Azylber (talk) 22:51, 12 September 2013 (UTC)

Since I don't see any place in our guidelines that address stress marks specifically, it might be best to drop a note at Wikipedia talk:WikiProject Russia and/or Manual of Style/Russia-related articles.

Please use a descriptive edit summary for edits and make or redirect bot's talk page to yours. It would be helpful if you could specify in a log, where exactly you found the stress mark usage for the edits and ideally make a note in the edit summary. — HELL KNOWZ  ▎TALK 08:57, 13 September 2013 (UTC)


 * Thank you very much for trusting us. It will take a few days, as we haven't written the code yet. We'll do the trial and show you the results. Azylber (talk) 04:41, 14 September 2013 (UTC)
 * Update on this:
 * task zero: I've already compiled the content of all present calls to lang-ru
 * bot: I've already got a basic bot code that can connect to Mediawiki. Now it's just the case of implementing the functionalities of task 1.
 * Azylber (talk) 00:00, 19 September 2013 (UTC)

D Any updates in over a month? Hasteur (talk) 14:44, 25 October 2013 (UTC)
 * Hi Hasteur, thanks for getting in touch. We've been working on human verification of all the names that the bot is going to replace. You can see what we've been up to here and here. We want to make sure that there are no errors in the data that we're feeding to the bot. The lists we're working on are quite long, so it'll take a while. If you have any other concerns, please let me know. Cheers, Azylber (talk) 14:57, 25 October 2013 (UTC)
 * The most recent updates to those pages was 26 September 2013‎ and 18 November 2013‎. Do you still intend to pursue this BRfA? Josh Parris 05:02, 6 December 2013 (UTC)
 * Hi Josh, yes I do. Unfortunately the stuff I'm waiting for doesn't depend on me. Perhaps what I need to do is find a couple more Russian people to help with Question2. I'll ask Ezhiki and Ymblanter if they can suggest someone. By the way, there have recent changes to Question2, Ymblanter added a few more a few days ago. I will chase. Azylber (talk) 12:52, 6 December 2013 (UTC)
 * Update: now I've got all the information that I need. I will work on this during the Christmas/New Year break. Azylber (talk) 03:17, 19 December 2013 (UTC)
 * That's excellent news. Josh Parris 03:20, 19 December 2013 (UTC)


 * Hi, I've finished compiling the list of words, and I've almost finished writing the bot. Now I'm ready to start the real tests on the BOT's sandbox, so it's all good news!
 * I need help with a little thing. I've put the bot on a hosting account which is shared (hostmonster) and apparently the IP of the server is blocked, because someone at some point in the past tried to abuse wikipedia from an account from that same hosting company.
 * So would anyone here please be able to place an IP block exemption flag on RaBOTnik?
 * Thanks! Azylber (talk) 20:24, 25 December 2013 (UTC)
 * Update: the bot's got the IP flag now, and now it works! Will do the trial soon. Azylber (talk) 20:56, 25 December 2013 (UTC)


 * Hi all! I've got very good news to share with you: I've allowed RaBOTnik to edit 50 articles at random and I've checked all 50 diffs and they're all flawless! You can check them here. Any comments? Azylber (talk) 19:23, 26 December 2013 (UTC)
 * Update: It's been more than 2 days since the 50 edits in article space, and so far:
 * several of the articles have already been edited by other people
 * nobody has reverted any edits done by the bot
 * nobody has undone any changes done by the bot
 * nobody has complained about the bot
 * In addition to this, myself and other people have gone through all 50 edits and they're all fine.
 * Therefore, could the bot please be approved?
 * Thanks, Azylber (talk) 19:46, 28 December 2013 (UTC)
 * Update: 4 and half days, still no problems whatsoever :)
 * Any feedback? Azylber (talk) 06:18, 31 December 2013 (UTC)

That is a lot of test pages you have created in your userspace. The policy calls for a reasonable amount for testing, not over a hundred of them. Surely, you only need one or two pages to verify that the bot edits it correctly. The rest are just different texts you could most certainly test offline. Besides failing to attribute where you got the text from, you are using non-free images, filling up categories, etc.

In any case, there still isn't any guideline for stress marks or any discussion of doing this automatically (as far as I know), so have you posted somewhere about mass adding stress marks as I asked before? — HELL KNOWZ  ▎TALK 12:30, 31 December 2013 (UTC)


 * Hi Hellknowz, and thank you very much for your feedback.
 * In reply to your points:
 * That is a lot of test pages you have created in your userspace. The policy calls for a reasonable amount for testing, not over a hundred of them.
 * It's important to do a lot of testing, because the most dangerous problems are always the ones that you're not aware that you're not aware of. Extensive testing is the key to success.


 * Surely, you only need one or two pages to verify that the bot edits it correctly. The rest are just different texts you could most certainly test offline. Besides failing to attribute where you got the text from, you are using non-free images, filling up categories, etc.
 * I did a lot of testing offline, and then I also did a lot of testing online. The more testing, the better. You never know what could go wrong. I think it would have been very irresponsible to have the bot do those 50 edits in article space before doing extensive sandbox testing. And because of all the testing we did, the behaviour of the bot in article space is flawless.
 * Also, all those test pages can now be deleted, as we don't need them anymore. End of the "problem".


 * In any case, there still isn't any guideline for stress marks or any discussion of doing this automatically (as far as I know), so have you posted somewhere about mass adding stress marks as I asked before?
 * This has been discussed extensively. For example here


 * Cheers, Azylber (talk) 16:06, 31 December 2013 (UTC)


 * I'm well aware what testing means and you cannot convince us your code needed over 100 pages and over 200 online edits to know that it is able to correctly find/replace text, no matter how complicated your code is. I don't think I recall a single bot that needed this many. This isn't even about the number, this is about you showing competence and understanding of WP:BOTPOL and WP:BOTAPPROVAL and what is expected of bot operators. With your reply, it sounds like you will not hesitate to carry out such testing again.


 * Perhaps I have a different understanding of "discussed extensively", but you linked 1 discussion on a local project originally asking to remove stress marks. It has no mention of systematically adding them, changing articles en masse, doing this by bot, changing only individual worlds instead of the full title/phrase, or adding stress mark usage into a guideline. I did ask you to publicize this BRFA as soon as you answered my technical issues that might come up in such a discussion, but it doesn't look like you have (and there has been plenty of time). I hate to be blunt at this point after all your work, but per WP:BOTREQUIRE please show us how the bot "performs only tasks for which there is consensus" and "carefully adheres to relevant policies and guidelines". — HELL KNOWZ  ▎TALK 16:44, 31 December 2013 (UTC)
 * What would be the best way to proceed? --Ymblanter (talk) 17:21, 5 January 2014 (UTC)
 * I would like to know that as well. Besides, I am somewhat (OK, quite a bit) confused by what the size of a testing sample (which I wholeheartedly agree is commensurate with the complexity of the task, although perhaps not implemented in the most efficient manner) has to do with approving or not approving a bot job which no longer needs those pages nor will need to re-create them at any point in the future? And as far as consensus goes&mdash;the link provided is to an extensive discussion on a WikiProject within the scope of which all articles affected by this bot fall. No solid reason for not using the stress marks was provided in that thread; in fact, some of the opposers (including, to some extent, yours truly) were convinced that showing stress marks is more beneficial than not showing/removing them. If your prefer that a wider audience looks at the practice, I understand, but in that case could you please clarify where else you would like to see this discussion take place? Thanks.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); January 6, 2014 ; 18:10 (UTC)
 * Testing sample "size" of this many new pages was completely unnecessary. 2-3 pages would have been enough for almost any bot to read/write over multiple times, not over a hundred (just look at how many other bot and even editor edits there are now have due to appearing like articles). You can fit 100s of cases on a single page. They weren't even different cases, just slight variations of the same. I mentioned this but the operator hasn't done anything to delete them or even said that they will fix the issues. I understand they may not have realized this or not have experience programming/testing on wiki, but I want them to realize this and not try to convince BAG otherwise.
 * I asked for additional discussion way back at the start. We have had plenty of bot problems with tasks that don't have any guidelines and aren't opposed, but don't have explicit consensus either. Not opposing the task is not the same as supporting the task to be done by bot en masse. This is why I am asking for some sort of discussion or at least unopposed note that a bot will add stress marks to thousands of articles. The linked discussion is a local project with 5 people against removing stress marks, couple people for and couple people borderline. I agree arguments to remove stress marks aren't very strong and arguments to keep them are much stronger. But that's not a proposal to add stress marks wiki-wide, it's a failed proposal to remove them. If I approve it and someone later comes and asks "show me consensus to do this task by bot" I can't. I would have been happy to approve it the operator had at least left a note somewhere like village pump, but besides saying "discussed extensively", I don't see anything else (besides your talk pages). —  HELL KNOWZ  ▎TALK 20:25, 6 January 2014 (UTC)
 * Hi Hellknowz, thanks for your clarifications. I appreciate you're genuinely concerned, and as a reader of Wikipedia I want to thank you for taking so much care of it. If it wasn't for people like you, Wikipedia would be a complete mess.
 * I also understand your point about the test pages, and I will request their deletion. Several of the people who've worked in the making of this bot are admins, so I can ask them for help with deleting the test pages.
 * As regards approving the bot, could I please ask you to take into account that the bot edited 50 proper articles in article space (including several high-profile ones), and it's already been 11 days, and so far:
 * most of the articles have already been edited by dozens of editors after the bot edited them
 * nobody has reverted any edits done by the bot
 * nobody has undone any changes done by the bot
 * nobody has complained about any of the edits made by the bot
 * I honestly believe that this is the strongest evidence we could ever wish for. Could you please approve the bot?
 * Thanks, Azylber (talk) 23:33, 6 January 2014 (UTC)
 * "I honestly believe that this is the strongest evidence we could ever wish for." It is a good indicator of it, but no -- 50 pages is not an indication of site-wide consensus. We've approved bots with no opposition to trial edits, even with hundreds of pages. Then suddenly they end up at ANI. I'm exaggerating, of course. But you have had more than 4 months since I mentioned more discussion, why are you so reluctant to advertise this, rather convincing me it's not needed? Trial edits is in part just something you can show to people and say -- this is how it will perform. Completing trial doesn't mean we have to approve right away. (May be another BAG member is willing to approve at current state, but I'm afraid I'm not.) — HELL KNOWZ  ▎TALK 12:09, 7 January 2014 (UTC)
 * I'm sorry Hellknowz, but you're confusing me now. Why do you want this task discussed at the Village Pump? The task in question has already been discussed at WP:Russia. Why would we want to take it to the Village Pump? Are you serious?
 * And in fact you assessed the consensus at WP:Russia yourself. You yourself told us that you'd seen a voting majority + stronger arguments. So then what is the problem?
 * I'm trying my best to see your role in a good light, but I don't understand your motivations. Could you please look at it again and reconsider. Thanks. Azylber (talk) 17:11, 7 January 2014 (UTC)
 * I am really wondering why something we ask of every bot is such an issue here? We either need an uncontroversial or minor task (doesn't apply to 30k+ pages), a policy/guideline (in this case, there isn't one) or a properly advertised and formulated proposal/discussion (for which I asked very early on). The one at WP:Russia was not about adding stress marks site-wide by bot, it was a local discussion about removing them. I already said I can summarize there was no consensus to remove them at the project level. But I cannot conclude there was consensus to add them site-wide by bot. I've explained this several times, I don't know how else to put it. Not being black doesn't necessarily mean it's white. — HELL KNOWZ  ▎TALK 18:28, 7 January 2014 (UTC)
 * I am afraid we are not moving, so let me formulate it differently. If we open a new discussion at WP:Russia specifically pointing out that this is about adding the stress in the articles, wait two weeks and close, would this be good enough? An RfC template can be added if this is essential. (Actually, I think nobody cares). I have never seen such requests at village pumps, but, well, if going to a village pump would be a necessary condition for bot approval, I can do it.--Ymblanter (talk) 17:31, 7 January 2014 (UTC)
 * That would be fine. Also, lots of bots and tasks have been advertised at VP when there wasn't an existing consensus/discussion/guideline. I could even do it myself, I just don't understand why the bot operator hasn't and won't do this? I'm just being told to approve it, which makes me very uneasy. I am, after all, quoting directly from WP:BOTPOL and haven't asked anything we haven't asked dozens of times before. — HELL KNOWZ  ▎TALK 18:28, 7 January 2014 (UTC)


 * This task clearly requires a wider discussion. I'm inclined to agree with HellKnowz.— cyberpower Online Happy 2014 22:39, 7 January 2014 (UTC)
 * Comment Discussion was opened one day after Cyberpower's comment and can be found there. 46.107.88.236 (talk) 16:52, 24 January 2014 (UTC)
 * &mdash; So it looks like Wikipedia_talk:RUSSIA passed a couple weeks ago. I'm assuming the operator still wishes to continue with this request? Pinging ( & : is this in-line with our concerns? -- slakr \ talk / 06:56, 2 April 2014 (UTC)


 * Did the RfC or any meantime discussion lead to any changes in the bot's behavior or is everything still as was in the trial? — HELL KNOWZ  ▎TALK 10:27, 2 April 2014 (UTC)


 * Can you please respond to this BRFA? If we don't hear anything back I will close this BRFA as "Expired" 7 days from today.  We can't let random BRFAs dangle out here for perpitutity. Hasteur (talk) 12:52, 16 May 2014 (UTC)
 * Azylber hasn't been online since January. An RfC, which was requested earlier on this page, has in the meanwhile run its course and closed as a "consensus in favor" (of running the bot in its proposed form with no further modifications). With that in mind, is it possible to close this request as "approved", even as the operator is missing? My hope is that he'll eventually return. Thanks.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); May 16, 2014 ; 13:08 (UTC)
 * It's common practice that BRFAs aren't approved without a positive contact and agreement from the nomnitive bot operator. Closing this down as operator in question hasn't responded.  Azylber can re-file once they come back and make progress, but It's my understanding that this request is dead in the water
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.