Wikipedia:Bots/Requests for approval/Yobot 19


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Yobot 19
Operator: Magioladitis

Automatic or Manually assisted: Automatic, supervised for most of the edits

Programming language(s): AWB + KingbotK plugin

Source code available: Yes

Function overview:
 * Add listas in talk pages of biography articles without listas parameter

Links to relevant discussions (where appropriate):

Edit period(s): One-off to clean the backlog and reguraly in the future.

Estimated number of pages affected: 61k

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: Yobot, using a custom AWB module will run in Category:Biography articles without listas parameter to add listas in all pages, meanwhile it will do various talk page fixes performed by WP:AWB described in BRFA 17.

Yobot already works with living, class, priority and the work-groups. This completes Yobot work on Wikiproject Biography.

JimCubb who works with listas for a long time, agrees with the idea of a bot doing the job in similar way we add DEFAULTSORT in rticle pages and in facte he was the one to suggest it. Check Template_talk:WikiProjectBannerShell.

Details of how AWB's function works can be found in GENFIXES. I'll be using the function Tools.MakeHumanCatKey. -- Magioladitis (talk) 18:40, 8 January 2011 (UTC)

Discussion
Here is a little more information. Almost two years ago, in March 2009, Matt developed ListasBot which, in its final incarnation, looked for a sort value in all the usual places on a biography — DEFAULTSORT, lifetime, BD and pipes on category tags — and place the best one as the value for listas in the Biog banner. It worked very well. The population of Category:Biography articles without listas parameter went from over 377,000 to under 100,000. At one point there were fewer than 40,000 pages in the category. The category now has just over 61,000 pages in it.

I was on an unintended but needed semi-break for a few months in 2010. When I returned to the task in late-November I was appalled at how many articles had valid sort values in DEFAULTSORT but nothing in listas. I have contacted Matt to get him to run ListasBot with no response. (Listas bot has not run since May 2010).

Mag's bot have been known to make errors. However, unlike many bot owners and most editors, Mag acknowledges his errors and corrects them.

I hope this request is approved and approved quickly so that my favorite problem category will only consist of the really difficult cases and only number less than 1000 pages at most. I hope to live to see the day that Category:Biography articles without listas parameter has the warning tag for admins not to delete it if it is empty.

JimCubb (talk) 00:23, 10 January 2011 (UTC)


 * Comment I've added or fixed over 45,000 listas parameters, and probably nearly as many DEFAULTSORTs. Doing so many of these, I've definitely considered getting a bot account, but I've always rejected the idea because I would not be comfortable with the inevitable errors. I feel that it's much better to have no sort key than an incorrect one. M AN d ARAX  •  XAЯA b ИA M  21:48, 10 January 2011 (UTC)
 * I'll fix all musical groups first to avoid false positives. Moreover, me and JimCubb will both supervise the edits. At any case, I encourage you to use my code in manual mode to reduce the backlog. -- Magioladitis (talk) 22:38, 10 January 2011 (UTC)
 * Moreover, we already trust Rjw's bot to add persondata in hundreds thousands articles and this task contains human cat key generation. We also trust many other bots to add defaultsort. The logic behind HumanCatKey has improved a lot the last year. Finally, the task was already done by another bot in the past! -- Magioladitis (talk) 00:09, 11 January 2011 (UTC)

I hope that the bot, if approved, will search first for a sort value on the article page before attempting to devise one. Unless the logic has improved drastically since the last time I saw it tried, the logic completely fails on non-Western names. Some values require a bit of outside research to determine. JimCubb (talk) 21:46, 11 January 2011 (UTC)
 * If you know of any bad examples, please report them. Using the same method Rjw's bot added 200,000 persondata in article space. -- Magioladitis (talk) 23:43, 11 January 2011 (UTC)
 * After just a quick look through some Chinese names, I came across four pages for which Yobot added incorrect sort keys (,, , and ), as well as others by other AWB users. See WP:SUR. Pages with "irregular" sort keys include those for Chinese, Icelandic, and Arabic names, as well as many other special cases. (I know that AWB handles many special cases, including some Arabic names.) Some difficult ones are those involving a compound surname such as Ludwig Mies van der Rohe; the only way to determine the correct listas (in this case, Mies Van Der Rohe, Ludwig) is to check the article's DEFAULTSORT and hope that a knowledgeable person had added the correct one, or in some cases to see how the name is used in the article. I would expect Category:Biography articles without listas parameter to include a higher percentage of pages requiring special attention than in the general article population, because people often omit the sort key when they're unsure of the correct one in difficult cases.  M AN d ARAX  •  XAЯA b ИA M  02:54, 13 January 2011 (UTC)
 * I can obtain the DEFAULTSORT then. I'll check the special cases too. I think we can fix those who contain appropriate templates in their articles. -- Magioladitis (talk) 11:48, 13 January 2011 (UTC)
 * Many defaultsorts are messed anyway. I'll keep gathering information to improve logic. What about working in small batches and have people check the result? -- Magioladitis (talk) 23:44, 13 January 2011 (UTC)
 * Another proposal would be that I do it manually using this code. -- Magioladitis (talk) 02:23, 24 January 2011 (UTC)
 * Discussion on Icelandic names showed that listas should follow the generic rule. -- Magioladitis (talk) 01:29, 2 February 2011 (UTC)

BAG assistance needed Magioladitis (talk) 01:29, 2 February 2011 (UTC)

Çelebi
I just caught the DEFAULTSORT error on Süleyman Çelebi that was put there by Yobot. The page name was treated as if it were GivenName Surname and the sort value of "Celebi, Suleyman" was inserted. The last sentence of the lead says that Celebi is an honorific.

I really think that AWB and bots should not be used to concoct sort values. Once the logic has progressed to the point that the tools can read the article, let me know and I will rethink my position. JimCubb (talk) 21:49, 4 February 2011 (UTC)
 * That's why I asked that we do it manually/semi-manually with your help. In order to find and record cases that we can add to AWB's code. -- Magioladitis (talk) 23:38, 4 February 2011 (UTC)
 * Btw, I fixed this one after 3 years it was there. So I think we are giving too much though on very special cases which don't receive a lot of complaints. I am willing to help in improving the rules but for example I didn't find any rule for this case. -- Magioladitis (talk) 12:57, 5 February 2011 (UTC)
 * I am lost. Check Asaf Halet Çelebi too. Çelebi is even mentioned in the article. -- Magioladitis (talk) 13:02, 5 February 2011 (UTC)

Two things — In many instances there is no sort value it is because the contributors to the article were uncertain as to the correct value. As I have said before, it often takes some research to determine the correct value. It could be that somewhere between the 15th century and the 20th century the honorific became a surname. That also may have happened with "Lord" and "Laird". I would greatly prefer the owners / operators of the various bot who have attempted or wish to attempt this problem to abandon their bots in this instance and grind through a few, a few dozen even a few hundred of these articles manually. Go to Category:Biography articles without listas parameter and pick any page from the very detailed ToC. JimCubb (talk) 03:49, 7 February 2011 (UTC)
 * The one I fixed was introduced manually and not with a bot some years ago and nobody fixed it. Let's see now. All people with the word "Çelebi" in their name can be found inÇelebi (disambiguation). A wikisearch reveals that aren't any others. They all have listas set so we are done with this case. Recall that I won't be changing already set values. I can also add rule to skip if this string appears. -- Magioladitis (talk) 08:39, 7 February 2011 (UTC)

There is a large class of such names. In the same area of the world, Pasha was used in exactly the same way; at the other end of Europe, Thorgerd Egilsdottir is being DEFAAYLTSORTed under E. (This last error seems to have been imposed by a human editor before a bot got around to it; but would Yobot have known better?) Septentrionalis PMAnderson 19:49, 10 February 2011 (UTC)
 * The latter is correct. Consensus in the wikiproject Iceland says to treat patronymics exactly as surnames. -- Magioladitis (talk) 20:14, 10 February 2011 (UTC)
 * That may be reasonable for living Icelanders, who exist in a world of surnames; it is not the usage of reliable sources on the tenth century. Observe that Leif Ericson is (quite properly) sorted under L. Septentrionalis PMAnderson 22:35, 10 February 2011 (UTC)
 * Wikipedia_talk:WikiProject_Iceland? -- Magioladitis (talk) 22:53, 10 February 2011 (UTC)
 * Discussing only a living subject, born in 1958. Thorgerd (who has the patronymic  for disambiguation only) was born a millennium earlier. Again, consider Leif Ericson, which is DEFAULTSORTed under L - or his father - whose parronymic (although known, is not the common usage; should we sort him under Th?)   Septentrionalis PMAnderson 19:42, 11 February 2011 (UTC)
 * Why don't you leave a message in the discussion there? -- Magioladitis (talk) 20:11, 11 February 2011 (UTC)

Another offer: I 'll exclude all pages transcluding Surname clarification templates. -- Magioladitis (talk) 16:34, 12 February 2011 (UTC)

BAG assistance needed
 * Hi Mag. I see there is some discussion about some odd surnames that may not work with the bot's logic. That's why I'm going to give you an extra long trial period in the hopes that other weird exceptions can emerge before the bot goes into production.  Tim  1357  talk  23:11, 2 March 2011 (UTC)
 * Tim 1357  talk  23:11, 2 March 2011 (UTC)

Sorry for coming in late for the party. Wondering if a manual and automatic route would be best. Many names have no established rule and no way will be mistake free, but I was thinking in general terms. For example, is there a way to grab Japanese bios with no listas? I've been running around 50% of Japanese people living before 1900, plus there have been a few sumo wrestlers, which also use the old style of naming. If the minions can knock them out manually, there would be alot fewer mistakes. Running the other way would work too... have yobot run on names from people born in US, England, Mexico, etc. and then have the minions work on what is left manually. I'm just thinking of a way where you don't have to add all the permutations to the bot and also decrease the amount off mistakes. Bgwhite (talk) 20:36, 28 March 2011 (UTC)


 * Bgwhite's suggestion seems to me to be a good one, especially the last part. There are still going to be an almost unmanageable number of pages that will be difficult for anyone to do.  As an example, a number of months ago I ran across a page for a person whose name appeared to be Arabic.  Fortunately, several generations were named in the article and I was able to determine that the person had taken his grandfather's given name, the patronym for his father, and turned it into a surname for himself and his son.  There is no way that an automated tool could have gotten that one right.


 * Thorgerd Egilsdottir's sort value was set more than two years ago and could have been according to the consensus at that time. However, as Septentrionalis has pointed out, that consensus does not apply to people in the Middle Ages so I fixed it.


 * How did the long trial work out? JimCubb (talk) 23:05, 1 April 2011 (UTC)
 * I'll do it in 10-12 hours. Can you check after this please? -- Magioladitis (talk) 23:09, 1 April 2011 (UTC)
 * FYI, I've been going thru the letter "A" and picking out "weird" names. There are about 1,400 "weird" free names. I'm about 1/2 way thru A.  I know I left some latin and probably missed some more.  Don't know if you want to run your trial on these or not.  Bgwhite (talk) 23:24, 1 April 2011 (UTC)
 * For the straightforward method I need a list of 200 "safe" names. For you approach we need just to add the key "as is" in the listas. I am good both ways as soon as someone provides me a list. I am a bit busy with other projects to create the list right now. Please help! -- Magioladitis (talk) 23:43, 1 April 2011 (UTC)
 * I'm confused (as always) on what you need. Could you spell it out a little better.  What format of the list?  You need every one to have a listas parameter of "as is"? I'll be online for another two hours, then again in about 14 hours for about 4 hours, so I get get it to you tomorrow. Bgwhite (talk) 06:19, 2 April 2011 (UTC)
 * Sorry for the bad wording :) Let's see. I need a list of pages (just post them in a sandbox or my talk page) that don't have listas and the listas parameter should be the keyname without any reversion. For example: Mao Zedong who has listas parameter "Mao, Zedong". OR alternatively a list of pages in which the "surname, name" rule implies. -- Magioladitis (talk) 10:35, 2 April 2011 (UTC)
 * User:Bgwhite/Sandbox  Bgwhite (talk) 20:24, 2 April 2011 (UTC)
 * Have you done the test run? Wondering how it turned out. Bgwhite (talk) 19:06, 22 April 2011 (UTC)
 * OperatorAssistanceNeeded I know you're super busy with other stuff, but just poking my head in to check up.  MBisanz  talk 20:42, 22 April 2011 (UTC)
 * -- Magioladitis (talk) 11:37, 23 April 2011 (UTC)

. This is the permanent link to the edits. And some examples:, , ,. It also fixes special letters (this task is previously approved):. Please ignore the edit summaries. I couldn't find my settings file today in my hard disc so I loaded only the custom module. the real result will be better. -- Magioladitis (talk) 11:54, 23 April 2011 (UTC)
 * Not a complaint, I understand why AWB does this with Arabic names... It won't sort many Arabic names via surname, given name.  Examples:    defaultsort in article is also not set.    defaultsort in article is set correctly.  I had gone thru and manually done the "weird" cases such as Al-, Abdul, el-, etc.  Guess we need to manually do all before the bot runs.  Thank you for running the test. Bgwhite (talk) 21:21, 23 April 2011 (UTC)
 * We have a list of common Arabic names. I think this logic can be improved. Thanks for the very nice comments and effort. You are probably saving a failed BRFA. -- Magioladitis (talk) 21:28, 23 April 2011 (UTC)

BG19bot Bots/Requests for approval/BG19bot approved for the same task. I am not planning in doing this task maybe it could be a good idea to approve it for reasons of completeness? -- Magioladitis (talk) 14:17, 5 May 2011 (UTC)


 * If you don't plan on running this task, why do you need it to be approved? Headbomb {talk / contribs / physics / books} 14:44, 5 May 2011 (UTC)
 * I can add this job to be done additionally to other jobs in talk pages after some months. I.e. I can transform the code to a plugin-like custom module which updates WikiProject Biography banners. Ultimate target is to replace today's KingbotK plugin of AWB to a new code written in C#. -- Magioladitis (talk) 14:53, 5 May 2011 (UTC)

Task is done by BG19bot, so someone just has to decide if this will be task of Yobot too or not. I am in cooperation wit Bgwhite in order to improve the code. -- Magioladitis (talk) 07:40, 3 June 2011 (UTC)
 * Using same code and settings at Bgwhite's bot.  MBisanz  talk 18:54, 3 June 2011 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.