Wikipedia:Bots/Requests for approval/RjwilmsiBot 4


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved.

RjwilmsiBot 4
Operator:

Automatic or Manually assisted: Automatic

Programming language(s): AWB (C#)

Source code available: AWB

Function overview: Add/complete Persondata based on existing article infobox/birth & death template data.

Links to relevant discussions (where appropriate): WP:PERSONDATA

Edit period(s): On release of database dump

Estimated number of pages affected: up to 100,000

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: Persondata AWB function (which I wrote). I propose to run the persondata function, with data sorter to place the persondata template correctly. The function already has logic to only add persondata to biography articles. If needed can configure a data addition threshold when adding a new persondata e.g. require name and at least one date or location to be added - alternatively I can always add the persondata template even if zero/one fields are filled in, on the basis this would then encourage users to fill in the rest.

Discussion
This should definitely not be adding blank templates, it should only be performing edits if there are at least one or two fields to fill in. Since you'll already be editing a large number of pages, will you also be running other non-controversial AWB genfixes? Also, will the template be added at the appropriate place in the article (just above defaultsort and categories seems to be the standard)? - EdoDodo  talk 01:17, 29 August 2010 (UTC)
 * I will set the field completion threshold to whatever the community feel is right. I can run other genfixes, but I'm not clear which ones the BAG feel are approved for bot use. Yes, the persondata template will be correctly placed (as I said with data sorter to place the persondata template correctly). Rjwilmsi  06:57, 29 August 2010 (UTC)
 * Ah yes, I'd missed that, sorry. Feel free to run the most uncontroversial genfixes, perhaps you could run the FixPeopleCategories genfix alongside it, which will add birth and death categories, and the living people category if appropriate. Please add the template only if at least one field can be filled in (we can discuss that after we've seen how it goes in the trial). - EdoDodo  talk 12:52, 29 August 2010 (UTC)
 * . Rjwilmsi  10:26, 30 August 2010 (UTC)

I noticed that the bot also includes words before the names of the place, for example 'on Sõrve Peninsula' or 'near Bremen'  and I'm not sure if that's acceptable per Persondata. If the whole point of it is to make data processing easier then those words should probably be left out to make the fields more standardized and easier to process. Just a thought, I may be wrong, I hadn't worked with before. - EdoDodo  talk 12:55, 30 August 2010 (UTC)
 * I'm not sure if there is a right answer: Near Bremen and Bremen aren't the same. I'd rather stick with what is given in the infobox. Rjwilmsi  16:19, 30 August 2010 (UTC)

Other general fixes can be done at the same time. Yobot and SmackBot have taken similar approvals and this will help Rjw's work on AWB too.-- Magioladitis (talk) 09:32, 1 September 2010 (UTC)

Field completion stats
I ran some stats on the March 2010 database dump over the number of fields that would be completed (up to 5: name, date of birth & death, place of birth & death). I extrapolated from the first 2% of the database dump. Hopefully the articles are distributed through the dump such that my first 2% sample is representative of the whole dump. So I got (extrapolated) a total of 98,000 articles to which persondata would be added, with the following field completion levels:

Mostly the '1 field' completion is the name. Rjwilmsi 23:07, 1 September 2010 (UTC)
 * Trial edits look good. Please add the template only if you can fill in at least one field. - EdoDodo  talk 20:30, 5 September 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.