Wikipedia:Bots/Requests for approval/LaraBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

LaraBot
Operator: MZMcBride

Automatic or Manually Assisted: Automatic

Programming Language(s): Python (wikitools)

Function Overview: Warning editors who create unreferenced biographies of living people.

Edit period(s): Daily

Already has a bot flag (Y/N): No

Function Details: The script queries the Toolserver's copy of enwiki_p.recentchanges and finds all new pages that are in Category:Living people from a specific day (usually one day prior to the current date). It goes through each title looking for a ==References==, ==Further reading==, ==Bibliography==, <ref, and http://. If it doesn't find any evidence of references, it substitutes Unreferenced BLP warning on the user's talk page. The list is output to Database reports/Recently-created unreferenced biographies of living people for tracking / review.

Example biographies from June 3.
 * Italian_Mafia_DJ
 * Ron_Singleton
 * Violeta_Isfel
 * Liam_Bowman
 * Frederick_Ponsonby,_4th_Baron_Ponsonby_of_Shulbrede
 * Mostafa_Azizi
 * Jason_Taylor_(guitarist)
 * Sally_Greengross,_Baroness_Greengross
 * Yogi_khari
 * Danny_Sanderson
 * Javad_Razzaghi
 * Jordan_Schroeder
 * Bill_Belk

I may need to include a check for ==External links==. Thoughts on this would be appreciated.

Discussion

 * There are lot of chances that people who normally create unreferenced BLP articles are inexperienced even for adding categories like Category:Living people in the first place. They are usually added by other wikipedians later. Just a thought ! --  Tinu  Cherian  - 08:45, 4 June 2009 (UTC)
 * Yes, that's one of the advantages of getting the pages a day later. (Hopefully others will have tagged the pages by then.) I realize I'll still miss some biographies, but without manually reviewing every new page, there's no reliable way to detect whether it's a biography of a living person or not without the category. --MZMcBride (talk) 08:57, 4 June 2009 (UTC)
 * Ok. I agree. I guess the presence of "External links" should also be checked as many a times referenes are added as external links section. --  Tinu  Cherian  - 09:02, 4 June 2009 (UTC)
 * A few of these have infoboxes with links to NFL.com, which appears to be a valid reference. — Snigbrook 12:17, 4 June 2009 (UTC)
 * And IMDB, through a template. If you can weed those out, I think this would be a great task. – Quadell (talk) 14:34, 4 June 2009 (UTC)
 * Yes, I saw those. I've been debating in my mind whether or not those count as references. I suppose there's agreement that they do? It means that I'll just pull all external links when I get the list and I'll exclude pages that contain non-"en.wikipedia.org" links (to avoid "expand this" in stub links, etc. which ruin any queries for pages without any external links). Does that sound reasonable? --MZMcBride (talk) 16:10, 4 June 2009 (UTC)
 * I think they could count as references, and I think your solution is reasonable. – Quadell (talk) 17:36, 4 June 2009 (UTC)

June 4 results:

The script now checks for ==External links== and does a check for "true" external links. Only 3 results out of 112 new BLPs for 2009-06-04. --MZMcBride (talk) 02:13, 5 June 2009 (UTC)

Okay, take it away MZ! – Quadell (talk) 02:20, 5 June 2009 (UTC)
 * Wheeeeeeee . :-) --MZMcBride (talk) 02:36, 5 June 2009 (UTC)


 * Should you be using uw-unsourced1? – Quadell (talk) 03:01, 5 June 2009 (UTC)


 * Maybe. I sort of hate the user warning templates. They're not very friendly, they include images and far too much text, etc. So unless someone cares, I'd prefer to use my own variant. Reading through uw-unsourced1 a few times, it sounds very terse and belittling.... --MZMcBride (talk) 03:22, 5 June 2009 (UTC)


 * It's good to be friendly, and it's fine to use your own message. But I'd recommend an amalgamation of the two, something like "Hi! It seems you recently created an unreferenced biography of a living person: article name. Our verifiability policy requires that all content be cited to a reliable source. Please add references as soon as possible. Thanks!" That at least tells them why, and gives them links to the policies if they want to read them. – Quadell (talk) 12:50, 5 June 2009 (UTC)


 * [ So fix it] . :P --MZMcBride (talk) 19:43, 5 June 2009 (UTC)

Hit a Unicode error this evening that prevented the bot from posting the list. I added a nag_users parameter and made a few minor adjustments. Hopefully that fixed the issue. --MZMcBride (talk) 04:44, 7 June 2009 (UTC)
 * Any particular code issue with the Bot posting the msg more than once for the same article like this ? --  Tinu  Cherian  - 06:38, 8 June 2009 (UTC)


 * (edit conflict) More Unicode errors (similar ones to the previous ones). It's a shell issue with Unicode, not a problem with my script. It runs fine from the command line, but cron + shell + Unicode &rarr; death. Anyway, I prevented the death from occurring and implemented some better checks to ensure I don't hit a user page multiple times if I run the script manually when it died previously midway through. (And I prevented updating the report page more than once for the same set of data.) Everything seems to be working well now. We'll see what happens when it runs again in about 20 hours. --MZMcBride (talk) 06:41, 8 June 2009 (UTC)
 * Sounds good ! --  Tinu  Cherian  - 06:53, 8 June 2009 (UTC)

This seems to be working as expected now. I had one (theoretical) complaint about templating regulars, but everything else seems to be in perfect working order. --MZMcBride (talk) 21:32, 11 June 2009 (UTC)

Thanks for doing this! – Quadell (talk) 00:37, 12 June 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.