Wikipedia:Bots/Requests for approval/LivingBot 23


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

LivingBot 23
Operator:

Time filed: 12:43, Tuesday September 10, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP, Peachy framework

Source code available: Yes, though it's not going to win any style awards!

Function overview: Add |commonscat=XXX entries to existing listed building entries. Example

Links to relevant discussions (where appropriate): Use of the commonscat parameter is well established

Edit period(s): One time run, with the potential for smaller incremental runs at a later date

Estimated number of pages affected: ~350 initially

Exclusion compliant (Yes/No): Yes (though unlikely to be an issue)

Already has a bot flag (Yes/No): Yes

Function details: The |commonscat parameter of the English listed building lists provides very useful functionality, not least because Commons often has many images of buildings with only skeleton entries here on the English Wikipedia. Primarily, however, it serves three purposes:


 * It allows for the automatic presentation of a "more images" link for readers.
 * It allows new uploads as part of Wiki Loves Monuments to be automatically and correctly categorised on Commons -- a huge time saver (this could also be applied retrospectively)
 * It allows for editors to more easily select an image to use where one was not already present.

However, so far we have relied on human editors adding these categories, despite the fact that many can be easily recovered programmatically. The code linked to above provides three main recovery mechanisms:


 * If an image is present, look at its categories
 * If the building is a church, try some automatic permutations of its name
 * Try searching the allpages list, namespace=14

Although strict checks are applied at each stage to minimise the false positive rate, a further two sanity checks are applied late on:


 * Any existing image must have the proposed category
 * The parent categories of the proposed category must reference the correct county (or equivalent) to avoid cases where (e.g. village) X is confused with village Y where they have the same name

This ensures a low (zero?) false positive error rate, while saving human editors a great deal of time and allowing them to focus on less mundane tasks.

Discussion
In the process of testing my code, I made some manual edits with identical output:. These form a technical trial of sorts, I suppose -- you can see the code doing what it's supposed to, at least. - Jarry1250 [Vacation needed] 12:43, 10 September 2013 (UTC)


 * Trialing due to possibility of false positives due to the way the links are determined. — HELL KNOWZ  ▎TALK 14:28, 10 September 2013 (UTC)


 * Trial done (ish). The bot successfully skipped a Welsh page, didn't re-edit some others, and generally seems to have done well (barring a couple of false starts). It's also logged a series of anomalies for human investigation. Just checking for false positives now... - Jarry1250 [Vacation needed] 18:11, 10 September 2013 (UTC)


 * I've done a manual check on all of the Cambridgeshire page, the second-largest of its edits, and it seems to be completely accurate here. Andrew Gray (talk) 18:37, 10 September 2013 (UTC)
 * Only Cornwall's penchant for naming things "the church of St. Erth, St. Erth" seems to have shown up a bug, which I've now fixed. Extended trial, anyone? It's seeming increasingly likely the eventual edit count will exceed 500, so it'll probably be worth it. - Jarry1250 [Vacation needed] 18:46, 10 September 2013 (UTC)

— HELL KNOWZ  ▎TALK 19:08, 10 September 2013 (UTC)
 * No further problems. Once approved, we'll continue to check the cases where an error has a >0% chance of being made (all of which are logged), though it would seem that that percentage is very small indeed. - Jarry1250 [Vacation needed] 18:22, 11 September 2013 (UTC)

No issues that I can see. Low edit rate; uncontroversial task; supervision on any problems; trusted bot op. — HELL KNOWZ  ▎TALK 19:19, 11 September 2013 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.