Wikipedia:Bots/Requests for approval/KMLbot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

KMLbot
Operator:

Time filed: 05:33, Saturday, September 3, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): SPARQL + PetScan + AWB

Source code available: Yes, see function details below

Function overview: Adds to articles which have KML files available through Wikidata

Links to relevant discussions (where appropriate):

Edit period(s): Around once a week

Estimated number of pages affected: ~450 for initial run, probably much less in subsequent runs (depends on KML creation rate on other wikis)

Exclusion compliant (Yes/No): Not through Yes, per below

Already has a bot flag (Yes/No): No

Function details: SELECT ?article WHERE {       ?article schema:about ?item ;      schema:isPartOf . ?item wdt:P3096 ?kml. SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } }
 * (1) Get a list of Wikidata items which have both a KML file and an article on English Wikipedia.
 * Done manually with the following SPARQL query to Wikidata Query Service:
 * Output saved as CSV, opened with MS Excel. Titles extracted from URLs using formula like . But these are precent-encoded, so then decode them using a web-based URL decoder.


 * (2) Filter out articles which already have, or are a disambiguation page, or have been excluded through  (template yet to be created; would add pages to a hidden category  ).
 * Done manually through a PetScan query like the following (pasting the decoded list of articles into the "Manual list" box):
 * https://petscan.wmflabs.org/?language=en&project=wikipedia&depth=1&categories=Attached%20KML%20tracking%20categories%0D%0AAll%20Disambiguation%20pages%0D%0APages%20which%20should%20not%20use%20KML%20from%20Wikidata&combination=union&ns%5B0%5D=1&manual_list_wiki=enwiki&source_combination=manual%20NOT%20categories&interface_language=en&active_tab=tab_other_sources
 * Output as "Wiki" and save as a UTF-8 text file (for input to AWB)
 * (3) Use AWB to add to the end of each article (prior to DEFAULTSORT, interwikis, categories and stub templates).
 * Specifically, use the following options only:
 * Options tab: (none)
 * More tab: Append text  with "Sort meta data after" checked
 * Disambig tab: (none)
 * Skip tab: Skip if page is in use, or page is redirect, or page doesn't exist
 * Bots tab: Auto-save with delay 20 seconds, nudge if stuck, skip if first nudge doesn't help
 * Start tab: Summary

Notes:
 * Applying here per Special:Diff/737483526
 * The page at User:KMLbot would have info like what I started drafting at User:Evad37/KML, but adjusted to match whatever approval is given for the bot.
 * The bot would be not only be exclusion-compliant through, but would also be exclusion-compliant through the (yet to be created)  template populating   (hidden tracking category also yet to be created). This allows excluded pages to be filtered out with the PetScan query, would enable tracking of such article through the category, and would encourage editors to provide a reason why the KML shouldn't be used (so that problems could possibly be fixed for all wikis, rather than just ignored at the English Wikipedia). - Evad37 &#91;talk] 05:33, 3 September 2016 (UTC)

Discussion
Unless you turn the option off, AWB is exclusion compliant via. —&thinsp;JJMC89&thinsp; (T·C) 05:55, 3 September 2016 (UTC)
 * Oh, okay. Didn't realise that, but I don't think it makes a big difference to the proposal. - Evad37 &#91;talk] 06:25, 3 September 2016 (UTC) Adjusted above - Evad37 &#91;talk] 06:27, 3 September 2016 (UTC)

—  Earwig   talk 17:16, 3 September 2016 (UTC)
 * . Notes:
 * Edits 1-5: edit summary malformed
 * Edits 6-10: disambiguation pages included - because typo in petscan (capital D)
 * Edits 11-14: tried to fix in petscan - something went wrong, all of these had no KML in wikidata - reverted these edits
 * Edits 15-20: Started over (Wikidata query, decode titles, petscan query, save as UTF8 text file). Generally okay, but AWB inconsistent in adding new lines - some have a single space (as intended), some with 2 (which makes a gap in rendered page), some with none (??) - may be due to extra lines being rearranged by "Sort meta data after".
 * Edits 21-25: Try option "use 0 newlines" - all appears to be okay now;
 * Edits 25-50: Went ahead with remainder of trial edits. Spot-checked about 15 of these while the bot was editing and saw no problems - Evad37 &#91;talk] 00:44, 6 September 2016 (UTC)
 * Some further notes: I left the KML files on the disambig pages in place. While I don't think dismbig pages are good candidates for automated KML addition, as what is ambiguous in one language might not be in another, in these cases the KML files did match the disambig listings. I reverted the bot's edit to Field of Mars, a set index article. With similar reasoning to disambigs, I'll exclude set index articles in future runs (by putting Category:All set index articles in the petscan query). All other edits have functioned as expected, and as of now (6 days later), none have been reverted, and no-one's posted anything at the bot's talk page. - Evad37 &#91;talk] 04:14, 12 September 2016 (UTC)
 * - Evad37 &#91;talk] 07:18, 16 September 2016 (UTC)
 * - Evad37 &#91;talk] 00:36, 26 September 2016 (UTC)
 * Hmm. Do you think it's placing the template in the best location? This and a lot of others looks fairly unbalanced. Also, what's going on with the unreferenced tag here? —  Earwig   talk 01:03, 26 September 2016 (UTC)
 * The movement of unreferenced tags might be an AWB bug, I've reported it at WT:AWB (I think that might be AWB mistaking for a stub tag, and thus resorting it to the end of the article.)
 * With regards to the location, the absolute ideal location could vary quite a lot based on what else is on the page. It should go somewhere under the last heading, and come before the categories and stub templates, but whether it would be better above or below the navboxes really depends on what exactly is above the navboxes. If its just a simple bulleted list, then it can float to the right of the list. But if there's already one or more floating-box templates, or a multi-column list (e.g. reflist), or if its a short article with a long infobox, placement above the navboxes would cause excessive whitespace. (For example, these pages which had the KML box manually added below the navboxes some years ago: Karrinyup Road, West Coast Highway, Perth.) But then again, if there's just one floating box template and 6 or so items in a bulleted list, then it may be better to place it immediately after the existing box, to float in space available on the right of the list. I don't think a bot can be much good at making these cosmetic decisions – I'd rather the bot place the template in an acceptable position (if not 100% optimal), that gives the links to readers without causing big whitespace issues, and allow humans editors to make aesthetic choices when further editing of the article occurs. - Evad37 &#91;talk] 02:24, 26 September 2016 (UTC)
 * Got an answer from WT:AWB – if I enable genfixes, then the template redirect will be bypassed and replaced with, but left where it is rather than being moved to the end of the article. This seems to work, I tested it by getting AWB to generate a diff (but not saving) for Nishiyatsushiro District, Yamanashi:
 * [[File:KMLbot diff Nishiyatsushiro District Yamanashi.gif]]
 * (the metadata sorting doesn't seem to work in userspace, otherwise I would have made a sandbox edit to show you) - Evad37 &#91;talk] 04:17, 29 September 2016 (UTC)

Okay. Sorry, I haven't been around much lately, so this is going slowly. The other thing I notice is that most of the tagged pages are towns (where the KML file shows the town boundary) not routes like the help pages describe their use to be. Do you think we have consensus to add these? —  Earwig   talk 23:10, 8 October 2016 (UTC)
 * I think there is already an informal consensus to use KML files for polygons (areas) as well as lines (routes), as evidenced by the existing uses of KML for articles like Beaver Island State Park, Central Park, Des Moines, Iowa, and List of postcode areas in the United Kingdom. I think it is more that the documentation is out of date, and thus also the help page (as it was based on the template doc). - Evad37 &#91;talk] 03:13, 9 October 2016 (UTC)
 * I updated the template documentation and the help page to mention polygon features - Evad37 &#91;talk] 03:24, 9 October 2016 (UTC)
 * Hm, are you sure about Central Park? I see it removed by an editor before the bot added it. (Ping epicgenius for input.) —  Earwig   talk 07:41, 9 October 2016 (UTC)
 * It seems epicgenius created the KML file in October 2013 ‎; removed the article's coordinates and replaced them with KML in title-only display mode ; in March 2014 later changed the KML display to be inline at the bottom of the article ; and in the diff you linked (Nov 2014) removed the KML and replaced it with coords. Maybe epicgenius mistook and  as being mutually exclusive? Which they're not, as the latest version of the article shows. - Evad37 &#91;talk] 03:57, 10 October 2016 (UTC)
 * In response to the above, yes, I thought that if you added coords for landforms like Central Park, you couldn't add a KML unless it was inline-only. I also thought if you added KMLs in the title, you could only add inline coords. I don't remember why I removed it, but it had to be because of a petty formatting issue within the KML itself (i.e. the areas were all shaded in, making it impossible to see the map of Central Park itself). epicgenius (talk) 11:16, 10 October 2016 (UTC)

Not seeing much in the way of objection, so I think we can go ahead with this. The number of pages affected is small enough. —  Earwig   talk 08:29, 11 October 2016 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.