Wikipedia:Bots/Requests for approval/DschwenBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

DschwenBot
Operator:

Time filed: 22:38, Friday February 24, 2012 (UTC)

Automatic, Supervised, or Manual: automatic (with initial supervision)

Programming language(s): python (pywikipedia framework)

Source code available: will be made available

Function overview: parses TIGER Line US Census shape files and extracts individual US county outlines. The outlines are converted to KML and attached to the respective county articles using the Attached KML template (note that the actual KML data does not appear in the article text). This is a new geocoding method that was developed during the last few weeks. The outlines will be displayed on the WikiMiniAtlas and can also be viewed using Google/Bing Maps

Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Geographical_coordinates, Wikipedia_talk:WikiProject_Geographical_coordinates

Edit period(s): one time (per US state)

Estimated number of pages affected: 3143 (the total number of US counties)

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): N

Function details: Bot will create a new page Talk:Article/KML (unless it already exists), bot will generate the correct KML outline data and upload it on said page, bot will place Attached KML in the article text at the bottom before the first category link.

Discussion

 * Example
 * Talk:Santa Fe County, New Mexico/KML
 * Santa Fe County, New Mexico
 * Santa Fe County, New Mexico (pre KML)
 * diff to add KML

Josh Parris 09:23, 25 February 2012 (UTC)
 * 1) In the example, I don't like the positioning of the KML link; I think it should be in the External Links section.
 * 2) The KML infobox is terrible, far too editor-centric (but this is not a bot issue in of itself).
 * 3) Given the KML is associated with the article, why is it a subpage of the talk rather than the article?


 * Thanks for the feedback, Josh. The examples were added fully manually, I mentioned that I intend to put the template before the cats, that would be a technically simple solution. Of course detecting the External links heading is not hard either, but is it really an external link (yes, I guess, if you think about it as links to google maps/bing maps)? The template design is not really my business and not a bot issue. I know there has been some discussion to revise it. There are no subpages in the article space (but yes, that would be the preferable location to me, too) so we had to resort to moving it into talk space. It won't be likely  to interfere with article discussions and won't pollute the article namespace. The basic idea here is to keep momentum going for a useful new geocoding device that emerged from an initially heated but now very productive discussion process. All involved people are fully aware that the technical implementation could be better, but we also realize that if we wait for the necessary technical changes in the MediaWiki software (i.e. uploading of xml data on commons) we will have to wait for a looongg time and this thing will just die. This is about putting the idea out there, giving it exposure and demonstrating it's usefulness. --Dschwen 14:33, 25 February 2012 (UTC)


 * If it makes anybody more comfortable, I do have a record developing, maintaining, and running bots on commons: DschwenBot, VICbot, and QICbot. All those bots are custom developments and have a fairly high edit volume, for fairly complex edits. --Dschwen 20:23, 25 February 2012 (UTC)


 * Ok, I'm pretty much ready for a limited test run on a handful of pages, if that is ok. I programmed the bot to search for an external links section, and if it does not find one, it inserts it either after "Further reading" or "References", whatever is found first. I will skip articles that either already contain an Attached KML template, or that already have an existing KML data page. Articles are just skipped and stored for manual processing if any of the conditions are not met. --Dschwen 20:51, 25 February 2012 (UTC)


 * A map of a county isn't really further reading, and it barely passes as a reference - is it referred to in the article? I suggest, in order:
 * External links
 * Further reading
 * References
 * But I'm one guy. We really need more people eyeballing this.
 * Have you considered adding it to Infobox U.S. County? That seems a more natural fit. If you were to go down that route, I'd suggest creating all the talk/kml pages and then modifying Infobox U.S. County to hard-reference the KML non-optionally.  An entry like:
 * County boundaries on Google or Bing
 * might fit in nicely.
 * Please alert Wikipedia talk:WikiProject U.S. counties to this BRfA, but it looks like it's a pretty quiet wikiproject. I don't forsee any objections, but equally I'm not hopeful on others chiming in. Also raise it at Wikipedia talk:WikiProject United States in the hope that will attract more participation. Josh Parris 21:35, 25 February 2012 (UTC)
 * Ok, will do. And yes, we could add it to infobox county (but I'll leave that to the county guys). Right now the bot is creating a new external links section at the lostations I pointed out (it does not just put it into further reading!). --Dschwen 21:40, 25 February 2012 (UTC)
 * I think you should drop a note on USRoads as well. They were discussing doing something similar with Roads, Highways, INterstates, etc. I agree with Josh too, the WPUS and WPUSCounties projects suddenly got very quite over the past few weeks. --Kumioko (talk) 04:11, 26 February 2012 (UTC)
 * Yes, I know, that is where the whole thing originated. The discussion moved to the GeoProject page and some road people are still very involved. --Dschwen 05:32, 26 February 2012 (UTC)

A technical comment. That stuff really should be in template space, not in talk namespace. Have have something like transclude Template:Attached KML/ARTICLENAME. Headbomb {talk / contribs / physics / books} 02:46, 28 February 2012 (UTC)
 * I like that suggestion. I brought it up here. --Dschwen 03:35, 28 February 2012 (UTC)
 * I like the general concept too, except I worry such an arrangement could make it hard to identify attempts to hijack a page's content by injecting what, on the surface, looks to be legitimate transclusion. My first thought is to create a new namespace, perhaps KML.  —EncMstr (talk) 04:00, 28 February 2012 (UTC)
 * It won't pass, there's no need for a new namespace for something that is no different than a template. Headbomb {talk / contribs / physics / books} 04:01, 28 February 2012 (UTC)
 * Colour me stupid, but isn't there a need to pass KML to external websites to render, rather than transclude it into the article page? Josh Parris 04:16, 28 February 2012 (UTC)


 * Quite right. There is no need for transclusion of KML data.  The template  generates external links.  WikiMiniAtlas (click on a coord globe to activate) is a pretty clever applet which retrieves the data from the /KML subpage of the article's talk page.  (See Mojave Desert for examples.)  The point is that nothing should transclude KML data.  So being in template space is not quite correct.  —EncMstr (talk) 04:37, 28 February 2012 (UTC)
 * Well transcluded or not, it's certainly not an article, not a discussion, and the template namespace is the best to handle that sort of thing. Headbomb {talk / contribs / physics / books} 14:22, 28 February 2012 (UTC)
 * Agree, and two other people on the GeoProject also think this is the way to go. Template subpages can be understood as auxilliary material supporting the main template. Template documentation pages are subpages of templates as well, and they are not meant to be included. It is easy to shoot down such a suggestion because it is not the perfect solution, but it certainly is better than putting it on a subpage of article talk. --Dschwen 15:09, 28 February 2012 (UTC)
 * This occurred to me over night: KML would be most suitable in 'File:' space.  They are a lot like an .SVG file, and possibly future development will automatically handle (display, annotate) a KML in the same way a .PNG, .GIF, or .JPEG is handled now.  —EncMstr (talk) 17:05, 28 February 2012 (UTC)
 * This is not going to happen. at least not in our lifetime. Please read 26059. --Dschwen 17:28, 28 February 2012 (UTC)
 * I don't see where not going to happen follows from the conversation there. A stripped-down KML format missing title and heading tags is evidently supported by MediaWiki.  What about the mapping services?  —EncMstr (talk) 17:48, 28 February 2012 (UTC)
 * Either way, I'm perfectly fine with uploading things in the template namespace, at least as a temporary solution, and whenever MW supports KML/KMZ files, the transition to File namespace can be done at that moment. The template should be placed in the External link sections, before any links and above navigational templates. When the template and bots are updated to take this into account, we can move into trial phase. Headbomb {talk / contribs / physics / books} 18:32, 28 February 2012 (UTC)
 * The bot is ready (placement is already performed as you describe). --Dschwen 18:46, 28 February 2012 (UTC)
 * What's the rush for a temporary solution? Let's talk this out and get to the final solution and avoid rework. Josh Parris 02:14, 29 February 2012 (UTC)
 * Unless I'm mistaken, the KML at Talk:Santa Fe County, New Mexico/KML won't trip the <head* filter. I think it should go to ns:File.  But under what name?  Can any further use for KML files be foreseen other than a single use per article? Josh Parris 02:10, 29 February 2012 (UTC)

Trial
Alright, then let's go with a small trial just to make sure things aren't completely broken. Then we'll go for a larger trial to make sure things work fine. Headbomb {talk / contribs / physics / books} 21:39, 28 February 2012 (UTC)
 * Thanks! I have modified the (now renamed) Attached KML template to also accept data at the new location Template:Attached_KML/PAGENAME. The bot will be uploading there and existing KML files can be moved without disruption that way. --Dschwen 22:46, 28 February 2012 (UTC)
 * Hmm, I need advice: Where should the template be placed in Winn Parish, Louisiana for example? --Dschwen 00:42, 29 February 2012 (UTC)
 * Well, per my earlier advice I'd suggest Further reading See also, but also note I'm not crazy about the use of a template. Josh Parris 02:12, 29 February 2012 (UTC)
 * I'm not convinced that it's time for a first trial. I believe there's a number of aspects of this proposal that aren't nailed down, certainly not to my satisfaction. Josh Parris 02:18, 29 February 2012 (UTC)

Ugh, great. Just ran a small test. Looks good to me. The bot is set to only touch articles where it an unambiguously place the Attached_KML template. The generated KML displays fine in the WikiMiniAtlas and on the two proprietary mapping services. I should add the article title into th KML file so it shows up when viewing it in Google Maps. Just please do not suffocate a great idea. --Dschwen 04:41, 29 February 2012 (UTC)


 * To Dschwen - You should upload the markup before putting the template on the article. This way the article cache will be up-to-date after the bot edits the page, and if something goes wrong (Wikipedia in read-only-mode), the article will not have bogus links
 * To Josh - What exactly would need to be nailed down before proceeding, IYO?
 * Headbomb {talk / contribs / physics / books} 16:44, 29 February 2012 (UTC)

Ok, I reversed the edit order (thanks for the suggestion, it makes sense). Thereby I introduced a bug which associated the wrong coordinate data with some counties (St. Charles Parish, Louisiana was given the data of a county in North Dakota). This is fixed now in the code and I re-processed the broken counties. Sorry, but i guess that put me a little above the 10 edits! I also updated the KML to include a link back to the Wikipedia article. Further suggestions? --Dschwen 18:08, 29 February 2012 (UTC)

Further discussion
I haven't examined the trial. I feel this is a useful task, I have no intention of smothering it; I want it done right the first time.

What I consider the unresolved issues: Josh Parris 21:20, 29 February 2012 (UTC)
 * 1) Destination of the KML (I'm hoping for some discussion of the File: namespace)
 * 2) Technique for linking the article to the KML (I'm hoping some interested editors will chime in regarding a separate template vs an additional field in the County infobox)


 * I do not see the advantages of File: namespace, in File you cannot edit the data. KML is a simple XML file. Quick edits are possible and diffs make sense. KML was proposed as an allowed filetype over 14 months ago, a first patch was provided over 12 months ago but reverted only shortly after. Since then there is the security issue with IE6. Sure, you can craft a KML that wont trigger the IE6 security check, but I can tell you with certainty that KML upload won't be enabled in that state. It would just be confusing to the users if some files are "randomly" rejected. Bottom line still is that File: does not provide significant (or any) advantages over Template: space. --Dschwen 21:54, 29 February 2012 (UTC)


 * The clearest advantage is handling by wikicommons. By referring to KML in the File: namespace, the data can live either on any project language wiki, or on commons and it will be conveniently resolved.  I am pretty sure that is not true of all other namespaces.  KML data is a natural fit for commons.
 * The points about it not being editable is also true of images. For most modification, they have to be downloaded, edited with client PC software, and reuploaded to commons.  Diffing revisions, if it were allowed, would not be meaningful for jpeg, gif, and png media, though it would be handy with svg, and now kml.  Since this technique is breaking new wikiground, it could well drive the needed supporting technology to fruition.  In fact, it is a compelling case for mediawiki development to handle KML in a suitable manner.  —EncMstr (talk) 23:00, 29 February 2012 (UTC)


 * Since the KML data is never included, but only a bunch of links is generated, a template on any language wikipedia may refer to the KML data on the english wikipedia and vice versa. Mediawiki development is really nice, but a bit of wishful thinking right now. It is not a reason to block a solution that is good enough right now. When a better solution has been developed some time in the future it will be trivial to whip up a bot that moves all KML data to commons. But we have something that is useful for the reader right now. We want them to get rich geodata to illustrate wikipedia articles now. For the reader it makes no difference where that data is hosted. They are the customers. Pandering to order and structure loving editors is nice and important to keep the project running smoothly, but here we are talking about a fairly structured solution vs. a perfect solution (and the perfectness is highly debatable!). Please by all means push for a solution you think is better, but do not stop work now, because there is an easy and obvious transition path from what we have now, to what we may get sometime. --Dschwen 23:21, 29 February 2012 (UTC)


 * There are two good points here; Files can't be diff'd, but they can be shared by Commons. I'd lean towards the diff being more important for detecting vandalism. Josh Parris 04:47, 1 March 2012 (UTC)
 * - it can't be shared by Commons at this time. --Rschen7754 04:51, 1 March 2012 (UTC)


 * After jumping through some hoops, I got http://commons.wikimedia.org/wiki/File:NYstateroute308.kml set up. But you are correct:  there is no way to link to it, not from enwiki, and not even from commons.  See item 4 at commons:User:EncMstr.
 * You are also correct that putting the data anywhere it can be used is a very good idea. Even if it has to (and can be) moved later.  Semantically, template space is better than talk subpages, so I rescind my suggestion of File: space.  —EncMstr (talk) 06:17, 1 March 2012 (UTC)
 * It seems resolved: template space is the least worst place to put it. I presume that's been tested? Josh Parris 06:42, 1 March 2012 (UTC)
 * It's live. See Special:PrefixIndex/Template:Attached KML/ for subpages in use (except that /doc is, of course, the documentation subpage).
 * Having agreed that Template subpages are currently better than File pages and uploads, should the subpage name include ".kml" at the end? This might help downloaders assign an appropriate file extension (or mime type). It might also make KML pages easier to identify by a simple rule independent of the template (similar to the way .css and .js pages are currently treated specially in certain namespaces by virtue of the pagename). At present, the subpages are all named identically to the mainspace articles. But Attached KML could easily be modified to append ".kml" to the name of the template subpage used in the external links. We could change this now before there are very many subpages to move.
 * In other words, should we move each existing "Template:Article KML/articletitle" to "Template:Article KML/articletitle.kml"?
 * — Richardguk (talk) 16:15, 1 March 2012 (UTC)
 * Can you please discuss details like this at the template page? It does not seem appropriate for this venue, and won't get any attention by the right people who are involved with the template. --Dschwen 16:12, 5 March 2012 (UTC)
 * Thanks for the feedback. I was surprised how little interest there has been in previous threads on the template talkpage, compared with the move to template space discussed here, but I agree that Template Talk is the logical place.
 * Subpage renaming suggestion moved to Template talk:Attached KML.
 * — Richardguk (talk) 03:09, 6 March 2012 (UTC)

Infobox or template?
Now the hard one - aesthetics. I prefer a modification to the infobox (which has the upside of the article not needing modifying at all, making a much simpler, quicker bot), but other involved editors may have a preference to make the links in a separate template, placed at some point in the article. Josh Parris 06:42, 1 March 2012 (UTC)


 * Perhaps I don't understand the question, but shouldn't *both* be implemented? The infobox to do the majority of existing articles, and a template for articles lacking a supported infobox, or for providing alternate data, or layers, or what-have-you which has not been thought of yet. —EncMstr (talk) 07:16, 1 March 2012 (UTC)
 * Are there any counties that don't have the county infobox? Why?
 * Alternate data? Layers?  Please explain these possible future needs. Josh Parris 10:56, 1 March 2012 (UTC)
 * Such discussion would be better on, say WT:GEO, or a more specific project sub-page, with pointers posted on affected project, MoS and template pages. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 12:13, 1 March 2012 (UTC)


 * I agree there is a need for such a discussion before we can proceed with more extensive testing. Specifically, the discussion should deal with these questions:
 * Would the best place for these links to be in an infobox (presumably through a kml parameter), or in the external links section, or somewhere else?
 * Assuming the preferred place for these links is the infobox:
 * What happens if a county does not have an infobox?
 * Should the bot place it in the External Links section (or failing that, the See Also section) as a temporary solution?
 * What if those sections don't exist? Can the bot create them?
 * Should the bot wait for editors to place an infobox, so things are done right the first time around instead of requiring cleanup in the future?
 * Can the bot create the infobox if it's missing?
 * Assuming the preferred place for these links is the External Links section:
 * What if the External Links section doesn't exist? Should the bot use the See Also section if it's there? Or can it create an External Links section? What if there is no See Also section? Headbomb {talk / contribs / physics / books} 14:50, 1 March 2012 (UTC)


 * Headbomb {talk / contribs / physics / books} 14:45, 1 March 2012 (UTC)
 * Currently the bot places it at the top of the external links section, if it finds neither external links nor further reading but a reflist template it creates a new external links section right after the reflist (please read closely what I wrote, it is complicated!!!). The reason is the suggested section order in the MOS. I currently have no solution to find the end of a further reading section (it can be a dangling end with navigational templates and categories under the same heading). I could scan the section for a *-list and assume the end of the *-list is the end of the section. --Dschwen 15:41, 1 March 2012 (UTC)
 * If the box continues to float right, I suggest putting it at the start of these kinds of sections.
 * I see you're pushing along discussions relevant to this BRfA elsewhere; I can't see anywhere that consideration of links inside the infobox instead of / in addition to the separate template. Please make an effort to form a consensus for the template-box over inline infobox links; I don't feel the community has given it due consideration. Josh Parris 00:32, 11 March 2012 (UTC)

Hold the phone!
A new development may make this bot request superfluous. The data from the WIWOSM project is already being displayed in the WikiMiniAtlas and it will only be a matter of time until the tagging of the OSM database gets to U.S. counties. Then they will appear on the map automatically. --Dschwen 23:05, 15 March 2012 (UTC)
 * So, do you want to continue pursuing this BRFA? Josh Parris 23:43, 22 March 2012 (UTC)
 * No, forget it. Check out the WMA in Fiji for example. I have a better source for geometric overlay data now. --Dschwen 23:48, 22 March 2012 (UTC)

— HELL KNOWZ  ▎TALK 12:20, 24 March 2012 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.