Wikipedia:Wikipedia Signpost/2013-05-27/WikiProject report

This week, we plotted out the demarcations of WikiProject Geographical Coordinates, which aims to create a single standard of handling coordinates in Wikipedia articles. We talked to Jim.henderson, The Anome, Dschwen, Andy Mabbett, and Backspace.


 * What motivated you to join WikiProject Geographical Coordinates? Do you tend to spend more of your time geotagging articles or working behind the scenes on the templates and applications that provide and use geodata?


 * Jim.henderson: I've loved maps for over half a century. Road maps, contour maps, rainfall maps, whatever. I always like to know where I'm at and which way I'm looking. This meshes with my interest in astronomy. I tag many pictures and some articles, and know nothing of writing apps or templates.
 * The Anome: I thought that it might make Wikipedia more useful, and it might be fun to do. The expectation that coordinates are an essential feature of a well-written article also encourages better fact-checking by editors. I've put in a lot of work on coordinate discovery, data fusion, and bot-based maintenance of coordinates using my bot account,
 * Dschwen: Geographical coordinates and the interactive services linked to them, such as my project the WikiMiniAtlas, are a great example of what sets an online encyclopedia apart from the print medium. It adds a layer of crosslinking between articles through proximity on a map. This is great for researching areas for a trip for example.
 * Andy Mabbett: I first became involved as I wanted to add to the geo microformat to the way we display coordinates. That led to the creation of Coord, and I was the liaison person between the Wikipedia community and getting that recognised by Google for the Wikipedia layer in its maps. I quickly realised the usefulness of our coordinate data for a multitude of similar and other purposes. I've been involved in adding coordinate features to a number of templates and forging links between our articles and the corresponding entities on OpenStreetMap. I've also been involved in developing methods for displaying lists of coordinates in articles about linear features, such as those in Netherton Tunnel Branch Canal; see LINEAR.


 * How important is the geographical coordinate system used by Wikipedia? What use does this data have outside Wikipedia? Is the project anywhere near its goal of establishing a system akin to the ISBN system used in the publishing industry?

However, I don't think it's Wikipedia's job to provide globally unique identifiers, nor do I think that one single system can accommodate something as blurry as geographical data, which necessarily involves human-created distinctions everywhere. In any case, many people have had a go at this already, with their own private identifiers within their own databases. I'd rather that we participated in some geographical equivalent of VIAF, for geographical entities instead of people. VIAF apparently intends to cover geographical features, but I haven't found anything useful for our purposes: if I'm missing something here, I'd appreciate hearing about it! This is something that I hope that the Wikidata project may be able to help us with.
 * Jim.henderson: I have no idea what this achievement would look like or how to get there. Probably readers get some use in opening an online map to understand the environment of an article, but I've accumulated so many maps in my head that my main use of Wikicoords is to check other Wikicoords.
 * The Anome: It's already useful for finding articles for things near your location, which is fun, and for finding articles for things that are near another article. However, Wikipedia only has a very simplistic idea of geographical data, essentially just mapping the article to a point, with no idea of precision, scope, or provenance.
 * I think the really interesting things are yet to come: cross-referencing Wikipedia's articles, OpenStreetMap's geographical features and semantic relationships (such as those exposed by their "Nominatim" tool) with Wikidata and other semantic data repositories opens the possibility of making a fantastic geodata resource cross-linked with information that cannot be reduced to geographical terms.
 * Dschwen: We already have crosslinking with OpenStreetMap objects. In our interactive maps OSM objects corresponding to the current article are highlighted. We know Google uses the geotagging information on Wikipedia on their maps, and mobile applications, where location based services make a lot of sense, are just beginning to use the data for proximity searches for example. Wikipedia does have the tools for more complex geographical data, such as the Attached KML template that is in use in hundreds of articles and allows to embed complex geodata that can be visualized as map overlays.


 * Are Wikipedia's geographical coordinates easily accessible to the public? How difficult is it to add coordinates to an article? Do the coordinates in articles ever need to be corrected?


 * Jim.henderson: It's a pain in the butt for anyone not a hardcore coordfreak like me to figure how to make additions or corrections. Offline, graphical programs (I use MS Pro Photo Tools 2) work easily and precisely but online you have to paste or otherwise deal with numbers. Which I do, most every day, because many Wikicoords put a building or other object on the wrong side of a street, and a few that belong in New York or Pennsylvania show up in Kyrgyzstan or Sinkiang. Or Patagonia.


 * The Anome: yes, it's not uncommon -- I habitually click geocoordinate links when I visit articles for other reasons, just to see whether their coordinates make sense.


 * Dschwen: Adding coordinates involves adding a template, which probably still is too complicated for the average user. However the preponderance of geographic coordinates on Wikipedia demonstrates that we have a sufficient amount of advanced users that are up to the task. For the enduser the coodinates seem to be hard to discover, or if the user notices them in the article they oftentime do not realize they are more than decoration and come with a lot of functionality, such as interactive maps, and the geohack page with links to dozens of external mapping resources.


 * Backspace: It's not difficult to add (or to correct) geocoordinates at all. I have done perhaps thousands of them. The difficulty, of course, is in finding accurate sources. When I input a coordinate, I always check the linked-to map to see where it actually goes, because some sources, even "official" ones are sometimes just flat-out wrong. If I cannot find another source, I will then ignore the bad coordinates and post nothing.


 * Do different databases and mapping services provide conflicting geographic information? Should some sources be trusted more than others? How does the project determine where something is really located?


 * Jim.henderson: Everybody's databases are full of errors for parks, monuments and prominent buildings. Google landmarks, Bing Streetside View, NRHP, HMDb, Wikicommons, they all get many things wrong. Sometimes most of those agree and the one that disagrees is right or not as far wrong. Roads and active train stations have fewer errors. My bible is Google Earth's satellite photos, supplemented by historicaerials.com for vanished features and Google Streetview which has good precision in cases where the other sources can correct its baseline errors. Sometimes for local objects I hop on my bike and pedal out there. Sometimes you have to be a noodge.
 * The Anome: Jim is right: everybody's databases are full of errors, some more than others, and none of them can be trusted all the time on their own without cross-checking with others. Wikipedia isn't perfect, either, but the more eyes we have on the problem, the more likely errors are to be found and fixed, for everybody's benefit.
 * Dschwen: Associating point coordinates with area-like objects (cities, countries, lakes, etc.) or line-like objects (motorways, rivers, train lines) always has a degree of arbitrariness to it. We have a bunch of conventions on what points to tag, but there is also a lot of controversy with other on-wiki projects which oppose point like tagging of non-point-like features. The collaboration with OpenStreetMap should help. As for reliability of sources, I hope people will got out in the field with their GPS where they can.
 * Andy Mabbett: Of course sources vary. We have the advantage, for many features, that they can be checked against reliable maps - though this is less easy for historic sites. There is, however, a difference between giving the exact pinpoint location of, say, a small building, and a point that enables a more nebulous concept like the area of a battle or the region traversed by a long-distance road, to be found in a map.


 * Where do you foresee the future of geographic data heading? What new features would enhance the usefulness of this data? Is the system capable of being adapted to other types of data?


 * Jim.henderson: I'm too much the geomonomaniac to see into any of that. Umm, come to think of it, better integration among Wikipedia Mobile, the photographic and geographic aspects of Commons, and Google Maps for Android, would make all of these more useful, especially to me and my quest to illustrate the many unillustrated or badly illustrated articles.
 * The Anome: As I said above, the future is data fusion between many disparate sources, and geographical data is just one more way to tie things together. I suspect that the long-term future of the Wikipedia geocoding project will be as part of Wikidata -- managing that transition will be a big undertaking that at the moment we haven't even begun to address.
 * Dschwen: Adding map resources for historical map data will be immensely useful. For example through reprojection and overlaying of old maps onto contemporary maps (http://mapwarper.net), explicit generation of KML data with a temporal axis (http://mapstory.org) or through encoding of time data into OpenStreetMap (check out the start_date and end_date tags on buildings for example). To have a map that reflects the status of the world at the time period relevant to the current article, or even to have maps animated with a time slider, displaying changing political borders, glacier melting, drying lakes (Aral Sea) would add tremendous educational value.
 * Andy Mabbett: I see two important, and related, improvements in the near future: the hosting of coordinate data in Wikidata; and greater integration with OpenStreetMap. The latter will see the use of OSM maps in articles; and closer ties between Wikipedia articles and items mentioned in articles (or their Wikidata equivalents) and entities in OSM. For example, a building, lake, road or other feature, about which we have an article has a unique identifier in Wikidata and a unique identifier in OSM. We have a wonderful opportunity to work with the OSM community (and, indeed, others) to tie those identifiers together; to make clear that they are about the same thing; and to make sure that similar objects are not conflated.


 * What are the project's most urgent needs? How can a new member help today?

However, it's still a vast backlog. We could really do with an outreach program to recruit editors to help, particularly those with local knowledge, and to get better interwiki coordination on coordinates-gathering. Although I understand the rationale for not cluttering up articles with maintenance categories by default, making it easier for casual editors to un-hide/un-fold "hidden categories" on pages (which first of all requires them to know that such things exist) would be a really useful feature for this, and many other maintenance projects. Finally, it would be really good if we could get access to more machine-readable public domain geographical data -- anything which can be done to get database owners to free their data would do a great service not only for Wikipedia, but for the global community.
 * Jim.henderson: Newbies can fix coordinate errors the same as other errors. Click a place you know, and if it's wrong, squawk in Talk. When oldtimers like me don't respond, study the numbers in the locator template and adjust them a few times until the map starts to come out better instead of worse. If this looks like fun, join the rest of us who are similarly afflicted.
 * The Anome: I agree with Jim above about fixing things. There are also still around 160,000 articles marked as candidates for gecoding which are still missing coordinates, most of which have been categorized into per-country categories: see Category:Articles missing geocoordinate data by country. Tracking down coordinates for them can be quite entertaining, particularly as a coffee-break distraction from another task.  The good news is that we've been doing really quite well at managing the backlog over the last few years, particularly when considered as a proportion of Wikipedia's ever increasing size: see here for a record of past performance.
 * Dschwen: Actually I think our biggest challenges are social ones. Despite being around for quite a long time there is still a lack of acceptance towards geographical data coding in some sub-projects. Point data gets removed from articles because the coded objects are line-like (which destroys contributed information). Geodata is perceived as database data rather than encyclopedic data, weird strings of numbers just confuse the reader. We will have to work more on conveying the benefits of geodata by adding and improving the userinterfaces to add/edit and visualize the data. WikiMiniAtlas is a start, but to broaden the use and improve scalability, the geodata processing and map rendering will have to move from the underpowered toolserver onto Wikimedia Foundation infrastructure.
 * Andy Mabbett: Editors can do three things: learn how to add coordinates to articles; learn to edit Wikidata, and learn to update OpenStreetMap. Once they've done one or more of them, they can contribute to the discussions of how we make and apply the improvements I refer to above.

Next week, we'll invade Europe with a well-known WikiProject. Until then, operate within the boundaries of our archive.