Wikipedia:Wikipedia Signpost/2013-02-11/WikiProject report

This week, we got the details on WikiProject Infoboxes. Started in January 2007, the project seeks to make infoboxes look consistent across Wikipedia's articles and provide tools for editors to create new infobox templates when the need arises. The project's efforts are greeted by enthusiastic data miners employing the microformats included in many infoboxes, while criticism is flung by some WikiProjects where the use of infoboxes has created controversy. The work done by WikiProject Infoboxes impacts projects covering nearly every topic, from bioboxes for people to taxoboxes for species to detailed route diagrams for transportation. We interviewed Andy Mabbett (Pigsonthewing), Chris Cunningham (Thumperward), kosboot, Sameboat, Van (Vanisaac), and Daniel Mietchen.


 * Why are infoboxes beneficial for Wikipedia articles? How can infoboxes be used outside of Wikipedia? What purpose does WikiProject Infoboxes play in improving the infoboxes used throughout the encyclopedia?
 * Andy Mabbett: The benefits of infoboxes include:
 * A quick and convenient summary of the key facts about a subject in a consistent format and layout
 * Emission of machine readable metadata
 * Infoboxes about people, places, buildings, organisations, products, species and dated events (battles, sports fixtures, record releases, etc.) and more emit microformats; see microformats
 * Data is made available to third party tools such as DBpedia and Freebase
 * Forthcoming integration with Wikidata
 * Chris Cunningham: As to what the WikiProject does, its aim is to make infoboxes simpler to create and maintain and to give them a simple and consistent appearance which is as accessible as possible for those accessing the data through means other than a graphical Web browser (to that end, there's an overlap with WP:WPACCESS).
 * Andy Mabbett: Also see WikiProject Accessibility/Infobox accessibility.


 * The use of infoboxes has not been formally standardized across Wikipedia, inspiring multiple competing essays and becoming a subject of controversy for some WikiProjects. Why have infoboxes stirred such fierce debate? What can be done to soothe concerns about infoboxes?
 * Andy Mabbett: Essays decrying infoboxes represent the views of a small but vocal minority of editors. Wikipedia has well over a million infoboxes; that demonstrates wide community support for them.
 * Van: More to the point, essays on unhelpful infoboxes focus on the bad implementations, which only goes to exemplify how helpful the vast majority of infoboxes can truly be. When an editor tries to simplify overly complex information, eliminate nuance, and uses an infobox to avoid actually writing the article, you end up with unhelpful, poor articles - the same outcome as if the editor only focuses on images. Ignoring the actual writing will always result in a bad article, whether you spent your effort on filling out an infobox or not.
 * Chris Cunningham: In my opinion the degree to which infoboxes are supposedly controversial is significantly overestimated. With regards to the point that infoboxes are often developed to the detriment of the article itself, that is simply because editing an infobox is often the lowest barrier to entry when editing an article: I'm heavily involved in a WikiProject which deals with tens or hundreds of thousands of BLPs, and we rely extremely heavily on casual editors to keep them up-to-date: a great deal of that work is through simple infobox updates. Without a simple, consistent entry point to articles like that, we simply wouldn't get those edits in my opinion.
 * kosboot: I participate in two projects WP:OPERA and WP:CM where a majority of active participants are vociferously anti-infobox. I think the major problem on Wikipedia regarding infoboxes is that there is not a clear explanation of their purpose.  Their purpose is not (or should not be) aesthetic;  their fundamental purpose is setting the groundwork for making Wikipedia a repository of structured data in preparation for the Semantic Web.  In that regard, they are as important as any markup on a Wikipedia page -- in other words, they should be mandatory.  I don't know how this can be emphasized enough other than a reminder on every page that what an editor does is for the future of the web.


 * There is no shortage of infoboxes available to editors working on articles. Why are there so many different types? Are there any infobox templates that could be considered "typical" or "standard"? To what templates should a contributor look for inspiration when building a new infobox template?
 * Andy Mabbett: Because anyone can edit Wikipedia! Work continues to merge overly-similar infoboxes, and delete those which are redundant. We need to better educate editors that infoboxes should not be forked just because a minor change is required. The best infoboxes are based on the Infobox framework and do not unnecessarily override its default style.
 * Chris Cunningham: One of my pet projects has been the introduction of a "module" system for infobox which allows smaller "sub-infoboxes" to be plugged into common bases such as infobox person. This allows for us to move all common biographical detail (such as birthplace, family information, eduction and so on) to be kept in one place, and then for simple additional pieces to be added in for career information and such as required. And on a simpler level, a great many of our infoboxes (such as infoboxes on different types of buildings, or on towns or railway stations in different countries) are pretty much redundant to one another. A huge amount of work in consolidating these templates has already been done, but we've still got a long way to go.


 * Many articles dealing with transportation include a route diagram template in their infobox. How did this template and its graphical style come into existence? Has there been any collaboration across languages? What can be done to improve this template and route maps in general on Wikipedia?
 * Sameboat: The route diagram template project (RDT) was started by the German Wikipedians in 2006 and implemented to English Wikipedia in 2007. Technically speaking, projects of both languages continue to advance the template codes independently. However, since all the icons which are used to compose the map are shared in Wikimedia Commons, Wikipedians from different projects collaborate with each other to create new icons. Speaking objectively, the current form of English Wikipedia RDT is more advanced than German for the icon overlaying function. This means if someone is transwikiing the map in English Wikipedia which uses the overlaying function to German Wikipedia, the map has to be redone or create new icon which the chance of being reused in other maps could be very small. The other problem of the RDT that concerns me all this time is the accessibility for visually impaired readers. Although there was an experiment to implement the alt attribute for each icon individually, it is extremely impractical to create the alt text for over 12,000 RDT icons search result. The other problem related to visual accessibility is that the colors used to distinguish the heavy rail (red), metro/light rail (blue) and unused/under construction line (lighter colors) could be very confusing to color blind readers. I don't have any color weakness issue myself, so this is where I want the comment from those color blind readers on RDT maps with mixing icons of more than one kind of shades (for example, East London Line original RDT).


 * What difficulties arise from biographical infoboxes? Are some fields in an infobox more necessary than others? What impact has the proliferation of infoboxes about people in different professions had on the consistency of Wikipedia articles?
 * Van: Biographical infoboxes suffer from the same concerns as biographies in general: unsourced/poorly sourced materials and WP:BLP violations will always be a problem. Many infobox fields are completely inappropriate for some subjects, as different people's lives, work, and relationships vary greatly in complexity. The proliferation of personal infoboxes is equivalent to the the proliferation of other kinds of infoboxes: as a WikiProject, one of our goals is to recombine infoboxes forked for spurious reasons and eliminate the redundancies; but different fields of biography may imply vastly different pieces of relevant information, meaning a different infobox for a Nobel Peace Prize winner and an Academy Award winning director.
 * Chris Cunningham: In general, Wikipedia articles become more consistent with one another over time, and infoboxes on BLPs are no exception to this. I've noticed a staggering improvement in this area over the last five years or so. With regards to specific details being more or less important in particular cases, views differ on whether this is a matter of style to be enforced socially or a matter of policy to be enforced technically. Particular infoboxes have moved in both directions over time.


 * How well do geographic infoboxes fulfill their purpose? How do editors determine what information should be readily available to readers at a glance and what information can be left in the article's paragraphs of text?
 * Van: Geographic infoboxes are some of the most effective, generally well-executed in the entirety of Wikipedia. The project of implementing microformats means that large amounts of data can be machine extracted, and will allow for encyclopedic details across a large set of geographic entities to be collated, searched, and compared. For users, geographic infoboxes enable quick finding of salient details, so that anyone can accumulate a data set for their personal use.
 * Chris Cunningham: I still feel we have a great deal of work to do in reducing the massive amount of redundancy in per-country infoboxes. However, this leads to another problem in that our base templates in this area are extremely complicated due to the huge number of features and edge cases to be catered for. I'm not sure to what degree we're going to be able to tackle this in the near future, as geographical articles are probably the most common ones to be created en masse through database extraction and as such the work required is staggering.


 * Are some scientific articles better suited for the use of infoboxes? What benefits and limitations do taxonomy infoboxes (taxoboxes) bestow upon articles about different species? Likewise, what impact do chemical infoboxes (chemboxes) have on articles covering chemical compounds?
 * Andy Mabbett: Many scientific articles are well-suited to infoboxes. The Taxobox emits a 'species' microformat.
 * Chris Cunningham: Taxobox is the ancestor of the infobox system, and as such it's got a great deal of inertia behind it. I previously worked on bringing it more into line with what we typically expect from modern infoboxes, but that work wasn't completed. In a way it's somewhat opposite to chembox, which provides a dazzling array of information and is frequently (indeed, perhaps mostly) the primary focus of the articles it's contained on. The degree to which a given article suits an infobox depends primarily on how much comparative information there is on it: chemical properties can be compared across a huge range of articles, for instance, and all animals have taxonomy data. Conversely, articles on things like engineering practices are poor fits for infoboxes.


 * How do you see infoboxes evolving in the future? What new features still need to be developed? How can a new contributor help today?
 * Andy Mabbett: More non-standard infoboxes should be migrated to the Infobox framework. Parameter names need to be standardised (for instance, different infoboxes use URL, website, homepage to mean the same thing). Similarly, the community needs to agree sets of standard parameters, for example for biographies, so that we don't have, say, spouse for actors, but not musicians.
 * Daniel Mietchen: I am looking forward to templates - and thus infoboxes - becoming more integrated across languages and perhaps projects. Other than that, I would like to see more multimedia elements in taxoboxes, when it makes sense. For instance, many animals produce sound, but taxoboxes currently do not have a field for sound files (or videos, for that matter).
 * Chris Cunningham: In terms of growing the project, a visual editor for simple updates to infobox data fields would be one of the biggest boons we could hope for as regards casual contributions. I dearly hope this is accounted for in the present plans for visual editing. As for how editors can help with infoboxes today, there are still plenty of redundant templates that could be consolidated; that has been the major focus in this area for years, and we're not going to be done any time soon.
 * Andy Mabbett: ...not to mention the many articles lacking infoboxes, to which one can be added by any editor!

Next week, we'll visit one of the transportation projects included in the route diagram above. Until then, draw squiggly lines in the archive.