User:RexxS/Infobox factors

In anticipation of a request to amend WP:INFOBOXUSE, I'm trying to brainstorm all of the factors that ought to be considered when approaching the decision on an article having an infobox, and if so, what it should contain.

Background
At present the only guidance that exists on whether an article should have an infobox is: ArbCom repeated the above guidance as a finding-of-fact at WP:ARBINFOBOX.
 * "The use of infoboxes is neither required nor prohibited for any article. Whether to include an infobox, which infobox to include, and which parts of the infobox to use, is determined through discussion and consensus among the editors at each individual article."

ArbCom also issued a remedy at WP:ARBINFOBOX : In December 2013, in response to my request that the "decorum and civility" remedy be enforceable, : reminders, just like admonishments, cannot be enforced directly; it's, of course, possible to ask for an amendment to the original case so that either an editor can be placed under a remedy which *would* then be enforceable or discretionary sanctions are authorised, but rebus sic stantibus hypothetical violations of the "editors reminded" remedy cannot lead to restrictions under our delegated authority, though it's certainly possible for an individual admin or the community to exercise their power to restrict users editing disruptively.
 * "All editors are reminded to maintain decorum and civility when engaged in discussions about infoboxes, and to avoid turning discussions about a single article's infobox into a discussion about infoboxes in general."

In February 2014, of clarifying what they understood by "discussion and consensus among the editors at each individual article".

As is obvious, all of that has proven to be insufficient to deal with the problems that arise when the issue of an infobox is raised in certain areas of the encyclopedia. The result of the guidance and the ArbCom injunction is that similar debates erupt in multiple articles with many of the same disputants engaging in behaviour and rhetoric that flagrantly breaches ArbCom's requests.

Stewardship
There exist at least two groups of editors who are (a) prolific content contributors with a strong enough interest in certain topics to take many of them to Featured Article status; and (b) have a belief that articles within their purview are generally unsuitable for an infobox. Because they do their best to steward the articles that they have invested so much time in, they may come to resent other editors asking for those articles to have an infobox on a regular basis. This is where an asymmetry appears in the two positions, because in these cases, the anti-infoboxers have invested a lot of time and effort into the articles, unlike the pro-infoboxers, who are now quite often editors unaware of the previous conflicts, who are coming to the article in question for the first time.

It is a burden to explain multiple times the reasoning for a decision to exclude or remove an infobox, but failure to do so by reference to Wikipedia policies or guidelines runs the risk of straying beyond stewardship into ownership. It must be exasperating to lay out reasoning time after time, and it is not surprising that it sometimes leads to incivility and frustration.

The question therefore arises, "To what extent should an article's principal authors have a greater say in the decision to have an infobox than any other editor?" If the answer is "completely", then we run counter to WP:5P3. If the answer is "not at all", then how do we avoid or at least reduce the burden of stewardship in having to defend the same decision multiple times across multiple articles when our guidance insists on article-by-article discussion?

One solution might be a "compact": when the question of an article's infobox has been discussed amicably, then the issue may not be raised again within an agreed period, perhaps 12 months.

Another solution might be to authorise any uninvolved admin, during an infobox discussion, to strike any comment that they judge to be uncivil and ban the commentator from any further discussion on that talk page until the discussion is closed. Of course there would be disagreements about "uninvolved" and appeals, but perhaps common sense might prevail and those admins with an axe to grind might be persuaded to stay clear of such enforcement actions. Maybe we need a panel of "neutral" admins? or volunteer bureaucrats, or functionaries? Who knows what might work?

Third-party re-use
"Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment." - Vision

Our content is distributed by many means: directly on desktop and mobile; through mirrors and forks; translated into other languages; delivered on CD and Kiwix/memory stick; complete database dumps; scraped by automated programs and re-packaged for specialist audiences and uses. Some folks just read it. Nevertheless, if we're going to realise our vision, we have to embrace all the other delivery methods beside the obvious one.

For better or worse, plain English text is not the most convenient, nor most efficient way of fitting into other delivery methods. The question then arises "To what extent should we compromise on writing the encyclopedia in plain English in order to accommodate third-party re-users?"

Some will say "as much as possible - make it easy for third-parties"; others will say "not at all - let the third-parties sort it out themselves"; many will probably conclude that somewhere in-between is most appropriate. Without some consensus on the value of third-party re-use, the value of an infobox to any article will not be agreed upon by different editors who find themselves at different ends of the that scale.

Microformats
Infoboxes incorporate microformats in many fields. These provide a standardised means of associating common items of data such as full name, date-of-birth, coordinates, region, etc. with their respective values in the article, in such a way that they can be read automatically by third-party tools such as Yahoo! Query Language.

That microformats provide an advantage to some third-parties is indisputable, but the perceived worth of that to Wikipedia editors varies enormously.

Structured data
Almost all of the information in an infobox is stored in a table with two columns. This is populated by labels and values, which conveniently correlate with attribute–value pairs, a fundamental data structure. This allows third-parties to build tools which can read these pairs from an infobox and collect the data, even from fields that have no microformat. This is sometimes referred to as "scraping".

There are unanticipated applications of the structured data in Wikipedia's infoboxes. Since 2008, Google has been using Wikipedia articles to train its natural language reading tools: Google Tech Talks, 11 November 2008 (51 mins). For example, by comparing the certainty of a date-of-birth found in an infobox with the text surrounding the same date found in the article text, the reading tool can infer that the article text is describing a date-of-birth. That allows the tool to learn how that description can be written in natural language, and eventually can extract likely dates-of-birth from other text with increasing confidence of accuracy. According to Google, Wikipedia is the biggest source of this kind of structured data combined with natural text on the internet.

Similarly to microformats, the perceived worth of providing structured data via an infobox varies strongly among editors.

Translation
The English Wikipedia is more than 50% larger than the next largest language Wikipedia and over ten times larger than the 17th largest out of the 294 languages for which official Wikipedias have been created. English is also still the lingua franca of the internet, so it is not surprising that other language Wikipedias frequently turn to the English Wikipedia as a source for articles to expand their Wikipedia. The infobox will often be easier to translate because of the simplified language and structure. Additionally, infoboxes can draw their content from Wikidata. A well designed infobox on the English Wikipedia may be imported into another language and may even be able to populate itself from Wikidata in the target language automatically. An example would be cy:Telesgop Pegwn y De which uses the imported cy:Nodyn:Infobox telescope template to automatically fill in its contents. It makes a good start for the article translation if the key facts are there from the start.

The advantage of an infobox as a starting point for translation is anecdotal, but plausible. It is difficult, therefore, to attach an importance to it when making the decision on having an infobox in an article. It is likely that there will again be considerable variability in the weight that different editors will assign to it.

Alternate values
Most common infoboxes come with a large selection of available parameters. Although each of them will be useful in some article, it is unlikely that all of them will be appropriate in any article. That means that the choice of which fields to include is often a complex one. Sometimes a particular fact is not known, such as Beethoven's date of birth. Although his family celebrated his birthday on 16 December, the only extant documentary evidence is of his baptism on 17 December 1770. It is therefore inappropriate to include date of birth, but we can use baptised in these sort of circumstances. On the other hand, if we know both the date of birth and date of baptism for a subject, most editors would probably conclude that the latter is not a 'key fact' for the subject when it closely follows the date of birth, so it is very rare that both fields would be included.

Missing nuances
Sometimes a relevant fact is not clear-cut. This happens quite often in literature and generally in the humanities. For example, the genre of a novel may be neither one thing or another: in Night, the genre has been variously described as "novel", "autobiographical novel", "autobiography" and "memoir". This is a question of some delicacy as it deals with the subject of the Holocaust, and opinions vary as to the extant that it is a memoir. In cases where it requires multiple sentences to adequately explain a particular item, it cannot be neatly fitted into an infobox field without losing all of the nuances. It is therefore common to leave the parameter out, rather than have something that is misleading there (Night is not well-described by novel, autobiographical novel, autobiography, memoir).)

Irrelevance
Some parameters will rarely be 'key facts' for their subjects, so would do little more than clutter an infobox and dilute the importance of the other facts found there. One obvious example is height in infobox person - for most people, their height has no bearing on their importance or notability. There will be exceptions, of course, as it is likely to be a key fact for a 7 ft tall basketball player. One person's trivia may be another's interesting fact, and it may require a Request for Comment to settle whether certain items are relevant or not.

Thin end of wedge
The issue of inappropriate fields is not immediately relevant to the decision of whether or not to include an infobox; it is principally a matter of the selection of what fields to include. Nevertheless, if a proposed infobox is likely to only contain contentious items, many editors would conclude that the article is better off without it, In addition, there is an argument ("thin-end-of-the-wedge") that once an infobox exists in an article, editors unaware of the background and editorial decisions previously taken - so-called "drive-by editors" - will regularly try to add inappropriate fields to the infobox and consume the time of editors stewarding the article in reversions and discussions. Some might say that it's the price we pay for "anybody can edit", but hopefully we can provide better solutions, for example by using an infobox constructed using Module:WikidataIB that allows an editor at the article level to make a positive decision to disallow certain fields by the suppressfields parameter.

Aesthetics
To many editors, it is important how an article looks. An infobox can become quite large and because its placement is generally expected to be at the top right of an article, there is little room to manoeuvre when trying to ensure that the article retains a pleasant aesthetic.

Editors looking at aesthetic issues need to remember that there are a wide variety of monitor widths, aspect ratios, and portrait/landscape orientations when it comes to display. For example, an identically sized box may take up an unacceptable width on a 768x1024 portrait-oriented tablet or be almost unnoticeable on a 3840x2160 pixel sized 4K monitor. Mobile rendering may be considered also, as the infobox on a phone browser now appears immediately after the lead and takes the entire width of the screen, enhancing the effect of repeated information.

Balance of the article
Newly created articles are sometimes quite short, or merely stubs. It will often seem incongruous if a long infobox is added, dwarfing the rest of the article. This will be the case particularly if the infobox contains much information not otherwise found in the article itself. However, as a temporary stepping-stone to fleshing out the article with prose, such an infobox may prove useful, especially as more infoboxes become available that are auto-populated from Wikidata.

Images
Some editors make a point of selecting, sizing and placing images within an article to create a pleasing aesthetic. This is particularly common in architectural articles. Although the effect is at its best only within a limited range of resolutions, it can be a cause of distress to the editor concerned if an infobox is added to the article, thereby disturbing the placement of images that they had put a lot of work into.

Lead image
It is conventional to place the lead image inside an infobox. However, as an infobox is generally about 240px by default, it limits the width of that image unless we accept a correspondingly wider infobox, which is considered unaesthetic. This restriction is often reasonable for portrait images, but does not work well with landscape images which generally need more than 300px width to adequately display the subject. An example would be Sydney Opera House, which even oversized at 270px wide doesn't make a good job of showing the architecture. There exists a possible solution, which places the lead image above the infobox, allowing it to be as wide as needed without affecting the infobox width. There's an example for Sydney Opera House at User:RexxS/sandbox. That can be coded as a switch that could be activated in appropriate articles without disturbing current articles.

Core editors
Although Wikipedia rejects the concept of ownership, there is no doubt that many of our best articles are principally the product of a single editor, or of a small group working in collaboration. Sometimes the principal editors are in favour of an infobox in the article; sometimes the principal editors are opposed to an infobox. There is a quite understandable feeling of pride in work that has turned out well, and we would be foolish to ignore that as a factor in making editing an enjoyable experience, and as a valuable help to editor retention. It would make sense to try to accommodate the feelings of the core editors of an article, if only because it's a collegial thing to do.

Effect of GA/FA
Featured articles have undergone a quite rigorous process of review against a well defined set of criteria. These are considered to be among the best articles that Wikipedia has to offer. That does not mean that an FA cannot be improved; almost all are capable of further improvement, but the decision to have an infobox or not will have been considered, possibly at length, prior to or during the FA review. That should carry weight in making a decision to add or remove an infobox from an FA, and is almost certainly an indication that the talk archives should be checked for a prior consensus. In any case, it is considerate to open a discussion on the talk page and seek a consensus before making any large change to an FA. WP:FAOWN discusses this issue.

Good articles have undergone a lightweight review process by a single other editor, so have not benefited from the same scrutiny as FAs. Although it cannot be assumed that a GA has necessarily considered the question of an infobox, GAs quite often have a principal author, and the consideration in the prior section are likely to apply.

Effect of Wikiprojects
Wikiprojects are a way that editors with a common interest collaborate. Many topic areas have an associated Wikiproject, which can be a good source of opinion, advice and assistance for editors wanting to work on a particular subject. Wikiprojects define assessment schemes for articles within their scope.

There is no defined policy regarding consistency between articles for most aspects, but Wikiprojects can help by encouraging editors to make use of consensuses established by the Wikiproject in many aspects of editing not covered by policies and guidelines. Although those consensuses are often very helpful, they cannot be considered binding on any editor. ArbCom has made findings that reinforce WP:CONLOCAL, but have opined that where no specific consensus for an article exists, a local consensus at a broader level should be taken into consideration.

When considering adding or removing infoboxes, it may be helpful to understand when a Wikiproject has developed guidance on the use of an infobox in articles within its scope. Some Wikiprojects recommend a particular infobox; others reject the use of an infobox. So knowing the attitude toward infoboxes of a group of editors particularly concerned with an article's topic will be helpful, although not prescriptive.

Differing audiences
The extent to which a infobox may add value to an article will to some extent depend on who is reading the article. It could be argued that the principal intended audience will be reasonably well-educated, native English-speaking, literate and have a good attention span. For such an audience, it is difficult to see the value of infobox, since the assumption is that they will read the article and learn whatever it was they were looking for in the process. There is little doubt that given such an audience, a top quality article will provide a deeper understanding of its subject whether it has an infobox or not.

The contrary argument is that our audience is actually a disparate group, impossible to categorise. This leads to editors producing redundant information within the article, which has a different presentation in order to cater for the different needs and abilities of different readers. One obvious, mandated, example is the lead of a well developed article. The lead is intended to provide a summary and overview of the article, so that much salient information can be found without the need to go through the entire article. In medical articles, there is a requirement to keep the lead simple and as free of jargon as possible. That is not only desirable to accommodate lay readers who just want some simple medical information, but also serves the purpose of making the lead simpler to translate in many different languages - a vital service for third-world countries, particularly whenever a disease outbreak occurs.

In the same way as for the lead, there is a corresponding argument that key facts can generally be collected and displayed "at-a-glance" for those who want a small piece of information as conveniently as possible, and that is one of the functions of an infobox. It could be said that a well-developed article is a 20-minute treatment of the subject, while the lead is a 2-minute overview and the infobox is a 20-second condensation. Not all editors favour repeated redundancy in this way.

The homework assignment
There is surely a place for Wikipedia to give people a very brief introduction to topics. A Google search for almost any subject will provide the Wikipedia article (where it exists) at or near the top of the results. A school pupil may want to pick a topic from a suggested list and having a very brief summary of that topic is useful to help them decide. Similarly, a person writing an essay or discussing a topic with a friend may need to find just one fact, such as the place where a historical figure died or their age at death. Wikipedia can provide that sort of information immediately if we have good summaries available. Some editors may argue that the student would benefit from reading the entire article - and that is almost certainly true - but it's not Wikipedia's place to dictate how students are to learn. It is also preferable to give the visitor the information they were looking for and hope that they may decide to read more, rather than force them to search through an entire article just to extract one fact.

In many cases the lead may be sufficient to meet the needs of such students and others, but that is not always the case, whereas an infobox is predictable in its placement and clear in its presentation, so that if the information sought is in the infobox, the reader will see that immediately and unequivocally.

Reading impairments
Not all of our audience will have the same skills in reading as our most accomplished editors and so there is again a need to simplify and organise the information in an article. Although the information presented in an infobox is normally to be found elsewhere in the article, often in the lead, a dyslexic reader may find difficulty in scanning thorough larger blocks of text to pick out a piece of information - an ability many of us take for granted. For someone who has a reading impairment of any kind, the infobox with its predictable placement and internal layout can very often be the difference between finding what they were looking for with ease and a frustrating slog though multiple sections in an attempt to locate the information they were seeking.

Foreign visitors
The English Wikipedia is the largest of more than 300 active Wikipedias. Currently it has articles; the next largest Wikipedias are considerably smaller and the eighth-largest has less than a quarter of the number of articles. The total number of edits on the English Wikipedia is about the same as the total of all edits on all other Wikipedias. In addition, English has become the lingua franca of the internet. This has led to the English Wikipedia becoming the default reference site on many topics for numerous visitors who do not have English as a first language. The infobox, located in a predictable position, with clear connections between each field label and its value is the most accessible repository of key information on a topic for anyone whose English skills are considerably less than a native speaker.