Wikipedia:Wikipedia Signpost/2018-07-31/Recent research

Diverse image usage across languages

 * Reviewed by Morten Warncke-Wang

A paper presented at the recent ICWSM conference studies image usage across 25 of the larger Wikipedia language editions. Whereas diversity of the text of Wikipedia's articles across language editions has received much attention from researchers, studies of image and other media usage is rare. The paper has two main research questions:


 * What is the across language editions of Wikipedia?
 * How does the diversity in visual knowledge compare to the diversity of textual encyclopedia knowledge?

The paper chose to study 25 specific language editions to enable direct comparisons to prior work on textual diversity. Their methodology only examines image usage in articles that are not redirects, not disambiguation pages, and were not created by a bot (thereby excluding, for example, the Swedish Wikipedia's extensive amount of bot-created pages), resulting in a dataset of more than 23 million articles. Furthermore, they develop a method to filter out images that are frequently included based on templates, such as navigation boxes or stub templates, as these often do not reflect. Previous studies of textual diversity similarly removed templated-based links.

To enable exploration of the image use across languages, the authors developed the WikiImg Drive tool, a tool that visualizes image usage for any concept found in multiple Wikipedia editions. The tool provides information on how many editions has an article about a given concept, how many images are found in those articles, and shows a chord diagram to visualize image usage across the languages (an example diagram is shown below). Users can then get further information about what the used images are, and which specific editions use a given image.



The paper studied in two ways: image diversity across language editions, and image diversity within coverage of a concept. When it comes to image diversity across language editions, they find that more than 67.4% of all images only appear in. The proportion of images used in multiple editions is relatively small and decreasing quickly: only 14.1% of images appear in two editions, and in all only 142 images (0.0014%) were used in all 25 language editions. The majority of these "global images" are either portraits or other images showing a specific person.

For image diversity within concepts, they made pairwise comparisons and calculated averages, finding a range from 22.2% (German and Indonesian) to 75.6% (Hungarian and Romanian). This means that, on average, more than 24% of images used in an article in a specific language will be unique relative to an article about the same concept in a different language, and that for some languages this rises to almost 80%.

Lastly, when comparing visual diversity to textual diversity, they find that the overall degree of diversity is roughly similar. At the far ends of the diversity scale, there are some clear differences: 67.4% of images are only used in a single language edition, whereas 73.5% of concepts have an article in only one language edition. This reverses on the opposite end of the scale: 0.13% of concepts are "global concepts" (found in all 25 editions), whereas only 0.0014% of the images are "global images".

The paper discusses what may drive the larger diversity in image usage. Cultural diversity can lead to significant differences in how one might want to illustrate a concept. There are also examples of language editions using variations of the same image, meaning that further image analysis would be necessary in order to identify those. Lastly, they discuss whether images having English names on Commons might enable cross-language usage. Previous research from 2012 (where this reviewer was one of the authors) found that English is the primary language from which translations were made, and we have also covered research in May 2016 that found English to still be the lingua franca of Wikipedia.

Furthermore, the paper discusses how the diversity in image usage can affect algorithms and AI trained on Wikipedia data, cautioning that using images from a single edition will likely result in a biased view. The paper points out that gathering images from all language editions is relatively straightforward and should therefore be the preferred approach.

Filling knowledge gaps: PDFs as "boundary objects" between experts and Wikipedia

 * Reviewed by Morten Warncke-Wang

What strategies have success when seeking to fill knowledge gaps in peer-produced content? In a recent paper titled "Beyond notification: Filling gaps in peer production projects", Ford et al. studied several approaches aiming to improve coverage in articles relevant to teachers in South African primary schools. A committee of scholars and researchers in ICT and education—especially primary school education—along with teachers, parents, and Wikipedia experts, identified 183 articles in the English Wikipedia that are relevant to the South African national primary school curriculum. Five strategies for soliciting improvements to these 183 articles were then tested and evaluated with regard to whether articles were improved, and if the strategy was helpful in bringing new contributors to Wikipedia:


 * 1)  were relatively successful. Articles with few online sources or requiring specialized knowledge were unlikely to be improved, and the competitions did not result in newcomers editing.
 * 2)  were not successful as no articles were improved during those.
 * 3)  proved largely unsuccessful. These notifications were sent to WikiProjects, and regardless of whether there appeared to be activity within the project or articles related to it, the notifications tended to be ignored.
 * 4) Reaching out to academics to write  that can later be moved on to Wikipedia as articles was unsuccessful. One academic responded negatively, noting that they did not recognize Wikipedia as a legitimate academic enterprise.
 * 5)  resulted in a low overall number of improved articles, but had the most sustained engagement and the highest quality results. The experts would get a PDF of the Wikipedia article and review it. The team would then copy the comments onto the article's talk page, and use OTRS to verify that the comments were appropriately licensed by the expert. Wikipedia contributors could then respond by incorporating changes to the article or discuss the review on the talk page.

In the paper, Ford et al. discuss how the PDFs can be seen as a form of "boundary objects" that allow for a negotiation between the workflow and epistemological paradigms of the experts and Wikipedia, and that this negotiation is necessary in order to facilitate collaboration. They also argue that expanding the collaboration between experts and Wikipedia contributors is an important strategy to close the knowledge gaps in the encyclopedia.

Contributor experience and article quality

 * Reviewed by Morten Warncke-Wang

A short paper recently published in the Journal of Medical Internet Research studies the "Effects of Contributor Experience on the Quality of Health-Related Wikipedia Articles". Using a dataset of 18,805 articles from the Health and Fitness portal on the English Wikipedia, the paper compares those articles that were at some point tagged with a template indicating a quality flaw (also called "cleanup templates") to those that never contained such a template. The goal is to understand to what extent contributor experience, in the form of average number of edits made or number of articles edited by contributors to these articles, correlates with the presence of these cleanup templates. Only the number of articles edited was found to have a significant relationship, and contributors to non-tagged articles had a higher average number of articles edited. The authors discuss these findings in relation to ensuring that articles about medical topics on Wikipedia are of high quality.

The paper's limitation section discusses the operationalization of article quality used, citing early work on predicting article quality and suggesting that the methodology could be improved by incorporating multiple quality factors. This resonated with this reviewer, who has both done extensive research in this area and reviewed other work for this newsletter. There are two papers that appear particularly relevant in this case: we covered one using a deep learning approach last year, and a second paper used the ORES API to measure the development of article quality across time enabling a demonstration of the Keilana Effect.

Teahouse

 * Reviewed by Kudpung


 * "Evaluating the Impact of the Wikipedia Teahouse on Newcomer Retention"
 * Aaron Halfaker is a WMF employee and edits Wikipedia as . Jonathan Morgan is a WMF employee and, while editing as, is a Teahouse Host.

From the abstract: "[F]ew interventions employed to increase newcomer retention over the long term by improving aspects of the onboarding experience have demonstrated success. This study presents an evaluation of the impact of one such intervention, the Wikipedia Teahouse, on new editor survival. In a controlled experiment, we find that new editors invited to the Teahouse are retained at a higher rate than editors who do not receive an invite. The effect is observed for both low- and high-activity newcomers, and for both short- and long-term survival."

Can there ever be a solution to the dwindling number of new and active users? In their paper, Halfaker and Morgan explain that "[s]o far, neither purely social or purely technical efforts have been shown to be effective at providing effective socialization at a scale that leads to a substantial increase in the number of new editors who go on to become Wikipedians." They examine the Teahouse which they describe as "one of the most potentially impactful retention mechanisms that has been attempted". Their research demonstrates that "new editors who are invited to the Teahouse are significantly more likely to continue contributing after three weeks, two months, and six months than a similar cohort who were not invited."

Comparing it with the Wikipedia Adventure new editor training system developed in 2013, which—although often used—did not show any long-term impact, the Teahouse (created in 2012) "combined social and technical components to provide comprehensive socialization on a large scale, and was designed to promote long-term retention." Their study evaluates "the effect of invitation to the Teahouse, rather than participation in the Teahouse." Both the 24-hour and 5-edit threshold before issuing a Teahouse invitation, while avoiding vandalism-only accounts, may result in "many good-faith newcomers being denied the opportunity for positive socialization", they say.

"ORES", they explain, "provides powerful predictive models that can accurately discriminate between damaging and a non-damaging edits, and between malicious edits and edits that were made in good faith, even if they introduce errors or fail to comply with policies." The authors consider that "most new editors" receive "overwhelmingly negative and alienating experience [...] when they first join Wikipedia," but they do not appear to draw on any data for this assumption.

Research presentations at Wikimania 2018

 * Summarized by Morten Warncke-Wang and Tilman Bayer

One of the presentations at the recent Wikimania 2018 conference was on the "State of Wikimedia Research 2017–2018". An almost yearly occurrence since 2009, this presentation gives a quick look into the overarching themes in research published about Wikimedia projects over the previous year. This year's presentation (slides) is now available on YouTube, and covers five main themes: images & media, talk pages, multilingual comparisons, non-participation (who is not contributing?), and Wikipedia as a source of data. The first of these highlights the "Tower_of_Babel.jpg" paper also covered above.

A keynote titled "Creating Knowledge Equity and Spatial Justice on Wikipedia" (summary, slides) by Martin Dittus, a data scientist at the Oxford Internet Institute, featured various results about the geographical distribution of geotagged Wikipedia articles and IP edits, partly from earlier research by Dittus' colleague Mark Graham and others (cf. earlier coverage). Other presentations included:
 * "The State of Research in Knowledge Gaps" (video), showcasing various research results and technology projects by the Wikimedia Foundation's research team and its academic collaborators;
 * "Research on gender gap in Wikipedia: What do we know so far?" (video, slides); and
 * A lightning talk (video, slides, poster) about the "Wikipedia Cultural Diversity Observatory" (WCDO) project, which likewise uses geolocation coordinates to associate articles with cultural contexts, but also draws on other data such as article categories and Wikidata properties to overcome some of the shortcomings of the geotagging data.

Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.
 * Compiled by Kudpung and Tilman Bayer

Vandalism

 * "Vandalism on Collaborative Web Communities: An Exploration of Editorial Behaviour in Wikipedia"

The study found that "most vandalisms [on Wikipedia] were reverted within five minutes" on average and that "the majority of the articles targeted [with vandalism] are related to Politics (29.4%), followed by Culture (26.4%), Music (23.5%), Animals (11.7%) and History (8.8%)."

Simple English

 * "Evaluating lexical coverage in the Simple English Wikipedia articles: a corpus-driven study"

From the paper: "[Simple English Wikipedia] articles require surprisingly large vocabularies to comprehend, comparable to that required to read standard Wikipedia articles."

"Open algorithmic systems: lessons on opening the black box from Wikipedia"
From the abstract: "This paper reports from a multi-year ethnographic study of automated software agents in Wikipedia, where bots play key roles in moderation and gatekeeping. Automated software agents are playing increasingly important roles in how networked publics are governed and gatekept, with internet researchers increasingly focusing on the politics of algorithms. [...] In most platforms, algorithmic systems are developed in-house, where there are few measures for public accountability or auditing, much less the ability for publics to shape the design or operation of such systems. However, Wikipedia's model presents a compelling alternative, where members of the editing community heavily participate in the design and development of such algorithmic systems."

"High-quality standards in information presentation are not globally shared" across Wikipedia languages

 * "Cultural diversity of quality of information on Wikipedias"

From the abstract: "This article explores the relationship between linguistic culture and the preferred standards of presenting information based on article representation in major Wikipedias. Using primary research analysis of the number of images, references, internal links, external links, words, and characters, as well as their proportions in Good and Featured articles on the eight largest Wikipedias, we discover a high diversity of approaches and format preferences, correlating with culture. We demonstrate that high-quality standards in information presentation are not globally shared and that in many aspects, the language culture's influence determines what is perceived to be proper, desirable, and exemplary for encyclopedic entries."

Revert behavior differs between "political" and "unpolitical" articles

 * "Case study in political user behavior on Wikipedia"

From the paper:

How German Wikipedians coin words that describe unwanted editing behaviors as diseases

 * "Combinatorics of the suffix -itis on talk pages of Wikipedia: A word formation pattern for the discursive regulation in the collaborative knowledge production" (in German)

From the English abstract: "The study reveals that -itis is a highly productive suffix in meta(-linguistic) discourses of the online-encyclopaedia: Wikipedia authors using word formation products with the suffix -itis (e.g. Newstickeritis or WhatsAppitis) try to standardise the collaborative knowledge production with the help of these linguistic innovations. The corpus analysis delivers evidence for the fact that certain linguistic innovations and special types of word formation characterise the community of Wikipedia authors and their discourse traditions."

"Linking ImageNet WordNet Synsets with Wikidata"
From the abstract: "The linkage of ImageNet WordNet synsets to Wikidata items will leverage deep learning algorithm with access to a rich multilingual knowledge graph. [...] I show an example on how the linkage can be used in a deep learning setting with real-time image classification and labeling in a non-English language and discuss what opportunities lies ahead."

"Capturing the influence of geopolitical ties from Wikipedia with reduced Google matrix"
From the abstract: "[We] show that meaningful results on the influence of country ties can be extracted from the hyperlinked structure of Wikipedia. We leverage a novel stochastic matrix representation of Markov chains of complex directed networks called the reduced Google matrix theory. [...] We apply this analysis to two chosen sets of countries (i.e. the set of 27 European Union countries and a set of 40 top worldwide countries). We [...] can exhibit easily very meaningful information on geopolitics from five different Wikipedia editions (English, Arabic, Russian, French and German)." (See also earlier by some of the same authors: "Multi-cultural Wikipedia mining of geopolitics interactions leveraging reduced Google matrix analysis".)

AI assessment of article quality using deep learning

 * "A Hybrid Model for Quality Assessment of Wikipedia Articles"

From the abstract: "We explore the task [of document quality assessment] in the context of a Wikipedia article assessment task, and propose a hybrid approach combining deep learning with features proposed in the literature. Our method achieves 6.5% higher accuracy than the state of the art in predicting the quality classes of English Wikipedia articles over a novel dataset of around 60k Wikipedia articles." (See also earlier coverage in August 2017 of related research by a different team: "Improved article quality predictions with deep learning".)

"Social capital" of editors has a "significant impact" on article quality

 * "Using big data and network analysis to understand Wikipedia article quality"

From the abstract: "The research reported in this paper focuses on the question of why Wikipedia articles are different in quality. [...] We focus on three major types of social capital with respect to teams of contributors working on Wikipedia articles: internal bonding, external bridging and functional diversity. Through a social network analysis of these articles based on a dataset extracted from its edit history, our research finds that all three types of social capital have a significant impact on their quality. In addition, we found that internal bonding interacts positively with external bridging resulting in a multiplier effect on article quality."