Wikipedia:Wikipedia Signpost/2016-10-14/Recent research



"Wikipedia Dispute Index" detects high-conflict countries
The 2011 study Content Disputes in Wikipedia Reflect Geopolitical Instability was referenced in a recent article on Ozy. The study (not previously covered in the Signpost) considers whether Wikipedia's metadata may be used to glean insights into global phenomena. (Various online predictors have been associated with events. For instance, Google searches can be used to monitor the spread of infectious diseases.) The authors attempted to test whether Wikipedia content disputes can be used to understand real-life conflicts. They analyzed all pages linking to articles about a given country that had the "NPOV dispute" tag, though they note that only about a quarter (138 of 497) countries had a sufficient number of conflicts to allow further analysis. (This reviewer wonders why the authors chose the "what links here" tool rather than the more precise category of WikiProject template groups of articles; a cursory look at the 100+ articles linked to Poland, for example, suggests that only ~20% are clearly related to that country.)

They then created a "Wikipedia Dispute Index" (downloadable image of the index heat map), which measures whether a country has more or fewer than average disputes linking to it. The authors note that their index roughly matches the "1996–2008 World Bank Policy Research Aggregate Governance Indicators" and the "Economist Intelligence Unit 2009 Political Instability Index" (downloadable image of the correlation plots between those indexes – not bad, given the underlying problem of using "what links here" as a dataset). The results indicate that "the most disputed are parts of the middle east followed by other regions such as Kosovo, Bosnia & Herzegovina and North Korea ..., countries in North America and Western Europe are the least disputed, with most other countries occupying a middle range." With regards to the type of conflicts, they observe that "the biggest contributors to the indicator tend to be disputes over current or historical events or individuals that vary according to different political views."

Though the authors present no convincing arguments about why exactly their index would be more or less useful then the existing ones, they write that it can be seen as a supplementary tool validating other indexes, and conclude that Wikipedia's data and metadata can be used to generate other useful indexes and metrics – something that this reviewer certainly agrees with.

Wikipedians may find the following page created for this project useful (for the next few years until it inevitably goes down as it stops being maintained – perhaps someone could contact the authors about moving it to the Toolserver/Labs?: http://www.disputeindex.org/ which displays the (gray and white) heatmap and lists Wikipedia articles that are being analyzed – a nice visual gadget for our internal cleanup purposes) PK

Emergent Role Behaviours in Wikipedia – The "How" and "Why"
The roles that contributors play in Wikipedia (e.g. "copyeditor" or "vandal fighter") are informal and fluent, in contrast to other areas where roles are assigned and static. These types of roles are referred to as “emergent roles” in the literature, and a paper titled "On the "How" and "Why" of Emergent Role Behaviors in Wikipedia" at the 2017 CSCW conference looks at the extent to which contributors move between roles, and if so, why they do it.

This paper builds upon work by some of the same authors at the 2015 CSCW conference, in which they studied functional roles, which are defined by access levels in the system. In the upcoming paper, they use a similar approach and dataset in order to quantify roles and whether contributors take on multiple roles. Using a perspective of roles and articles, the authors identify four classes of contributors:


 * 1) Role-Article samplers: contributors who enact a particular role in a single article
 * 2) Role embracers: contributors who enact a particular role but across multiple articles
 * 3) Article embracers: contributors who enact multiple roles in a single article
 * 4) Role-Article polymaths: contributors who enact multiple roles across multiple articles

When it comes to longevity, the Role-Article polymaths (7.4% of the contributor pool) are those who continue to stay active in the system for the longest time, with 4% of them being active for at least seven years. Role embracers also sustain participation over multiple years, and will often be focused on the second article they encounter.

To learn more about how contributor motivation affects role behaviour, a survey of a stratified sample of contributors was performed, with 175 valid responses. These surveys aimed at understanding contributor motivations across four dimensions: fun, forming friendships, gaining reputation, and peer approval. The results reveal striking differences in motivation between the classes, for instance Role-Article samplers are low across all four dimensions, while Article Embracers are the opposite, high across all four dimensions. Using Role-Article samplers as a baseline, transitioning to other classes are motivated as follows:


 * 1) Role embracers: friendship
 * 2) Article embracers: reputation, peer approval
 * 3) Role-Article polymaths: fun and reputation

The paper then discusses these findings, proposing that each of the four behaviours plays a distinct role in how content is created in Wikipedia. For instance, the fact that some motivations are associated with role-transitioning behaviour while other motivations lead to transitioning between articles, means the other contributors can respond differently to those who display this type of behaviour in order to foster continued participation. MWW

Conferences and events
See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.

Other recent publications
A list of other recent publications that could not be covered in time for this issue—contributions are always welcome for reviewing or summarizing newly published research.
 * "The Impact and Evolution of Group Diversity in Online Open Collaboration" From the abstract: "we examine 648 WikiProjects to understand (1) how tenure disparity and interest variety affect group productivity and member withdrawal and (2) how the two types of diversity evolve over time. Our results show a curvilinear effect of tenure disparity, which increases productivity and decreases member withdrawal, up to a point. Beyond that point, productivity slightly decreases, and members are more likely to withdraw."
 * "Helping Wikipedia versus Helping a WikiProject: Subgroup Dynamics, Member Contribution and Turnover in Online Production Communities"' From the abstract: "we analyze data from 648 WikiProjects and the archived behaviors of 14,464 member editors ... Our results reveal two critical trade-offs in managing online production communities. First, a number of factors that increase member contribution such as tenure dissimilarity and past contribution also increase one’s likelihood of leaving the community, perhaps due to conflict or feelings of “mission accomplished” or “burnout”. Second, individual membership in multiple projects has mixed and largely negative effects. It decreases the amount of work editors contribute to both the individual projects and Wikipedia as a whole. It reduces one’s likelihood of leaving individual project yet increases the likelihood of leaving Wikipedia as a whole."
 * "Transforming Wikipedia into an Ontology-based Information Retrieval Search Engine for Local Experts using a Third-Party Taxonomy" From the abstract: "Using a third-party taxonomy, independent from Wikipedia's category hierarchy, we index information connected to our local experts, present in their activity reports, and we re-index Wikipedia content using the same taxonomy. ... A Wikipedia gadget (or plugin) activated by the interested user, accesses the endpoint as each Wikipedia page is accessed. An additional tab on the Wikipedia page [developed using ResourceLoader] allows the user to open up a list of teams of local experts associated with the subject matter in the Wikipedia page. "
 * "'An Encyclopedia, Not an Experiment in Democracy': Wikipedia Biographies, Authorship, and the Wikipedia Subject" Abstract: "Wikipedia biography is a culturally significant, yet overlooked form of digital life narrative. Through an examination of Wikipedia’s policies and discussion forums, and a number of its most popular and controversial biographies, this essay explores the politics of biographical practice and representation on the site."
 * "Networked knowledge : approaches to analyzing dynamic networks of knowledge in wikis for mass collaboration" From the abstract: "[This] work builds on a theoretical consideration of collaborative learning and knowledge building stemming from the interdisciplinary learning sciences and research on computer-supported collaborative learning (CSCL) in particular. ... A complex systems perspective is used to explain knowledge as an emergent phenomenon ... Based on these conceptualizations, the present dissertation empirically examines large real-life data sets from the online communities Wikipedia and Wikiversity. Knowledge is captured as a network of interconnected articles in different knowledge domains."
 * "Wikipedia and conceptions of knowledge in encyclopaedism" From the "Methodology" section: "Wikipedia [is] broken down genealogically, historically and structurally. [Then,] we study more specific epistemological aspects and problems within Wikipedia. The workings of Wikipedia will be confronted with different epistemological and hermeneutic interpretations of what knowledge is, how it works and how it is organised. ... Before concluding, the previous findings will be used to judge the effect of Wikipedia upon its cultural surroundings."