Wikipedia:Wikipedia Signpost/2020-08-02/Recent research

Receiving thanks increases retention of new editors, but not the time contributed to Wikipedia
The Citizens and Technology (CAT) Lab at Cornell University recently published two preprints studying the effects of Wikipedia's inbuilt "thanks button" on participation. These experiments were conducted in collaboration with community members from the Arabic, German, Persian and Polish language Wikipedias.

In the first study, all editors who had made a "thankable edit" (defined e.g. as receiving a "good-faith" and "non-damaging" rating from the automated ORES classifier) in the last 90 days were eligible to be randomly assigned into a test and a control group. Edits from the test group were then presented to experienced volunteers (334 overall) who decided whether to send a thanks for them. The researchers measured the effect on three preregistered main outcomes:
 * No significant changes were found in the daily labor hours contributed (comparing the six weeks before and after receiving the thank). The authors write that "this finding may result from an effect that is inconsequentially small, from the high variance in labor hours between participants, or from measurement error in the method for inferring time contributed" (which was based on grouping edits into "sessions" as proposed by Halfaker and Geiger, cf. our earlier coverage).
 * The retention of editors increased by 2% on average (an editor was defined as retained if they made "at least one edit to Wikipedia in a five week period starting at the beginning of the second week" after receiving the thank).
 * As the biggest effect, the study found that "receiving an expression of gratitude caused contributors to send 1.6 times more thanks" (according to the preprint; in an accompanying blog post, the researchers give a slightly different figure of a 43% increase).

These results are somewhat in contrast with a study by Goel, Anderson and Zia (conducted in collaboration with the Wikimedia Foundation's research team and presented at last year's WWW conference) which found "that receiving a thank has a strong positive effect on short-term editor activity", but had to base that conclusion on correlations as it was a merely observational study, instead of a randomized controlled experiment.

CAT's second study, titled "Expressing Gratitude and Feeling Emotionally Drained on Wikipedia" randomly assigned experienced editors from the German, Polish and Persian Wikipedias (recruited via banners to participate in the study) to either "review [edits by other contributors] and send personal expressions of gratitude to four Wikipedia contributors", or, in the control group, "to carry out common, routine activities on Wikipedia". Before and after this, they were asked to fill out a survey that asked whether they "feel positive about" their Wikipedia contributions, and whether they found contributing is "emotionally draining." Contrary to the researchers' preregistered hypothesis, the experiment "failed to find an effect from expressing gratitude on differences in socially supportive activity (p=0.74) [measured in article edits, article talk page edits, and thanks sent] or positive feelings about one’s contributions to Wikipedia (p=0.065)".

In general, the researchers found that (as summarized in an accompanying blog post) "People who spend more time mentoring and people who do more to monitor Wikipedia for vandalism report feeling more emotionally drained than others. Yet people who do more monitoring also feel more positive about their contributions".

The CAT team emphasizes that both studies were conducted in collaboration with community members from the involved Wikipedias, not unlike an earlier study that found a positive effect of barnstar-like awards on the German Wikipedia. In contrast, a quite similar project where researchers from Carnegie Mellon University had planned to study "How role-specific rewards influence Wikipedia editors’ contribution" was withdrawn in early 2019 after being met with resistance from editors on the English Wikipedia.

See also this Facebook discussion with two of the CAT researchers

Briefly

 * See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.

Other recent publications
''Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.''

"Openness, inclusion and self-affirmation: Indigenous knowledge in open knowledge projects"
From the abstract:  "This paper is based on an action research project [...] conducted in 2016-2017 in partnership with the Atikamekw Nehirowisiw Nation and Wikimedia Canada. Built into the educational curriculum of a secondary school on the Manawan reserve, the project led to the launch of a Wikipedia encyclopaedia in the Atikamekw Nehirowisiw language. We discuss the results of the project by examining the challenges and opportunities raised in the collaborative process of creating Wikimedia content in the Atikamekw Nehirowisiw language ... " See also by one of the authors: "Ethics and responsibilities of open access. Lessons learned from the Wikipedia project of the Atikamekw First Nation" (abstract and 15min video)

Librarians "Reverting Hegemonic Ideology" by editing Wikipedia
From the abstract:  "... Using both Antonio Gramsci and LIS theorist Michael Harris as starting points, this paper argues that Wikipedia is predicated on a philosophy of pluralism that serves as a transmitter of hegemonic ideology, thereby upholding the oppressive status quo. To counter this issue, the paper encourages librarians to embrace 'critical editing'—an approach to Wikipedia editing built around an awareness of power, a penchant for critical literacy, a focus on desocialization, and an emphasis on self-education. The paper concludes with an example of critical editing praxis (dubbed the "Library Repository-to-Wikipedia" method) that research librarians and information professionals can replicate to counteract aspects of Wikipedia that inherently support the status quo and thus, hegemonic ideology."

"From MDMA to Lady Gaga: Expertise and Contribution Behavior of Editing Communities on Wikipedia"
From the abstract:  "we present a methodology for gaining a better understanding of the contribution behavior, interests and expertise of communities of Wikipedia users. Starting from a list of core articles and their main editors, we identify which other articles (outside of the initial list) they contributed to ‘significantly’. The ordering is based on (empirical) Bayesian estimates of the contribution probabilities for each of the articles. [...We ] use the editors that contributed to the articles on designer drugs as a case study. We find that the users in this community contribute significantly to articles on pharmaceuticals, popular party drugs, chemistry, mental illnesses, diseases, medicine and cell biology."

"Does Copyright Affect Reuse? Evidence from Google Books and Wikipedia"
From the abstract:  "I use the digitization of in-copyright and out-of-copyright issues of Baseball Digest magazine by Google Books to measure the impact of copyright on knowledge reuse in Wikipedia. I exploit a feature of the 1909 Copyright Act whereby material published before 1964 has lapsed into the public domain, allowing for the causal estimation of the impact of copyright across this sharp cutoff. I find that, while digitization encourages knowledge reuse, copyright restrictions reduce citations to copyrighted issues of Baseball Digest by up to 135% and affect readership by reducing traffic to affected pages by 20%. These impacts are highly uneven: copyright hurts the reuse of images rather than text and affects Wikipedia pages for less-popular players greater than more-popular ones."

See also university press release: "Wikipedia Readers Get Shortchanged by Copyrighted Material" and our coverage of an early presentation of the results (five years before the publication of this paper): "How Wikipedia articles benefit from the availability of public domain resources"

"Knowledge Graphs on the Web -- an Overview"
From the abstract:  "While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chapter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap."

How Wikipedia illustrates coup d’états in Soviet Russia and Egypt
From the abstract:  "we analyze variation in image usage across Wikipedia language editions to understand if, like text, visual narratives reflect distinct perspectives in articles about culturally-tethered events. We focus on articles about coup d’états as an example of highly contextual sociopolitical events likely to display such variation. [...] we use an iterative inductive coding process to arrive at a 46-item typology for categorizing the content of images relating to contested sociopolitical events, and a typology of network motifs that characterizes structural patterns of image use. We apply these typologies in a large-scale quantitative analysis that establishes clusters of image themes, two detailed qualitative case studies comparing Wikipedia articles on coup d’états in Soviet Russia and Egypt, and four quantitative analyses clustering image themes by language usage at the article level. [...] We find substantial cultural variation in both content and network structure."

"Wikipedia's Network Bias on Controversial Topics"
From the abstract and paper:  "Due to the high influence that links' disposition has on users' navigation sessions, one needs to verify that, given a controversial topic, the hyperlinks' network does not expose users to only one side of the subject. [...] In this work, we define the static structural bias, which indicates if the strength of connections between pages of contrasting inclinations is the same, and the dynamic structural bias, which quantifies the network's level bias that users face over the course of their navigation sessions [based on clickstream data ]. Our measurements of structural bias on several controversial topics demonstrate its existence, revealing that users have low likelihood of reaching pages of opposing inclination from where they start, and that they navigate Wikipedia showing a behaviour much more biased than the expected from the baselines. [...] For instance, we consider the topic abortion. The anti-abortion movement looks more represented than the abortion-rights one. It means that a user who randomly picks a Wikipedia’s page, has double the probability of reading an article related to anti-abortion than abortion rights."