Wikipedia:Wikipedia Signpost/2012-02-27/Recent research

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, edited jointly with the Wikimedia Research Committee and republished as the Wikimedia Research Newsletter.

Wikipedia research at CSCW 2012
The annual 15th ACM conference on computer-supported cooperative work (CSCW 2012) featured two sessions about Wikipedia Studies. The first one was titled "Scaling our Everest" (in amusing contrast to an earlier metaphor for the role of Wikipedia in that field of research: "the fruit fly of social software"), and covered four papers. A second session likewise comprised four papers and notes. Below are some of the highlights from these two sessions.

Gender gap connected to conflict aversion and lower confidence among women
Since January 2011, Wikipedia's "Gender gap" has received much attention from Wikimedians, researchers and the media – triggered by a New York Times article that cited the estimate that only 12.64% of Wikipedia contributors are female. That figure came from the 2010 UNU-MERIT study, which was based on the first global, general survey of Wikipedia users, conducted in 2008 with 176,192 respondents using a methodology that had raised some questions (e.g. about sample bias and selection bias), but other studies found similarly low ratios. A new paper titled "Conflict, Confidence, or Criticism: An Empirical Examination of the Gender Gap in Wikipedia" has now delved further into the data of the UNU-MERIT study, examining the responses to questions such as "Why don't you contribute to Wikipedia?" and "Why did you stop contributing to Wikipedia?", finding strong support for the following three hypotheses: A fourth hypothesis likewise tested a conjecture that has been brought up several times in discussion about Wikipedia's gender gap: However, the paper's authors argued that this conjecture was not borne out by the data, instead finding that "men are 19% more likely to select 'I didn't have time to go on' as a reason for no longer contributing."
 * "H1: Female Wikipedia editors are less likely to contribute to Wikipedia due to the high level of conflict involved in the editing, debating, and defending process." ("Controlling for other factors females were 26% more likely to select 'I got into conflicts with other Wikipedia contributors' as a reason for no longer contributing. The coefficients for being afraid of being 'criticized' [31% higher probability to be selected by female users as a reason against becoming more active in Wikipedia], 'yelled at', and 'getting into trouble' are all significant".)
 * "H2: Female Wikipedia editors are less likely to contribute to Wikipedia due to gender differences in confidence in expertise to contribute and lower confidence in the value of their contribution. "
 * "H3: Female contributors are less likely to contribute to Wikipedia because they prefer to share and collaborate rather than delete and change other's work."
 * "H4: Female contributors are less likely to contribute to Wikipedia because they have less discretionary time available to spend contributing".

Making sense of NPOV
A paper titled "From Individual Minds to Social Structures: The Structuring of an Online Community as a Collective–Sensemaking Process" looks at how Wikipedia editors talked about the Neutral point of view (NPOV) policy in the period of July 2005 to January 29, 2006, using Karl Weick's model of sensemaking and Anthony Giddens' theory of structuration for its theoretical approach. The paper's focus was on "how individual sensemaking efforts turn into interacts"; in other words, trying to understand how editors came to understand the NPOV policy through examining their posts. Editors' posts were differentiated into three types of questions (asking clarificatory questions, asking about behavior and the rules, and using questions as rhetorical devices) and answers (offering interpretation, explanation to others, and explanation to oneself).

Public Policy Initiative motivated students to become Wikipedians
In a paper titled "Classroom Wikipedia participation effects on future intentions to contribute" (presentation slides), five Michigan-based researchers looked at a sample of over 400 students who were involved in a pilot of the WMF education initiative (87% of whom were native speakers of English), and asked how likely the student-editors were to be become real editors after the end of their class projects, and what the relevant factors in such conversions are. They find that the student retention ratio is higher than the average editor retention ratio (while only 0.0002% of editors who make one edit become regulars, about 4% of students have made edits after their course ended). About 75% of the students preferred the Wikipedia assignment to a regular one, and major reasons for their enjoyment included the level of engagement in class, an appreciation of global visibility of the article, and the exposure to social media.

In related news, Erik Olin Wright, president of the American Sociological Association (ASA) who last year announced the organization's "Wikipedia Initiative", posted an overview of a graduate seminar he conducted with a Wikipedia component. The students had to review a book, and use their newly gained knowledge to expand a relevant article on Wikipedia. In his assessment, Wright called the activity a "great success" and encouraged others to engage in similar activities.

High-tempo contributions: Who edits breaking news articles?
A team based at Northwestern University studied how topics of a specific nature find matching contributors in Wikipedia, or more precisely: "how editors with particular skills self-organize around articles requiring different forms of collaboration". The study focused on the case of co-authorship in the context of breaking news articles. The authors note that such articles pose an interesting paradox: those that undergo a high-tempo editing cycle involving multiple contributors at once typically manifest quality issues, as the increased cost of interaction inhibits quality improvement work, yet in the unique case of breaking news articles, quality tends to remain very high despite multiple contributors attempting to make simultaneous edits with incomplete information or poor coordination. The study uses revision data describing 58,500 contributions from 14,292 editors to 249 English Wikipedia articles about commercial airline disasters and represents them as a bipartite network characterized as article and editor nodes. A statistical model (p*/ERGM) is applied to estimate the likelihood of the creation of a link between a pair of nodes as a function of specific network properties or node attributes. The analysis focuses both on attributes of each set of nodes (e.g. whether an article is "breaking news", or the number of editor contributions) as well as properties of article-editor pairs as illustrated in the figure (at right). Some of the main results of the study were:
 * Breaking news articles are more likely to attract editors.
 * Breaking news articles are not more likely to get experts to work together: experienced editors work together on high tempo collaborations significantly less often than would be expected by chance.
 * Experienced editors are unlikely to collaborate together on breaking news articles.
 * Experienced editors tend to contribute to similar types of articles more than dissimilar types of articles (suggesting the existence of classes of experienced editors who mostly focus on breaking news topics).

How different kinds of leadership messages increase or decrease participation
Three social computing researchers from Carnegie Mellon University measured the "Effectiveness of Shared Leadership" on the English Wikipedia – a model where leadership is not restricted to a few community members in a specialized role, but rather distributed among many. In an earlier paper (reviewed in a previous report), they had found evidence for shared leadership from an analysis of four million user talk page messages from a January 2008 dump of the English Wikipedia, classifying them (using machine learning) into four kinds of behavior indicating different kinds of "leadership": "transactional leadership" (positive feedback), "aversive leadership" (negative feedback), "directive leadership" (providing instructions) and "person-focused leadership" (indicated by "greeting words and smiley emoticons"). Based on this data, the present paper examines whether these four forms of messages increase or decrease the edit frequency of the user who receives them, also taking into account whether the message comes from an administrator or a non-administrator. Their first conclusion is that messages sent by both kinds of editors "significantly influenced other members’ motivation", and secondly, they found that "transactional leaders and person-focused leaders were effective in motivating others, whereas aversive leaders' transactional and person-based leadership had the strongest effects, suggesting that interfaces and mechanisms that make it easier for editors to connect with, reward, and express their appreciation for each other may have the greatest benefits." (The sample predates the introduction of the "WikiLove" software extension which has exactly this goal.) Addressing a common objection by active Wikipedians in defense of warning messages, they acknowledge that "[p]eople may argue that reducing the activity of harmful editors is a positive impact of aversive leadership. However, considering the fact that there is much work to be accomplished in Wikipedia and the recent downward trend of active editors, pure aversive leadership should be avoided." The paper did not attempt to measure the quality of the work of the message recipients.

The researchers had to use a technique called propensity score matching to address the difficulty that true experimentation – for instance, separating users into control groups – was not possible in this purely observational approach. However, they separately examined the case of Betacommandbot, who had sent "more than half of the messages categorized as aversive leadership" in the sample, warning users who had uploaded a non-free image without a valid fair use rationale. Because these messages had been sent to editors regardless of whether their contributions were in violation of policy at the time they were made, "the Betacommandbot warning was a natural experiment, like a change in speeding laws, that was not induced by recipients’ behavior". The effect of this warning was to decrease the recipients' edits by more than 10%.

Other CSCW 2012 contributions

 * Which edits get reverted?: In "Learning from History: Predicting Reverted Work at the Word Level in Wikipedia" researchers examined a sample of 150 articles from the English Wikipedia with over 1,000 revisions each. Every edit was classified according to whether it was a revert or not, and examined for these features: "the number of times each word is added or removed as two separate features, leading to feature spaces that are on average three to ten thousand words in size ... comment length, the anonymity of the editor, and his or her edit count and time registered on Wikipedia." The researchers then tried to construct a separate classifier for each article predicting whether a given edit would be reverted, with random decision tree forests turning out to be most accurate, such that "the model .. obtained high accuracy [even] when vandalistic edits and bots were filtered out". Even when only taking the added words into account (ignoring user-based data and removed words), it was "still obtaining reasonable results". For the article genetic engineering, added words that made a revert likely were those "that violated policy or article conventions, had spelling errors, or had Wiki syntax errors", whereas use of terms specific to the article's subject made reverts less likely. As a possible application of their model, they speculate that it "could inform [new] editors when their edit is likely to be reverted, enabling them to reflect on and revise their contribution to increase its perceived value".
 * WikiProject's "Collaborations of the Week" help increase participation: The authors of the above reviewed paper on shared leadership also presented a paper in the "Social Network Analysis" session, titled "Organizing without Formal Organization: Group Identification, Goal Setting and Social Modeling in Directing Online Production", finding evidence for the effectiveness of "Collaboration of the Week (COTW)"-type article improvement drives on WikiProjects. (The Signpost's "WikiProject report" series is cited at one point in the paper.)
 * Should a new wiki be "seeded" to invite participation?: Apart from research specifically about Wikipedia, the conference featured many other results that are potentially of interest to Wikimedians and Wikipedia researchers. For example, a paper titled "Bootstrapping wikis: Developing critical mass in a fledgling community by seeding content" reported on an experiment with 96 students who were asked to spend 20 minutes on contributing to a new MediaWiki-based course wiki, and "found that users tend to contribute more content, and more unstructured content, when they are given a blank slate. This suggests that bootstrapping is not always a positive. However, users tend to contribute content roughly similar to any seeded content. Bootstrapping can be used to direct user effort toward contributing speciﬁc types of content".
 * Two other papers presented at CSCW 2012 focused on the editing behavior of new Wikipedians and on collaboration in breaking news articles.

Wikipedia discourse on Europe analyzed
A master thesis by Dušan Miletić on Europe According to English Wikipedia: Open-sourcing the Discourse on Europe looks at the nature of the discourse on Europe in the English Wikipedia, employing Foucauldian discourse analysis, which focuses on analyzing the power in relationships as expressed through language. The article notes that "changes to the statements defining what Europe is, which hold the cardinal role in the discourse, had much more significance than others." In other words, the editors who succeeded in changing the definition of Europe were subsequently able to have their points of view better represented in the remainder of the article. Another finding suggests that the definition of European culture was much more difficult to arrive at, and spawned many more revisions throughout the article, than the discussion of the geography of Europe. Another aspect discussed in the article is the blurry boundary between Europe and the European Union. The article concludes that the borders of European culture are not the same as the borders of geographical Europe, and hence, that the difficult task of defining Europe – and revising the Wikipedia article – is bound to continue.

The significance of the first edit
A paper titled "Enrolled Since the Beginning: Assessing Wikipedia Contributors' Behavior by Their First Contribution" by researchers at Telecom Bretagne looks at an editor's first contribution as an indicator of her future level of involvement in the project. After having discovered Wikipedia, the sooner one makes their first edit, the higher the likelihood they will continue editing. Reasons for the first edit matter, as those who just want to see how a wiki works are less likely to keep editing than those who want to share (improve) something specific, content-wise. Making a minor edit is much less likely to result in a highly active editor; those who will become very active are often those whose very first edit required a large investment of time. As the authors note, "it seems that those who will become the core editors of the community have a clearly defined purpose since the beginning of their participation and don’t waste their time with minor improvements on existing articles". Finally, the authors find that having a real life contact who shows one how to edit Wikipedia is much more likely to result in that person becoming a regular Wikipedia contributor, compared to people who learn how to edit by themselves.

Given enough eyeballs, do articles become neutral?
Building on their previously reviewed research, Greenstein and Zhu ask "will enough eyeballs eliminate or decrease the amount of bias when information is controversial, subjective, and unverifiable?" Their research calls this into question, by taking a statistical approach to measuring bias in Wikipedia articles about US political topics, which uses Linus’ Law ("Given enough eyeballs, all bugs are shallow") as a null hypothesis.

They rely on a slant index previously developed for studying news media bias, which specifies certain code words as indicating Republican or Democratic bias. Within their sample of 28,382 articles relating to American politics, they find that the category and vintage of an article are most predictive of bias. "Topics of articles with the most Democrat words are civil rights, gun control, and homeland security. Those with the most Republican words are abortion, foreign policy, trade, tax reform, and taxation. ... [T]he slant and bias are most pronounced for articles born in 2002 and 2003". While they do not find a neutral point of view within each article or topic, across articles, Wikipedia balances Democratic and Republican points of view.

Yet answering "Why did Wikipedia become less biased over time?" is more challenging. They classify explanatory variables into three groups: attention and editing; dispersion of contributions; and article features. The narrow interpretation of Linus' Law would make attention and editing the only relevant feature (not supported by their data), while a broader interpretation would also take dispersion into account (weak support from their data). While both the number of revisions and the number of editor usernames are statistically significant, they work in opposite directions. Pageviews, while also statistically significant, are unavailable before February 2007. They also suggest questions for further work, including improvements to their revision sampling (they "divide [each article's] revisions into ten revisions of equal length") and overall sampling method (which uses the same techniques as their earlier work).

Navigating conceptual maps of Wikipedia language editions
A paper from this year’s Conference on Human Factors in Computing Systems (CHI 2012) entitled "Omnipedia: Bridging the Wikipedia Language Gap" presents the features of Omnipedia, a system that enables readers to analyse up to 25 language editions of Wikipedia simultaneously. The study also includes a review of the challenges that the architects faced in building the Omnipedia system, as well as the results of initial user testing. According to the authors, language barriers produce a silo effect across the encyclopedias, preventing users from being able to access content unique to different language editions. Omnipedia, they write, reduces the silo effect by enabling users to navigate different concepts (over 7.5 million of them) from up to 25 language editions of Wikipedia, highlighting similarities and differences in an interactive visualization that shows which concepts different editions mention and how each of those topics is discussed.

The authors provide the example of the English Wikipedia article on conspiracy theory, showing how it discusses many topics – from “Moon landing” to “Kennedy assassination”. Other language editions contain articles on the same concept, including Verschwörungstheorie in the German Wikipedia and teoria conspirativa in the Spanish Wikipedia. Omnipedia consolidates these articles into a single "multilingual article" on conspiracy theories, showing which language editions have topics discussed in only one language edition and which have those discussed in multiple language editions.

The paper concludes with the results of user testing, showing how the volume of single-language topics was "a revelation to the majority of users" but also how users targeting concepts they thought might reveal differences in perspective (for example on "Climate scepticism" or the "War on the Terror") actually had fewer differences than anticipated. The authors conclude by highlighting their contributions to this area of study, including a system that for the first time allows simultaneous access to large numbers of Wikipedia language editions – powered by several new algorithms that they assert “preserve diversity while solving large-scale data processing issues” – and a demonstration of the value of Omnipedia to user analysis of concepts explored in different language editions.

Briefly

 * Taxonomy extraction in Wikipedia. Mike Chen recently defended an MSc thesis at Ohio University focused on extracting a taxonomy from Wikipedia articles. A similar problem is discussed in a paper by a team based at Technische Universität Darmstadt which presents a solution tackling two of the main challenges for building a robust category system for Wikipedia: multilingualism and the sparse connectivity of the semantic network (i.e. the fact that users do not identify resources on the same topic with identical tags). A third paper by researchers at Telecom Bretagne develops a method to extract a tree from the Wikipedia category graph and tests its classification precision against a corpus from Wikinews.
 * Dynamics of Wikipedia conflicts. A team of researchers from Hungary studied the dynamics of controversies in Wikipedia and analyzed their temporal characteristics. They find a correspondence between conflict and burstiness of activity patterns and identify patterns that match cases eventually leading to consensus as opposed to articles where a compromise is far from achievable.
 * Do social norms influence participation in Wikipedia?. A paper titled "Factors influencing intention to upload content on Wikipedia in South Korea: The effects of social norms and individual differences" reported on the results of a survey, analyzing responses by 343 South Korean students ("uploading" meaning any form of contributing to Wikipedia). Among the findings was that "users of Wikipedia presented higher perceived injunctive norm and greater self-efficacy [roughly: confidence in their ability] toward the intention to upload content on Wikipedia than non-users. These findings can be understood as follows: uploading content on Wikipedia is a socially desirable act given that it contributes to knowledge sharing, and thus, for the people who already use Wikipedia, they might feel that they are urged by their social groups to upload content on the site as a way of participating in the making of collective intelligence."
 * Link disambiguation and article recommendations. An MSc thesis in computer science defended by Alan B Skaggs from the University of Maryland proposes a statistical topic model to suggest new link targets for ambiguous links in Wikipedia articles. Three Stanford University students in computer science proposed a recommendation engine for Wikipedia articles using only a small set of articles liked by a population of users as training data.