Wikipedia:Wikipedia Signpost/2023-03-09/Recent research

"Wikipedia's Intentional Distortion of the Holocaust"

 * Reviewed by Nathan TeBlunthuis

English-language Wikipedia, so influential in shaping collective memory in today's world, has been presenting systematically misleading information about Nazi Germany’s genocide of the European Jews, by "whitewash[ing] the role of Polish society in the Holocaust and bolster[ing] stereotypes about Jews." Showing this is the important contribution of "Wikipedia's Intentional Distortion of the History of the Holocaust," a scholarly essay by Jan Grabowski and Shira Klein published in The Journal of Holocaust Research. In the past few weeks, this publication has already sparked a response including media coverage and a new arbitration case. This review's purpose is to summarize the essay and its contributions and to reflect on its merits and significance, and it will not engage the widespread debates in this area more than necessary (see also coverage in this and the previous issue of The Signpost).

Grabowski and Klein's central claim is twofold. First, Wikipedia articles often support a narrative of Holocaust distortion (not denial) with four elements: (1) overstating the suffering of Poles in comparison to Jews during World War II, (2) understating Polish antisemitism and Nazi collaboration while overemphasizing the rescue of Jews by Poles, (3) insinuating that Jews "bear responsibility for their own persecution" because of their communism and/or greed, and (4) exaggerating the role of Jewish-Nazi collaboration. The result misrepresents the Polish nation's role in the Holocaust and contradicts mainstream historiography, as Grabowski and Klein show by citing prior scholarship.

Grabowski and Klein provide very strong support for this first claim, that Wikipedia bolsters each form of distortion. They offer myriad examples where articles ranging from Stawiski, Warsaw Concentration Camp, Naliboki massacre, History of the Jews in Poland, Collaboration with the Axis Powers, to Rescue of Jews by Poles during the Holocaust, and Polish Righteous among the Nations have supported the distortion narrative by including claims backed by dubious sources or overemphasizing facts aligned with the distortion narrative while ignoring or underemphasizing facts that do not support it. Many of the errors Grabowski and Klein identify, and their role in the narrative, are not obvious to non-experts, and so an important contribution of this scholarship is to make the pattern of distortion clear.

Wikipedia's distorted coverage is harmful, Grabowski and Klein persuasively argue, because "Wikipedia plays a critical role in informing the public about the Holocaust in Poland." It is important that Wikipedia not reproduce it because misremembering the Holocaust can increase the risk of future antisemitic violence and genocide. Many Poles believe elements of the distortion narrative which Poland's current government has taken legal and administrative steps (e.g., creating monuments for apocryphal Poles who rescued Jews) to popularize. To be clear, critiques of distortion do not blame the Polish for the Holocaust. No one is confused that Nazi Germany is at fault. Still, Grabowski and Klein cite evidence that Polish antisemitism was common during, before, and after WWII, and that Poles (without direct Nazi coercion) committed atrocities against Jews during the war as well as afterward when Jews returned to Poland and attempted to reclaim their stolen property. Although they are not entirely clear about why distortion is popular, this juxtaposition suggests that it relieves a sense of national guilt.

The second part of Grabowski and Klein’s thesis is that a small group of committed Wikipedians "with a Polish nationalist bent" have persistently and successfully defended both the distortion narrative's claims and sources advancing it. The essay argues that these editors are substantially responsible for the observed distortion pattern, citing article diffs, excerpts from on-wiki discussions, and edit counts. It also relies on interviews with some of the editors that it describes as "distortionists", their opponents, and involved Wikipedia administrators.

Grabowski and Klein persuasively argue that these editors heavily worked on Wikipedia articles that (typically in versions from early 2022) included the four types of distortion, and in doing so often cited uncredible sources that contradict historical scholarship. These editors surface again and again throughout the topic area and its controversies, defending the source-validity of dubious authors while attacking "well-known experts on Holocaust history" that contradict them. In a striking quantitative description of the distortionist editors' outsized influence, Grabowski and Klein argue that Wikipedia cites two authors they view as distortionist (Richard C. Lukas and M. J. Chodakiewicz) much more than the mainstream experts (Doris Bergen, Samuel Kassow, Zvi Gitelman, Debórah Dwork, Nechama Tec) even though the former have far fewer academic citations than the latter according to Google Scholar.

Two of the editors criticized as distortionists, Piotrus and Volunteer Marek, have defended themselves in terms of the essay's omissions and possible errors, only some of which are actual errors. One notable inaccuracy is that the method for counting citations using Google Scholar is imprecise and today surfaces many more citations to Richard C. Lukas than Grabowski and Klein reported. Yet, even this inaccuracy does not change the broader conclusion that Wikipedia relies too heavily on Lukas' work (also, Klein has uploaded a table with updated numbers (.csv) which continue to support the original conclusion). The title of his most-cited work, The Forgotten Holocaust, refers to the suffering of Poles under Nazi occupation. The Nazis indeed had a murderous colonial policy to "Germanize" Poland (see ), but this is distinct from the Holocaust, which refers to the genocide of European Jews. Lukas' title thus insinuates a false equivalence between Polish and Jewish suffering. Arguably, Wikipedia should not reference this at all, at least not without blinding clarity about how it contradicts mainstream sources.

From these editors' defensive responses, it is clear Grabowski and Klein have interpreted their actions unsympathetically to the extent that they overlooked their many valuable contributions to Wikipedia, some of which involved removing distortion. This omission is mostly understandable. A thorough account of these editors' Wikipedia careers (spanning more than 18 and 17 years, respectively) would have distracted from identifying and accounting for the Holocaust distortion on Wikipedia. In this reviewer's view, even if we take these defenses on board, Grabowski and Klein's possible errors are small relative to their abundant evidence that this group, comprising around a dozen or so editors, helped secure a foothold for the Holocaust distortion in Wikipedia articles.

That said, we should recognize how this case surfaces some of Wikipedia's more fundamental problems. At its core, this was a conflict about which Holocaust narratives belong on Wikipedia exemplified by questions such as: "Should Wikipedia include elements of Polish heroism?" and "How should facts about Poles rescuing Jews from the Holocaust be sourced, emphasized or positioned relative to facts about Polish atrocities or complicity in the Holocaust?" These questions are broad, complex, and require subject-matter knowledge and historiographic consideration to answer.

In their essay's final and most thought-provoking section, Grabowski and Klein describe how Wikipedia administrators and arbitration committee (ArbCom) members responded to the conflict. They are sharply critical of ArbCom members who "don't do the homework it takes to recognize distortion" and "wish to avoid fights in this area." It is standard practice on Wikipedia for administrators to avoid questions like those above by bracketing them as content disputes (which community members are normally supposed to resolve on their own) rather than misconduct (which administrators are normally empowered to address). This practice means that transforming a broad conflict about a content area into a series of narrow misconduct cases can be an effective strategy for winning (or at least dragging out) the conflict about content. Many times, administrators dismissed reports about the distortionists for being about content not conduct. On three occasions reports resulted in arbitration cases and even sanctions such as topic bans on distortionists and a discretionary "reliable-source consensus" requirement (WP:APLRS) intended to empower administrators to intervene against controversial sources. Efforts to enforce such sanctions, however, were themselves dismissed as content disputes and the topic bans were ultimately reversed (once ahead of schedule).

Emerging from this administrivia is a picture of Wikipedia's highest institutions straining under the complexity of this case. Strikingly, steps taken to simplify administrators' tasks shift the burden of proof onto the parties of a conflict. Short word-limits in case statements were too constraining for defenders of historical accuracy to be able to explain to non-experts the problems with distortion in the articles (indeed; it takes Grabowski and Klein most of 50 pages), but provided enough space for distortionists to deflect the accusations. Thus advantaged, the authors argue, distortionists skilled in wikilaywering effectively steered the content-dispute-averse administrators away from the fundamental conflict over historical narratives and toward the particular conduct of individual editors, which is easier for the ArbCom to address.

As noted above, Grabowski and Klein may have made errors, yet these barely undermine their central argument. An audience of Wikipedia scholars is more likely to feel underwhelmed by the essay's sparse engagement with the existing Wikipedia research literature beyond the amount needed to demonstrate Wikipedia's influence and importance to collective memory. Better positioning this case study within Wikipedia scholarship could have shed new light on Wikipedia's fundamental limitations. Past scholarship has discussed systematic flaws in Wikipedia's dispute resolution processes (cf. our review: "Critique of Wikipedia's dispute resolution procedures") and the damage when disagreements about article content turn into conflicts about bureaucratic process and individual conduct. In the Gamergate Controversy, for example, the ArbCom's decision to punish editors who were defending against a coordinated anti-feminist brigade similarly reveals how Wikipedia administrators' myopic focus on civil conduct and procedural fairness can distract from a fundamental conflict about content—and even become an effective tool for disingenuous actors. Yet other research finds that Wikipedia can be remarkably resilient to partisan misinformation because conflicting partisans hold each other to the same policies (cf. our review: "Politically diverse editors and article quality"). We might ask: What (if anything) was special about this Holocaust case such that it reveals Wikipedia’s limitations so starkly? Or: How (if at all) should Wikipedia's institutions for dealing with content disputes evolve? This case presents an important opportunity to consider such questions. Grabowski and Klein, content to draw attention to this case and document it in great detail, have left this to future work.

"Let's Work Together! Wikipedia Language Communities' Attempts to Represent Events Worldwide"

 * Reviewed by Piotr Konieczny

The paper addresses the issue of systemic bias, and focuses on English, Chinese, Arabic and Spanish Wikipedias. The authors study the production of seven years of news on these projects (from the "In the news" (ITN) section on the Main Page and its equivalents), and conclude that while there is an indication of self-focus bias, there is also strong evidence of a global representation of events. Self-focus, here, refers to focusing on one's home region or culture, and past studies found that about a quarter of most Wikipedias are about "self-focused topics".

The authors ended up with the dataset of a total of 6730 articles... 2064 in English, 1379 in Arabic, 1527 in Chinese and 1760 in Spanish which correspond to 2064 events, 172 in Arabic-speaking countries, 115 in Chinese-speaking areas, 114 in Spanish-speaking regions, 445 in the US, 472 in other English-speaking countries and 746 in [other] areas. The events were also coded by topic covered, which resulted in the 192 events classified as Science & Nature, 714 in Notable Person, 337 in Sports, 299 in Politics, 231 in Man-made Incidents, and 291 as Other categories. To compare Wikipedia's coverage to global media coverage, the author also associated their dataset with that of the GDELT Project.

Some specific findings suggest that English Wikipedia suffers from a slight under-representation of events in Arabic-speaking countries. The Arabic Wikipedia project on the other hand does not show much self-bias; instead it over-represents events that happen in English-speaking countries (but not the United States). The Chinese and Spanish Wikipedias, the authors argue, have a stronger self-focus bias than the Arabic and English projects, although still, over 90% of events covered by the news sections of these projects are about items not related to these countries. The authors also find, perhaps unsurprisingly, that larger Wikipedias will react to breaking news faster and update their news section more promptly.

Briefly

 * See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.

Other recent publications
''Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.''
 * Compiled by Tilman Bayer

"Digital divides in the social construction of history: Editor representation in Wikipedia articles on African independence processes"
From the abstract:  "The present study examines how [Wikipeda's] editor geography is reflected in the editing of articles (participation, impact and success) about the independence of former French colonies in Africa. The analysis is based on 354 Wikipedia articles; by geolocating 75% of the editors (N = 23,408), we show that the majority of edits are made by users located in France. This imbalance is also reflected in the overall share of text they contribute over time. However, when looking at the individual user level, we find that editors from France are only slightly more successful in maintaining their contributions visible to the reader, than editors from African successor states."

"A Wikipedia Narration of the GameStop Short Squeeze"
From the abstract:  "This paper examines the usefulness of Wikipedia pageviews as indicator of the performance of stock prices. We examine the GameStop (GME) case, which drew the investors’ and scholars’ attention in 2021 due to the short squeeze, and its skyrocketing price increase since 2021. [...] The results show strong statistical evidence that increased number of Wikipedia pageviews for COVID-19, which represents the fear of the pandemic, has a negative impact on the GME performance. Moreover, the findings show that the increased interest in information regarding the short squeeze, as expressed by the increased number of pageviews of the relative Wikipedia page, is positively linked with the GME price. The econometric analysis shows that the interest indicator of GME has a positive coefficient, but it is not confirmed at significant statistical level."