Wikipedia talk:Wikipedia Signpost/2012-08-27/Recent research

Low-hanging fruit
I totally buy the low-hanging fruit hypothesis. Maybe the authors of that paper could have done more to clarify their definitions but they still convince me. Signpost misses the point when it argues that it is easier to write about a relatively obscure area than expand an existing article in a well know area. Yes. That is true. But in the early days: Some of the rules and bureaucracy are required, to protect what we have already got and enable such a large group of editors to work together. Maybe that is a missing piece of the puzzle that the Stanford authors could have mentioned, but it does actually fit in nicely with the concept of low-hanging fruit. When you have fewer editors and fewer, lower-quality articles you don't need so much bureaucracy, making it easier to pick that fruit.
 * 1) It wasn't so much about expanding an article; it was about creating a new one.
 * 2) There were fewer rules and much less bureaucracy to deal with when either creating or expanding.

Yaris678 (talk) 11:44, 29 August 2012 (UTC)
 * While you may be right, this is not what the authors said, and our (mine...) critique was more of what they actually said :) --Piotr Konieczny aka Prokonsul Piotrus&#124; reply here 16:50, 30 August 2012 (UTC)

Hi! I am one of the authors of the paper. Thanks for the responses Piotr and Yaris.

Piotr, I agree that we did not define our terms very precisely - I think this is in part an artifact of this paper being coupled with a verbal presentation, but I will make efforts to define our terms more precisely. I also agree that we left some areas under-explored, the quarter ends unfortunately quickly. Our initial explorations included attempts at quantifying the "deletionism" and its resulting influence, but we did not find anything promising, and were not able to come back to the topic in time. I am however opposed to your claim that the Missing Articles list offers support against the low hanging fruit argument with respect to ability to create articles. Indeed, I would almost consider that list supporting argument that expert or esoteric knowledge is required! I should like to quantify both people's ability and desire to approach these subjects to see what comes out. Ultimately, I do concede that our three supporting arguments don't conclusively prove our core hypothesis - personally I think it only lends weight that amongst our initial three hypothesis the low-hanging fruit idea is the most likely. Nevertheless, I hope it can act as a starting point should anyone explore further.

Yaris, we tried to quantify the effect of rules and bureaucracy in the following manner :
 * 1) Create features for editors (such as # posts added, # posts reverted/deleted, # topics touched, etc.)
 * 2) Cluster editors
 * 3) Assign labels to the clusters manually (the hope being there would be a cluster of "novices", of "reverters", etc.)
 * 4) Observe how these clusters change over time
 * 5) Ideally, Observe the reverters cluster grow

For whatever reason (feature selection, algorithm, who knows) we couldn't get good, consistent clusters. This led us to do the statistical analysis we did.

I would like to see someone better quantify the affect of bureaucracy, I think it is there even though I do consider to play a smaller role to low hanging fruit hypothesis. I interviewed five editors, and while four of them did not think the culture was too terrible, one of them spoke vehemently against it, saying he never edited a talk page because it was filled with bickering and flame wars.

Thanks again for the feedback. If anyone should care to contact me (or one of the other authors) I will come visit this page again or you can contact me by email by clicking on my user name below and then clicking on "Email this user" in the "toolbox" menu on the left.

AustinGibbons (talk) 22:07, 30 August 2012 (UTC)

One more thing - I strongly encourage anyone looking at wikipedia trends to do so across many languages! We observed many more interesting patterns than just what we presented. Particularly with respect to those pesky wikipedia bots!

AustinGibbons (talk) 22:08, 30 August 2012 (UTC)

Short comments

 * China - one reason that there is less comment from Chinese contributors than we might like, is that the Great Firewall of China has made it difficult or impossible for them. It is of course a leap of faith to assume that increased Chinese contributions would necessarily lead to coverage more in line with the Chinese government's point of view, and an even bigger leap of faith to assume that this would be a good thing. Rich Farmbrough, 15:29, 29 August 2012 (UTC).

NPOV History

 * The limits of amateur NPOV history - I agree that the general history articles are dominated by the great man history (or "Dead White Men") understanding of world history. That's in part a generational/cultural POV issue, because of the tradition of "great man history" as the familiar approach used in school history textbooks for so many decades (nay, centuries), so Wikipedia editors won't reflect the shift towards more inclusive perspectives in historiography and historians' research methods for a few years yet. And, we all know the systemic bias plaguing Wikipedia has a chance to morph into something different and more diverse... Visual editor, where are you? lol ... OTOH, if WP became more diverse and had more expertise from academic historians, would it begin to lean more "politically correct" (at least in the view of readers outside the ivory tower)? I'm of two minds about that. I'm mostly curious though, to see what happens with some of these articles in WP once people start to realize the... canonical(?)... influence WP seems to potentially develop in certain areas.OttawaAC (talk) 01:54, 30 August 2012 (UTC)
 * Was it really a study of just one article? How are we supposed to conclude anything based on that? (Not that I disagree with the conclusion: though I might suggest that what gives Great Man theories their advantage is not just historical inertia but also that they make for the most interesting reading.) - Jarry1250 [Deliberation needed] 09:49, 30 August 2012 (UTC)
 * We can conclude things, although generalization is an issue. I'd hesitate to make claims on Wikipedia based from one article, unless it would be really representative, which I am not sure we can argue was the case here. --Piotr Konieczny aka Prokonsul Piotrus&#124; reply here 16:50, 30 August 2012 (UTC)
 * While we shouldn't generalize from analysis of the one article, it does line up with my experience; a lot of articles get cobbled together bit by bit, without anyone who really knows the relevant literature organizing the overall structure and balance of coverage. That said, the paper builds on some other research that points in a similar direction using a broader and less nuanced approach (which I haven't looked at). I think this is sort of a deep-dive followup to earlier work by Luyt and others.--Ragesoss (talk) 00:18, 1 September 2012 (UTC)
 * I am glad to see other people are interested in this matter.User:Wadewitz and I are currently working on am article on how to use statistical analysis of FAs and GAs that grapple with history to look at this issue more broadly (looking at types of sources, when they are published, etc). We are also talking about a qualitative analysis of a subset of articles to describe how the various review processes and other policies influence the source bias, Sadads (talk) 16:55, 3 September 2012 (UTC)

China #2

 * "[T]he way Wikipedia frames the event is much closer to that of The New York Times than the sources preferred by the Chinese government"- well yeah, given that the Chinese government was full of shit and everyone knew it and still knows it, I'm not sure that's a bad thing. Their point of view should be included, to be sure, but reporting the known distortions and outright lying from the Chinese government as if they're factual along with the NYT and other sources we know are factual doesn't make any sense.  The Blade of the Northern Lights  ( 話して下さい ) 00:29, 2 September 2012 (UTC)

Doctors
I can hear it now: "I stayed at a Holiday Inn Express last night and I read about your surgery on Wikipedia. Now where am I supposed to start cutting?"--ukexpat (talk) 17:56, 30 August 2012 (UTC)

Predicting quality flaws
FlawFinder sounds very interesting. I can think of at least two ways it could be used: Yaris678 (talk) 21:02, 2 September 2012 (UTC)
 * 1) A WP:STiki-like system that suggests articles to look at and tag or improve.
 * 2) A system to to look at the history of an article and how the probabilities of faults changed over time. This may give people a clue as to when an article got messed up.

Late comment, but whatever: I cited the low-hanging fruit hypothesis in this op ed ages ago. Res Mar 21:32, 9 February 2013 (UTC)