User:Colonel Chaos/study

In the last several days [as of 1 may 2007], I conducted a small survey of vandalism to featured articles on Wikipedia. The following page includes a discussion of my methods and results. The shocking result of my study: it takes the Wikipedia community on average 10 hours to remove serious vandalism other than childish vandalism from Featured Articles.

Method
For my project, I studied only featured articles. Featured articles should be the very best Wikipedia has to offer as they have been extensively checked and vetted. At the beginning, I also assumed that the individuals who went to the work of creating featured articles would also probably watch them heavily. I also figured that most featured articles would me on a number of watchlists. For these reasons, I assumed that featured articles would display the ideal response time for vandalism.

Of course, we all know that vandalism that includes profanity and childish insults is caught rather quickly thanks to the efforts of our hundreds of RC patrollers and the multitude of software devices created to help find easy to spot vandalism. They are quite adept at finding and reverting "Your mom..." or "PENIS!" So, for this study, I engaged in slightly more complex vandalism in three categories:
 * Grave Factual Accuracy: For these articles, I changed or inserted material that any average reader or editor of Wikipedia would immediately know to be untrue. For example, in the hydrochloric acid article, I wrote that Martin Sheen discoved the acid by mixing potatoes with salt and that Martin Sheen also invented agent orange for dissolving gold.  Finally, I inserted the sentence: "Most of Sheen's research received funding from the United States military due to their interest in new weapons technology".  If a reader/editor didn't immediately notice that this information was false (and wholly inconsistent with the remainder of the article), clicking on any of the wikilinks I inserted would've revealed the truth.
 * Complete Nonsense: In this category, I inserted a passage of completely irrelevant prose into an article. For example, I inserted the opening of This Side of Paradise into the middle of the article Island Fox.
 * Factual Innaccuracy: In this category, I changed articles more slightly so that a user with knowledge of the topic would be needed in order to spot the incorrect information. For example, in the article on Norman Borlaug, I changed "Between 1965 and 1970, wheat yields nearly doubled in Pakistan and India" to "Between 1968 and 1975, wheat yields nearly tripled in Pakistan and India" in the articles lead section.

Complete Nonsense
In the complete nonsense category, I modified a total of five articles. The average response time was 691.8 minutes or 11.5 hours!

All in all, the response in this category was shockingly slow, especially for Island Fox. Only two of the articles, Technetium and Fin Whale were reverted in what I would deem an acceptable time frame for vandalism this easy to spot. Of course, the average revert time above is skewed by the extreme values, so if we take the average of the middle 3, we end up with 113 minutes or nearly two hours, which is better, but still terrible.

Grave Factual Innaccuracy
I concentrated the most on this category, modifying a total of 9 articles. The response time here was still quite bad. On average, it took 555.7 minutes or 9.25 hours to revert modifications to articles in this category. Unlike the previous category, no single value threw off the data as there were two values of nearly 14 hours and one of over 31 hours.

Once again, we are faced with an appalling response time. Every article in this category had obviously been vandalized and no one cleaned many of them up for several hours. I shudder a little bit when I think about people reading these articles in the elapsed time. What kind of impression would a casual reader form of an encyclopedia that reported that Bill Cosby led the Second Crusade along with Harry Potter and Gregory Peck?

Factual Inaccuracy
Interestingly, this category provided the single glimmer of hope in my study with an average revert time of 57.4 minutes for five articles. Of course, prior to this survey I would've looked at a revert time of nearly an hour as atrocious, but you take what you can get. Unfortunately, the quick revert time in this category wasn't the result of actions of a group of vigilant users. Instead Morven, a user with checkuser permissions connected the five articles through the fact that I had made all of my edits up to that point through the same proxy. So, I feel that this category is essentially invalid. Nonetheless, here is the data. Once again, I don't think this category was especially notable given the influence of Morven and his checkuser permissions. Nonetheless, it is a glimmer of hope when it comes to a possible organized vandal attack.

Conclusions
For the categories of grave factual inaccuracy and complete nonsense, the average overall response time was about 10 hours. I would categorize the edits in these categories as serious vandalism. Personally, I think that a revert time of 10 minutes would be more appropriate for featured articles than 10 hours. In other words, our anti-vandal measures have failed. While they are highly successful at catching childish vandalism, our vandal fighters don't even notice serious, but equally if not more damaging content vandalism.

Also, this study given that it involved changing facts reveals that the general community response to a bad fact in a featured article is far too slow.

What do I suggest

 * Discussion on this topic - people need to talk about it
 * Put more articles on your watchlist! I've already watched another half-dozen featured articles.
 * Stable versions- so people don't have to deal with bad facts
 * Divert some of the extensive resources we devote to vandal-fighting to fact-checking. It may be less glamorous, but fact-checking is essential to catch bad facts inserted not just as vandalism but all in good faith.
 * Create some sort of bot to monitor edits by new editors. Ideally this would apply to those with a low edit count, but recently created accounts could be monitored.  I think such a bot is technically feasible and a good use of resources.  Just look at the numerous variations on detecting "PENIS!" in rss feeds or monitoring recently reverted users.

Answers to Questions
I have answered some questions I anticipate receiving below
 * 1. Isn't this a massive violation of WP:POINT
 * In a word, yes. But someone needed to look at how long it actually takes to revert vandalism and this seemed like the best way.
 * 2. Who are you?
 * I am a frequent editor with over 2500 mainspace edits. I am not an administrator.  For obvious reasons, I have concealed my identity for this survey.
 * 3. Why did you use registered accounts?
 * For several reasons, primarily so that I could operate behind the same IP without having my edits linked by other editors (except one with checkuser). Additionally, people screen anonymous edits more thoroughly so I felt being registered would give better data.
 * 4. What did you expect to be the results of your study?
 * I would've guessed about an hour to revert minor factual innaccuracies, 15 minutes for grave factual inaccuracies, and 5-10 minutes for nonsense. Yes, I was wrong.
 * 5. Why shouldn't we ban you?
 * I'm not a threat. Go ahead and block the accounts that I used to do my modifications, but I am not a vandal and I don't plan to use this account to do anything wrong.  This account is just for sparking discussion.
 * 6. Why did you choose Featured Articles?
 * Because they are supposed to be Wikipedia's best. I think we all know that vandalism to stubs can go unnoticed for a long time.  But Featured Articles, simply put, matter a lot.