Wikipedia talk:Counter-Vandalism Unit/Vandalism studies/Study 3

Consider this a starting point for Study 3. What do we want to accomplish with this study?

Study ideas
I think the easiest thing to do (so that we can all get back into the swing of Studying Vandalism) would be a recreation of the Obama study, using Mitt Romney. It will allow us an apples-to-apples comparison of what has changed in regards to vandalism (as opposed to, say, edit warring) over the last 4 years. I also suggest we re-look at the numbers from the Obama Study. The Obama Study asked the question "Is semi-protection of hot-button articles a good thing?" or "Are IP editors bad?". I think a more important question to ask is "Is Vandalism (particularly IP Vandalism) as serious a threat to the Encyclopedia as it had been?". I would hypothesize "No, it is not; in fact disruptive editing is a bigger problem". So let's use our Scientific method and develop an experiment to gain the datums that we can compare to those produced by the Obama Study. Unless anyone has any better ideas. Achowat (talk) 16:03, 3 August 2012 (UTC)
 * Per Meph below I think we need to finish the Obama study before we do the Romney study, but I think that's what study 3 will be. Dan653 (talk) 02:35, 7 August 2012 (UTC)
 * Ok, so this is going to be hard to read, but per below, I agree that we just look at the shortcomings of the Obama study and apply it to a study of the Romney page. How can information (such as active page watchers, read rate, etc) be gleaned? Achowat (talk) 12:40, 7 August 2012 (UTC)
 * Achowat, some of the tools for monitoring these already exist: Wikipedia article traffic statistics, Page view statistics for Wikimedia projects, Equazcion's Active Watchers tool, and MZMcBride's earlier Watcher. Mephistophelian (talk) 20:19, 7 August 2012 (UTC).
 * Acho, what are you referring to by "read rate"?  Theopolisme TALK 20:25, 7 August 2012 (UTC)
 * How often a page is read. For instance, it's possible that some obscure Azerbaijani Olympic Water-polo player's page is in a constant state of Vandalism, but if no one ever reads it, it's doing little damage to the encyclopedia. Achowat (talk) 20:43, 7 August 2012 (UTC)

Incorporate feedback
Small-scale, comparative studies are certainly practical in terms of resources, and this is arguably the principal advantage. Perhaps the immediate disadvantage to this comparative study between the Obama and Romney articles is that the research surrounding the former isn't complete, specifically in lacking any conclusions whatsoever. There are several interesting reflections on the scope of this earlier research:

"Perhaps it is a little late to bring this up, but the scope of the study seems to be too narrow. Rather just analysing if ANON edits helped or hurt the article, the study should have included the damage that IP vandalism caused Wikipedia in the form of editor and admin wasted time and frustration. Otherwise, a true cost and benefit analysis of ANON edits can not be done. If a quality Wikipedian's time is wasted it means Wikipedia will have missed out on some of his benefital edits."

"To calculate the impact of the vandalism on editors and readers the statistics 'number of editors with the Obama article on their watch list' and the 'read rate of the Obama article' are needed. Are these statistics available?"

If the tentative hypothesis for the Romney study is: 'Is vandalism, particularly IP vandalism, as serious a threat to the encyclopedia as it had been?', the risk in changing the premise is that this proposal misses the opportunity to incorporate the earlier feedback and address shortcomings. Meph talk 19:03, 4 August 2012 (UTC).


 * Why not pick a topic related to Science? HARSH (talk)  14:42, 26 August 2012 (UTC)


 * Actually, why not pick a topic related to Science? Anti-Quasar (talk)  16:21, 26 August 2012 (UTC)
 * I think science could be interesting...  Theo polisme   :) 16:33, 26 August 2012 (UTC)
 * A topic related to a political person requires extensive knowledge (especially for me since I am from India and not much bothered about what's going on in American politics) and there always are some conflicts in opinions whereas science is factual. HARSH (talk)  18:24, 26 August 2012 (UTC)
 * Good point. Dan653 (talk) 02:12, 27 August 2012 (UTC)

Let's start this conversation back up: I think that a science-related topic would make a lot of sense. I suppose a first question is, what should that topic be? Some like this, or more in this direction (obviously not that article, since it's semi'd -- but... the idea.)? In order for us to gain data, we need to find something that gets a fair enough bit of attention -- ideas?  Theo polisme  21:31, 29 September 2012 (UTC)

The front page is always getting a lot of attention and vandalism but I suspect Admins are all over it. Perhaps we could review histories of several school IPs and make some assumptions on the feedback we collect? Meva - CHCSPrefect - GIMME A POTATO CHIP! C: (talk) 13:38, 22 October 2012 (UTC)

Resurrection of the idea
It's been 5 months since the last post here. Study ideas aren't getting anywhere. I figure I should try to resurrect this topic, since research on vandalism is vitally needed. I would, however, like to state a few fundamentals about what this research should entail.

1) It must be of practical use to the encyclopedia. This is goal-directed research, not a project to learn about vandals for learning's sake.

2) It must offer new insights. Some of the study ideas raised here and elsewhere, essentially, propose a study to see if vandalism adversely affects the encyclopedia.  Of course it does; nothing useful could be gained from such a study.

3) It must be feasibly possible to study. If we want to do an experiment, we would likely need approval from a number of parties.  If we wish to simply review data that's already out there, it must be an amount that a small team of part-time researches could handle.

With that in mind, let's start looking for new ideas. To avoid having another round of thinking that doesn't lead anywhere, people should probably provide an explanation of methods (how the study could be done) and what its impact could be. Here are a few of my ideas.

1) Graphs of trends in the amount of vandalism. Wikipedia offers a lot of easily accessible data, making this fairly easy.  The amount of vandalism could be modeled by the number of Cluebot NG edits over a period of time or the number of edits made through Igloo, Huggle, and Stiki (not sure if that info is available).  I would prefer this to a look at a single article to maximize accuracy with a much larger sample size.  We also don't have to go through each individual edit and check for vandalism; we could simply count the number of Cluebot edits (its false positive rate is very small. While it has decreased over time, changes in the false positive rate will probably not cause a significant impact on the data). This info would be a critically important base for future studies to look at how changes in the past affected the amount of vandalism.

2) An analysis to see if watchlisting causes a significant drop in vandalism display times (the amount of time a vandal edit is up). This would require an administrator's aid and a way for the researchers to securely access the administrator-only list of unwatched articles (kept hidden for understandable reasons).  The data, when publicly presented, would omit the names of all articles involved.  We could immediately discount all edits made through anti-vandalism client software (Igloo, Huggle, Stiki) as those using them do not only see edits made to articles they are watching.  Of those remaining, we would have to go through and meticulously count them I fear.  This could, however, offer great insights.  If a significant effect is revealed, it would justify a vocal campaign to get people to watchlist articles they have expertise on; if more articles are watchlisted, vandalism would be stamped out more quickly. However, due to the need of active administrator support and a review of countless individual edits, this option is probably unfeasible unless someone can amend it.

3) An examination of how media attention to Wikipedia itself or certain topics affects vandalism. This may require research option #1 first so that we have a large set of data to look through.  We could then look at prominent mentions of Wikipedia (especially Wikipedia articles) in the media to see just how much additional vandalism they cause.  For example, XKCD's coverage of the "Star Trek Into Darkness" edit war on January 29th caused a huge spike in vandalism in the days following.  If we examine media reports on Wikipedia as a whole, we should probably use the #1 data set.  If we want to look at individual articles, it would pay to examine the histories of individual edits to hose articles.  Tedious, but potentially fruitful.  If we find, for instance, that a media mentions of Wikipedia articles cause a massive jump in vandalism during the few days immediately after, we could send the data to "Requests for Article Protection" and ask for a new rationale for semi-protection: Anticipated vandalism. This would prevent some vandalism from happening at all, although media discussions of Wikipedia articles are rare.

Of these three, I'd suggest #1, since we can do the most with it in the future. Hopefully others will present their own ideas/advice. Hopefully we can get a study done in the next 2.5 months. Marechal Ney (talk) 03:49, 20 March 2013 (UTC)


 * I honestly that the ideas of studies should be dropped, they hardly ever are successful. Thєíríshwαrdєn  - írísh αnd prσud  18:37, 31 March 2013 (UTC)

Shared IP template effective in slowing down IP vandals.
With my experience of fighting vandalism, I found out that if the shared IP template appears on the IP users talk page, he slows down and realises that he is not so anonymous. I believe that we need to conduct a research, to prove the effectiveness of this way. Friy Man talk 08:23, 26 January 2017 (UTC)

Discussion closed
Can we say that the "vandalism studies" area is dead? From AnUnnamedUser (open talk page) 02:29, 28 October 2019 (UTC)
 * Since literally nobody has been active here for the past 4 years, I think we should call this a dead area. --Pink Saffron (talk) 21:01, 1 October 2021 (UTC)