Wikipedia talk:Counter-Vandalism Unit/Vandalism studies/Archive 2

Answers

 * How effective are bots in curtailing vandalism?
 * Much less effective than real users, but they can operate for longer periods of time.
 * Are editors any more likely to continue or desist vandalizing if warned by a bot instead of a person?
 * Slightly less likely, but the difference is not significant; for the most part, warnings achieve absolutely nothing.
 * How long does vandalism typically remain visible?
 * The typical time is, of course, inversely related to the obviousness of the vandalism and the visibility of the article. In most cases, no more than a few minutes.
 * Who is responsible for vandalism? What are the demographics of the vandal population?
 * Mostly anonymous users, mostly in the United States, probably mostly children. More cannot really be determined with any degree of accuracy.
 * What proportion of vandals are on dynamic IP addresses, and hence very hard to block?
 * The number is small, though not completely insignificant, and is certainly a lot less than it used to be. (Some dynamic ISPs now send X-Forwarded-For headers, allowing MediaWiki to record these users' real IP address rather than a dynamic one).
 * Who is responsible for reverting vandalism?
 * A reasonably-sized group of regular contributors (and a bot) are responsible for dealing with most vandalism. Other regular contributors who do not devote their time to fixing vandalism but deal with it as they come across it, and the occasional new or anonymous user make up the rest.
 * How much time do editors waste cleaning up vandalism?
 * Approximately 5% of all edits, though each of these edits takes a few seconds at most.
 * What effects does semi-protection have on the level of vandalism of protected articles?
 * It dramatically reduces, but does not completely eliminate, vandalism to the semi-protected article.
 * Do vandals just choose another article to edit instead? How can we test this?
 * Possibly. There is no way to reliably test this short of standing behind vandals and watching what they do.
 * What level of vandalism is considered acceptable before semi-protection or some other measure is needed? How should the 'level of vandalism' be measured? (See Wikipedia talk:Protection policy)
 * It will vary greatly depending on which administrator you ask. Generally, persistent vandalism to a page by multiple, unrelated anonymous and/or registered users at the rate of several incidents every few hours will merit temporary semi-protection of that page.
 * Are IP edits ever responsible to improving a featured article while on the Main Page?
 * Very rarely. It has happened, though.
 * What motivates people to vandalize articles (See The_Motivation_of_a_Vandal)? How can we minimize the satisfaction they get from doing it?
 * Boredom. Integrate an anti-vandalism bot into MediaWiki itself so that vandal edits do not even save.
 * Why do certain articles attract more vandalism than others?
 * Because the subject is more popular with Wikipedia's typical demographic. The level of vandalism can be roughly correlated with the popularity of the article, once the effect of semi-protection is ignored.
 * What types of vandalism are there? What message are they trying to get across? Why do vandals not fully realise that their actions are futile?
 * Unless a specific message is inserted into the edit, they are not trying to get a message across. They are simply bored. Most of the time, they do realize their actions are futile, but they are bored and can't think of anything better to do.
 * What strategies can we employ to catch vandalism quickly?
 * Connect to the recent changes IRC feed, apply an algorithm to pick out likely vandalism, monitor the output and revert where necessary. I'm not going to go into details on the algorithm because this method is far more effective if everyone uses their own. (Otherwise, everyone is going after the same edits).
 * How can we catch most of it at recent changes?
 * The page itself isn't very useful, but can probably be made to work in an approximation of the above way using JavaScript, for those who do not have access to IRC.
 * How can we establish a situation where almost every article has someone responsible for maintaining it? Is this even a good idea? (See WP:OWN)
 * It would be better to assign people to time periods, rather than articles, and have them watch recent changes in that period. However, this being a voluntary project, any attempt to assign people to things will always be less effective than simply allowing them to choose to do what they want to do. Being less harsh on people who spend most of their time dealing with vandalism would be a better strategy (right now, their contributions are frequently dismissed as "worthless", they are hassled over every mistake they make, and they are barred from seeking adminship).
 * What impact does vandalism have on the reputation of Wikipedia?
 * A moderate negative impact, but one not so strong as the percieved general unreliability and inaccuracy of the project.
 * What sort of financial gains can be made from using Wikipedia to advertise - are spammers just wasting their time, or can it actually be profitable? Are our anti-spam measures adequate?
 * Addition of external links to Wikipedia articles used to greatly increase search engine rankings, increasing site traffic. While it no longer does, it does lead to a smaller, more direct increase in site traffic; from there, financial gains arise, most obviously through advertising.
 * How good are editors at reverting vandalism? That is, is it reverted properly, or is it often dealt with poorly, e.g. removing a whole paragraph that the vandal has simply altered in meaning. Also, how often are vandals properly warned on their talk page after committing an offense?
 * The word "revert", when used on Wikipedia, actually means 'to restore a page to a previous version'. Provided the correct previous version is chosen, all reverts remove the vandalism correctly. Removing a whole paragraph that a vandal has altered in meaning would have to be done manually and not by reverting, unless a recent version existed without that paragraph. The most common error is to revert vandalism by one user, but neglect to revert further vandalism by another user immediately preceding it. This error is most often made by the anti-vandalism bot, reducing its usefulness as it is necessary to follow after it checking its edits. (Occasionally reverting a revert back to the vandalised version is another curious and very annoying 'feature' of the bot). "Properly warned" is rather an odd concept, especially since warnings are virtually useless. Contributors who do not warn vandals are being no less useful than those who are (indeed, you could argue they are saving server load and avoiding provokation) and do not deserve to be hassled for it.
 * What is the overall contribution from schools and universities like?
 * Anonymous contributions from high schools (or equivalent) are usually mostly vandalism. Long-term anonymous-only blocking of the relevant IP addresses effectively deals with this while allowing established users at those institutions who wish to contirbute to continue doing so. While vandalism can also originate from universities, it is rare for a long-term block to be necessary.
 * What happens to vandalism levels when edits won't show up in the current version of the article - a trial of something like stable versions, where the vandal cannot vandalize the actual article people see, or something functionally similar, is needed. Perhaps a small section (e.g. all articles in a certain category) could be tested out.
 * They would be reduced, though only slightly. A more effective solution would be to not save the vandalism at all.
 * How does the rate of vandalism vary throughout the day?
 * It correlates almost precisely with the overall level of site traffic. The 'vandalism information' template is imprecise, irregularly updated and on the whole useless, and I'm not entirely sure why people even use it, when they would get a far better indication of the level of vandalism by looking at the traffic graphs.
 * Angela suggests there would still be problems with vandalism if anonymous editing was blocked. How can we test this hypothesis? Certain categories could be experimentally altered to block anonymous editors, but then vandals could just choose an article that wasn't protected. We would have to block all IP editing, which would certainly be controversial, even just to gather a small sample of data. The blocks would also have to allow newly registered users to edit, otherwise there wouldn't be time to create an account and then wait 4 days. Perhaps we could use a comparative method by doing the experiments on another wiki instead?
 * Of course there would still be problems; we get vandalism from registered users now, and the level could only increase if anonymous editing was prevented. Semi-protection is essentially a small-scale test of this, so it has already been tested. Why certain categories? Why not just continue what we are currently doing, and selectively disable anonymous editing to problematic articles by semi-protecting them?
 * Quantitatively, how are levels of vandalism affected (both in terms of percentage of edits and number of edits) when there is external attention draw to an article (e.g. Slashdot or The Colbert Report). Do levels of vandalism return to normal (e.g. in elephant) in all cases? How quickly?
 * The level of vandalism to elephant is still far higher than it was before attention was drawn to it. However, such incidents affect only individual articles, or a small handful of them, and can usually be dealt with simply by semi-protecting for a few days; occasionally more long-term protection is needed.
 * How well does Flagged revisions work in practice?
 * Reasonably well on very small wikis, but would be absolutely and totally useless here. You asked above "how much time is wasted fixing vandalism"; multiply that by a factor of about 15 to get the amount of time that would be wasted on the even more useless task of flagging revisions.

All your questions have been answered. You may now go and do something useful. — Preceding unsigned comment added by Gurch (talk • contribs) fake timestamp for MiszaBot to archive : 21:11, 3 August 2012 (UTC)

List all vandals
Why don't you just give us a big list of every single vandals ever and let us add the details. It will give bored wikipedians something to do.--Deathlaser (talk) 17:16, 16 April 2012 (UTC)
 * That sounds like a poor idea, per WP:DENY. Achowat (talk) 17:35, 16 April 2012 (UTC)
 * Strong oppose - in AGF. For example, would you want your name on a list if you got off on the wrong foot? -- Cheers, Riley Huntley  talk  No talkback needed; I'll temporarily watch here.  19:31, 3 August 2012 (UTC)
 * Strong Opposes There is too much backlog on wikipedia already. If your bored, start going through it. Dan653 (talk) 21:11, 3 August 2012 (UTC)

Inactivity
Hello, this sub-section of the CVU seems to be dead. All that happens is that people add their usernames to the list, and it's just making people waste time adding their names to an unhelpful list. One vandalism study happened years ago, two were planned years ago but were never concluded, and one has been inactive forever even though it's never been formally closed. If I get no objections for a month, we can begin procedures to declare inactivity. UnnamedUser (talk) 03:16, 16 December 2019 (UTC)

Agree
Although this is an important topic for Wikipedia this structure clearly has not been working and we should retire it. Alex Jackl (talk) 14:30, 16 December 2019 (UTC)

Closure
I have added a status tag because a month has passed. – UnnamedUser (talk; contribs) 19:44, 17 January 2020 (UTC)

Datasets
I added some links to three relevant datasets because I think they are just as useful as studies. Does anyone know of any others? Sam at Megaputer (talk) 15:13, 4 October 2020 (UTC)