Wikipedia:Checked edits brainstorming

A solution to this problem has been implemented. See the talk page for more info.

Motivation
Wikipedia receives about 8 edits a minute - this figure has increased rapidly and continues to increase. A key way of checking all these edits for vandalism or other unwanted things is "Recent Changes watching". Along with more edits means more contributors so in theory vandalism should be able to be kept in check. However the "Recent Changes watching" method does not scale. Contributors do not know which edits have been checked for goodness by other editors and so have to check it themselves. This duplication means that the available manpower is not being as productive in terms of improving overall WP quality as it could be.

The idea of a "checked edit" attempts to overcome this.

Implementation
(I am guessing a bit here)

Currently for each revision of an article we store
 * The article text as of that revision (possibly in diff form from the last edit, probably compressed)
 * The editor
 * The time of edit

To this we add a field


 * List of checking editors

which is used to store a list of editors who've looked at [ a diff between this edit and the last OR the article since this edit ] and are happy with it.

Indicating that you've checked an edit:


 * Add a button to the diff screen, history screen AND/OR article page.


 * A button and individual notes of which editors have looked seems unnecessary. If I've looked, it's not obvious vandalism or I'll have reverted it. On the list of editors, I object on privacy grounds. With respect, knowing what I've looked at and left unchanged is not your business. Just count the "long term editors" (three months? six?) and perhaps the "administrators" who've looked at the edit and that'll be sufficient and entirely automatic to indicate vandalism review. If you need a note, make a minor edit to the article and put the note in the edit comment. Jamesday 18:56, 6 Jul 2004 (UTC)


 * I agree that making the list of editors automaticly would be a privacy violation. However, as I understand the proposal, the list would not be a list of everyone who has view the diff, but rather a list of everyone who has viewed the article and say, pressed a button saying "good edit".  The problem with not showing the name of the person who approved the edit is that it is too easy to abuse with sock puppets (not to mention that the edit might take more time than just one view to determine if it is good (For example  this might be vandalism for all I know), so some kind of action besides viewing should be required.).  Showing the name allows you to go to the person and ask why they approved the edit if there is any question.  Jrincayc 00:52, 7 Jul 2004 (UTC)


 * You've just described why it's problematic: you have to record having been interested in order to say that it wasn't vandalism, so the process as proposed is abusive of privacy. The proposal to use some threshold for those who would count (I assume we trust that administrators, say, aren't sock puppets) seems sufficient and less problematic. Jamesday 12:38, 12 Jul 2004 (UTC)


 * The recording that the edit is checked is voluntary, so if you care about your privacy, then you shouldn't check the edit. Since presumably the point of checking the edit is to revert or copy edit the edit if it needs it, both of which will record who did that, I don't think preserving privacy is particualary important for a checked edits system.
 * Secondly, administrators are not necessarily the most qualified. I have quite a few economics articles on my watchlist.  One of the longest standing mistakes in the Supply and demand article was made by a longstanding Wikipedian who probably is or could have become an administrator.  On the otherhand, there is someone who was denied administratorship, that I trust more to review economics articles because of her greater knowledge of economics.  In otherwords, I am far more qualified to judge if a user's checkmark is meaningful than some computer algorithm.  So are most humans.  If I don't have the name, then I can't judge the marks validity.  Jrincayc 13:36, 12 Jul 2004 (UTC)


 * I'm thinking, many times anonymous contributors (and even logged in ones) forget to give edit summaries. Perhaps you could, in addition to marking pages checked, also supplement edit summaries. Ambush Commander 17:42, Jan 2, 2005 (UTC)

Content hash?
Perhaps we could add a content hash field to the old and cur tables, and index by that. Comments would be stored in a separate table, and linked to cur and old by the content hash, so whenever an article is reverted, old comments about a revision would be immediately accessible. However this makes it difficult to check or comment on a revert or page blanking. Instead, reverting and page blanking could be marked as such on RC -- if a revision has the same content hash as a previous revision, it is highlighted. The kind of highlighting to use would have to be cached in RC, rather than searching the index for every row. It would also have to be cached in the old and cur tables, see below.

The UI for checking, adding comments and viewing comments could be at the bottom of the diff page. A short summary could be displayed at the top of the diff, say a list of names.

The comments are internally associated with a revision rather than a diff. The easiest and most efficient way to handle this would be to display the comments associated with the more recent of the two revisions in the diff.

For a revert, or replacement of a page by a common text, for example a blank page, the diff page would make it clear that the comments do not refer to the diff. For example, the comments could be greyed out, and the summary at the top could be replaced by some text such as "revert" or "page blanking".

When I say a "comment", that's shorthand for a DB row which could contain all sorts of information, not just the comment itself. It could contain the username, timestamp, comment, a cached user trust metric, and some sort of standard numeric good/bad/ugly type rating, as discussed below. -- Tim Starling 06:19, Jul 6, 2004 (UTC)
 * I think associating the comment with the hash isn't so great of an idea. However, prepending a tag to the comment (like, [REVERT-3]  ) would be very cool. --ssd 05:05, 7 Jul 2004 (UTC)


 * I don't think he meant the hash was for the comment. More for the actual page's content. Then, if the hash matched any comments associated with that hash before, that would mean that this particular revision of the page had come up again, and then any comments previously associated with the revision would also be displayed. Sounds good to me. And your idea also sound cool too. That would help us pick off reverts. Ambush Commander 17:44, Jan 2, 2005 (UTC)

Near-term

 * Add extra toggles to Recent Changes. Some possible RC styles:
 * Current style
 * List of recent changes with a list of checkers. (effectively vandalproof)
 * List of unchecked changes (vandals could mitigate the usefullness of this by marking their edits as checked using a sockpuppet), but it would still be useful anyhow for those looking for something easy to work on.

Longer-term

 * Used for implementation of "safe versions" of contentious articles. Only checked revisions are shown to non logged-in visitor.
 * Integrate with a user trust metric
 * Show only changes not checked by someone trusted by me
 * ooh I like the sound of that one! Erich 04:58, 7 Jul 2004 (UTC)


 * Automated scoring of user trustworthiness. For each user a running tally of the proportion of their changes that are changed rather than ticked as "checked" could be maintained. This score could have various uses: highlighting "dangerous users" in RC, as evidence to the Med/Arb committees, awarding prizes as "most uncontroversial user"


 * I tent to distrust automatic methods of trustworthiness since they may possibly encourage vandals to game the system. I would trust the self choosen list of checkers much more.  I suppose if the only thing the automated list is used for is just a list that users can see, then it might be okay.  Jrincayc 16:40, 5 Jul 2004 (UTC)


 * Safe versions requires review of anything before it's accepted for display, a very unwiki concept. In effect, it defaults to rejecting changes. Not showing new articles to those not logged in seems counterproductive - they aren't going to be linked externally so they aren't going to be found by a significant audience and it prevents an anonymous contributor from expanding or correcting their own edit. Defaulting to hiding new things makes a failure to keep up with RC a far more serious problem because new things just vanish instead of being improved by subsequent viewers. Better a list of edits which haven't been viewed by "many" "old hands" yet. That would scale a lot more readily. A brief delay before showing things to anons might be of use - say 10 minutes to allow RC watchers to check for obvious vandalism. Jamesday 18:56, 6 Jul 2004 (UTC)


 * I'm remembering when I first started. A delay would be confusing at best. I agree that waiting for approval is not wikilike and could be quite discouraging. We want to *encourage* new people to make more edits, not make them more cautious or disgruntled. Elf | Talk 20:57, 6 Jul 2004 (UTC)

Here's a silly way to do it...
 * By default, pages are not delayed at all. (status quo)
 * If a "trusted" wikipedian marks an article as controvertial, it becomes delayed.
 * If an anonymous user edits a controvertial article, they edit the most current version, not the one they saw. After they save, if they edited more recently than the controvertial mark, they will see the current version even though they are anonymous.  (But other anonymous ip's would not see it.)
 * Eventually, after checking, a trusted wikipedian would unmark it.


 * Does that sound slightly useful? --ssd 05:14, 7 Jul 2004 (UTC)
 * Interesting. I'm not keen on requiring more mandatory work to approve or disapprove but I suppose that trusting in the case of most articles and requiring a tag to say delay does have merit, since it is the current situation and seems to work pretty well. I'm trying to think of ways which only require action if there's really a problem. Jamesday 12:54, 12 Jul 2004 (UTC)

Server strain?
Everything that is currently cached could continue to be cached, which is important. This would add strain writing to db, because we are going nearer to recording page views rather than edits. Strain reading from db probably less - unless RC is already a bottleneck.


 * Ceasing to record views after a threshold value for number or time since edit might help to reduce server load. There seems little point in updating counters once 10 old hands have reviewed an edit - it's either not vandalism, being taken care of or isn't obvious enough for the RC patrol to sort out by that point. Jamesday 18:56, 6 Jul 2004 (UTC)


 * What kind of threshold then would we propose? I have no idea how the editscape works, so I'd like to recommend 2 hours, but it's a shot in the dark. Ambush Commander 18:04, Jan 2, 2005 (UTC)


 * There's been some chat in IRC about having a set of values for various things, like how may admins have looked, how many users have looked, is it a new account (in the last n days), is it using a known proxy server, is the IP address on a spam blacklist, does it contain or add external links, does it contain swear words. Then having user controls for a personal risk score, a bit like the way slashdot lets you assign different personal weights to things. Implemented in javascript so everyone can get the same base page and still see their personal scores, without upsetting caching. Jamesday 10:12, 6 Jan 2005 (UTC)

More Choices to check
Besides just checking if the edit is acceptable, it might be better to allow a short list of possibilities:


 * Accept - The edit is a useful contribution and needs no further work.
 * Neutral - The edit is not obviously a vandalism, but the editor can't check its correctness at the present time. For example, in the minimum wage article, I recently saw a change to the minimum wage amount for Canada.  I have no idea if it is right or not, so I would not mark it accept, but I might want to add a text comment even if I cannot accept it at the present time.
 * Needs work - The edit is an improvement, but needs spell fixing, grammar fixing etc.
 * Reject - The edit should be reverted. If this flag was there, it would help with problems with not knowing if there is a general consensus to revert an edit.


 * [Pcb21 replying to Jrincayc] I agree that it would be useful to separate out the first two cases - the "known good" and "probably good" cases. I was envisaging however not having the last two choices. In the latter two cases it would be better for the person on RC patrol to correct the errors, or revert if necessary... If they don't have the time then the edit remains unchecked and someone else comes along.


 * The Reject could be used as a recommendation by the editor in a revert war. This  also could be used to show that there is a general consensus to revert a contraversal edit.  I like having all the possibilities since then I can mark an edit that needs work as such, and remember to fix it when I have more time.  Jrincayc 17:19, 5 Jul 2004 (UTC)


 * If we use the content hash system I described above, negative ratings could be useful. Blank pages or the non-consensus side of an edit war could be given a negative score, and whenever someone reverts to them, the revert could be highlighted on RC. -- Tim Starling 03:53, Jul 7, 2004 (UTC)


 * It would be useful to have this list include user-defined codes. This would allow individuals and projects to build up their own classification. If this were a 16 bit field, then the first 256 could be shared accross wikipedia, but users could define their own meanings for the other. For a user defined code you would need to join (userid, flag_code) to your look up. In the project I'm interested it would be useful to have specific flags for quality of writing, content and detail (inadequate, basic encyclopaedia entry, medical student level, postgraduate level, detailed review of current status etc etc) Erich 06:52, 6 Jul 2004 (UTC)


 * also its just occured to me that part of the QA is to review the existing pages. We therefore need to be able to comment on (a) edits (b) pages. For example, you may wish to say (not necessarily at the same time):
 * "This was a great edit but this page needs a lot more work" or
 * "That edit was OK, but this page is a fantastic review of current scientific literature on this top."
 * These are diffent dimensions and the logical rules and implications for interpretation would be different. Erich 06:52, 6 Jul 2004 (UTC)


 * Perhaps what we need is a method for polling readers on the quality of an article. A multiple-choice box asking them to rate the article. -- Tim Starling 07:04, Jul 6, 2004 (UTC)


 * If you want to say that a page needs a lot more work, why not make a small edit and say that in your edit comment? That way you make it better and tell others what you think. And we get the sort of incremental improvement the wiki way is supposed to encourage. It's pretty routine for me to do this already and it works. Jamesday 18:56, 6 Jul 2004 (UTC)

Sounds like to separate issues here: 1) quality of the article 2) quality of the edit. A bad edit could be fixed or reverted. A good edit should not reflect negatively on the user just because the article remained bad afterwards. Score users for trustedness; score articles to find both the cream of the crop (for skimming into wikireaders) and the dregs that need work. Currently, we already do this, by marking as Vfd, stub, cleanup, which is not very fine grained or easy to search. Having an article rank might really help that and be a more permanent solution. --ssd 05:24, 7 Jul 2004 (UTC)

Summary String
A summary string would be helpful. This could specify the way that the edit was checked and other wise give comments (for example: fact-checked, useful new section ...). If there are more possibilities than just accept (such as reject), then this becomes a near necessity, since the reason for rejection may not be obvious (and will seem nasty to a newcomer if they see their edit is rejected with no reason given).


 * The only I problem I see with this is where are we going to find room to display it... I think we have to have the facility for multiple people to check the same edit else it can be gamed. Pcb21| Pete 16:50, 5 Jul 2004 (UTC)


 * The diff page would have enough room to display this. I think that the history page should just have a summary something like: (3+L,2?S),  which means 3+ -> 3 accept, 2? -> 2 neutral, +L -> at least one of the accepts was on your list of approved checkers, and ?S -> you (self) choose neutral.  Clicking on the summary would bring up the edit and the complete list of who has checked it and their summary reason.  I think that even a list of who has checked it would be too long in the history page or the Recent Changes page. Jrincayc 17:27, 5 Jul 2004 (UTC)


 * The summary string would be very useful. If your edit gets slammed in peer review you'd like to read all the comments. You may potentially wish to view all comments on a user's edits, or all comments that a user has made. But the comments needn't be visible in a list of recent changes, you just want the mean (median?) rating Erich 06:45, 6 Jul 2004 (UTC)


 * Why do you believe that a comment in the talk page with associated edit comment is insufficient or unsuitable? Isn't that where discussion about the merits supposed to be conducted? Adding another talk venue seems redundant. If an edit is rejected, the edit comment undoing it seems pretty obvious and must, per our conventions, give the reason for the change and/or refer to the talk page for more discussion. There's no need to duplicate that existing practice and functionality. Jamesday 18:56, 6 Jul 2004 (UTC)
 * (1) it can be extremely difficult to associate a talk page edit with the actual edit, espectially if another 15 edits have occurred since then and you weren't tracking that conversation (2) it is time consuming to gauge the response of others to a particular editor. Point 2 makes it easier to tell if everybody else thinks this editor is a bit off beam or whether everyone else thinks he is right on the money and you should cut him some slack... you can do that already, of course, it just takes a lot longer than pullng up a list of the last 30 reviews of their edits. Erich 05:07, 7 Jul 2004 (UTC)


 * Keep it positive, this is meant to be a brainstorming page. It was described as such in all the incoming links. -- Tim Starling 01:04, Jul 7, 2004 (UTC)


 * It wasn't described that way on my way here. Here's the positive rewording, though: I really like the idea of being able to associate comments and discussions with an edit. Better still if they appear on the watchlists of those interested in the article. Perhaps we could have a place to discuss the contents of an article and edit. Could we arrange to present them all at once for easy review of the history of the discussion? Have to avoid spreading them all over the place so they are hard to follow, though - would be very inconvenient to have dozens or hundreds of different talk pages so you can't find all of the discussion about an article easily. Maybe a talk page of some sort to go with the article? Jamesday 13:56, 12 Jul 2004 (UTC)


 * Jamesday, that is a fair question. The big advantage is that the comment is directly linked to an edit.  So, if I click on the diff for the edit, I get to see all the short comments about the edit on that page.  This saves me from opening up both talk and the diff page at the same time, and having to determine which comments in talk refer to which edits.  This is a nontrivial savings in time.  Next, is that talk tends not to get used for this presently except for the exceptional edits.  If it was easier to comment on a single edit, more single edits would get commented on.  Lastly, and I consider this a bug in Wikimedia, I don't think there is any way to link to the most recent diff (at least not in a permenant way).  This means that I usually end up copying the old version, the new version and then pasting both into the talk page along with my comment.  This takes on the order of 5 to 10 minutes, whereas I think that if I could just type in a comment from the diff page, it would take closer to 2 minutes.  This means I can comment on twice to five times as many edits in the same amount of time.  I agree that it is somewhat redundent, but I think it would be helpful.  Jrincayc 01:22, 7 Jul 2004 (UTC)


 * Yes, that appears of some interest when you're considering one edit. What about when you want to review all of the discussions about the article or a series of edits, though? Putting the discussion in one place seems much more convenient for that. When I want to write a talk page comment with quoting, what I tend to do is open another browser window for that, edit the old version, copy and paste from there, then copy and paste from the revised version. Works nicely for me. Still, I can see that it might be convenient to also have a post comment box on the edit screen, which would post in the talk page. That's a couple of clicks saved and that sounds good. It's also something which could generate an automatic link to the comment. Jamesday 13:56, 12 Jul 2004 (UTC)

There has been a handful of times where an edit's comments were meaningless to me,and I would have very much liked to add to or change the comment left. For example, change something silly to 'revert' or 'unblank'. This goes along with the automatic tagging/hashing described above, which sounds like a very good idea. If a revert was tagged in comment automatically, it would help. It could be done either via hash, or by addding an extra field to the form (or something more secure) when you edit an old revision. Likewise, I wonder if there should be more than one kind of minor edit. (e.g., 'spelling/punct/grammar', 'category addition', etc...) but that's a lot less important than major changes like reverts. --ssd 05:34, 7 Jul 2004 (UTC)

User trustworthyness
(I'm gonna add a section, hope you don't mind. Feel free to add or correct my ramblings below. --ssd 05:49, 7 Jul 2004 (UTC))

If users are going to tag articles as checked, we have to have a metric for measuring how much we trust them. This is hinted at above, here are some ways we can measure this user trust.
 * Assigned rank. Ranks that exist: anonymous, logged in, sysop, admin, developer
 * Calculated trust based on work done: number of days since user created, number of edits, number of new articles, etc.
 * Trust metric: number of edits reverted by others; negative or positive scoring of an edit (see above), number of edits not reverted or scored highly, etc.
 * Trust lists: users in my personal trust list; users in trust lists of users I trust; distrusted users

I think it's important for new users to be able to patrol RC. Not only that, but it should be rewarding and fun for them. I know I patrolled RC shortly after I arrived, it was great to be able to help out the community in that way. So whatever measure we use, we must allow untrusted users to review edits. It should be clear to them that their review work is benefiting the community, not being chucked in the "untrusted" bin, to be retrieved only in the unlikely event that some clicks the "show untrusted reviews" link. And it should be possible for a user to win trust by patrolling RC and fixing any vandalism that they see.

Displaying the names of the reviewers serves several purposes. It allows users to assess the trustworthiness of a review without relying on an algorithm. And it allows the reviewers to build a reputation, which will eventually lead to real trust by the community.

That said, there is still a need for a compact representation of the quality of an edit, to be displayed on RC or in other similar lists. Perhaps a simple count of thumbs-up votes versus thumbs-down votes would suffice. Automatic sock-puppet detection could help make this more robust. Another option would be to allow users to pick some small symbol -- say their initials or a small transparent icon -- which would identify them. The marks of reviewers could be listed on RC, perhaps with a different background colour for negative ratings as opposed to positive ratings.

If we went for an icon-based representation, we could make some modification to the icon to represent a user who is new or untrusted. This could be done using one of the trust metrics mentioned by ssd. Regardless of how new a person is, we should always display their review, although it may be appropriate to have a blacklist for users who are known to abuse the system.

Remember that the reason I think negative ratings are necessary is because a rating is attached to the text of an article. If a user reverts to a "bad" version, the previously assigned ratings are immediately displayed. Of course the user may make some trivial change to avoid their edit being marked as a revert. It's important that we discourage this by not assigning any meaning to reverts beyond displaying them as such on RC. In particular, this measure should never be used to limit a user to some number of reverts per day.

-- Tim Starling 07:07, Jul 8, 2004 (UTC)


 * Well... If you find a vandalised article, and fix it, your edit in fixing it would presumably be rated highly, no? So you'd get brownie-points for the resultant actions from patrolling, but not from the patrolling itself. I dunno - should people be trusted (or trusted more) just for looking at a few hundred edits and marking them in the mob-approved way (that is, contiguous with the majority of others')?
 * James F. (talk) 08:10, 8 Jul 2004 (UTC)


 * That's correct, if you find a vandalised article and fix it, that will presumably count for something. But you can't find vandalised articles and fix them if you're not allowed to use the edit checking system. Real human trust and actual reputation are more important than computerised measures of the above. I'm not suggesting using edit checking as part of a trust metric algorithm, I'm saying that displaying names leads to good social dynamics -- trust, reputation and competition. -- Tim Starling 03:43, Jul 9, 2004 (UTC)
 * I hope to never see a fuzzy user trust metric used to limit what a user can edit. It should only be used to determine how much scrutiny that user's edits need.  Some other system such as rank or page protection or page permissions or some form of banning would be better to limit user activities.  Any calculated rank can be subverted. --ssd 06:30, 9 Jul 2004 (UTC)


 * I don't think it matters who does it. What matters to me is how many people have looked. I may trust or not trust some people but by the time there have been five views of a change I'm sure enough that someone has checked it who will revert trouble. It's much easier to track and update this - a simple counter will suffice. I don't care much about good or bad edit - just that enough eyes have looked that if it was bad, it's been fixed. Jamesday 11:24, 14 Aug 2004 (UTC)