Wikipedia talk:Version 1.0 Editorial Team/Wiki Sort

WP1.0 editorial team discussions – Core topics discussions – FAs first discussions – Work via WikiProjects discussions – Pushing to 1.0 discussions

Current Tasks

 * 1) Hammer out the specifics of the proposed rating system
 * 2) Request software modifications
 * 3) Hope and pray it gets implemented
 * 4) Start rating
 * 5) Decide on a verification method
 * 6) Wikipedia 1.0!


 * "Current" as of when? Who is the author? As I write this comment, the next comment section is from 2005: that suggests to me that "current" is not particularly current in 2014. hunterhogan (talk) 07:56, 4 January 2014 (UTC)

A Message
I would just like to make it known that I am deeply devoted to this project. It has a LOT of potential, and I am willing to see it through to the end. However, that does not mean I intend to hijack it! Rather, we need to do this together. Let's git 'er done!the1physicist 03:51, 24 October 2005 (UTC)

Guidelines
We will need a systematic way of developing the rating system. Post everything that is agreed upon to the article page. Discrete areas of discussion will focus our efforts, so any areas of disagreement should be started in (or moved to) their own section on this talk page.

Rating Scale
I know this is going to come up so I thought it best to pre-emptively create this. What range of numbers will this rating system cover? Will it be 0 to 3, 1 to 10, or something in between? Personally, I strongly favor a scale of 1 to 10. Among other things, this would make it much easier on the software. More importantly, I think people are used to rating things on a scale of 1 to 10, which jives with the goal of making rating as easy as possible.the1physicist 03:13, 21 October 2005 (UTC)

Online vs. CD/DVD vs. Paper
This will also come up, so we might as well address it now. A disk version of Wikipedia will naturally contain less information than the online version, and a paper version will contain even less. The wonderful thing about a rating system is by varying the selection criteria, it automagically allows for both. For example, a disk version could include all articles with a rating of say 7 and up, whereas a paper version could limit itself to articles rated 9 or 10. In contrast to our sister projects, a rating system will allow for any conceivable distribution method, without pulling from the editor pool.the1physicist 03:13, 21 October 2005 (UTC)

Categorized Rating
While reading other folks' past suggestions, I saw that a few had the idea of additional rating categories, so I have added this to the project page. I am definitely in favor of any and all additional categories (suggestions, please!), but feel we should limit them to the top 5 or 10. I also think the scope and comprehensiveness ratings are most important, so they should be separate from the others.the1physicist 03:13, 21 October 2005 (UTC)

Rating Accuracy
An issue that may or may not happen is that people may (unintentionally) not reliably rate articles. This will primarily be a problem with casual viewers who may not be well versed enough in wikipedia. I personally don't think this will be a problem (due to averaging, etc), but for some articles it very well could be. At the very least, an article with wildly varying ratings suggests an issue with the article itself. As User:Stirling_Newberry put it, "even noise is data".the1physicist 20:21, 22 October 2005 (UTC)

Scope Rating
This is particularly true of the scope rating, where only an expert in that topic will truly know how much an article fulfills it's scope. The solution to this problem would be to weight experts' scope ratings (or possibly suggest an admin to override it), which of course requires a way to identify experts.the1physicist 20:21, 22 October 2005 (UTC)

Expert Rating
An editor could select areas where he or she is knowledgeable. Then other editors could rate that editor's level of expertise after reveiwing edit histories, etc. It may even be possible for the software to determine the expert rating. The software could compare an article's rating before and after an edit (assuming an improvement), and using it's category determine that the editor is knowledgeable in that subject.the1physicist 20:21, 22 October 2005 (UTC)

Wikiversity
Alternatively, if the (proposed) Wikiversity ever decides to include testing (which I believe they should), those scores could be used to determine an expert rating.the1physicist 20:21, 22 October 2005 (UTC)

Last Minute Fork vs. Continuous Validation
Vandalism is going to be an issue. We have two options. The first would be to fork the 'pedia at the last minute and have editors do a final check for vandalism/copyvio. The other option would be to verify each version of an article as free from vandalism, and have the software choose the latest vandalism-free version for release. This option would require much more effort than a last minute fork. More importantly, a continuous validation scheme would probably undergo final validation anyway, so we might as well do it all at once.the1physicist 02:10, 24 October 2005 (UTC)


 * My vote is for continuous validation. Academic studies show that the average act of vandalism is fixed with in $$ x \pm \sigma $$ minutes.  We can mine the database to get up-to-date and detailed values for x as a function of the number of users who list the article on their watchlist, the number of those users who are admins, etc.  Once we set a statistical confidence level (such as 95%), we can choose a length of time $$ x + k \sigma $$, tailored to the watchlist status of the article in question, and simply use the newest version which is older than that time interval.  In our example, $$k=2$$.  The figure from 2004 is $$x = 2.8 $$, but that's the aggregate, and I'm sure it's a bit smaller now, especially for well-watched articles.  An extremely simple version could be based on the old aggregate, though.--Joel 15:40, 4 November 2005 (UTC)


 * Have you seen the latest version on the project page? Note this sentence: "Perhaps the best solution would be to verify revisions of articles, fork at the last minute, and simply present to editors articles whose most recent revision isn't already verified."  I think it combines the best of both worlds.  Let me know what you think.the1physicist 16:31, 4 November 2005 (UTC)

General Discussion
On the specifics of the rating system: I think that users should be prompted to rate articles, and that a system should be devised which uses previous ratings and the size of the diff file to choose which artilce is brought to the users' attention. I'm quite fond of the randomization trick that Slashdot uses to prevent the rating of things one already cares about; I think we should steal it. Perhaps the developers involved would even donate their services to the Wikimedia team; failing this, their software is open source. See Coase's Penguin (the paper itself, not the Wikipedia article) if you aren't familiar with the finer points of their rating system; a quick treatment (which omits my favorite part of it) can be found here.

On the implementation of software modifications: see Slashdot comment. Also: the Unification church ("Moonies") is working on sorting out the best of Wikipedia, as well. Perhaps they'd help with financing the software development. Of course, this won't affect whether or not it gets accepted into MediaWiki...

On the last-minute fork: If the article rating applies to the version itself (with, perhaps, a spillover if the edits are small), there is no need to fork or skim. Only the articles that have been automatically rated would be included.--Joel 06:43, 20 October 2005 (UTC)


 * On second thought, ratings of the scope of the topic should be kept through all versions, but ratings on quality and thoroughness within that scope should apply to particular versions.--Joel 21:43, 20 October 2005 (UTC)

As the semi-official project leader, I thought it would be a wise to read all of the previous discussion concering WikiSort and the 1.0 project in general. Between your links and User:Walkerma's, this may take a while. I'll post sometime soon with a fresh batch of ideas.the1physicist 22:01, 20 October 2005 (UTC)

Some random thoughts: If the article rating is to be used to rate the editors it would be logical to only allow logged in users to do the rating. Articles should also be rated on structure and readability. Users taking a stub and adding significant content would change a rating significantly, and this would not be vandalism. Users adding links, correcting grammar, etc. are just as valuable as content adders, and additions that improve navigation (e.g. disambiguation) are possibly more significant. User rating should take account (somehow, but I don't know how) of their participation in edit wars or vandalism. Joe1011010 19:03, 25 October 2005 (UTC)


 * Why exactly should we limit ratings to logged in users? Are you worried about rating vandalism?  Perhaps I should have included all the anti-vandalism ideas on the project page after all.  True, it would be much more accurate if we could rate editors' edits directly (rather than a before and after snapshot of the article), but that would be VERY tedious, which goes against the idea of making this as easy as possible.  More importantly, it wouldn't be _that_ much more accurate.  Alternatively, we could allow users to rate other editors directly (perhaps on categorized criteria).  Anyhow, check the project page in a little bit, and post back if it doesn't answer your concerns.the1physicist 02:01, 26 October 2005 (UTC)

Hi,

When I started reading your "Wiki Sort" page and stumbled on the phrase "Catagegorized ratings" and saw your examples on comprehensiveness rating and importance rating, I thought to myself - what a wonderful idea! But then I started to read more about your implementation suggestion, and got a bit disappointed. You are suggesting massive software changes to add the new concept of "rating", and new ways to work with these ratings. I'm thinking to myself - why not do something much simpler, and have actual category-based ratings?

What I mean is, imagine we create Wikipedia categories named Comprehensiveness-1 through Comprehensiveness-10, and Importance-1 through Importance-10, and so on, and let people add these categories to articles. Disputes in these ratings will be handled just like any other dispute (including perhaps a discussion in the talk page), they are protected from vandals just like any other change (history, etc.), and most importantly, this requires absolutely no change to the software. When building Wikipedia 1.0, these categories, as well as existing categories like "stub", "NPOV", and so on, will be used to decide which articles should be included.

What do you think?

Nyh 11:57, 8 November 2005 (UTC)


 * I think that's a great idea. I was thinking an external system like de.lirio.us, but your way sounds even better.  Go ahead and set up the categories yourself.  However, I would suggest using the pipe trick:  will produce a synoptic "category" page ranked by rating and then sorted alphabetically.  I'd also suggest creating a template for the talk page that allows users to vote, and invites people to check that the average/consensus of votes is what the page is listed under.--Joel 17:22, 8 November 2005 (UTC)


 * That would be a very good idea, except it wouldn't allow for all the nifty data-mining techniques discussed on the project page. More importantly, I recently talked with User:Magnus_Manske, and according to him, most of the software changes have already been implemented.  I I understood him correctly, the only thing we're waiting on is the appropriate data mining algorithms (some of which I have outlined).the1physicist 17:51, 8 November 2005 (UTC)

Observations off the top of my head
Hi, I had been thinking about this topic, having seen some of the editorial team assessment stuff somewhere. Also I was reading comment following a (fairly) recent Guardian article about Wiki, which was critical in some respects, and discussion I was having with a librarian. Wiki pages need rating for the benefit of people using them, to get an idea of how good we think a particular article is. The fact that some pages are acknowledged to be virtually worthless is a real liability in a source to be recomended to someone.

I didn't see how you plan to display this information, but it needs to be available on the article page so someone can quickly see it. Perhaps a tab at the top of the page could bring up a 'rate' page with more statistics and tick boxes to make a rating? A history of past ratings (somewhere, I suppose not necessarily visible, though if it has a dedicated page this could be quite extensive). A brief read suggests you are branching out into editor rating. What will you do with the ratings. is Kate's tool out of date?

But this does need to work automatically with minimum special intervention by teams of assessors.Sandpiper 00:50, 26 November 2005 (UTC)


 * My plan would be to display a disclaimer at the top of each page (perhaps below the "From Wikpedia, etc") that would say something like "This page has been rated X by users of wikipedia. This means that it may have Y or Z issues." or something like that. As far as the rating goes, the powers that be have determined a tab at the top of the page would be best.  You can see the demo here (click the validate tab).  Keep in mind that those are just test categories. You may also want to check out the metawiki. Thanks for commenting!the1physicist 01:58, 26 November 2005 (UTC)


 * That was quick work. Good of you to to set that up overnight: just the sort of thing I had in mind.

Another mathematical approach
What is needed to provide safety in verification is an heuristic to determine how reliable an article's rating is. I agree with the time-based method suggested above, but I also want to suggest an approach based on character counting. Under this system, whenever an edit is made, the server notes the number of characters added + number removed. With this as change:

new unknown score = net rating - [old article length / (change + old article length) * (net rating - old unknown score)] (This would need special cases to prevent division by 0.)

Where "unknown score" refers to a rating value meaning that the server is agnostic as to the value of the new content. This would roughly approximate it; someone changing punctuation wouldn't register much on the scores, while someone venturing a new paragraph or an outright page blank would cause the consensus to be automatically decimated. The algorithm would guard against last-minute vandalism getting into printed copies, as a large change would edge a page out of a B-rating. Just a suggestion. Alksub 01:14, 9 April 2006 (UTC)
 * I'm having a hard time understanding exactly what that does, so you'll have to break it down for me. It looks like it assigns an accuracy rating based on how different a revision is from the previous one.  Is that correct?the1physicist 14:28, 11 April 2006 (UTC)
 * Yes, an accuracy rating of the page's rating. Alksub 01:48, 22 April 2006 (UTC)
 * Actually, the formula I gave above is just goofy. But you get the general idea. Alksub 14:03, 8 July 2006 (UTC)

Status
Is this project still active? It could be good to put an update at Version 1.0 Editorial Team. Thanks. Maurreen 19:06, 30 April 2006 (UTC)
 * Yes, this project is still alive. I've personally not been able to devote a lot of time, as detailed on my user page.  Unfortunately, we're being held up getting everything polished off in the software.  Discussion is also fragmented between here, the article validation pages on meta, and various other places.  Probably the biggest obstacle is that not a lot of people know about this project yet. What we really need is someone to spam the village pump, etc.  (if that's appropriate?) to A. finetune the ideas, but more importantly, B. light a fire under the devs. to finish the software.the1physicist 01:38, 1 May 2006 (UTC)
 * We're planning to publicise WP1.0 a lot in the coming months, maybe we can include this project in some of that. Walkerma 02:16, 1 May 2006 (UTC)
 * Has anyone here gone through to CD burning ? Redlink cleanup, Category links that work, reduction of large pictures ? Wizzy&hellip; &#9742;   18:36, 20 June 2006 (UTC)
 * Not that I know of, but someone could have slipped that past me.the1physicist 18:46, 20 June 2006 (UTC)

Ranking Articles?
This is going to be nearly impossable unless we're ranking a specific export of the database. An article can be rated 10 (1000 votes) and then some it10t can come along, make "valuable minor" changes, and blow the current rank. If this has already been solved, I'm on the project.

-NickSentowski 18:09, 6 July 2006 (UTC)

Software suggestion
I think it would be easier to use a bot + a script. The script would create a "rate" tab on each article and post the rating to a user subpage. The bot would look at the ratings on the user subpage and get the final rating somehow. This way, we don't have to wait for the developers to make changes to the software because the bot can be run by one user and the scripts can be posted on the scripts page. We could make the script more well known by posting it on the village pump. The bot doesn't need to be well known because only a single user has to run it. Eyu100 23:19, 24 August 2006 (UTC)

Script (untested):

function write_to_ratings_page {

document.ca-edit.click; document.editform.wpTextbox1.value = document.editform.wpTextbox1.value + "(" + quality + " and " + importance + " and " + pagename + ")"; document.editform.wpSummary.value = "Added rating"; document.editform.wpMinoredit.checked = true; document.editform.submit; }

function get_rating { var quality=prompt("Please enter a quality rating from 1 to 6, 1 is stub, 6 is FA","") var importance=prompt("Please enter an importance rating from 1 to 4, 1 is low, 4 is top","") }

function rate {

get_rating; document.ca-cumbersome_tab.click; write_to_ratings_page; }

addOnloadHook(

function { var pagename=getPname; addTab("javascript:rate", "rate", "ca-rate", "Rate this page", ""); addTab("http://en.wikipedia.org/wiki/User:Eyu100/Bot_area", "DO NOT CLICK THIS", "ca-cumbersome_tab", "PLEASE _DON'T_ CLICK THIS", ""); } );

Obsolete
Version 1.0 Editorial Team states that the Wiki Sort "project has been rendered obsolete by WikiProject-based assessments (see above)."

Problems: hunterhogan (talk) 08:13, 4 January 2014 (UTC)
 * 1) The linked page directs the reader to "see above", but as of this writing it is not clear what the author is directing the reader to see. As best as I can tell, the author is referencing Work via WikiProjects.
 * 2) As best as I can tell, this page, Wiki Sort, contains information that is still relevant but not reproduced in other "active" projects. If this project is obsolete, editors should migrate the non-obsolete knowledge to non-obsolete projects.
 * 3) The Wiki Sort page does not include a warning that it is obsolete. Compare to Article_assessment.