Wikipedia:Wikipedia Signpost/2008-06-23/Dispatches

Two different grading systems: "importance" and "quality"
Most users will have seen the talk page banners that indicate what stage an article has reached in the writing process: A-Class, B-Class, Start-Class, or even Stub-Class. They may also have noticed that many articles are graded according to their importance: from Low-importance to Top-importance. These rankings may seem cryptic to new or occasional editors, and even seasoned editors may not have given much thought to the role of these templates in Wikipedia's quality control process. Moreover, there is often confusion about the relationship between this assessment scale and the processes that determine good articles (GA) and featured articles (FA).

Importance scheme
Wikipedia's importance scheme aims to determine the importance attached to an article's topic by its related WikiProject(s) – from those that are "extremely important, even crucial", to those that are "not particularly notable or significant". Thus, the same topic may be more important to one project than to another, and as such can receive more than one assessment on the importance scale. Powderfinger, for instance, has been rated of "top-importance" (priority) by the Powderfinger WikiProject, "high-importance" by WikiProject Australia, and "mid-importance" by WikiProject Alternative music.

Quality assessment
The encyclopedia's quality assessment scheme is more complex, because it has to address many facets of article quality, such as completeness, layout and language. Since a June 2008 poll added a new "class", WikiProjects will begin using five levels for quality assessment:
 * Stub – a basic description in a paragraph or two;
 * Start – an article that is developing, but is quite incomplete and lacks reliable sources;
 * C – an article that is moderately complete, but lacks sources or contains cleanup tags;
 * B – an article that is mostly complete, without POV or other major cleanup issues, but which requires further work to reach Good Article standards;
 * A – an article that is organized well and is essentially complete, but needs style issues addressed before submission as a featured article candidate).

Critically, such "importance" and "quality" are not necessarily correlated: one article might be of "low importance" and "A Class" (see Clea Rose example); another might be a "top-importance" stub (see Judiciary of Australia example).

At press time, the new C-Class still needs to be fully enabled in the WP1.0 bot and elsewhere. This new classification has effectively raised the standards of quality required to attain B-Class. Other classes are included, such as FA-Class and GA-Class, which are not WikiProject-based, as are descriptive classes such as "Portal-Class"; for a complete list, see below.

Developing the scale
The original purpose of the assessment processes was twofold: to facilitate the production of an offline release, and to assist WikiProjects in organizing their articles, by categorizing the quality of articles as simply, accurately and comprehensively as possible. A test CD (Version 0.5) was released by the Version 1.0 Editorial Team in 2007, and a larger DVD release (Version 0.7) is planned for the third quarter of 2008. The gargantuan task of sifting through 2.4 million articles (as of June 2008) would be impossible with just a handful of team members. To solve this problem, a standardized baseline had to be developed so the task could be distributed among the editors who comprise Wikipedia's base.

Instead of developing a brand-new scale, the Version 1.0 Editorial Team adopted existing guidelines, and modified them for greater scalability. The assessment scheme in use across the community was originally developed at the Chemicals WikiProject as a method of tracking the completeness of the articles in their Worklist (a set of around 400 articles on which the project decided to focus its effort). By late 2005, the scheme was proposed as part of the article selection process at the 1.0 project. The Work via WikiProjects sub-project was started with the aim of having projects provide subject-expert assessments, which the 1.0 team could then put together to produce a broad selection of articles from the encyclopedia. The initial method was to request manually written lists of the top articles from each project; this did generate around 3,000 assessments and provided some suitable articles, but was very labor-intensive. In April 2006, there were about 1.1 million articles in Wikipedia, so continuing with the older method would have proved ineffective. At about this time, a new category-based, bot-assisted system was introduced; this gave projects valuable tools for their work (lists, a log and a statistics table) and provided the 1.0 group with a much more comprehensive list of articles. Tagging an article (via the talk page) is straightforward, and so the scheme rapidly grew to encompass 30,000 articles by August 2006, and to around 1.3 million articles in June 2008. The following table shows the aggregate of all the assessments by more than 1300 participating WikiProjects and task forces throughout Wikipedia:

Although the assessment scheme is only approximate, it allows users to broadly gauge article quality, and WikiProjects to keep track of their articles. When combined with the importance assessment scheme (which is not universally used), projects can see which of their key articles need the most work. The Wikipedia 1.0 project is now able to integrate the information from all of the WikiProjects and make selections of articles for offline release.


 * Note: The chart is generated from WikiProject templates, and represents the scheme used until June 2008. There are currently featured articles, but some wikiprojects include featured lists in their featured article tally, so the number of featured articles in the chart is overstated. On the other hand, there are currently  good articles, but as some articles have no WikiProject templates or the templates are not updated to include GA, the number of good articles in the chart is understated.

Criticisms and changes
Although the scheme is generally working, there is a steady trickle of criticisms and suggestions. The scheme is designed mainly for WikiProjects to assess article content and completeness, but GA and FA levels are included as "cross-references" to Wikipedia-wide quality assessment processes. This has been a regular source of confusion, since GA and FA status are not awarded by WikiProjects.

The Version 1.0 Editorial Team recently reevaluated the number of levels for project-based quality assessments. Until now there have been four (Stub, Start, B and A), but a recent poll indicated support for expanding this to five. To be useful across the community, the system must be simple and straightforward, so that all editors in all projects can use a common system for assessing articles. A greater number of assessment levels may yield a finer analysis of quality, but this is meaningless if the assessments cannot be performed to this level of detail. A majority of those polled believe that a fifth level (C-Class) will give a more refined scheme without seriously compromising reliability. The C-Class level will be introduced in the coming weeks.

The 1.0 team is testing a bot for automatic selection of articles. This involves evaluating the importance of an article using four parameters: a manual assessment by the project, the number of page hits, the number of foreign language "interwiki" links, and the number of links into the article. These factors are weighed along with the quality assessment to produce a selection of the most important "decent" articles for release. Initial test results look promising, but require an improved balance between WikiProjects. This new method should allow the 1.0 team to easily make regular general releases, and individual WikiProjects should be able to produce their own offline releases on paper, CD or DVD.