Wikipedia:Assessing articles

This essay discusses the criteria and purpose of article assessments, recorded in talk page templates like WikiProject Venezuela. Assessments are useful if done right, but are often done wrong. Many articles are given lower quality or importance ratings than they merit based on the criteria. A common mistake is to assess short articles as stub or start class even when there is nothing more to be said about the subject, and longer articles as B (or higher) class even when there is much more to be said.

An unjustified "stub class" assessment with no explanation may cause a potentially productive newbie to give up. However, an author may be blind to defects that a reviewer sees at once. Reviewers are therefore encouraged to give notes on the article talk page that state what they feel needs improvement, preferably relating the notes to the project's assessment criteria, and authors should feel free to ask reviewers for more detailed feedback on what needs improvement.

There may be more leverage in bringing many articles up to C class, where they meet the needs of most casual readers, than in bringing a few up to the very demanding standards of FA class.

By and for project members
Assessments are for project members, not for casual readers. Most Wikipedia readers never see ratings, but some may click on the talk tab by accident and see the article they were reading has a C rating. That seems like a rather mediocre grade for an article that gave them all they wanted to know. They shrug and move on. They will not click on the quality scale to find out what C class means.

Assessments should ideally be done only by project members, or at least should be reviewed by project members. An article on a species of butterfly may cover all that distinguishes it from others in its genus. The article is well written, well sourced and complete, as anyone who knows about butterflies can see. But the article is just two paragraphs long. A general editor, busily working through a list of new articles, may give it a Stub rating because it is so short. B would be more appropriate. An article on a 19th-century physicist may give an excellent and well-illustrated overview of his life, but skim over the work for which they are known. The same busy editor may assign it a B rating because it is so long and thorough, but if most readers are more interested in the work than the person, it may be a Start.

To assess an article properly the reviewer should understand where the article fits in the spectrum of importance for the project, what information should be included in this type of article and what casual readers would be looking for. The reviewer must understand what could make a butterfly or physicist very important to the project, as opposed to a mundane butterfly or physicist. They must also understand the standard information to be recorded about butterflies and physicists, and know something about the more important subjects, so the presence or absence of the information tells them how complete the article is. The length of the article is irrelevant.

Quality ratings: an awkward compromise
Quality ratings try to give a combined assessment of three quite different aspects of any article: The three are independent. A very readable article may be a hoax about a non-existent subject. A perfect article in terms of technical style may be poorly written and have major gaps. A professor may write the definitive article on a subject, but their English is very poor and they see no need to add citations to their own work.
 * 1) Prose: Is the article well organized, easy to read and easy to understand, avoiding needless jargon, with no spelling or grammar errors?
 * 2) Technical style: Does the article cite reliable sources to support what it says? Does it have appropriate formatting, wikilinks, categories, etc.?
 * 3) Coverage: Does the article give detailed and in-depth coverage of all significant aspects of the subject?

The table below summarizes the criteria given at WikiProject assessment. GA and FA are similar to A: fairly complete and well-written.
 * {|class="wikitable"

! scope="col" | Class ! scope="col" | Coverage ! scope="col" | Prose & style ! style="background: transparent;" scope="row" | Stub ! style="background: transparent;" scope="row" | Start ! style="background: transparent;" scope="row" | C ! style="background: transparent;" scope="row" | B ! style="background: transparent;" scope="row" | A Coverage is the main criterion in assessing Stub, Start and C class articles, but truly awful prose or severe technical issues can drag a rating down to Stub. With B and GA/A/FA, where coverage is mostly complete, quality of prose and technical style are more important. One approach is to take the lowest rating of the three aspects. If an article's prose, style or coverage is Stub level, the article is Stub level. If it is not a Stub, but prose, style or coverage is Start level, the article is Start level. And so on. However, this may work poorly with Start class, which could describe a well-written article that just needs a bit more information to become C class, or a rambling, confused and unsourced essay that should be rewritten from scratch.
 * Very little meaningful content
 * May be incomprehensible
 * Some meaningful content, but most readers will need more
 * May need improvements to organisation, grammar, spelling, writing style, jargon use and citations
 * Still major gaps, but useful to a casual reader
 * May have problems with clarity, balance, flow, bias or original research.
 * Mostly complete, may not satisfy a serious student or researcher
 * Reasonably well-written
 * Essentially complete, very useful to readers
 * Well-written, clear, well referenced
 * }

According to the policy What Wikipedia is not, "Information should not be included in this encyclopedia solely because it is true or useful. A Wikipedia article should not be a complete exposition of all possible details, but a summary of accepted knowledge regarding its subject." However, the guideline WikiProject assessment says of the FA grade, "it neglects no major facts or details ... a definitive source for encyclopedic information." Google's list of synonyms of "encyclopedic" includes "comprehensive, complete, thorough, thoroughgoing, full, exhaustive, in-depth, wide-ranging, broad-ranging, broad-based, all-inclusive, all-embracing, all-encompassing ...".

There is room for debate over what constitutes "complete" coverage, but three assertions seem uncontroversial:
 * 1) A Start class article does not meet the needs of the typical casual reader, our primary audience, but a C class article does, even though it may be quite incomplete. (A "casual reader" is curious enough to have clicked on a link to the article or searched for the title. They may not be the average Joe, but they are not an expert on the subject area.)
 * 2) If a short article gives all that has been published about the subject, it must be considered complete even if many questions remain unanswered. "Complete" measures how close the article comes to what is possible rather than how close it comes to the ideal. Thus Beornred of Mercia, which was rated Stub as of December 2017, perhaps should be rated A. There is no more to be said.
 * 3) A long article may still be incomplete if omits significant available information, so falls short of what is possible, even if it meets the needs of almost all readers.

Importance ratings: a variety of definitions
The importance scale, also called the priority scale, is specific to a project. An article may be highly important to one project, less important to another. There is no "official" scale, and projects are encouraged to define their own, specialized scales. Different projects may consider different factors to evaluate importance. A project's importance scale typically answers the question, "How important is it to Wikipedia's coverage of this project's subject area that there should be an article for this topic". It is often assigned incorrectly. Thus an article on a minor but notable artist, river or movie may be rated Low importance since the subject is not particularly significant, but should be rated Mid importance since deleting the article would leave a gap in Wikipedia's coverage of the project's subject area.

Some projects refer to the scale documented at Importance scheme, while others refer to the definitions in the Version 1.0 Editorial Team/Release Version Criteria. Other projects have customized scales which may consider factors such as notability of article topics, relationship to a "main" article for the project, centrality to understanding the project's subject area, reader interest and expectations and so on. The WikiProject Video game Importance scale takes the interesting approach of breaking the project scope into sub-areas such as video games and series, in-game elements, companies, hardware and so on, and giving different Top/High/Mid importance criteria for each sub-area. WikiProject Visual arts does not assess importance at all due to the difficulty in comparing such things as 19th century English history paintings, traditional Chinese porcelain and pre-Columbian architecture.

The proposed default scale documented at Importance scheme is based on the subject's notability within the field of knowledge covered by the project (an estimate of how many sources discuss the subject in some depth) combined with an estimate of whether there is worldwide interest compared to purely local interest:

Importance ratings are used by the Version 1.0 Editorial Team to decide which articles to include in an offline edition of Wikipedia. The editorial team's article selection bot also looks at factors such as the number of page hits, links from other pages, and a score of how broad the project is. The definitions in the Version 1.0 Editorial Team/Release Version Criteria are based on how central the subject is to the field, which may roughly correlate with notability, but ignores geographical distribution of interest in the subject: These two scales are somewhat inconsistent, and a given project may have its own scale. The common factor is that an article is assigned importance based on an informed view of how important the article is to the project's subject area, and may be used to prioritise work by project members. Ideally importance is assigned or reviewed by a project member. It should not be assigned based on a vague idea of how important the subject is in the wider scheme of things, or how important it is to readers. In the second quarter of 2017 Darth Vader consistently got more |World_War_II pageviews than United States and World War II combined. This factoid should not affect the importance ratings of these three articles.

Sample article
An example of a mid-importance article that meets the criteria for C class:

Slatsnovgrad is a hill village in the Tslatzyn province of Ruritania.

Slatsnovgrad is in the northeast of Tslatzyn province at coordinates 65.898°N, 72.147°W, at an elevation of 1673 m above sea level.[1] It may be reached by a two hour drive over a rough dirt road from Tslatzyn City to the south.[2] As of 2015 the population was 340, of whom 54% were female and 44.5% were under the age of 18. The economy is based on raising goats for milk, wool and meat.[3] There is a shop in the village that sells ammunition, gasoline, bread, olives and kefir (fermented goat milk).[2]

References

••• In this example there are no major problems with the prose and technical style, and most readers will not need more. Although there are still major gaps in information, the article tells the typical reader looking up the village on their phone all they want to know about Slatsnovgrad. It gives a C level of coverage for a minor village. An infobox with a picture and a better map would be nice, but these are not required. The article is mid-importance because the citations indicate that the subject has achieved notability, at least locally. It fills in minor details and may be of interest to readers other than social scientists who specialize in villages. Unfortunately, many reviewers would glance at the article, see just one paragraph on a small village, and give it Stub and Low importance ratings.

The example meets the needs of most readers, but there may be more to be said: There is a ruined stone fort in the north of the village that was the birthplace and power base of Borg the Greedy (d. 1154), regent of Ruritania from 1143 to 1154 during the Second Turkish War.[4] ... The Soviet-era Slatsnovgrad Dam supplies the village with water and hydroelectric power.[5] ... The locally fermented kefir is said to have aphrodisiac properties.[6] ... The additional details may be of interest to readers, but are not needed to achieve C class, since the casual reader would not know the details were missing. With this additional information, the article may have reached B class, or even A class if no sources give further information on the village. The serious student or researcher may be dissatisfied, but the article is a fairly complete treatment of the subject. There is no more to be said without indulging in original research. This leads to a paradox: the more information is available, the harder it is to get above C class.

To illustrate, suppose the village of Slatsnovgrad was founded shortly after the Second Turkish War to house peasants subject to Slatsnovka Abbey. The monks kept detailed records from the foundation of the village up to the revolution of 1923. The Ruritanian People's Republic published the complete records in 13 volumes between 1935 and 1939. Every birth, marriage and death is recorded, as is monthly weather, livestock numbers, crop yields, prices, building, road and irrigation works and detailed accounts of plagues, wars and rebellions. Several major academic books have been based almost entirely on the Annals of Slatsnovka, discussing what it reveals of different aspects of central European culture, history and economy. The Wikipedia article can never be more than a superficial overview of this huge trove of information. It cannot be considered "mostly complete" or "essentially complete", so must remain C class for ever.

Article life cycle
WikiProject assessment describes a smooth progression as an article moves step by step from Stub to Featured Article. The reality is different. The normal life cycle is "create as Stub or Start, then stagnate". The life cycle of a few select articles is "create as C, stagnate, upgrade to GA or FA, stagnate." (Controversial articles have a complex life cycle which is unrelated to quality assessments so not discussed here.)


 * Most articles on uncontroversial subjects are created with a series of edits, sometimes spread over several days.
 * Soon after a bot has put the new article onto project lists, a new article watcher rates it if the creator has not yet done so. Stub and Start are much the most common quality ratings, and "Low" the most common importance rating. The ratings are often incorrect and often premature.
 * The creator may continue to improve the article after the initial rapid assessment, but it is rarely re-assessed.
 * The article now enters a stagnant phase where various editors tweak spelling, punctuation, categories, links and so on, but add little real content. Editors working on related articles may add a sentence or two of more substantial content, but will usually leave the assessment unchanged.
 * A Stub may be nominated for deletion, prompting a rescue job and an upgrade to C class. This is not what the deletion process is for, but it happens.
 * An editor may take on the challenge of moving a C or B class article up to GA or FA status. There is a flurry of activity as editors add substantial content and make many copy edits, followed by approval of the upgrade. The article then becomes stagnant again.
 * Few articles are upgraded to A status, probably because of the lack of a recognition mechanism.

Ratings are used by bottom-feeding and top-feeding editors. One may question the value of developing an article to GA or FA status in an attempt to satisfy the serious student or researcher. Would any serious student or researcher use Wikipedia? Perhaps getting more articles up to C class, meeting the needs of most readers, gives greater payback. But most articles stay as Stub class for ever, or move to the Start class garbage can. A Start class article "... is quite incomplete ... might or might not cite adequate reliable sources ... is weak in many areas. Quality of the prose may be distinctly unencyclopedic, and MoS compliance non-existent ... needs substantial improvement in content and organisation. Also improve the grammar, spelling, writing style and improve the jargon use." No sane editor would want to fix up a mess like that.
 * Bottom-feeding editors work through sets of Stub articles making the same enhancements to all of them, such as adding an infobox and basic data from a standard source. Their reward is knowing that they have added useful information to a lot of articles without doing any heavy-duty research. They rarely change the assessment.
 * Top-feeding editors browse among the B or C class articles, bringing them up to GA status, or try to bring GA articles to FA status. Their reward is bragging rights, and perhaps publication of "their" article on the front page.

Statistical analysis
Statistics for the English Wikipedia derived from Wikipedia:Version 1.0 Editorial Team/Statistics as of 2017-05-17 follow. Where an article has been rated for quality and/or importance by more than one project, the highest quality and importance ratings are used. Thus an article counts as high importance if it is high importance for WikiProject Furry even if it is low importance for WikiProject Anime and manga.

Counts of articles as of 2017-05-17 by quality rating and by importance:

Chart 1 shows the distribution of articles with different quality ratings across the various levels of importance. Most articles are considered low importance, or have not had their importance assessed, and almost all articles that have had their quality assessed are rated Stub or Start: does not meet the needs of most users. This may not be a serious issue. A Stub class article for an obscure subject may slightly annoy the rare reader who is searching for information on the subject, but otherwise does no harm. If most searches find articles that are C-class or above, Wikipedia is working well.

Chart 2 zooms in to show the distribution of top-, high- and mid-importance articles. Average quality is better for top- and high-importance articles than for mid-importance articles since project members are more likely to focus on improving the more important articles. Of the 51,011 top-importance articles there are only 4,243 Stub articles and 17,346 Start articles. Stub and Start articles still account for most mid-importance subjects, and by definition do not meet the needs of most users. This may not be a serious concern if, as is often the case, importance ratings are unrelated to levels of reader interest.

Chart 3 shows the distribution by importance of articles with quality GA and above. FA articles are most likely to be for top- or high-importance topics, while GA articles include more mid- and low-importance topics. There are relatively few A-class articles, perhaps due to the lack of reward for taking an article to this level. If an editor is going to make the effort to bring an article up to A, they may as well take it all the way to FA.

Educationalists have found students retain most interest in a subject when they score about 70% on tests. If they score much higher, they think the subject is boring. It they score much lower they think it is too hard, and may give up. Well-designed academic tests aim for a median score around 70%. If we take C-class or above as a success, only 10% of editors succeed. We are desperately short of new editors. Possibly the criteria are too rigorous or the scoring is too harsh.