User:Nikkimaria/How to spotcheck

Spotchecking sources is the process of checking an article's content for adherence to reliable sources (as required by WP:V) and avoidance of copyright violations, plagiarism and close paraphrasing. These checks are frequently done during an article's featured article candidacy, but may be done at any point. This is a brief how-to guide to completing spotchecks.

Verifiability
The first step is to ensure that all content in the article requiring a citation has one. A very rough rule of thumb is a minimum of one source per paragraph; however, most good-quality articles have a much greater citation density. In general, anything contentious or likely to be challenged, and any quotations, must be cited; common knowledge (such as the oft-used example, "The sky is blue") need not be cited. Additionally, the citations present must have sufficient information to allow the content to be verified. This means that book citations should have page numbers, web citations should not have dead links - in general, the citations should be as complete and accurate as possible, and at minimum must allow interested editors to be able to locate the source used.

Once these basic requirements for verifiability have been met, compare the article to the sources. Given that it is often not possible to check every single source, a variety of approaches can be taken to spotcheck. Read the article and look for surprising or striking facts, or possible breaks in logic. Pick one interesting paragraph or section and check it in its entirety. Pick one source that is cited multiple times and check every citation to it. Select citations to check at random. No matter what method is chosen, enough citations should be checked to reasonably provide a representation of the entire article. One sentence unsupported by one citation checked is possibly indicative of wider problems in the article, but one sentence supported by one citation checked should not be assumed to indicate that the article is a perfect representation of sources.

Copyvio/paraphrasing
A potential starting point for checking for copyvio and plagiarism is the use of automated tools, such as User:CorenSearchBot/manual or Earwig's tool. There are also a variety of non-Wikipedia-based tools available for this purpose, such as Plagiarism Checker or Viper. This method can quickly locate blatant plagiarism or copyvio. However, there are two important caveats. First, it can sometimes produce false positives through comparison to a mirror site; sites reported as copies should be checked to verify that they do not copy material from Wikipedia, or if their provenance is unclear to check that the site predates the Wikipedia article. Second, it can often produce false negatives, either when the article is plagiarized from a non-web source or when it is closely paraphrased but not directly copied. For these reasons, automated tools cannot be relied upon as the only check for close paraphrasing. Additionally, checking for close paraphrasing manually allows this check to be combined with that for verifiability.

As with verifiability, there are a number of approaches to take in manually checking for close paraphrasing:
 * The reading method: read the article and look for phrases that do not fit with the surrounding prose. Indicators can include a sudden shift in tone, a sudden change in language use, or anything that makes a particular phrase or section seem distinct from the rest of the article. Be aware that shifts in language can also result from co-authored or heavily copy-edited articles.
 * The Google method: input phrases into a search engine, either randomly or targetting specific sections that seem suspicious. This method is slightly more useful than automated tools, but has similar drawbacks.
 * The availability method: check some or all sources that are available to you. This is useful for articles heavily based on offline or subscription sources.
 * The targeted method: check either an entire paragraph or section, or all content cited to a heavily-used source.
 * The random method: randomly check a selection of sources or sections.

These methods can be combined for more targeted spotchecking. As with verifiability checks, enough content should be checked to give a representative sample of the article.