Wikipedia:Qualify evidence

Evidence should be put in context, or qualified in Wikipedia articles, especially for sources that should not be taken at face value. Some articles are tagged with qualify evidence, and this page helps describe what to do for them, and how to treat factual claims in general.

Things to check for scientific studies mentioned in articles:
 * Have other studies attempting to replicate the results succeeded, contradicted, or been inconclusive? If no replication has been attempted, that is important to mention in the article because it means the results are significantly less trustworthy.
 * What do secondary sources say about the claims? Is there scientific consensus? Are there multiple lines of evidence all pointing in the same direction? Are there competing theories or other possible explanations not mentioned by the study? Are these tested or untested? Are the findings generally disputed or dismissed? Sometimes primary sources contradict each other; see Identifying and using primary sources.
 * Correlation does not imply causation; articles should be careful not to imply results reporting a correlation have found the cause of a disease or other phenomenon.
 * Be especially careful with medical studies (more specific advice below).

In general:
 * Don't report scientific claims as facts unless there's a scientific consensus supported by secondary sources; otherwise, they need to be attributed to specific studies.
 * Predictions about future events (such as economic trends) – even those grounded in studies – are generally not encyclopedic, but some topics about the future (such as global warming and the ultimate fate of the universe) are appropriate, when sourcing is comprehensive and balanced. See WP:CRYSTAL for details.
 * Expert opinions unsupported by rigorous study should be avoided, or heavily qualified.
 * Watch out for sources representing only one perspective on a controversial question.

Editors may need to:
 * Add details from the cited sources that clearly and precisely help readers understand the limitations of the source material.
 * Add material from other sources that give context or which are more up to date.
 * Caution readers that studies that do not have scientific consensus might not be reliable. For example, link to levels of evidence, explain the strengths and weaknesses of a certain type of evidence, or quantify how likely something is to pan out (e.g. how many drug candidates become approved drugs, or how many psychology studies fail attempted replication).
 * Remove sources that turn out to be unreliable.
 * Become familiar with the scholarly methods of the specific field.

Expert opinions are often wrong
Wikipedia generally avoids trying to predict the future, but in some cases does discuss things that are have not actually happened. In medical levels of evidence, expert opinion is considered the least reliable, when compared with conclusions supported by epidemiology or experiment. In general, including fields like business and government, expert opinions are wrong more often than not. (See e.g. Wrong by David H. Freedman.)

When reporting expert opinions, Wikipedia editors should seek out multiple independent perspectives. When there is only one source, that should be carefully reported as unconfirmed and the source should be characterized. For example, is a report on Internet traffic patterns and likely future bottlenecks something published by a business which has a stake in building more telecommunications services, or was it an impartial government report? Is a prediction about the number of deaths that would be caused by a certain type of earthquake made by a group seeking donations for preparedness, a government agency prioritizing resources, or a business that earthquake-proofs buildings?

Leaping to conclusions
Correlation does not imply causation. A study that shows the presence of peanut butter correlates with the presence of jelly does not imply that smearing peanut butter on a slice of bread will cause jelly to appear.

When describing studies that prove or suggest correlations, Wikipedia should make clear that this does not prove causation (unless there are other studies or sources which support causation).

Single study syndrome
Wikipedia should make a special effort to avoid "single study syndrome" - reporting the results of a new study at face value as if it were a reliable, true, and complete picture of the subject. Keep in mind that:


 * Many or most scientific studies cannot be reproduced. Stats and examples:
 * Replication crisis
 * Reproducibility
 * Retraction
 * Any given single study could simply be mistaken due to random chance, even if conducted in a reliable manner.
 * Any given single study be flawed, for example due to poor methodology, conflict of interest, or fraud. (Consider the randomized controlled trial that accurately reported that parachutes did not reduce injury compared to an empty backpack for 23 volunteers who jumped from aircraft...which were parked on the ground.)
 * Advocates with a political agenda and promoters of fringe theories tend to cherry-pick facts and studies that support their own opinions, and this can show up in generally neutral, reliable sources

Systems like living bodies, the human mind, and societies are extremely complicated, and a single study might not take all of that complexity into account (often because the true nature of the system has yet to be discovered). For example, a sociology study conducted at noon might conclude that Californians frequently travel outside their homes, whereas one conducted at 3am might conclude they almost never leave home. Over time, secondary social science sources should become aware of these conflicting studies; if that's the state of knowledge at the moment, Wikipedia should report the conflict rather than decide the issue one way or another. With even more time and more studies, sociologists might resolve the conflict by deducing that Californians tend to leave home during the day and stay home at night because they are asleep. When a new consensus emerges in secondary and tertiary sources, Wikipedia should report that. It should also beware of primary sources that challenge the consensus - expert evaluation is needed as to whether they clearly indicate a new conflict vs. spurious or mistaken results.

Science
The scientific method demands reproducibility of results before concluding that a given claim is proven to a high degree of reliability. When citing primary sources like studies, Wikipedia should make clear where a given claim is in the social process of gathering evidence for and against. Science is a social enterprise; the elementary school "observation, hypothesis, experiment, conclusion" is only the beginning of the real process!

Medicine
Identifying reliable sources (medicine) has the full details, but essentially:
 * Avoid primary sources that are unsupported by secondary sources
 * Avoid new research that has not been scientifically reviewed (even if published in a peer-reviewed journal)
 * Summarize scientific consensus, giving due weight to dissenting views
 * Describe the level of evidence of a given source
 * Avoid over-emphasizing single studies, particularly in vitro or animal studies
 * Use up-to-date evidence

Remember that even clinical trials in humans have phases. Only about 18% of new drug candidates pass Phase II and only about 50% of those that make it to Phase III are approved.

Link to the article levels of evidence so readers can understand what it means that a study hasn't been reproduced, or that it used a randomized controlled trial vs. epidemiological study vs. animal experiment. (Good example: Postperfusion_syndrome)

Where medical claims are unsupported or where the cited source is primary (thus lacking context) or not authoritative on medical matters (e.g. popular press):
 * Medref - Whole sections or articles
 * Medical citation needed - Inline
 * Unreliable medical source - Inline

Examples of dealing with contradictory results or shaky evidence:
 * Bulletproof diet (cf. Bulletproof Coffee)

Psychology
The Reproducibility Project could reproduce only 36% of the published, peer-reviewed psychology experiments it tested.

Natural sciences

 * Identifying reliable sources (natural sciences)

History

 * Historical method