User:Uncle G/On sources and content

 The underpinnings of Wikipedia's sourcing and content policies Wikipedia has several basic sourcing and content policies. Of them, the Neutral Point of View, Verifiability, and No original research policies are not fundamental to the nature of an encyclopaedia. Indeed many well-known encyclopaedias have no such policies. They are, however, necessary consequences of the organization that develops, publishes, and maintains Wikipedia, and of the way that Wikipedia is written. What Wikipedia is not and its ancillaries are fundamental to the nature of an encyclopaedia.
 * Wikipedia takes no sides in debates. It has no opinion.
 * Readers must be able to check all content against sources outside of Wikipedia.
 * Everything in Wikipedia must have been through a process of fact checking, peer review, publication, and acceptance into the general corpus of human knowledge. This process occurs outside of Wikipedia.
 * The goal of the project is an encyclopaedia, not something else.

The rationales for our content policies
The underpinnings of our content policies lie in the non-profit nature of the Wikimedia Foundation, and in the nature of an encyclopaedia written using a wiki where anyone can create an account.

The Foundation and neutrality
The Wikimedia Foundation is a 501(c)(3) organization. As such, it is legally prohibited under United States law from performing certain political activities. (See 501(c)(3) for details.) The Foundation takes this further, and espouses adherence to the Neutral Point of View, which applies across the board in all fields, not just to politics. The NPOV is one of a handful of Foundation issues that applies without exception across all Foundation projects.

Thus one of Wikipedia's content policies is required to be that all encyclopaedia content be written from the neutral point of view. This policy can be found at Neutral point of view.

The Wikipedia model and sources and fact checking
Wikipedia's Verifiability and No Original Research content policies are consequences of the nature of how Wikipedia is written, and the implications that that has for what readers can trust.

Conventional, proprietary content, encyclopaedias are written by invited, sometimes paid, named expert authors, selected by the publisher. Readers trust the encyclopaedias insofar as they trust the named expert authors. The trustworthiness and reliability of the encyclopaedia is based directly upon the scholastic reputations of its authors.

At various times, people have tried to create free content encyclopaedias using the same model (although almost never including paying the experts). They have all been relatively unsuccessful. Wikipedia employs a different model. Wikipedia is written using a wiki. Anyone can create an account, and people can edit even without creating accounts. There's no mechanism for vetting editors, for checking that editors are who they claim to be, or for checking that editors have the qualifications and skills that they claim to have. Similarly, Wikipedia also has no mechanism for fact checking. It has no procedures and mechanisms for testing, reproducing, and confirming primary research.

Therefore Wikipedia does not base its trust model, that it provides to readers, on the reputations of its editors. Instead, Wikipedia's trust model is that readers can check, for themselves, using external sources cited in articles, all of the contents of articles. Readers can check that articles are right by seeing that they agree with the sources, and can trust that the sources are in turn right because what they contain has been through a process of fact checking, publication, peer review, and acknowledgement outside of Wikipedia. Readers are explicitly told this in the Risk disclaimer, which is hyperlinked-to at the bottom of every page on the project.

Wikipedia's content policies are designed to produce articles that follow that model. Thus everything in Wikipedia must be sourced (the Verifiability policy), and there are restrictions on what sources may be used based upon their provenances &mdash; i.e. whether what they contain has been fact checked, peer reviewed, published, and acknowledged as having become a part of the corpus of human knowledge (the No original research policy).

The mandate to be an encyclopaedia
The content policies grouped together under What Wikipedia is not, and its ancillaries such as Wikipedia is not a dictionary, are elaborations of the goal, set right from the start of the project, of being an encyclopaedia, as opposed to something else.

Discussion of some common canards
There are various misunderstandings of our content and sourcing policies that editors sometimes have.

There are no exceptions to everything
One common canard is that there exists content that "should not be sourced". This canard is based upon the false premise that it is somehow difficult or cumbersome to source content that is "common knowledge". The argument is that because it is somehow difficult or cumbersome to source such content, it is exempt from the requirement that everything must be sourced. This is untrue. There are no exceptions to everything.

The premise is false for several reasons.

The first problem with the premise is its assumption that there is such a thing as "common knowledge" at all. Wikipedia addresses a global readership. Many examples put forward of "common knowledge" are, in fact, not common knowledge to Wikipedia's entire readership, and the assumption that they are common knowledge is in fact an instance of systemic bias. One example of "common knowledge" that has been presented in such arguments is the datum that "Paris is a city in France.". But, in fact, knowledge of the names of cities in foreign countries is one thing that is famously not common. Such a datum is not something that all readers everywhere will know.

The second problem with the premise is the assumption that such data are difficult or cumbersome to source. In fact, they are easy to source. Taking again the example that "Paris is a city in France." one can consult Paris and see that there are several sources already cited in the article against which readers can check that fact.

An overall problem with the argument is that working on the basis that there is a class of content that is "too obvious to require sourcing" actually results in a bad encyclopaedia. "Common knowledge" that is "obvious" is often wrong. Whereas requiring that everything be sourced leads to a correct, if surprising to many people, encyclopaedia. A classic example of this is "The sky is blue.". As Dpbsmith so eloquently points out, writing an encyclopaedia based upon this "common knowledge" actually leads to a factually incorrect encyclopaedia, because, for one thing, the sky is not always blue: {{talk quote block|text=Your example is poorly chosen, first because the sky is not always blue and therefore this is a "fact" that is not really quite true, and secondly, because as is so often the case in of things that are "common knowledge," it is very easily sourced. Rather than fussing about the "fact" tag, why not just say:
 * A field guide notes that "the blue sky is so commonplace that it is taken for granted". It is a deep, saturated blue after a rainstorm.

One can go on to add:
 * The poet Robert Service says "while the blue sky bends above/You've got nearly all that matters" Songwriter Irving Berlin wrote of "Blue Skies smiling at me," airmen fly into the wild blue yonder. But the sky is not always blue. In the Bible, Jesus says to the Pharisees "When it is evening, ye say, It will be fair weather: for the sky is red" . At twilight, salmon reds, oranges, purples, white-yellows, and many shades of blue can be seen . And songwriter Oscar Hammerstein's famously wrote of "when the sky is a bright canary yellow."

It took me less than ten minutes to turn up the Schaeffer and Minnaert sources and another fifteen to find the rest. If something is really a commonly known fact, it is just not that hard to source

The requirement is only that the sources be cited somehow
Another common canard is to conflate the citing of sources with the linking of individual cited sources to individual parts of an article. From this error stems fallacious arguments about how requiring that every single sentence be sourced results in excessive referencing.

The verifiability requirement is merely that all sources be cited, and that no content be included for which there is no source. A citation is the raw information necessary for readers to uniquely identify, and to locate, a source. As long as that information is present in the article, it is possible for a reader to check the article contents. An article is verifiable if all of its content can be found, by readers, in one or more of the sources that it cites.

Citing sources is a style guide, not a policy. The cross-linking of specific source citations to specific parts of an article is a matter of house style, not of verifiability. An article that doesn't do this is not unsourced. It simply isn't spoon-feeding readers the exact source(s) to check for each individual part of the article. In general, it is an improvement to an article to link sections, or even paragraphs, of the article content to specific citations, using &lt;ref&gt; or Harvard referencing. It makes checking articles against the sources that they cite easier for both readers and editors. But precisely how such links are made, a matter that varies from article to article (because it depends from the exact content and sources of each article), is a matter of Wikipedia house style, not a matter of content policy.

Requiring that everything be sourced is not the same as requiring that the linkage between content and citations always be at the level of individual sentences.

Being "wiki" does not mean being first
One common canard is to read discussions of the advantage that Wikipedia has over other encyclopaedias, of being rapidly updated when things change, and to infer from that the erroneous conclusion that Wikipedia is the first place to come to document new things. It is not.

One implication of our content policies is that Wikipedia must never be the first to publish anything new. Wikipedia is not the place to come to document the previously undocumented, to report new discoveries, to publish new theories, or to report news.

The project for reporting news is Wikinews. Everything in Wikipedia must have been through a process of fact checking, peer review, publication, and acceptance into the general corpus of human knowledge. In the case of news, that process is performed by journalists and editors. Current events belong in Wikipedia only after they have been reported. News is not part of the corpus of human knowledge before it is reported.

The places for publishing new theories and new discoveries are the appropriate scholarly journals. For new discoveries and theories, it is those journals that perform the process of fact checking, peer review, and publication, and from those journals that new things are accepted into the general corpus of human knowledge.

Tips for editors
 The proper study of encyclopaedists The proper study of encyclopaedists is finding, reading, evaluating, and using sources.

Always work from and cite sources
Always have sources to hand when creating and expanding articles. Don't write articles based upon your own personal hypotheses and inferences. Don't write articles based upon knowledge that you half-remember learning, but have no idea from where or from whom. Write articles based upon actual, concrete, sources, and ensure that the article cites those sources. If you half-remember something, go and hunt up a source that covers it first, then write.

Always cite the sources that you used, preferably in the very first edit. Tell readers and editors where your knowledge of a subject is coming from. Showing the sources that you used, and that indeed you are actually working from sources in the first place, neatly avoids many pitfalls that editors can find themselves falling into: Articles that cite sources are rarely even nominated for deletion, let alone deleted. Citing multiple, independent, non-trivial sources in articles on bands, people, groups, companies, web sites, and so forth avoids any questions of notability, since it becomes apparent directly from the references and further reading sections of the article that the relevant notability criteria are satisfied. And if you cite sources, the content that you submit will be stable and remain in the encyclopaedia. Whereas, in contrast, content that has no sources is subject to removal at any point by any editor.

Even if you don't use some sources for actually constructing the article content, cite them in the article in a "further reading" section so that they are available to other editors. These can then be used to expand the article later on, moving the citations from the "further reading" section into the "references" section.

How to deal with unsourced content
There is no, one, single, well-defined step-by-step flowchart for dealing with unsourced content. How to deal with unsourced content varies from case to case. However, the following principles should be kept in mind:
 * Bad content damages Wikipedia. If some unsourced content is glaringly wrong, it should be removed from the article.  Optionally, one can bring it up for discussion on the article's talk page.
 * Libellous content damages Wikipedia. If an article contains unsourced content about a person that is controversial, then it must be removed from the article.  It must not be moved to the talk page.
 * Wikipedia is a collaborative project. There is a whole range of cleanup templates available for use, and every article has its own talk page.
 * The goal is to improve Wikipedia. The process of removing the material and then demanding that other editors find sources entails fuss.  If you hunt up and cite an additional source, that provides a basis for some currently unsourced material, yourself, that improves Wikipedia with a lot less fuss.

Remember that readers don't trust you
One popular mistake that editors make is where an article contains factual errors, the erroneous content is properly sourced, and editors that personally know the truth correct the article so that it disagrees with what the sources say. Another popular mistake is to add content that an editor knows to be true, but also knows cannot be found in any source. These are in fact the obverses of the same coin.

Articles must always reflect what the sources say, and no more. Wikipedia's trust model does not involve relying upon the sole words of Wikipedia editors. Always think of such situations from a reader's perspective. When faced with a group of fact-checked, published, sources that say one thing, and a Wikipedia editor, about whom they can know nothing, who asserts something contradictory, readers trust the sources not the Wikipedia editor.

Therefore if something is sourced but wrong the correct ways to tackle the issue are:
 * Contact the sources and have them publish a correction. Don't correct the encyclopaedia; correct the source.
 * Find, cite, and use another, better, source that provides better information or that demonstrates why all of the other sources are wrong. Counter sources with more sources.
 * Publish your original research, that provides new, never-before-published, information that contradicts what has heretofore been published, in the appropriate venue outside of Wikipedia, such as a scholarly journal or a book. Create a source yourself. However, note that you may have a conflict of interest. Please see also Conflict of interest.

Evaluating sources
One of the things that both readers and editors must do is consider the source. There are very few hard and fast rules for what constitutes a good source, since it very much varies from case to case. However, there are a general set of questions that must be answered when editors evaluate a source:
 * Who wrote and who published it? : A named author, and a named publishing organization, have reputations to protect, and those reputations can be checked. Forms of publication where authentication is unreliable or impossible must be avoided.  Some examples: Usenet postings are trivially easy to forge in other people's names, as are electronic mail messages.  Web pages with unidentifiable or pseudonymous authors/publishers cannot be trusted.  And the same trust considerations, about lack of mechanisms to reliably identify editors and to check that they are who they claim to be, apply to other wikis just as much as they apply to Wikipedia.
 * Was fact the goal? : The author and publisher should have had the intention of publishing a factual work. Jokes, comedy, satire, and so forth are not factual sources about the world.  The goal of a television comedy programme, for example, is entertainment, not the documentation of human knowledge.
 * How and by whom was it fact checked? : How many other people checked the source before publication is important.  For example: Something that someone publishes directly on their own web site is less likely to have been fact checked by other people before publication than an article written by a journalist and published in a newspaper, not least because the journalist has to get the article past at least one other person in order to have it published: an editor.  Similarly, a few academic journals have what amounts to an "accept all comers" policy, where peer review is seen on the level of "letters to the editor" after publication; whereas other academic journals have rigorous peer review processes that occur prior to publication.
 * When was it published? : Human knowledge changes.  Contemporary scholarship is more likely to reflect current knowledge than older scholarship.  But, conversely, in many cases it is the older scholarship that is still the accepted authority.
 * How many other sources on the subject exist, and how many agree and disagree with it? : Evidence that something has been acknowledged by people other than the subject's own proponents/creators/authors/inventors, and thus become a part of the corpus of human knowledge, is that sources from those other people also exist.  Evidence that a source is correct is that other people have performed research in the same area, and published material that concurs.
 * How directly and in what level of detail does it address the subject at hand? : The depth of a source is important.

Sources in languages other than English
In the case where the original source material is not in the English language, there is a tension between accessibility and verifiability:


 * On the one hand, this is the English language Wikipedia. Readers may not be able to read source materials in other languages, and thus require translations of those source materials into English so that they can read them.
 * On the other hand, translations, whether performed by a Wikipedia editor directly when writing content or performed by a separate translator who has published a translation of the source material, are inherently subject to error. A translation may be inaccurate or misleading. Readers require the ability to verify what the original source material actually said in its original language. Furthermore, readers require the ability to verify that the source material was published by a credible source and peer reviewed, and thus require information about the original source.

Therefore, where the original source material is in a language other than English:


 * Cite the original source in the original language, so that readers and editors can evaluate the reliability and credibility of the original source, can determine whether the original source was peer reviewed, and can verify that the article content is supported by the source material.
 * If any published translations of that material into English are used, cite them, so that readers and editors can evaluate the reliability and credibility of the translator, can verify the accuracy of the translation (by cross-checking with other published translations, by translating themselves, and by examining the reputation of the translator), and can verify that the article content is in fact supported by the translation used.
 * Where sources are directly quoted, prefer published translations, where the reliability and credibility of the translator can be evaluated by readers, over performing one's own translations directly. (If a published translation is erroneous, then the error can be pointed out. Cite sources for such errata, of course.)
 * When directly quoting a non-English source, performing one's own translation into English without reference to a published translation, then always cite (or outright supply) the original quote in the original language, so that readers can check the translation made.