Wikipedia:WikiProject WikiFundi Content/Wikipedia:Plagiarism

Plagiarism is taking credit for someone else's writing as your own, including their language and ideas, without providing adequate credit. The University of Cambridge defines plagiarism as: "submitting as one's own work, irrespective of intent to deceive, that which derives in part or in its entirety from the work of others without due acknowledgement."

Wikipedia has three core content policies, of which two make it easy to plagiarize inadvertently. No original research prohibits us from adding our own ideas to articles, and Verifiability requires that articles be based on reliable published sources. These policies mean that Wikipedians are highly vulnerable to accusations of plagiarism, because we must stick closely to sources, but not too closely. Because plagiarism can occur without an intention to deceive, concerns should focus on educating the editor and cleaning up the article.

Sources are annotated using inline citations, typically in the form of footnote (see Citing sources). In addition to an inline citation, in-text attribution is usually required when quoting or closely paraphrasing source material (for example: "John Smith wrote that the building looked spectacular," or "According to Smith (2012) ..."). The Manual of Style requires in-text attribution when quoting a full sentence or more. Naming the author in the text allows the reader to see that it relies heavily on someone else's ideas, without having to search in the footnote. You can avoid inadvertent plagiarism by remembering these rules of thumb:

Plagiarism and copyright infringement are not the same thing. Copyright infringement occurs when content is used in a way that violates a copyright holder's exclusive right. Giving credit does not mean the infringement has not occurred, so be careful not to quote so much of a non-free source that you violate the non-free content guideline. Similarly, even though there is no copyright issue, public-domain content is plagiarized if used without acknowledging the source. For advice on how to avoid violating copyright on Wikipedia, see Copyright violation. For how to deal with copying material from free sources, such as public-domain sources, see below.

Forms of plagiarism
Plagiarism is presenting someone else's work – including their language and ideas – as your own, whether intentionally or inadvertently. Because it can happen easily and by mistake, all editors are strongly advised to actively identify any potential issues in their writing. Plagiarism can take several forms.

Free and copyrighted sources

 * The above example is the most egregious form of plagiarism and the least likely to be accidental.


 * This can look as though the editor is trying to pass the text off as their own. It can happen by accident when inline citations are moved around during an edit, losing text-source integrity. It can also happen when editors rely on general references listed in a References section, without using inline citations.


 * Summarizing a source in your own words does not in itself mean you have not plagiarized, because you are still relying heavily on the work of another writer. Credit should be given in the form of an inline citation.

Copyrighted sources only

 * Here the editor is not trying to pass the work off as their own, but it is still regarded as plagiarism, because the source's words were used without in-text attribution. The more of the source's words that were copied, and the more distinctive the phrasing, the more serious the violation. Adding in-text attribution ("John Smith argues ...") always avoids accusations of plagiarism, though it does not invariably avoid copyright violations. See Respecting copyright below for more on using copyrighted sources. Be cautious when using in-text attribution, because it can lead to other problems. For example, "According to Professor Susan Jones, human-caused increases in atmospheric carbon dioxide have led to global warming" might be a violation of NPOV, because this is the consensus of many scientists, not only a claim by Jones. In such cases, plagiarism can be avoided by summarizing information in your own words or acknowledging explicitly that while the words are from Jones, the view is widespread.

Avoiding plagiarism

 * For avoidance of plagiarism of text copied from compatibly licensed copyleft publications and public domain publications, see also the section below: Copying material from free sources

You can avoid plagiarism by summarizing source material in your own words followed by an inline citation, or by quoting or closely paraphrasing the source, usually with in-text attribution (adding the author's name to the text) and an inline citation. The following examples are adapted from "What Constitutes Plagiarism?", Harvard Guide to Using Sources, Harvard University:

Respecting copyright
Regardless of plagiarism concerns, works under copyright that are not available under a compatible free license must comply with the copyright policy and the non-free content guideline. This means they cannot be extensively copied into Wikipedia articles. Limited amounts of text can be quoted or closely paraphrased from nonfree sources if such text is clearly indicated in the article as being the words of someone else; this can be accomplished by providing an in-text attribution, and quotation marks or block quotations as appropriate, followed by an inline citation.

Translating
If the source is in a language other than English, the contributor may be under the mistaken belief that the act of translation is a sufficient revision to eliminate concerns of plagiarism. On the contrary, regardless of whether the work is free, the obligation remains to give credit to authors of foreign language texts for their creative expression, information and ideas, and, if the work is unfree, direct translation is likely to be a copyright violation as well.

What is not plagiarism
Charles Lipson states that all plagiarism rules "follow from the same idea: acknowledge what you take from others. The only exception is when you rely on commonly known information." Plagiarism is less a concern where the content both lacks creativity and where the facts and ideas being offered are common knowledge. Here are some examples where in-text attribution is generally not required, though you may still need to add an inline citation:
 * use of common expressions and idioms, including those that are common in sub-cultures such as academia;
 * phrases that are the simplest and most obvious way to present information; sentences such as "John Smith was born on 2 February 1900" lack sufficient creativity to require attribution.
 * simple, non-creative lists of information that are common knowledge. If the list is drawn from another source (i.e., it is not common knowledge), or if creativity has gone into producing a list by selecting which facts are included, or in which order they are listed, then reproducing the list without citing its source may constitute plagiarism.
 * mathematical and scientific formulae that are part of the most basic and general background knowledge of a field, E = mc2 and F = ma (where, even in these cases, for deeper reader understanding, a citation may be best practice);
 * simple logical deductions.

Copyright violations
If you find duplicated text or media, consider first whether the primary problem is plagiarism or copyright infringement. If the source is not in the public domain or licensed compatibly with Wikipedia, or if you suspect that it is not, you should address it under the copyright policies.

How to find text plagiarism
There are several methods to detect plagiarism: plagiarized text often demonstrates a sudden change from an editor's usual style and tone and may appear more advanced in grammar and vocabulary. Plagiarized material may contain unexplained acronyms or technical jargon that has been described in an earlier part of the plagiarized document. Because plagiarized material was written for other purposes, it is often un-encyclopedic in tone. An editor who plagiarizes multiple sources will appear to frequently and abruptly change writing styles.

An easy way to test for plagiarism of online sources is to copy and paste passages into a search engine. Exact matches, or near matches, may be plagiarism. When running such tests, be aware that other websites reuse content from Wikipedia. A list of identified websites which do so is maintained at Mirrors and forks. It is usually possible to find the exact version in article history from which a mirror copy was made. Conversely, if the text in question was added in one large edit, and the text closely matches the external source, this is an indication of direct copying. When in doubt, double check search engine results with an experienced Wikipedian.

Another option is to utilize a plagiarism detector, such as those found at Category:Plagiarism detectors. Plagiarism detection systems, some of which are freely available online, exist primarily to help detect academic fraud. Wikipedia does not endorse, or recommend, any external services, so your own experience will be the guide.

It can also be useful to perform a direct comparison between cited sources and text within the article to see if text has been plagiarized, including too-close paraphrasing of the original. Here it should be borne in mind that an occasional sentence in an article that bears a recognizable similarity to a sentence in a cited source is not generally a cause for concern. Some facts and opinions can only be expressed in so many ways and still be the same fact or opinion. A plagiarism concern arises when there is evidence of systematic copying of the diction of one or more sources across multiple sentences or paragraphs. In addition, when dealing with non-free sources, be sure that any appropriated creative expressions are marked as quotations.

Addressing the involved editor
If you find an example of plagiarism where an editor has copied text, media, or figures into Wikipedia without proper attribution, contact the editor responsible, point them to this guideline, and ask them to add attribution. Attribution errors may be inadvertent, intentional plagiarism should not be presumed in the absence of strong evidence. Start with the assumption of good faith; contributors may not be familiar with the concept of plagiarism. It may be helpful to refer them to Verifiability, Citing sources, and/or Help:Citations quick reference. Editors who have difficulties or questions about this guidance can be referred to the Help Desk or media copyright questions.

As well as requesting repair of the example you found, you may wish to invite the editor to identify and repair any other instances of plagiarism they may have placed before becoming familiar with this guideline. If an editor persists in plagiarizing, report the editor to the administrators' noticeboard. Be sure to include diffs that show both the plagiarism and the warnings.

Repairing text plagiarism
It may not always be feasible to contact the contributor. For example, an IP editor who placed text three years ago and has not edited since is unlikely to be available to respond to your concerns. Whether you are able to contact the contributor or not, you can also change the copied material, provide attribution, or source on your own. Material that is plagiarized but which does not violate copyright does not need to be removed from Wikipedia if it can be repaired. Add appropriate source information to the article or file page, wherever possible. With text, you might move unsourced material to an article's talk page until sources can be found.

How to find media plagiarism
This can begin with a commonsense question: does it seem likely that the uploader is the original source? The person who scans an image from an 1825 textbook on herbs is unlikely to be the author, even if they have claimed PD-self. Sometimes doubts may be triggered by the professional quality of media, or by the exclusivity. If you suspect plagiarism, try to locate the original source through an online search engine such as Google Image Search. Other factors to consider include the editing history of the uploader and, with images, image metadata, such as Exif and XMP.

Frequently, a person who uploads and claims credit for another's image will leave the original image metadata, or a visible or invisible digital watermark, in place. If the author information conveyed by the metadata, or watermark, contradicts the author information on the image description page, this is a sign the image requires investigation. A user's original photographs can also be expected to have similar metadata, since most people own a small number of cameras; varied metadata is suspicious. Suspicions based on metadata should be checked with other editors experienced with images and other media.

Source and licensing information
For images and other media, the correct source and licensing information must be supplied, otherwise the files run the risk of deletion. Never use PD-self, GFDL-self or self if the image is not yours. If the source requests a credit line, e.g. "NASA/JPL/MSSS", place one in the author field of information.

Copying material from free sources
The guidance in this section must not be read in isolation. Inline citations to a source are still required as described in the Verifiability policy and added to an article as explained in the guideline citing sources. Attribution as described in this section is an addition to those requirements.

Attribution templates
For public-domain sources, using citation-attribution, source-attribution, or a similar attribution template is acceptable to acknowledge the work of others and still allow subsequent modification. See the next section for more on using attribution templates with compatibly licensed sources; the proper template may vary by the license of the source.

Compatibly-licensed sources
If the external work is under a copyleft license that removes some restrictions on distributing copies and making modified versions of a work, it may be acceptable to include the text directly into a Wikipedia article, provided that the license is compatible with the CC BY-SA and the terms of the license are met. (A partial table of license compatibility can be found at the Copyright FAQ). Most compatible licenses require that author attribution be given, and even if the license does not, the material must be attributed to avoid plagiarism. Attribution for compatibly licensed text can be provided through the use of an appropriate attribution template, or similar annotation, which is usually placed in a "References section" near the bottom of the page (see the section Where to place attribution).

Templates for compatibly licensed sources include:
 * Dual: for content imported from a source that may be reused under both CC-By-SA 3.0 and GFDL
 * CCBYSASource: for content imported from a source compatible for reuse under CC-By-SA 3.0 but not GFDL
 * CC-notice: for content imported from a source compatible for reuse under CC-By-SA 3.0 but not GFDL

Care must be taken to check that what appears to be a compatible licence is indeed compatible. Some websites allow text to be copied for educational or non-commercial use. Such text is not compatible with the Wikipedia licences because the text must be free to be used and distributed commercially.

Public-domain sources
Whether copyright-expired or public domain for other reasons material from public-domain sources is welcome on Wikipedia, but such material must be properly attributed. Public-domain attribution notices should not be removed from an article or simply replaced with inline citations unless it is verified that substantially all of the source's phrasing has been removed from the article (see ). Of course, citable information should not be left without cites, although the most appropriate citations should be used.

A public domain source may be summarized and cited in the same manner as for copyrighted material, but the source's text can also be copied verbatim into a Wikipedia article. If text is copied or closely paraphrased from a free source, it must be cited and attributed through the use of an appropriate attribution template, or similar annotation, which is usually placed in a "References section" near the bottom of the page (see the section "Where to place attribution" for more details).

If the external work is in the public domain, but it contains an original idea or is a primary source, then it may be necessary to alter the wording of the text (for example, not including all the text from the original work, or quoting some sections, or specifically attributing to a specific source an opinion included in the text) to meet the Wikipedia content policies of neutral point of view and No original research (in particular the restrictions on the use of primary sources).

Avoiding plagiarism requires attribution, and this is best accomplished when a reader can easily compare the Wikipedia article to the source. Many public domain sources are online, and attribution can (and should) include hyperlink. When there is no online source, the editor should consider creating an exact copy of the source at Wikisource. The editor should also consider this if the online source is not available on a stable site or is in a form (e.g., a photocopied book) that is not readily convertible into simple text. This may be appropriate even when the source appears to be at a stable site and in an acceptable form, because the Wikisource site is under control of the Wikimedia foundation and other sites are not.

Copying within Wikipedia
Wikipedia's content is dual-licensed under both the GFDL and CC-BY license models. Contributors continue to own copyright to their contributions, but they liberally license their contributions for reuse and modification. GFDL and CC-BY do require attribution. However, since Wikipedia's articles do not contain bylines, it is not necessary or appropriate to provide attribution on the article's face. As long as the licensing requirements for attribution are met (see the guideline for specifics), copying content (including text, images, and citations) from one Wikipedia article to another or from one language Wikipedia to another is not plagiarism as long as attribution is provided via the edit summaries.

Where to place attribution
If a Wikipedia article is constructed through summarizing reliable sources, but there is a paragraph or a few sentences copied from compatibly licensed or public-domain text which is not placed within quotations, then putting an attribution template in a footnote at the end of the sentences or paragraph is sufficient. To aid with attribution at the end of a few sentences, consider using a general attribution template such as the citation-attribution template for public-domain sources or CC-notice for compatibly licensed sources, Free-content attribution which is designed around material with an externally posted license, or use a source-specific attribution template such as DNB. Directions for usage are provided on the template pages.

If a significant proportion of the text is copied or closely paraphrased from a compatibly-licensed or public domain souce, attribution is generally provided either through the use of an appropriate attribution template or similar annotation placed in a "References section" near the bottom of the page. In such cases consider adding the attribution statements at the end of the Reference section directly under a line consisting of "Attribution:" in bold:

See, for example, Western Allied invasion of Germany and the Battle of Camp Hill.

A practice preferred by some Wikipedia editors when copying material from public domain or compatibly-licensed sources is to paste the content in one edit and indicate in the edit summary of the source of the material. If following this practice, immediately follow up with proper attribution in the article so that the new material cannot be mistaken for your own wording.

To provide proper attribution when copying verbatim from a public domain or compatibly-licensed source, you can either:
 * Put the whole text of the source (if small enough) in quotation marks or blockquotes, followed by an inline citation; or
 * For sections or whole articles, add an section-wide or article-wide attribution template; if the text taken does not form the entire article, specifically mention the section requiring attribution; or
 * In a way unambiguously indicating exactly what has been copied verbatim, provide an inline citation and/or add your own note in the reference section of the article.

For an example of the last, see the references section in planetary nomenclature, which uses a large amount of text from the Gazetteer of Planetary Nomenclature.

This practice has some advantages—for example, further changes such as modernizing language and correcting errors can be done in separate edits after the original insertion of text, allowing later editors the ability to make a clear comparison between the original source text and the current version in the article.

Tools
There are several tools available to help identify plagiarism on Wikipedia:
 * CopyPatrol - lists pages with suspected plagiarism for manual review
 * Earwig's Copyvio Detector - check any article for plagiarism
 * User:CorenSearchBot - automatically patrols newly created pages for plagiarism and tags them