Wikipedia:WikiProject Copyright Cleanup/How to clean copyright infringements

{| style="width: 100%; background-color: #FFFFF0; border: 2px solid #F4A460; padding: 10px; margin-top: -1px; margin-bottom: 8px; vertical-align: top;"
 * colspan=3 style="vertical-align:top" |

Are you looking for WikiProject Copyright Cleanup-related work to do? This page provides some suggestions on where to find problems and how to handle them when you do.

What is a copyright concern?
Copyright concerns exist where text or media are placed on Wikipedia from previously published sources that are not verifiably free for use and the material is not handled according to non-free content guidelines (WP:NFC). Even if the previously published source does not carry a copyright notice, it is presumed to be copyrighted unless we have reason to know that it is free, i.e. if public domain for age or other reason. If a source is licensed compatibly for use but that license is not adhered to — for instance, if attribution is required but not given — this is also a copyright concern, even if the source is another Wikipedia article (see Copying within Wikipedia).

Non-free content that is not used appropriately under WP:NFC represents a copyright concern. For example, even if quoted and attributed, extensive use of quotations from copyrighted sources is prohibited. A non-free image is a copyright concern if used outside of article namespace or if it otherwise violates the guidelines for such use.

While copyright concerns are more obvious when duplication is verbatim, they may also exist when the original source has been insufficiently altered. An image that incorporates or draws on a previously published image may be a derivative work. Text that paraphrases too closely on a non-free source may also infringe. (See Close paraphrasing.)

What to do if you find an untagged copyright concern
If you find a copyright concern that you cannot immediately clean by one of the methods below, please make sure that you do something. The steps may seem daunting, but generally aren't. Frequently, they will involve placing a tag on the article or file and notifying the contributor; sometimes, they also require pasting code on the appropriate investigation board. Most of the time, the template you place on the article will tell you what needs to be done.

The copyright problems board has instructions for how to handle text copyright problems, include which template to use in which circumstances. Similarly, the guide to image deletion will tell you which tags to use if you are concerned about copyright on an image or other media file.

Please try to talk with the contributor about your concerns and, in any case, be sure to notify the contributor of your actions in addressing a copyright problem. Most of the tags you use will generate a notice you can simply paste on the contributor's talk page. If the copyright infringement can be easily removed and you do not need to tag the article itself, you can advise the contributor by placing an appropriate notice, such as, on their page. (Only the first field is required.) This template will serve with either text or image infringement, although it is mildly presumptive of text. See the Resources page for some additional user talk templates.

Basic steps for finding and addressing copyright concerns

 * 1) To find untagged text copyright concerns, consider reviewing Special:NewPages and Template talk:Did you know. While copyright problems can be added to any article at any time, new articles or recently expanded ones may be more problematic. Some suggestions for recognizing copied text can be found at the signpost dispatch on plagiarism. If you find copied text and can't verify that the source is public domain or compatibly licensed, rewrite it or tag it, as described at CP. (If it is public domain or compatibly licensed, add attribution if this is missing. Plagiarism isn't always a copyright concern, but it's still a problem under guidelines that should be addressed, and proper attribution tags will help avoid mistaken copyright tags in the future. See Category:Attribution templates.)
 * 2) To find previously tagged issues, look in one of the categories or project pages listed in the box to the right; select and investigate an article or image tagged as a copyright concern. (An overview of these pages and what you might expect to find there is given in the next two sections below.)
 * 3) If the problem can be easily addressed, address it. For example, if an image lacks a non-free use rationale but you can supply one, do. If an article is copied from a compatibly licensed source but provides no attribution, attribute it. You may also consider requesting copyright permission. (See below for specific ideas for various forums of concern.) If copyright violating text can be easily removed or replaced with newly written language, you're welcome to rewrite it. If the article is blanked with , you should rewrite it in the temporary page linked from the article's face, being careful not to use text already in the article without attribution. Even if that text wasn't a copyright problem, it will become one if you copy it without giving due credit to the contributor who wrote it.
 * 4) Make sure the problem is listed at the proper forum, with all necessary information. For instance, if an article is at Suspected copyright violations and it should be tagged for speedy deletion as it meets criterion WP:CSD, tag it with Db-g12. If you have located licensing information that suggests an image at WP:PUF may be properly licensed (or not), note it.
 * 5) Make sure that others interested in the file or article are aware of the problem, where appropriate. For instance, if an article is listed at WP:CP but the article itself is not tagged copyvio or if an image is listed at WP:PUF but the image caption at an article is not tagged pufc, add those tags.
 * 6) Make sure that the contributor has been notified, usually with the notification template for the purpose. There are additional suggestions for talking to contributors below.)
 * 7) If copyright concerns seem valid, check the contribution history of the contributor to ensure that there are not other copyright issues that need to be addressed. If the contributor has violated copyright after having been notified of policies, consider whether a block is appropriate. If you are not an administrator, you may find one in the list of project members at Category:WikiProject Copyright Cleanup participants or request assistance at Administrators' noticeboard/Incidents. If an editor's contributions are extensive, additional review may be warranted; consider requesting it at Contributor copyright investigations.

Category:Possible copyright violations (and subcategories)
A number of tags place articles in this category and its subcategories, including copypaste and Close paraphrase. Some of the articles here are already listed at Copyright problems for review. Most of these can be addressed by the general steps above, either by revising or listing at Copyright problems. When appropriate, please remove the tags after the problems have been addressed. For example, when an article that was closely paraphrasing another source has been revised, the Close paraphrase template is no longer necessary.

Sometimes, efforts to address copyright concerns in text may be met with resistance. If you have attempted to clean a copypaste article and been reverted, please do not engage in an edit war. Politely discuss your concerns with the editor who reverted you, who may agree with your change once they understand your rationale. If you feel that the material represents a copyright concern and another editor is resisting your cleanup, blank the text with the copyvio template according to directions at copyright problems. This should lock the matter until an uninvolved administrator can investigate. If the tag is inappropriately removed, seek administrator intervention directly or through WP:ANI. In all cases, remember to remain within behavioral guidelines.

Copyright problems
Articles are listed at Copyright Problems (CP) if they duplicate other sources but are not deletable under speedy deletion criteria G12. Bots also list articles there which have been tagged for close paraphrasing or copy-paste. Sometimes articles tagged for the latter should have been blanked as the former. In those cases, you should use your best judgment to determine if they can and should be cleaned up on the spot or blanked.

Articles tagged copyvio
These are listed for seven days prior to being closed by an investigating administrator. Meanwhile, those who wish to help can identify the edit where the material was added, ensure that the contributor has been notified, and supply other helpful information. For example, if there is reason to question which came first, Wikipedia or the external site, consider using internet archives to check the dating of the external site and note your findings at the listing. While engineered for administrators closing listings, Copyright problems/Advice for admins gives some suggestions that can be useful in investigating at any stage of the procedure.

Perhaps the most useful contribution here, aside from the standard steps, is providing a clean alternative. If the subject of the article seems notable and the infringement goes all the way back to inception, consider starting a new article in the temporary page ( Talk: ExamplePage /Temp ) linked from the template, even if the infringement is limited to one section of the article. Make a note at the talk page indicating that you have done this in case the administrator neglects to check. Please be careful to give credit to the Wikipedia contributors to comply with CC-BY-SA and GFDL. Uncopyrightable elements, such as categories and external links, can be incorporated as are. If you do use text provided by other contributors, you may wish to do so incrementally, noting in edit summary the source (such as by stating "text contributed by User:Example on 12 January 2009"). The admin who closes the matter may merge this text in to the existing article after selectively deleting the infringement or removing it. Please be sure that your revision is sufficiently different from the source that it is not itself an infringement. (See Close paraphrasing for some ideas.)

Articles tagged close paraphrasing
"''Note: Sometimes articles tagged for close paraphrasing are not a copyright concern because the source is free. In that case, it may be possible to remedy the issue simply by ensuring that proper attribution is given as set out at Plagiarism. Alternatively, the close paraphrasing tag can be altered to add, which will prevent its relisting at CP."

These are most easily addressed if the source is given and notes made at the talk page of the article. Sometimes, research is required. Editors may note the suspected source in the edit summary when they tag the article, but it may be necessary to compare the article to its cited sources or to scan it against google or another search engine to find the similarity.

If you are able to identify the source, you can simply read through the given source and compare the content in the article. If you see close paraphrasing, it can be immediately addressed. Rewrite the content, with an explanatory edit summary, remove the tag and note what you've done at the CP listing. If the content was recently added by a registered contributor, it is sometimes beneficial to give them an opportunity to rewrite the material themselves. Writing to avoid close paraphrasing can take practice, and it may be more helpful to let the contributor hone these skills with some friendly assistance than to clean it up without giving them an opportunity to learn how.

Even if you can't rewrite the article, other opportunities exist for helping out here. If the source is not identified, you can add the source to the "close paraphrasing" tag. If the paraphrasing is not obvious, you can open a section at the talk page detailing where you see issues or why you do not. If you can't find the suspected source at all, you may want to ask the tagger for an explanation. If you can verify issues, you might want to leave the contributor who added the content a friendly note explaining that the article has been tagged and why.

Articles tagged copy-paste
This tag is often misapplied. It can mean an article is a blatant infringement, or it can mean that a contributor suspects based on the language that content was copied but does not know from where. When properly applied, it is used to indicate that text redundant to an identified source has been detected, but the tagger is unsure if the content is copyrighted or whether the content was first published on Wikipedia.

If no source is identified, the first step of addressing these listings is, as with the close paraphrasing listings, identifying the source. Check edit summaries and talk page. If needed, compare to listed sources or scan the internet. If the tagger did not identify a source and you cannot find one, you can remove the copy-paste template from the article and instead place the following on the article's talk page:. This leaves a record of concerns but removes a tag which cannot be verified from the article's face. Please use a descriptive edit summary so that the tagger knows that their concerns have been evaluated, not simply dismissed.

Once a source is identified, the article may be addressed like any other copyright problem. If the copying is reversed, use backwardscopy on the talk page and remove the tag. Be sure to provide your evidence in the comment field of the template. If the copying does not seem to be reversed, clean the article if you can or tag it with db-G12 or Copyvio as appropriate. Even if copy-pasted tags have been listed for a full week, they should be relisted if the Copyvio tag is applied to permit proper notice to be given to the user. If they can be handled otherwise, there is no reason for additional delay and no reason that the listing need remain open for the full seven days.

Contributor copyright investigations (CCI)
Contributors are listed for evaluation here when it can be shown that they have violated copyright in multiple articles or images. At the top of the CCI page is a box that lists every opened CCI. Any contributor who has no history of copyright problems (warnings or blocks) is welcome to help out with these. Each CCI subpage includes a set of instructions at its top. Basically, work at a CCI involves looking at the "diffs", articles or images listed to evaluate whether they constitute a copyright concern. The slight difference here is that there is a presumption of copying. While we don't want to remove content unnecessarily, we know that contributors who have been listed here have been demonstrated to have copied on multiple occasions. If we find duplicate content elsewhere, unless we can verify that it is a mirror, we presume that they had it first.

There is no need to focus on working on these in any particular order. Some CCIs are more complex than others; some may be in areas that are more of interest to you. Any progress here is a great benefit to the project, so feel free to start out at whichever listing seems best to you.

Category:Wikipedia files that may violate copyright
A number of tags place files in this category and its subcategories, including derivative and nld. Some of these tags will result in a file being automatically deleted if problems are not addressed in a certain period of time. General steps for addressing are as above. Some specific suggestions follow.


 * Wikipedia files that are derivative works: Some of the images here are derivative works the use of which has been approved. If fair use is not claimed and there is no assertion that the work depicted is free, consider nominating the image for deletion.
 * Wikipedia files with unknown copyright status or unknown source: Investigate to see if you can help determine copyright status. Some of these may be free; many may not be. If they are not free, determine if they can be used within non-free content guidelines and address accordingly. If not, consider a more appropriate tag if, for example, the file qualifies for a quicker deletion process, as WP:CSD.
 * Non-free files tagged as replaceable, orphaned or lacking non-free use rationales: Where possible, address these issues by locating free replacements, locating appropriate places to use the images or providing non-free use rationales.
 * Non-free files tagged as disputed: Evaluate. If you can correct the non-free use rationale, do so. Comments on the matter may be made at the file's talk page.

Files for discussion
Images are listed at Files for discussion if their copyright status or source is dubious. Any listings may be closed by any editor after seven days.

CCI
See above.

Special

 * Category:Articles containing links to copyright violations: Articles are listed here when they contain links to external sites that contain copyright infringement, a problem addressed both in copyright policy at WP:ELNEVER. If the matter is not under debate and you agree that the link violates policy, consider removing it, with a note at the article's talk page. If the matter is legitimately under debate, consider helping to build consensus by voicing your opinion or by inviting other contributors in accordance with the dispute resolution policy. If there seems no question that the link is in violation but one or more editors persist in replacing it, consider seeking admin intervention directly or through WP:AIV or WP:Blacklist.
 * Category:Copyright examinations: Article talk pages are placed in this category by the addition of copyrightexamination. This tag is often left in place after a conversation goes stale. Consider removing old discussions by nullifying or removing the template. If the conversation is ongoing, you may wish to contribute.

Addressing contributors of problematic material
The imperative of the WikiProject Copyright Cleanup, as it is for all Wikipedia editors, is to address infringing content; additionally, WikiProject Copyright Cleanup seeks to ensure that contributors who violate copyright understand the policy and know how to contribute constructively. It is important in this project as in all of Wikipedia to begin by assuming good faith. Contributors to Wikipedia come from many backgrounds and do not always understand the US copyright laws that Wikipedia complies with or the policies and guidelines we have developed to ensure we remain in compliance. Contributors should be civilly notified of these policies and guidelines with a goal of preventing future infringement without discouraging potentially good contributors. However, if an editor continues to contribute problematic material after notification, it will be necessary to block that contributor to prevent recurrence. You may need to seek assistance the administrator's noticeboard.

If a contributor seems to have widespread copyright concerns, you may wish to consider requesting a contributor copyright investigation. See that page for instructions.