Wikipedia:WikiProject Images and Media/Great Copyvio Purge

{| style="border: 2px solid darkgreen; margin-top:-1px" width="100%" cellpadding=10 cellspacing=0 Fixing copyvios one image at a time! The Great Copyvio Purge was a one-month long, collaborative effort with the goal of deleting or fixing many of Wikipedia's images which either A) are possible copyright violations (lacking copyright information, source, permission, etc.), or B) Fail the non-free content criteria. It took place throughout July 2009. This page now serves as a single location which can be used as a "starting point" for finding these types of images, through various categories and tools, and some additional tips on how to handle them.
 * style="background: lightgreen" |
 * style="background: lightgreen" |

How can you help?
Using either Twinkle or quickimgdelete is very helpful when working with these images, as they greatly simplify tagging images for deletion. Other tools which you may find useful are FurMe, which aids in adding well-formed fair-use rationales and other tags to images, and ifdthumbnails, which provides a useful thumbnail copy of images nominated for deletion at files for deletion and possible unfree files. When trying to figure out an image's source TinEye is a helpful resource; there is also a script to help use it. See the individual descriptions for information on installation and use.

Before doing any other work with image copyvio deletion, you should be familiar with a few of Wikipedia's policies, guidelines, and essays (reading the entirety of these is not usually necessary, of course, but a basic understanding of them is very important), notably: Criteria for speedy deletion, Public domain, Copyright violations, Requesting copyright permission, Image copyright tags, and Non-free content.

The next step is to start searching for problematic images. lists a large number of tools and categories which can help find them. Choose a category or tool and start looking through the images shown there to find problematic ones, either tagging them for deletion or fixing them as appropriate. Important: The goal of this project is not to delete as many images as possible. The goal is to find honestly problematic images so that they can be deleted; images should be fixed if possible or listed at media copyright questions so that others can take a look.

Finding problematic images
These categories, special pages, and tools on the Toolserver contain many ways to find problematic images.
 * Category:Wikipedia files with unknown source. The images in this category (but not its subcategories, which are deletion backlogs for administrators) contain an information template with no source specified. Although in some cases a source may be specified elsewhere in the text, many of these images lack source information altogether and so can be tagged with.
 * Category:Files lacking an author. The images in this category may contain a source and/or description, but not author information (exactly who created the image). Many of these may be acceptable as-is, but many also need an author added (which may be found on a sourced website or is simply the username of the uploader if the image is self-made). These images may also be missing other essential information, such as evidence of permission, which may call for tagging them for deletion instead.
 * Category:Images lacking a description. The images in this category may contain a source and/or author, but not a description. Lack of a description can be problematic because it can make the image hard to use, if it can't be determined what it is. Lack of a description is also a common sign that other important information is missing.
 * Category:Wikipedia non-free file size reduction requests contains files that need to have their size reduced in order to comply with WP:NFCC. Fixing these requires downloading the images to your computer, resizing them to a lower resolution using an image editing program, and the reuploading them back to Wikipedia.
 * Category:Wikipedia files that transclude the Non-free media rationale template with no Purpose specified: self-explanatory; these images generally have poor or no fair use rationales.
 * Category:Images with watermarks. A lot of watermarked images which are tagged as public domain or freely licensed are probably copyvious; e.g., watermarks saying "Copyright 2009 Joe Smith" or having an organization/company name.
 * Category:Wikipedia non-free files lacking article backlink and Category:Wikipedia non-free files with red backlink both contain non-free images which might not be in use in articles.
 * Category:Publicity photographs with missing fair-use rationale. non-free promotional contains an obscure parameter that removes pages from this category, the result being that most publicity images are in this category. Many publicity images also have little or no fair-use rationale or can be replaced by free images.
 * Category:Wikipedia license migration needs review. Many images are added to this category by a bot when the image appears to be tagged as both GFDL and fair use or public domain. This may indicate a copyright problem.
 * Category:Wikipedia files with disputed copyright information. These images tend to just need a general review.
 * Category:PD tag needs updating contain images which are tagged with the deprecated PD template. Many of these are legitimately public domain images, but some are probably copyright violations.


 * Special:UncategorizedFiles. These files are not categorized at all. This usually indicates that the image does not have a copyright tag, is on Commons but tagged here with something like FPCold, or it is an image page for an image on Commons which contains no content. The first of these can be handled the same way as any other untagged image, the second should be kept as-is, and the last can be deleted under G6 or F2 only if their history doesn't contain any really relevant information (e.g., it consists of vandalism, categories, etc., but not sources or license information).
 * Special:UnusedFiles. These files are not currently used anywhere on Wikipedia. Orphaned fair-use images can typically be deleted; while looking at this page, linkclassifier can highlight fair-use images to help you find them.


 * Short pages: File namespace. This page locates images with descriptions of 50 bytes or less. Short image descriptions can commonly indicate a lack of source and/or license information.
 * [//tools.wmflabs.org/dplbot/orphaned_images.php OrphanedImages]. This tool locates all orphaned images in a specific category. This is very helpful for finding orphaned non-free media files, which fail WP:NFCC (must be used in at least one article). Just find a category that contains mainly non-free images and use this tool to search them for orphans. Although it takes a loooong time to load, [//tools.wmflabs.org/dplbot/orphaned_images.php?limit=50&allNS=&cat=All+non-free+media searching Category:All non-free media] will eventually produce a list of all such images. Remember that before tagging these for deletion, check to see if they should be used in an article which they were removed from.
 * UntaggedImages. This tool helps to locate images which lack copyright tags. It can be filtered by time of upload and (optionally) by a specific user. It's most basic use is to show all such images from the past day. However, it can also be used to locate such images from any point in time, based on how long ago the images were uploaded. When at the "past day" URL, you can change the URL line of  to cover any span of time, in days. For example, using   will produce a list of all untagged images uploaded between 7 and 10 days ago. Note that longer time periods may cause the program to slow down, so keeping it to three days (or shorter) periods is probably helpful.
 * High-use non-free images lists fair use images which are used in more than 5 different articles. This can sometimes be valid, but it is often an indicator of overused non-free content. This tool can update the lists.
 * Pages with excessive non-free images lists all pages which contain five or more fair use images. This can sometimes be valid, but it is often an indicator of overused non-free content. This tool can update the lists.

Scanning categories
The bad old ones tool is intended for use administrators working on image deletion backlogs, but it is also very useful for anyone looking at the descriptions of images in a particular category, as it contains the full text of the image description as well as a preview of the image, allowing you to quickly go through a category to find problematic images. For example, this search scans Category:Wikipedia files with unknown source, so that you can more easily locate images in that category which may have problems. Note: This tool sometimes does not detect an image's usage correctly, saying that some in-use images are unused. Therefore, be cautious when using this tool to find unused non-free media.

Firefox users: Linky is a very powerful Firefox add-on which allows you to, amongst other things, select up to 99 images in a category simultaneously and open them in tabs with just a few clicks. This can greatly speed up the tagging of images in certain categories, by allowing you to just go through the now-open tabs rather than needing to open up a new tab after looking at each image.

Administrators
Administrators can help out with actually deleting images when needed. The following categories may contain backlogs that need administrative review and/or deletion. In addition, files for deletion and possibly unfree files need administrators to maintain them.


 * }