Wikipedia:Non-free content criteria/legacy image proposal

Scope
This is a proposal for to bring Wikipedia's store of legacy images into compliance with WP:NFCC, the requirement for image page information. For purposes of this proposal, "legacy" images are those uploaded in 2006 or before. "New" images are those uploaded starting in 2007, including all new images as they are uploaded.

At this point it is a very preliminary discussion draft, intended to lay out some issues. It is meant to evolve as we gain consensus for what to do.

It only relates to the source, copyright tag, and fair use data on image pages. It is not concerned with how we find and remove images that are inappropriate for Wikipedia because they fail criteria NFCC#1-9 even though they have the required image data. Nor is this proposal intended to widen or narrow the range of images considered appropriate for Wikipedia. It is only about data requirements. However, good data is helpful for a lot of things, so if we clean the images up it will be much easier to exclude bad non-free images.

Background
Currently, Wikipedia hosts approximately 350,000 "non-free" images, of which approximately 170,000 are believed to be out of compliance with NFCC#10(a) - image source, or NFCC#10(c) - use rationale. (nearly all legacy images comply with 10(b) - they have a copyright tag).

The can be divided into three issues, the first two of which are treated here:
 * 1 - what should the requirements and formats be for source statements and use rationales?
 * 2 - how should we clean up the old images that don't meet those requirements?
 * 3 - how do we deal with new non-compliant images as they are uploaded

Goals

 * Get as close to 100% compliance as possible in advance of March 23, 2008 date set by Wikimedia Foundation (without regard to whether date was properly set or whether we are going farther than they require)
 * Save as many viable images as possible by tagging and sorting them in a usable way, developing tools, and encouraging participation for people to fix old images
 * Keep things as orderly and predictable as possible
 * Use approved bots, templates, tools, and procedures to minimize workload for all
 * Get buy-in and a clear process, with rules we can point to, in order to avoid any more dispute over issue of image deletion and tagging
 * Once images are cleaned up, do not allow new noncompliant images in so this does not become an issue again
 * Make source statements and use rationales as machine readable as possible, even thought Foundation does not require that

Maintain status quo
While the 10(c) policy is under discussion:
 * Tagging by bot, and deletion of bot-tagged images, based on 10(a) and 10(c) suspended for legacy images (uploaded before 1/1/07) until we decide on a plan
 * Ad-hoc tagging images by hand, and deleting images that fail NFCC#1-9, are unaffected - if users see any problem images they should deal with them.
 * Bots that edit legacy images should coordinate and get approval here, so we are not working at cross purposes.
 * 10(a), (b), or (c), will stay unchanged while this is under discussion. Any plan to change these rules will be coordinated with this discussion.
 * All new images must continue to comply with NFCC#10. Existing and new bots are encouraged to enforce 100% compliance for images while we discuss the matter of old images, until and unless the 10(c) requirements change.

Discussion of 10(c)
We attempt to agree within the next few weeks on what if any changes we will make to 10(c).

A starting suggestion: note: this is by no means the final proposal - this is what we should discuss at first
 * All images must have a source statement in a template rather than below a heading so that it is machine readable. The template can be something extremely simple, like:  or
 * We keep an approved list of copyright tags and do not allow new images to be uploaded with non-standard copyright tags.
 * All non-free images will continue to require a separate written use rationale for each use in an article, as before. We should reject the notion that some uses are too obvious to need rationales.
 * For most types of images, the use rational requirement will not change (other than that we will stuff it into the template).
 * For common "obvious cases" we will set up a system for proposing and approving templated rationales (see below).
 * We will place the use rationales, whether templated or not, into a "use statement" template that contains the article name and the use rationale. The simplest, free-form use-statement would look like this:  .  In cases where one of the approved rationales is used it might look like this:.
 * Going forward, any image used in an article but without a corresponding use statement with all of the mandatory fields filled in may be speedily removed (this will be easy to tell because of the new format).

The templated rationales will work as follows:
 * Only approved templates may be used. Unapproved templates will not count as rationales.
 * Any new template must first be proposed and follow an approval process.
 * Each template is approved for use with one or more specific copyright tags, and may only be used in connection with those tags.
 * Templates are adopted only when a large number of images shares a nearly identical set of non-free use considerations. We have identified three so far: logos, album covers, and book covers
 * Template approval may be restricted to certain specific situations. For example, the "album cover" template will initially be approved only for articles about the album in question, not to illustrate the artist's article.
 * For each template, we set up a master rationale as part of the template that explains why an image that correctly uses this template meets our exemption policy (i.e. NFCC). We will decide whether or not to transclude the rationale onto the image page or merely provide a link to the template rationale.
 * The fact that a template exists does not mean it's automatically appropriate to an image. The editor adding the image to the article has the responsibility to make sure that any template used is appropriate to the use.
 * There may be further restrictions on templates. For example, a template may be approved only for images that are in a specified infobox.
 * Some or all of the templates may be parameterized, meaning that the user must fill out certain information for each image or use. For example, with "book covers" there may be a field for edition information.  As part of the approval process we will decide which of the parameters are mandatory.
 * The existing non-free use rationale template may be used, and will be approved for all copyright tags (though we may deprecate it and remove the "article" and "source" fields, because these fields are handled elsewhere - see above).

Tagging and preparation of noncompliant images
All noncompliant legacy images should be tagged as soon as we agree on a plan. Not tagged for speedy deletion but tagged so that we can identify and process them.

Source:
 * For all legacy images we look for the source and put it in the source template: .  Any images without an identifiable source get tagged as such, e.g.
 * Where the source is not stated but obvious from context we add that by bot. For example, the source of a company logo is the organization that uses the logo.  The source of any album cover is the original record company.  And so on.  If we decide that these need human review we will tag them as such.  For example:

Use statements:
 * Encapsulate every existing use rationale within a use statement of the form
 * In cases where there is a rationale but no indication of which article, tag with.
 * Make sure from the "file links" that there is one use statement for every use, even if the user has not filled them in. TA completely unexplained use will look like: <.
 * If the use statement doesn't say which article it applies to, but the image is used in only one article, use that information to update the use statement.
 * When done, the "unknown" articles and "missing" rationales will be both a tag that can be used by bots and categories to find noncompliant images, and also a starting point for users to fix the images.

Sorting noncompliant images. Develop a method using tagging, automatically adding links to lists of images, categories, a special function, or some other way, to break down all noncompliant images according to:
 * Article they appear in
 * Who uploaded them
 * Copyright tag used
 * Wikiproject the article belongs to
 * Any other category in the article that people want to watch, e.g. a user could request to see a list of all noncompliant images in articles in the "auto racing" or the "French history" category.

Fixing noncompliant images

 * Give people a fixed amount of time and a deadline for fixing noncompliant legacy images.
 * Create a tool for quickly adding approved templated rationales to the use statements.
 * Promulgate a schedule, ahead of time, for which images will be tagged for deletion, and find a way to notice the deletion date to the image (e.g. January 1-7, we will delete all noncompliant promotional photos; january 8-15 we will delete all noncompliant logos; etc).
 * Decide on a deletion protocol.
 * One possibility: tag then speedy deletion, as we do now.
 * Possibility 2: have a bot remove and delink all noncompliant use statements (missign rationale or unknown article), then leave orphan bot to delete any images that are thereby orphaned. We can avoid a new tag-and-delete procedure on the theory that people already had notice.

Coordination issues

 * Once we decide on approved copyright tags and approved rationale templates, changes if any to 10(c), and a tagging/deletion/fixing protocol, we should update any policy and guideline pages accordingly, e.g. WP:FURG, WP:NFCC, WP:NONFREE, and WP:CSD.
 * We will plan a schedule to meet the March 23, 2008 Wikimedia Foundation deadline (without regard to recent concerns that the deadline may have been improperly stated or that some of what we are doing goes beyond what the Foundation requires).
 * After we handle the legacy images we see what tools we have that can help police and fix the new images if they aren't yet done.