Wikipedia:Copyright in lists

The United States copyright law which governs Wikipedia (see Copyrights) forbids Wikipedia contributors from copying information directly from other sources except in limited cases. We can copy content that is public domain or that is properly licensed for our use (with any necessary attribution). When we want to copy information from lists and compilations, we have to first figure out if they are protected by copyright.

Copyright in a list may exist in the content of the list or in the way that the content was selected and arranged. Copyright does not protect facts, but it does protect opinion. If a source is based on "value judgments", it may be protected by copyright, even if it looks very similar to fact. And even if the source is fact, copyright may still protect its selection and arrangement if these are creative.

When a source is listing value judgments or opinions, we will likely have to limit our use of it to comply with non-free content policy and guideline. If selection and arrangement are creative, we cannot use the same selection and arrangement of our source, but might have to add or remove elements and rearrange content into a new work.

Background
Wikipedia is legally bound by the copyright laws of the United States. To comply with those laws, we may freely reproduce only works that are either not protected by copyright (whether that's because copyright has expired, or because the material was never eligible for copyright protection) or works released under a suitable free license. With lists and compilations, we sometimes face challenges in determining whether the material is eligible for copyright and, if so, how we might use it.

As set out in, copyright laws in the United States protect original works of authorship in any medium while leaving open to the public the ideas, procedures, processes, systems, methods of operation, concepts, principles and discoveries contained in such works. (See Idea-expression divide.) Discoveries (facts) are not copyrightable, but compilations often are. Copyright doesn't only govern fiction; an historical essay may be as much an original work of authorship as a purely speculative science fiction novel. The author of each has wide latitude in choosing what to say and how to say it. Likewise, a list or compilation may be extremely creative. We must determine the degree of creativity (and, hence, usability) on a case-by-case basis.

Since the United States law does not recognize the sweat of the brow doctrine, the amount of labor that goes into producing lists and compilations is not our concern. Instead, we consider two factors:
 * 1) The copyrightability of the content being listed/compiled, and
 * 2) The copyrightability of the selection and arrangement of the content being listed/compiled.

The Wikimedia Foundation's associate counsel wrote in January 2011, "Unless you know the criteria involved in creating the list, it is impossible to even gauge the potential of a court finding that it warrants copyright protection. And unfortunately, even if you do know the criteria, it is very hard to predict what a court will say (especially because the courts vary in their opinions in different circuits on this matter) when there is a degree of creativity involved. You are really only safe if the list is purely formulaic."

Copyrightability of content
In considering whether a list or compilation is copyrightable, we have to look first at its nature. Are we talking about facts ("discovery") or opinion? As William F. Patry points out, the law protects "compilations of things expressed as a value judgment". In assessing the nature of the material, we first have to determine if content is subjective ("value judgment") or not.

Sometimes this will be obvious ("best love songs"; "U.S. Presidents, chronologically by term"), and sometimes it will require evaluating the criteria used by the original creator. One might think that a list of the market value of used cars would be objective. But in CCC Information Services v. Maclean Hunter Market Reports (1994), one such work was found to be copyrightable expression because it was based not on "data" but on the predictions of the authors using a "wide variety of informational sources and their professional judgment". The United States Court of Appeals for the Second Circuit expounded that the values were "in the category of approximative statements of opinion". Had the book relied on historical statistics, the information in it (although perhaps not all elements; see below), would have been public domain.

Where criteria cannot be determined, we may not always be able to assess the inherent nature of content and comfortably determine that it is "safe" to freely reproduce.

Examples
The following examples provide some guidance on the kinds of content that would typically be considered "discovery" and, hence, public domain. (Please remember, that per below, other copyrightable factors may still hamper free use of the source list or compilation. After making sure content clears these criteria, it should be checked further against the below.)

What copyrighted content means for Wikipedia
If the content in the source list is a "value judgment", or if there is plausible reason to believe it may be, we cannot treat it like public domain material. Rather, we must limit our reproduction of the material just as we do reproduction of other copyrighted content.

There should be no issue with including a cited reference in an article about a subject mentioned in the list. For instance, the Penny Arcade (webcomic) article can comfortably mention the inclusion of the comic and its creators on multiple "value judgment" lists. Discussing a few elements of a copyrighted list as part of a larger article on the subject of the list itself should also be comfortably within non-free content allowances, as currently practiced on Wikipedia. Our claim to fair use in such cases is stronger the more critical commentary on the list itself that the article includes. This helps to ensure that we are making transformative use of the material, rather than simply competing. As with other copyrighted content, more extensive takings run a higher risk of infringement. There is no firm consensus whether or not articles on such lists should include brief excerpts, but current guidance at non-free content, based on advice from associate counsel, indicates that "A complete or partial recreation of "Top 100" or similar lists where the list has been selected in a creative manner" is "unacceptable use."


 * Examples: The 500 Greatest Albums of All Time Time 100

Copyrightability of other factors
Even when the content is public domain, a compilation of that content may not be, if it features creativity in selection or arrangement. As the Supreme Court wrote in 1991: "The compilation author typically chooses which facts to include, in what order to place them, and how to arrange the collected data so that they may be used effectively by readers. These choices as to selection and arrangement, so long as they are made independently by the compiler and entail a minimal degree of creativity, are sufficiently original that Congress may protect such compilations through the copyright laws." Our challenge is determining where creativity exists.

Selection
In one of the landmark cases of compilation copyright in the U.S., Feist v. Rural (1991), the courts determined that a comprehensive phone directory listing was not copyrightable. But in Key v. Chinatown (1991), copyright was upheld on a business directory which eliminated from its yellow-page listings those entities which the publisher did not believe would remain in operation long enough to merit inclusion.

The United States Court of Appeals for the Second Circuit clarified in § 13 of its decision of the case that "Selection implies the exercise of judgment in choosing which facts from a given body of data to include in a compilation." Selection criteria don't have to be obvious to involve human judgment. A "select bibliography" may very well use copyrightable subjectivity in compilation, even if it doesn't say what factors were included in selecting. Where selection of facts is subjective and sufficiently creative ("businesses which seem sustainable"), copyright protection applies; where it is not ("every household with a phone in the region"), it does not.

Examples of creativity in selection
The following examples provide some guidance on the kinds of content that would typically be considered uncreative in selection. (Please remember that other copyrightable factors may still hamper free use of the source list or compilation. After making sure content clears this criteria, it should be checked further against the others.)

Arrangement
Whether or not the content is copyrightable, the method of presentation may be. In Key v. Chinatown, § 16, the Second Circuit noted that for copyright protection to convey, arrangement ("the ordering or grouping of data into lists or categories") must be "original within the meaning of the copyright laws" ("go[ing] beyond the mere mechanical grouping of data as such, for example, the alphabetical, chronological, or sequential listings of data"). Just as the selection in Feist was found uncreative, so was the arrangement: an alphabetical listing of every telephone holder. In Key, however, originality of arrangement was also found, as the publisher of the yellow pages included categories designed to be of specific interest to her Chinese-American audience which were not copied from earlier directories.

Examples of creativity in arrangement
The following examples provide some guidance on the kinds of content that would typically be considered uncreative in arrangement. (Please remember that other copyrightable factors may still hamper free use of the source list or compilation. After making sure content clears this criteria, it should be checked further against the others.)

What copyrighted selection/arrangement mean for Wikipedia
Note that, in accordance with Feist, the facts (public domain material, in contrast to value judgments, as above) included in a compilation are not protected even if the selection and arrangement are; specifically, the Supreme Court wrote, "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement." The best approach in the latter case is to utilize an entirely different arrangement, one which is either a "mere mechanical grouping of data" or creative in patently different ways. In the former, we may need to vary the selection by drawing in additional information to form a compilation around a different criterion or to limit it substantially in a new and different way.

Other considerations
Beyond copyright, users should consider the licensing of the content they want to use. The Wikimedia Foundation's associate counsel recommended in March 2011 that the use of even uncopyrightable lists be considered with regards to licensing agreements that may "bind the user/reader from republishing the list/survey results without permission", noting that "Absent a license agreement, you may still run afoul of state unfair competition and/or misappropriation laws if you take a substantial portion of the list or survey results."

All editors on Wikipedia should remember that they are individually responsible for their edits here and even if they are in compliance with site policy may encounter legal liability if they violate the law. Wikipedia's policies are created by volunteers in a best effort to comply with United States law, but they do not constitute legal advice. They also may offer no guidance to editors outside the United States as to the legal situation in their jurisdiction. For example, editors who live in regions that have a different threshold of originality, including recognizing "sweat of the brow", should take that into account before importing content from any compilation to Wikipedia.