Cataloging (library science)



In library and information science, cataloging (US) or cataloguing (UK) is the process of creating metadata representing information resources, such as books, sound recordings, moving images, etc. Cataloging provides information such as author's names, titles, and subject terms that describe resources, typically through the creation of bibliographic records. The records serve as surrogates for the stored information resources. Since the 1970s these metadata are in machine-readable form and are indexed by information retrieval tools, such as bibliographic databases or search engines. While typically the cataloging process results in the production of library catalogs, it also produces other types of discovery tools for documents and collections.

Bibliographic control provides the philosophical basis of cataloging, defining the rules that sufficiently describe information resources, and enable users to find and select the most appropriate resource. A cataloger is an individual responsible for the processes of description, subject analysis, classification, and authority control of library materials. Catalogers serve as the "foundation of all library service, as they are the ones who organize information in such a way as to make it easily accessible".

Cataloging different kinds of materials
Cataloging is a process made in different kinds of institutions (e.g. libraries, archives and museums) and about different kinds of materials, such as books, pictures, museum objects etc. The literature of library and information science is dominated by library cataloging, but it is important to consider other forms of cataloging. For example, there are special systems for cataloging museum objects that have been developed, e.g., Nomenclature for Museum Cataloging. Also, some formats have been developed in some opposition to library cataloging formats, for example, the common communication format for bibliographical databases. About cataloging different kinds of cultural objects, see O'Keefe and Oldal (2017).

Six functions of bibliographic control
Ronald Hagler identified six functions of bibliographic control.
 * "Identifying the existence of all types of information resources as they are made available." The existence and identity of an information resource must be known before it can be found.
 * "Identifying the works contained within those information resources or as parts of them." Depending on the level of granularity required, multiple works may be contained in a single package, or one work may span multiple packages. For example, is a single photo considered an information resource? Or can a collection of photos be considered an information resource?
 * "Systematically pulling together these information resources into collections in libraries, archives, museums, and Internet communication files, and other such depositories." Essentially, acquiring these items into collections so that they can be of use to the user.
 * "Producing lists of these information resources prepared according to standard rules for citation." Examples of such retrieval aids include library catalogs, indexes, archival finding aids, etc.
 * "Providing name, title, subject, and other useful access to these information resources." Ideally, there should be many ways to find an item so there should be multiple access points. There must be enough metadata in the surrogate record so users can successfully find the information resource they are looking for. These access points should be consistent, which can be achieved through authority control.
 * "Providing the means of locating each information resource or a copy of it." In libraries, the online public access catalog (OPAC) can give the user location information (a call number for example) and indicate whether the item is available.

History of bibliographic control
While the organization of information has been going on since antiquity, bibliographic control as we know it today is a more recent invention. Ancient civilizations recorded lists of books onto tablets and libraries in the Middle Ages kept records of their holdings. With the invention of the printing press in the 15th century, multiple copies of a single book could be produced quickly. Johann Tritheim, a German librarian, was the first to create a bibliography in chronological order with an alphabetical author index. Conrad Gessner followed in his footsteps in the next century as he published an author bibliography and subject index. He added to his bibliography an alphabetical list of authors with inverted names, which was a new practice. He also included references to variant spellings of author's names, a precursor to authority control. Andrew Maunsell further revolutionized bibliographic control by suggesting that a book should be findable based on the author's last name, the subject of the book, and the translator. In the 17th century Sir Thomas Bodley was interested in a catalog arranged alphabetically by author's last name as well as subject entries. Sir Robert Cotton's library catalogued books with busts of famous Romans. The busts were organized by their name, i.e. N for Nero, and then came the shelf with its assigned letter, and then the roman numeral of the title's number. For example, the cataloging for The Lindisfarne Gospels reads Nero D IV. Cotton's cataloging method is still in use for his collection in the British Library. In 1697, Frederic Rostgaard called for subject arrangement that was subdivided by both chronology and by size (whereas in the past titles were arranged by their size only), as well as an index of subjects and authors by last name and for word order in titles to be preserved based on the title page.

After the French Revolution, France's government was the first to put out a national code containing instructions for cataloging library collections. At the British Museum Library Anthony Panizzi created his "Ninety-One Cataloging Rules" (1841), which essentially served as the basis for cataloging rules of the 19th and 20th centuries. Charles C. Jewett applied Panizzi's "91 Rules" at the Smithsonian Institution.

Descriptive cataloging
"Descriptive cataloging" is a well-established concept in the tradition of library cataloging in which a distinction is made between descriptive cataloging and subject cataloging, each applying a set of standards, different qualifications and often also different kinds of professionals. In the tradition of documentation and information science (e.g., by commercial bibliographical databases) the concept document representation (also as verb: document representing) have mostly been used to cover both "descriptive" and "subject" representation. Descriptive cataloging has been defined as "the part of cataloging concerned with describing the physical details of a book, such as the form and choice of entries and the title page transcription."

Subject cataloging
Subject cataloging may take the form of classification or (subject) Indexing. subject cataloguing is the process of assigning terms that describe what a bibliographic item is about whereby Cataloguers perform subject analysis for items in their library, most commonly selecting terms from an authorized list of subject headings, otherwise known as a 'controlled vocabulary. Classification involves the assignment of a given document to a class in a classification system (such as Dewey Decimal Classification or the Library of Congress Subject Headings). Indexing is the assignment of characterizing labels to the documents represented in a record.

Classification typically uses a controlled vocabulary, while indexing may use a controlled vocabulary, free terms, or both.

History
Libraries have made use of catalogs in some form since ancient times. The very earliest evidence of categorization is from a c. 2500 BCE collection of clay tablets marked in cuneiform script from Nippur, an ancient Sumerian city in present-day Iraq, wherein two lists of works of Sumerian literature of various myths, hymns, and laments are listed. As one tablet had 62 titles, and the other 68, with 43 titles common between them, and 25 new titles in the latter, they are thought to comprise a catalog of the same collection at different periods of time.

The library of Ashurbanipal in ancient Nineveh is the first library known to have a classification system on clay tablets. They had cuneiform marks on each side of the tablet. The Library of Alexandria is reported to have had at least a partial catalog consisting of a listing by Callimachus of the Greek literature called "Pinakes". There were originally 825 fragments of Callimachus' "Pinakes", but only 25 of them have survived. The Chinese Imperial Library of the Han dynasty of the 3rd century A.D. had a catalog listing nearly 30,000 items, each item similar in extent of its content to a Western scroll. The first catalogs in the Islamic world, around the 11th century, were lists of books donated to libraries by persons in the community. These lists were ordered by donor, not by bibliographic information, but they provided a record of the library's inventory.

Many early and medieval libraries in Europe were associated with religious institutions and orders, including the Papal library in Rome. The first Vatican Library catalog is from the late 14th century. These catalogs generally used a topical arrangement that reflected the topical arrangement of the books themselves. The Vatican Library published 'rules for the catalog of printed books' in 1939. These rules were then translated to English and published in the United States in 1949. Back in Medieval times, the library of the Sorbonne in Paris had accumulated more than one thousand books, and in 1290 their catalog pioneered the use of the alphabet as an organizing tool.

It was the growth in libraries after the invention of moveable-type printing and the widespread availability of paper that created the necessity for a catalog that organized the library's materials so that they could be found through the catalog rather than "by walking around." By the 17th century libraries became seen as collections of universal knowledge. Two 17th century authors, Gabriel Naudé, in France, and John Dury, in Scotland, both developed theories of systematic organization of libraries. The development of principles and rules that would guide the librarian in the creation of catalogs followed. The history of cataloging begins at this point.

In ancient times in the orient the title was used to identify the work. Since the renaissance the author has been the main source of identification.

Cataloging standards
Cataloging rules have been defined to allow for consistent cataloging of various library materials across several persons of a cataloging team and across time.

Anglo-American cataloging standards
The English-speaking libraries have shared cataloging standards since the early 1800s. The first such standard is attributed to Anthony Panizzi, the Keeper of the Printed Books of the British Museum Library. His 91 rules, published in 1841, formed the basis for cataloging standards for over 150 years.

Subsequent work in the 19th century was done by Charles Coffin Jewett, head of the Smithsonian library, which at the time was positioned to become the national library of the United States. Jewett used stereotype plates to produce the library's catalog in book form, and proposed the sharing of cataloging among libraries. His rules were published in 1853. A disagreement with the head Smithsonian secretary caused Jewett to be dismissed from his position but soon after he accepted a position with the Boston Public Library. He was tasked with purchasing books as well as arranging them. Jewett earned the role of director of the Boston Public Library in 1858; during this time the Index to the Catalogue of a Portion of the Public Library of the City of Boston Arranged in its Lower Hall was published. The article included new cataloging information alongside many of the Smithsonian cataloging rules that Jewett created. His systems became a model for other libraries as he pushed for alphabetical card catalogs.

Jewett was followed by Charles Ammi Cutter, an American librarian whose Rules for a Dictionary Catalog were published in 1876. Cutter championed the concept of "ease of use" for library patrons.

In the 20th century, library cataloging was forced to address new formats for materials, including sound recordings, movies, and photographs. Seymour Lubetzky, once an employee of the Library of Congress and later a professor at UCLA, wrote a critique of the 1949 ALA rules for entry, Cataloging Rules and Principles: A Critique of the ALA Rules for Entry and a Proposed Design for the Revision. Lubetzky's writings revealed the weaknesses in the existing rules, and spoke to the need for preparing a set of standards for a more complete and succinct code. As changes in culture over time would necessitate an ever-increasing/changing list of rules, Lubetzky "helped remedy the situation by advocating the concept of cataloging according to 'basic principles,' in place of a rule for each case that might arise." He was tasked to do extensive studies of the current cataloging rules over the time period from 1946 to 1969. His analyses shaped the subsequent cataloging rules.

The published American and Anglo-American cataloging rules in the 20th century were:
 * Anglo-American rules:
 * American Library Association rules:
 * Library of Congress rules:
 * AACR:
 * AACR2:
 * AACR2-R:

The 21st century brought renewed thinking about library cataloging, in great part based on the increase in the number of digital formats, but also because of a new consciousness of the nature of the "Work" in the bibliographic context, often attributed to the principles developed by Lubetzky. This was also supported by the work of the International Federation of Library Associations and Institutions on the Functional Requirements for Bibliographic Records (FRBR), which emphasized the role of the work in the bibliographic context. FRBR created a tiered view of the bibliographic entity from Item, Manifestation, Expression, to Work. Item refers to the physical form of the book. Manifestation refers to the publication. Expression meaning the translation of the book from other languages. Work refers to the content and ideas of the book. This view was incorporated into the cataloging rules subsequent to AACR2-R, known as Resource Description and Access (RDA).

England
The Bodleian Library at Oxford University developed its cataloging code in 1674. The code emphasized authorship, and books by the same author were listed together in the catalog.

We can trace the origins of modern library cataloging practice back to the 1830s and Anthony Panizzi's 91 rules. Panizzi's singular insight was that a large catalog needed consistency in its entries if it was to serve the user. The first major English-language cataloging code was that developed by Sir Anthony Panizzi for the British Museum catalog. Panizzi's 91 rules were approved by the British Museum in 1839, and published in 1841. The British Museum rules were revised up until 1936. The library departments of the British Museum became part of the new British Library in 1973.

Germany and Prussia
The Prussian government set standard rules called Preußische Instruktionen (PI) (Prussian Instructions) for all of its libraries in 1899.

These rules were based on the earlier Breslauer Instructionen of the University Library at Breslau by Karl Franz Otto Dziatzko.

The Prussian Instructions were a standardized system of cataloging rules. Titles in literature are arranged grammatically not mechanically and literature is entered under its title. These were adopted throughout Germany, Prussia and Austria.

After the adoption of the Paris Principles (PP) in 1961, Germany developed the Regeln für die alphabetische Katalogisierung (RAK) in 1976/1977.

The goal of the Paris Principles was to serve as a basis for international standardization in cataloging. Most of the cataloging codes that were developed worldwide since that time have followed the Paris Principles.

Cataloging codes
Cataloging codes prescribe which information about a bibliographic item is included in the entry and how this information is presented for the user; It may also aid to sort the entries in printing (parts of) the catalog.

Currently, most cataloging codes are similar to, or even based on, the International Standard Bibliographic Description (ISBD), a set of rules produced by the International Federation of Library Associations and Institutions (IFLA) to describe a wide range of library materials. These rules organize the bibliographic description of an item in the following eight areas: title and statement of responsibility (author or editor), edition, material specific details (for example, the scale of a map), publication and distribution, physical description (for example, number of pages), series, notes, and standard number (ISBN). There is an initiative called the Bibliographic Framework (Bibframe) that is "an initiative to evolve bibliographic description standards to a linked data model, in order to make bibliographic information more useful both within and outside the library community." The most commonly used cataloging code in the English-speaking world was the Anglo-American Cataloguing Rules, 2nd edition (AACR2). AACR2 provides rules for descriptive cataloging only and does not touch upon subject cataloging. AACR2 has been translated into many languages, for use around the world. The German-speaking world uses the Regeln für die alphabetische Katalogisierung (RAK), also based on ISBD. The Library of Congress implemented the transition to RDA from AACR2 in March 2013.

In subject databases such as Chemical Abstracts, MEDLINE and PsycINFO, the Common Communication Format (CCF) is meant to serve as a baseline standard. Different standards prevail in archives and museums, such as CIDOC-CRM. Resource Description and Access (RDA) is a recent attempt to make a standard that crosses the domains of cultural heritage institutions.

Digital formats
Most libraries currently use the MARC standards—first piloted from January 1966 to June 1968 —to encode and transport bibliographic data.

These standards have seen critiques in recent years for being old, unique to the library community, and difficult to work with computationally. The Library of Congress developed BIBFRAME in 2011, an RDA schema for expressing bibliographic data. BIBFRAME was revised and piloted in 2017 by the Library of Congress, but still is not available to the public. It will first be available to vendors to try out, but afterwards there will be a hybrid form of the system (MARC and BIBFRAME) until the data can be fully translated.

Library digital collections often use simpler digital formats to store their metadata. XML-based schemata, particularly Dublin Core and MODS, are typical for bibliographic data about these collections.

Transliteration
Library items that are written in a foreign script are, in some cases, transliterated to the script of the catalog. In the United States and some other countries, catalogers typically use the ALA-LC romanization tables for this work. If this is not done, there would need to be separate catalogs for each script.

Ethical issues
Ferris maintains that catalogers, in using their judgment and specialized viewpoint, uphold the integrity of the catalog and also provide "added value" to the process of bibliographic control, resulting in added findability for a library's user community. This added value also has the power to harm, resulting in the denial of access to information. Mistakes and biases in cataloging records can "stigmatize groups of people with inaccurate or demeaning labels, and create the impression that certain points of view are more normal than others".

Social responsibility in cataloging is the "fair and equitable access to relevant, appropriate, accurate, and uncensored information in a timely manner and free of bias". In order to act ethically and in a socially responsible manner, catalogers should be aware of how their judgments benefit or harm findability. They should be careful to not misuse or misrepresent information through inaccurate or minimal-level cataloging and to not purposely or inadvertently censor information.

Bair states that it is the professional obligation of catalogers to supply thorough, accurate, high-quality surrogate records for databases and that catalogers also have an ethical obligation to "contribute to the fair and equitable access to information." Bair recommends that catalogers "actively participate in the development, reform, and fair application of cataloging rules, standards, and classifications, as well as information-storage and retrieval systems". As stated by Knowlton, access points "should be what a particular type of library patron would be most likely to search under -- regardless of the notion of universal bibliographic control."

A formal code of ethics for catalogers does not exist, and thus catalogers often follow library or departmental policy to resolve conflicts in cataloging. While the American Library Association created a "Code of Ethics", Ferris notes that it has been criticized for being too general to encompass the special skills that set catalogers apart from other library and information professionals. As stated by Tavani, a code of ethics for catalogers can "inspire, guide, educate, and discipline" (as cited in Bair, 2005, p. 22). Bair suggests that an effective code of ethics for catalogers should be aspirational and also "discuss specific conduct and actions in order to serve as a guide in actual situations". Bair has also laid out the beginnings for a formal code of cataloging ethics in "Toward a Code of Ethics for Cataloging."

Criticism
Sanford Berman, former Head Cataloger of the Hennepin County Library in Minnetonka, Minnesota, has been a leading critic of biased headings in the Library of Congress Subject Headings. Berman's 1971 publication Prejudices and Antipathies: A Tract on the LC Subject Heads Concerning People (P&amp;A) has sparked the movement to correct biased subject headings. In P&amp;A, Berman listed 225 headings with proposed alterations, additions, or deletions and cross-references to "more accurately reflect the language used in addressing these topics, to rectify errors of bias, and to better guide librarians and readers to material of interest". Berman is well known for his "care packages," mailings containing clippings and other materials in support of changes to subject headings and against racism, sexism, homophobia, and governmental secrecy, among other areas for concern.

In "Three Decades Since Prejudices and Antipathies: A Study of Changes in the Library of Congress Subject Headings," Knowlton examines ways in which the Library of Congress Subject Headings (LCSH) has changed by compiling a table of changes described in P&A, followed by the current status of headings in question. Knowlton states that his intent for this table is to "show how many of Berman's proposed changes have been implemented" and "which areas of bias are still prevalent in LCSH." In the discussion of Knowlton's findings, it is revealed that of the 225 headings suggested for change by Berman, only 88 (39%) have been changed exactly or very closely to his suggestions (p. 127). Another 54 (24%) of headings have been changed but only partially resolve Berman's objections, and "(which) may leave other objectionable wording intact or introduce a different shade of bias." 80 (36%) headings were not changed at all according to Berman's suggestions.

Queer theory and cataloging
Building on Berman's critique of cataloging practices, queer theorists in library and information science such as Emily Drabinski, Amber Billey and K.R. Roberto have written about the implications of creating stable categorizations for gender identities. Utilizing queer theory in conjunction with library classification and cataloging requires perspectives that can present both ethically and politically sound viewpoints that support marginalized persons such as women, people of color, or members of the LGBTQ+ community. This work has resulted in the modification of RDA Rule 9.7, governing how gender is represented in record creation. At the ALA Midwinter meeting in January 2016, the controlled vocabulary for gender in RDA was abolished, allowing catalogers and libraries to describe a person's gender in whatever terms best represent that person.

Cataloging terms

 * or generally refers to the first author named on the item. Additional authors are added as "added entries." In cases where no clear author is named, the title of the work is considered the main entry.
 * is a process of using a single, specific term for a person, place, or title to maintain consistency between access points within a catalog. Effective authority control prevents a user from having to search for multiple variations of a title, author, or term.
 * refers to an approach in which libraries collaborate in the creation of bibliographic and authority records, establishing cataloging practices and utilizing systems that facilitate the use of shared records.