Language documentation

Language documentation (also: documentary linguistics) is a subfield of linguistics which aims to describe the grammar and use of human languages. It aims to provide a comprehensive record of the linguistic practices characteristic of a given speech community. Language documentation seeks to create as thorough a record as possible of the speech community for both posterity and language revitalization. This record can be public or private depending on the needs of the community and the purpose of the documentation. In practice, language documentation can range from solo linguistic anthropological fieldwork to the creation of vast online archives that contain dozens of different languages, such as FirstVoices or OLAC.

Language documentation provides a firmer foundation for linguistic analysis in that it creates a corpus of materials in the language. The materials in question can range from vocabulary lists and grammar rules to children's books and translated works. These materials can then support claims about the structure of the language and its usage. This should be seen as a basic taxonomic task for linguistics, identifying the range of languages and their characteristics.

Methods
Typical steps involve recording, maintaining metadata, transcribing (often using the International Phonetic Alphabet and/or a "practical orthography" made up for that language), annotation and analysis, translation into a language of wider communication, archiving and dissemination. Critical is the creation of good records in the course of doing language description. The materials can be archived, but not all archives are equally adept at handling language materials preserved in varying technological formats, and not all are equally accessible to potential users.

Language documentation complements language description, which aims to describe a language's abstract system of structures and rules in the form of a grammar or dictionary. By practising good documentation in the form of recordings with transcripts and then collections of texts and a dictionary, a linguist works better and can provide materials for use by speakers of the language. New technologies permit better recordings with better descriptions which can be housed in digital archives such as AILLA, Pangloss, or Paradisec. These resources can then be made available to the speakers. The first example of a grammar with a media corpus is Thieberger's grammar of South Efate (2006).

Language documentation has also given birth to new specialized publications, such as the free online and peer-reviewed journal Language Documentation & Conservation and the SOAS working papers Language Documentation & Description.

Digital language archives
The digitization of archives is a critical component of language documentation and revitalization projects. There are descriptive records of local languages that could be put to use in language revitalization projects that are overlooked due to obsolete formatting, incomplete hard-copy records, or systematic inaccessibility. Local archives in particular, which may have vital records of the area's indigenous languages, are chronically underfunded and understaffed. Historic records relating to language that have been collected by non-linguists such as missionaries can be overlooked if the collection is not digitized. Physical archives are naturally more vulnerable to damage and information loss.

Teaching with documentation
Language documentation can be beneficial to individuals who would like to teach or learn an endangered language. If a language has limited documentation this also limits how it can be used in a language revitalization context. Teaching with documentation and linguist's field notes can provide more context for those teaching the language and can add information they were not aware of. Documentation can be useful for understanding culture and heritage, as well as learning the language. Important components when teaching a language includes: Listening, reading, speaking, writing, and cultural components. Documentation gives resources to further the skills for learning a language. For example, the Kaurna language was revitalized through written resources. These written documents served as the only resource and were used to re-introduce the language and one way was through teaching, which also included the making of a teaching guide for the Kaurna language. Language documentation and teaching have a relationship because if there are no fluent speakers of a language, documentation can be used as a teaching resource.

Types
Language description, as a task within linguistics, may be divided into separate areas of specialization:
 * Phonetics, the study of the sounds of human language
 * Phonology, the study of the sound system of a language
 * Morphology, the study of the internal structure of words
 * Syntax, the study of how words combine to form grammatical sentences
 * Semantics, the study of the meaning of words (lexical semantics), and how these combine to form the meanings of sentences
 * Historical linguistics, the study of languages whose historical relations are recognizable through similarities in vocabulary, word formation, and syntax
 * Pragmatics, the study of how language is used by its speakers
 * Stylistics, the study of style in languages
 * Paremiography, the collection of proverbs and sayings

Related research areas

 * Linguistic description
 * Orthography, the study of writing systems
 * Lexicography, the study and practice of making dictionaries
 * Phonology, the study of describing the sound system of a language
 * Etymology, the study of how words acquire their meanings
 * Anthropological linguistics

Organizations

 * DoBeS
 * First Peoples' Heritage, Language and Culture Council
 * LACITO and the Pangloss Collection
 * The Language Conservancy
 * PARADISEC Archive
 * The Endangered Languages Archive (ELAR)
 * Resource Network for Linguistic Diversity
 * SIL International
 * Western Institute for Endangered Language Documentation (WIELD)
 * World Oral Literature Project, Voices of Vanishing Worlds