Writing system

A writing system comprises a particular set of symbols, called a script, as well as the rules by which the script represents a particular language. Writing systems can generally be classified according to how symbols function according to these rules, with the most common types being alphabets, syllabaries, and logographies. Alphabets use symbols called letters that correspond to spoken phonemes. Abjads only have letters for consonants, while pure alphabets have letters for both consonants and vowels. Abugidas use characters that correspond to consonant–vowel pairs. Syllabaries use symbols called syllabograms to represent syllables or moras. Logographies use characters that represent semantic units, such as words or morphemes.

Alphabets typically use fewer than 100 symbols, while syllabaries and logographies may use hundreds or thousands respectively. Writing systems also include punctuation to aid interpretation and encode additional meaning, including that which is communicated verbally by qualities such as rhythm, tone, pitch, accent, inflection or intonation.

The earliest writing was invented during the late 4th millennium BC. Each independently invented writing system in human history evolved from a proto-writing system of signs not fully capable of encoding spoken language. These systems used a small number of ideograms, but were not fully capable of encoding spoken language, and lacked the ability to express a broad range of ideas.

Background: relationship with language
According to most contemporary definitions, writing is a visual and tactile notation representing language. The symbols used in writing correspond systematically to functional units of either a spoken or signed language. This definition excludes a broader class of symbolic markings, such as drawings and maps. The relationship between writing and language more broadly has been the subject of philosophical analysis as early as Aristotle (384–322 BC). While the use of language is universal across human societies, writing is not—having first emerged much more recently, and only having been independently invented in a handful of locations throughout history. In the first several decades of modern linguistics as a scientific discipline, linguists often characterized writing in their work as merely the technology used to record speech—which was treated as being of paramount importance, for what was seen as the unique potential for its study to further the understanding of human cognition.

A text is any instance of written material, including transcriptions of spoken material. The act of composing and recording a text may be referred to as writing, and the act of viewing and interpreting the text as reading. All writing systems require a set of defined base elements, individually termed signs or graphemes and collectively called a script. The orthography of the writing system is the set of rules and conventions understood and shared by a community, which assigns meaning to the ordering of and relationship between the graphemes. The orthography represents the constructions of at least one, generally spoken language. Once established, conventions of written language generally changes more slowly than those of spoken language. As such, writing often preserves features after they cease to appear in speech.

The exact relationship between writing systems and languages can be complex. A single language (e.g. Hindustani) can be written using multiple writing systems, and a writing system can also represent multiple languages. For example, Chinese characters represent multiple spoken languages within China, and also were the early writing system for the Vietnamese language until Vietnamese switched to the Latin script.

General terminology
Orthography (lit. 'correct writing') refers to the structural method and rules of writing, and, particularly for alphabets, includes the concept of spelling.

Grapheme and phoneme
A grapheme is a specific base unit of a writing system. They are the minimally significant elements which taken together comprise the set of symbols from which texts may be constructed. The concept of the grapheme is similar to that of the phoneme used in the study of spoken languages. For example, in the English orthography, examples of graphemes include the uppercase and lowercase forms of the 26 letters2 of the Latin alphabet (with these graphemes corresponding to various phonemes), punctuation marks (mostly non-phonemic), and a handful of other symbols, such as numerals.

An individual grapheme may be represented in a wide variety of ways, where each variation is visually distinct in some regard, but all are interpreted as representing the "same" grapheme. These individual variations are known as allographs of a grapheme. For example, the lowercase letter a has different allographs when written in a cursive, block, or typed style. The choice of a particular allograph may be influenced by the medium used, the writing instrument used, the stylistic choice of the writer, the preceding and succeeding graphemes in the text, the time available for writing, the intended audience, and the largely unconscious features of an individual's handwriting.

Glyph, sign and character
The terms glyph, sign and character are sometimes used to refer to graphemes. Glyphs in linear writing systems are made up of lines or strokes. Linear writing is most common, but there are non-linear writing systems where glyphs consist of other types of marks, such as in cuneiform and Braille.

Complete and partial systems
Writing systems may be regarded as complete if they are able to represent all that may be expressed in the spoken language, while a partial writing system cannot represent the spoken language in its entirety.

Proto-writing
Writing systems were preceded by proto-writing systems consisting of ideograms and early mnemonic symbols. The best-known examples are:
 * A clay token system used for accounting purposes in Mesopotamia (c. 9000 BC)
 * Jiahu symbols (c. 6600 BC)
 * Vinča symbols (c. 5300 BC)
 * Proto-cuneiform (c. 3500 BC)
 * Indus script (c. 3500 BC)
 * Nsibidi script, (before 500 AD)

Invention of writing
Writing has been invented independently multiple times in human history. The invention of the first writing systems is roughly contemporary with the beginning of the Bronze Age in the late 4th millennium BC. The archaic cuneiform script used to write Sumerian is generally considered to be the earliest true writing system, closely followed by the Egyptian hieroglyphs. Both evolved from proto-writing systems between 3400 and 3200 BC, with the earliest coherent texts dated c. 2600 BC. It is generally agreed that the two systems were invented independently from one another. Chinese characters emerged independently in the Yellow River valley c. 1200 BC. There is no evidence of contact between China and the literate peoples of the Near East, and the Mesopotamian and Chinese approaches for representing aspects of sound and meaning are distinct. The Mesoamerican writing systems, including Olmec and the Maya script, were also invented independently.

The first known consonantal alphabetic writing appeared before 2000 BC, and was used to write a Semitic language spoken in the Sinai Peninsula. Most of the world's alphabets either descend directly from this Proto-Sinaitic script, or were directly inspired by its design. Descendants include the Phoenician alphabet (c. 1050 BC), and its child in the Greek alphabet (c. 800 BC). The Latin alphabet, which descended from the Greek alphabet, is by far the most common script used by writing systems.

Classification by basic linguistic unit
Several approaches have been taken to classify writing systems, the most common and basic one being a broad division into three categories: logographic, syllabic, and alphabetic (or segmental). Logographies use characters that represent semantic units, such as words or morphemes. Syllabaries use symbols called syllabograms to represent syllables or moras. Alphabets use symbols called letters that correspond to spoken phonemes. Alphabets consist of three types: abjads only have letters for consonants, while pure alphabets have letters for both consonants and vowels. Abugidas use characters that correspond to consonant–vowel pairs. David Diringer proposed a classification of five types of writing systems: pictographic script, ideographic script, analytic transitional script, phonetic script, alphabetic script.

Logographic systems
A logogram is a character that represents a morpheme within a language. Chinese characters represent the only major logographic writing systems still in use, used to write the varieties of Chinese, as well as Japanese, Korean, Vietnamese, and other languages of the Sinosphere. As each character represents a single unit of meaning, many different logograms are required in order to write all the words of a language. If the logograms do not adequately represent all meanings and words of a language, written language can be confusing or ambiguous to the reader. The vast array of logograms and the need to remember what they all mean are considered by many as major disadvantages of logographic systems compared to alphabetic systems.

Since the meaning is inherent to the symbol, the same logographic system could theoretically be used to write different spoken languages. In practice, the ability to communicate across languages works best in closely related languages, like the varieties of Chinese, and works only to a lesser extent for less closely related languages, as differences in syntax reduce the cross-linguistic portability of a given logographic system. For example, the Japanese writing system uses Chinese characters (known as kanji) extensively, with most having similar meanings as in Chinese. As a result, short and concise phrases written in Chinese such as those on signs and in newspaper headlines are often easy for a Japanese reader to comprehend. However, the grammatical differences between Japanese and Chinese are large enough that a long Chinese text is not readily understandable to a Japanese reader without knowledge of Chinese. Similarly, a Chinese reader can get a general idea of what a long Japanese kanji text means, but usually cannot understand the text fully.

While most languages do not use logographic writing systems, many systems include a few logograms. A good example of modern western logograms is the Arabic numerals: readers across many different languages understand what $⟨1⟩$ means whether they read it as one, ehad, uno, or ichi. Other logograms include the ampersand $⟨&⟩$, the at sign $⟨@⟩$, the percent sign $⟨%⟩$, and the many signs representing units of currency.

Logograms are sometimes conflated with ideograms, symbols which graphically represent abstract ideas; most linguists reject this characterization of historically attested writing: Chinese characters are often semantic–phonetic compounds, symbols which include an element that represents the meaning and a phonetic complement element that represents the pronunciation. Some non-linguists distinguish between lexigraphy and ideography, where symbols in lexigraphies represent words and symbols in ideographies represent morphemes.

Syllabaries
A syllabary is a set of written symbols that represent syllables, which make up words. A symbol in a syllabary typically represents a consonant sound followed by a vowel sound, or just a vowel alone. Syllabaries are best suited to languages with relatively simple syllable structure, since a different symbol is needed for every syllable. Japanese, for example, contains about 100 syllables, which are represented by syllabic hiragana. By contrast, English features complex syllable structures with a relatively large inventory of vowels and complex consonant clusters—making for a total of 15–16,000 distinct syllables. Some syllabaries have larger inventories: the Yi script contains 756 different symbols.

In a true syllabary, there is no systematic graphic similarity between phonetically related syllabograms. That is, the characters for, and  have no similarity to indicate their common "k" sound (voiceless velar plosive). Some more recently created writing systems such as the Cree syllabary are not true syllabaries, but instead use related symbols for phonetically similar syllables. Other true syllabaries include Linear B and the Cherokee script.

Alphabets
An alphabet is a small set of letters (basic written symbols), each of which roughly represents or represented historically a segmental phoneme of a spoken language. The word alphabet is derived from alpha and beta, the first two symbols of the Greek alphabet.

An abjad is an alphabet whose letters only represent the consonantal sounds of a language. They were the first alphabets to develop historically, with most that have been developed used to write Semitic languages, and originally deriving from the Middle Bronze Age alphabets. The morphology of Semitic languages is such that the denotation of vowels is generally redundant. Optional markings for vowels may be used for some abjads, but are generally limited to applications like education. Many pure alphabets were derived from abjads through the addition of dedicated vowel letters, as with the derivation of the Greek alphabet from the Phoenician alphabet c. 800 BC. Abjad is the word for "alphabet" in Arabic and Malay: the term derives from the traditional order of the Arabic alphabet's letters, , , , though the word may have earlier roots in Phoenician or Ugaritic.

An abugida is an alphabetic writing system whose basic signs denote consonants with an inherent vowel and where consistent modifications of the basic sign indicate other following vowels than the inherent one. In an abugida, there may be a sign for k with no vowel, but also one for ka (if a is the inherent vowel), and ke is written by modifying the ka sign in a consistent way with how la would be modified to get le. In many abugidas, modification consists of the addition of a vowel sign; other possibilities include rotation of the basic sign, or addition of diacritics.

While true syllabaries have one symbol per syllable and no systematic visual similarity, the graphic similarity in most abugidas stems from their origins as abjads—with added symbols comprising markings for different vowel added onto a pre-existing base symbol. The largest single group of abugidas is the Brahmic family of scripts, however, which includes nearly all the scripts used in India and Southeast Asia. The name abugida is derived from the first four characters of an order of the Ge'ez script used in some contexts. It was borrowed from Ethiopian languages as a linguistic term by Peter T. Daniels.

Featural systems
A featural script represents finer detail than an alphabet. Here symbols do not represent whole phonemes, but rather the elements (features) that make up the phonemes, such as voicing or its place of articulation. Theoretically, each feature could be written with a separate letter; and more systems could be featural, but the only prominent featural system is Korean hangul, where featural symbols are combined into alphabetic letters, and these letters are in turn joined into syllabic blocks, so that the system combines three levels of phonological representation.

Many scholars, e.g. John DeFrancis, reject this class or at least labeling hangul as such. The Korean script is a conscious script creation by literate experts, which Daniels calls a "sophisticated grammatogeny". These include stenographies and constructed scripts of hobbyists and fiction writers (such as Tengwar), many of which feature advanced graphic designs corresponding to phonological properties. The basic unit of writing in these systems can map to anything from phonemes to words. It has been shown that even the Latin script has sub-character features.

Classification by graphical properties
Perhaps the primary graphic distinction made in classifications is that of linearity. Linear writing systems are those in which the characters are composed of lines, such as the Latin alphabet and Chinese characters. Chinese characters are considered linear whether they are written with a ball-point pen or a calligraphic brush, or cast in bronze. Similarly, Egyptian hieroglyphs and Maya script were often painted in linear outline form, but in formal contexts they were carved in bas-relief. The earliest examples of writing are linear: while cuneiform was not linear, its Sumerian ancestors were. Non-linear systems are not composed of lines, no matter what instrument is used to write them. Cuneiform was likely the earliest non-linear writing. Its glyphs were formed by pressing the end of a reed stylus into moist clay, not by tracing lines in the clay with the stylus as had been done previously. The result was a radical transformation of the appearance of the script.

Braille is a non-linear adaptation of the Latin alphabet that completely abandoned the Latin forms. The letters are composed of raised bumps on the writing substrate, which can be leather, stiff paper, plastic or metal. There are also transient non-linear adaptations of the Latin alphabet, including Morse code, the manual alphabets of various sign languages, and semaphore, in which flags or bars are positioned at prescribed angles. However, if "writing" is defined as a potentially permanent means of recording information, then these systems do not qualify as writing at all, since the symbols disappear as soon as they are used. Instead, these transient systems serve as signals.

Directionality
Scripts are graphically characterized by the direction in which they are written. Egyptian hieroglyphs were written either left to right or right to left, with the animal and human glyphs turned to face the beginning of the line. The early alphabet could be written in multiple directions: horizontally from side to side, or vertically. Prior to standardization, alphabetical writing was done both left-to-right (LTR) and right-to-left (RTL). It was most commonly written boustrophedonically: starting in one (horizontal) direction, then turning at the end of the line and reversing direction.

The Greek alphabet and its successors settled on a left-to-right pattern, from the top to the bottom of the page. Other scripts, such as Arabic and Hebrew, came to be written right-to-left. Scripts that historically incorporate Chinese characters have traditionally been written, on the character-level, vertically (top-to-bottom), from the right to the left of the page, but nowadays are frequently written left-to-right, top-to-bottom, due to Western influence, a growing need to accommodate terms in the Latin script, and technical limitations in popular electronic document formats, and the fact that strokes are predominantly written from top to bottom and left to right, and this is also the order in which they are written within every character.

Several scripts used in the Philippines and Indonesia, such as Hanunó'o, are traditionally written with lines moving away from the writer, from bottom to top, but are read horizontally left to right; however, Kulitan, another Philippine script, is written top to bottom and right to left. Ogham is written bottom to top and read vertically, commonly on the corner of a stone. The ancient Libyco-Berber alphabet was also written from bottom to top.

Left-to-right writing has the advantage that, since most people are right-handed, the hand does not interfere with the just-written text—which might not yet have dried—since the hand is on the right side of the pen. Right-to-left writing, by contrast, may have been advantageous back when writing was done with hammer and chisel; the scribe would hold the hammer in their right hand and chisel in their left, and going right-to-left would mean the hammer was less likely to hit the left hand, as the right hand had more control.