History of the Arabic alphabet

It is thought that the Arabic alphabet is a derivative of the Nabataean variation of the Aramaic alphabet, which descended from the Phoenician alphabet, which among others also gave rise to the Hebrew alphabet and the Greek alphabet, the latter one being in turn the base for the Latin and Cyrillic alphabets.

Origins
The Arabic alphabet evolved either from the Nabataean, or (less widely believed) directly from the Syriac. The table below shows changes undergone by the shapes of the letters from the Aramaic original to the Nabataean and Syriac forms. The Arabic script shown is that of post-Classical and Modern Arabic—notably different from 6th century Arabic script. (Arabic is placed in the middle for clarity and not to mark a time order of evolution.)

It seems that the Nabataean alphabet became the Arabic alphabet thus:


 * In the 6th and 5th centuries BCE, northern Arab tribes emigrated and founded a kingdom centred around Petra, Jordan. These people (now named Nabataeans from the name of one of the tribes, Nabatu) spoke Nabataean Arabic, a Northwest Semitic language.
 * In the 2nd or 1st centuries BCE, the first known records of the Nabataean alphabet were written in the Aramaic language (which was the language of communication and trade), but included some Arabic language features: the Nabataeans did not write the language which they spoke. They wrote in a form of the Aramaic alphabet, which continued to evolve; it separated into two forms: one intended for inscriptions (known as "monumental Nabataean") and the other, more cursive and hurriedly written and with joined letters, for writing on papyrus. This cursive form influenced the monumental form more and more and gradually changed into the Arabic alphabet.
 * Laïla Nehmé has demonstrated the transition of scripts from the Nabataean Aramaic to the recognisably Arabic form that appears to have occurred between the third and fifth centuries CE, replacing the indigenous Arabic alphabet.

Pre-Islamic Arabic inscriptions
The first known recorded text in the Arabic alphabet is known as the Zabad inscription, composed in 512. It is a trilingual dedication in Greek, Syriac and Arabic found at the village of Zabad in northwestern Syria. The version of the Arabic alphabet used includes only 21 letters, of which only 15 are different, being used to note 28 phonemes:

Many thousands of pre-Classical Arabic inscriptions are attested, in alphabets borrowed from Epigraphic South Arabian alphabets (however, Safaitic and Hismaic are not strictly Arabic, but Ancient North Arabian dialects, and written Nabataean is an Aramaic dialect):
 * Safaitic (over 13,000; almost all graffiti)
 * Hismaic in the southern parts of central Arabia
 * Preclassical Arabic inscriptions dating to the 1st century BC from Qaryat Al-Faw
 * Nabataean inscriptions in Aramaic, written in the Nabataean alphabet
 * Pre-Islamic Arabic inscriptions in the Arabic alphabet are very few, with only 5 known for certain. They mostly use no dots, making them sometimes difficult to interpret, as many letters are the same shape as other letters (they are written with rasm only)

Below are descriptions of inscriptions found in the Arabic alphabet, and the inscriptions found in the Nabataean alphabet that show the beginnings of Arabic-like features. Cursive Nabataean writing changed into Arabic writing, likeliest between the dates of the an-Namāra inscription and the Jabal Ramm inscription. Most writing would have been on perishable materials, such as papyrus. As it was cursive, it was liable to change. The epigraphic record is extremely sparse, with only five certainly pre-Islamic Arabic inscriptions surviving, though some others may be pre-Islamic.

Phonemes / letters inventory
The Nabataean alphabet was designed to write 22 phonemes, but Arabic has 28 consonant phonemes; thus, when used to write the Arabic language, 6 of its letters must each represent two phonemes:
 * d also represented ð,
 * ħ also represented kh %,
 * ṭ also represented ẓ,
 * ʕ also represented gh %,
 * ṣ also represented ḍ %,
 * t also represented θ.

In the cases marked %, the choice was influenced by etymology, as common Semitic kh and gh became Hebrew ħ and ʕ respectively.

As cursive Nabataean writing evolved into Arabic writing, the writing became largely joined-up. Some of the letters became the same shape as other letters, producing more ambiguities, as in the table:



Here the Arabic letters are listed in the traditional Levantine order but are written in their current forms, for simplicity. The letters which are the same shape have coloured backgrounds. The second value of the letters that represent more than one phoneme is after a comma. In these tables, ǧ is j as in English "June". In the Arabic language, the g sound seems to have changed into j in fairly late pre-Islamic times, but this seems not to have happened in those tribes who invaded Egypt and settled there.

When a letter was at the end of a word, it often developed an end loop, and as a result most Arabic letters have two or more shapes.
 * b and n and t became the same.
 * y became the same as b and n and t except at the ends of words.
 * j and ħ became the same.
 * z and r became the same.
 * s and sh became the same.

After all this, there were only 17 letters that were different in shape. One letter-shape represented 5 phonemes (b t th n and sometimes y), one represented 3 phonemes (j ħ kh), and 5 each represented 2 phonemes. Compare the Hebrew alphabet, as in the table:

.

Early Islamic changes
The Arabic alphabet is first attested in its classical form in the 7th century. See PERF 558 for the first surviving Islamic Arabic writing.

The Quran was transcribed in Kufic script at first, which was then developed along with the Meccan and scripts, according to Ibn an-Nadim in Al-Fihrist.

In the 7th century, probably in the early years of Islam while writing down the Qur'an, scribes realized that working out which of the ambiguous letters a particular letter was from context was laborious and not always possible, so a proper remedy was required. Writings in the Nabataean and Syriac alphabets already had sporadic examples of dots being used to distinguish letters which had become identical, for example as in the table on the right. By analogy with this, a system of dots was added to the Arabic alphabet to make enough different letters for Classical Arabic's 28 phonemes. Sometimes the resulting new letters were put in alphabetical order after their un-dotted originals, and sometimes at the end. The first surviving document that definitely uses these dots is also the first surviving Arabic papyrus (PERF 558), dated April, 643. The dots did not become obligatory until much later. Important texts like the Qur'an were frequently memorized; this practice, which survives even today, probably arose partly to avoid the great ambiguity of the script, and partly due to the scarcity of books in times when printing was unheard-of in the area and every copy of every book had to be written by hand.

The alphabet then had 28 letters, and so could be used to write the numbers 1 to 10, then 20 to 100, then 200 to 900, then 1000 (see Abjad numerals). In this numerical order, the new letters were put at the end of the alphabet. This produced this order: alif (1), b (2), j (3), d (4), h (5), w (6), z (7), H (8), T (9), y (10), k (20), l (30), m (40), n (50), s (60), ayn (70), f (80), S (90), q (100), r (200), sh (300), t (400), th (500), dh (600), kh (700), D (800), Z (900), gh (1000).

The lack of vowel signs in Arabic writing created more ambiguities: for example, in Classical Arabic ktb could be kataba = "he wrote", kutiba = "it was written" or kutub="books". Later, vowel signs and hamzas were added, beginning some time in the last half of the 6th century, at about the same time as the first invention of Syriac and Hebrew vocalization. Initially, this was done using a system of red dots, said to have been commissioned by Hajjaj ibn Yusuf, the Umayyad governor of Iraq, according to traditional accounts: a dot above = a, a dot below = i, a dot on the line = u, and doubled dots giving tanwin. However, this was cumbersome and easily confusable with the letter-distinguishing dots, so about 100 years later, the modern system was adopted. The system was finalized around 786 by al-Farahidi.

All administrative texts were previously recorded by Persian scribes in Middle Persian using Pahlavi script, but many of the initial orthographic alterations to the Arabic alphabet might have been proposed and implemented by the same scribes.

When new signs were added to the Arabic alphabet, they took the alphabetical order value of the letter which they were an alternative for: tā' marbūta (see also below) took the value of ordinary t, and not of h. In the same way, the many diacritics do not have any value: for example, a doubled consonant indicated by shadda does not count as a letter separate from the single one.

Some features of the Arabic alphabet arose because of differences between Qur'anic spelling and the form of Classical Arabic that was phonemically and orthographically standardized later. These include:
 * tā' marbūta: This arose because, in many dialects, the -at ending of feminine nouns (tā' marbūta) was lenited over time and was often pronounced as -ah and written as h. This pronunciation eventually became standard, and so to avoid altering Quranic spelling, the dots of t were written over the h.
 * y (alif maksura ى) used to spell ā at the ends of some words: This arose because ā arising from contraction where single y dropped out between vowels was in some dialects pronounced at the ends of words with the tongue further forward than for other ā vowels, and as a result in the Qu'ran it was written as y.
 * ā not written as alif in some words: The Arabic spelling of Allāh was decided before the Arabs started using alif to spell ā. In other cases (for example the first ā in hāðā = "this"), it may be that some dialects pronounced those vowels short.
 * hamza: Originally alif was used to spell the glottal stop. But Meccans did not pronounce the glottal stop, replacing it with w, y or nothing, lengthening an adjacent vowel, or, intervocalically, dropping the glottal stop and contracting the vowels. Thus, Arabic grammarians invented the hamza diacritic sign and used it to mark the glottal stop.

Reorganization of the alphabet
Less than a century later, Arab grammarians reorganized the alphabet, for reasons of teaching, putting letters next to other letters which were nearly the same shape. This produced a new order which was not the same as the numeric order, which became less important over time because it was being competed with by the Indian numerals and sometimes by the Greek numerals.

The Arabic grammarians of North Africa changed the new letters, which explains the differences between the alphabets of the East and the Maghreb.

The old alphabetical order, as in the other alphabets shown here, is known as the Levantine or Abjadi order. If the letters are arranged by their numeric order, the Levantine order is restored:

(Note: here "numeric order" means the traditional values when these letters were used as numbers. See Arabic numerals, Greek numerals and Hebrew numerals for more details)

This order is much the oldest. The first written records of the Arabic alphabet show why the order was changed.

Abbasid standardizations
Arabic script reached a climax in aesthetics and geographic spread under the Abbasid Caliphate. In this period, Ibn al-Bawwab and Ibn Muqla had the most influence on the standardization of Arabic script. They were associated with al-khatt al-mansūb (الخط المنسوب), or "proportioned script."

Adapting the Arabic alphabet for other languages
When the Arabic alphabet spread to countries which used other languages, extra letters had to be invented to spell non-Arabic sounds. Usually the alteration was three dots above like ژ, and  or below like چ and پ.
 * Urdu: retroflex sounds: as the corresponding dentals but with a small letter ط above. (This problem in adapting a Semitic alphabet to write Indian languages also arose long before this: see Brahmi)
 * This book shows an example of ch (Polish cz) being written as in an Arabic-Polish bilingual Quran for Muslim Tatars living in Poland.
 * There are broadly two standards for Pashto orthography, the Afghan orthography in Afghanistan and the Peshawar orthography in Pakistan where is represented by  instead of the Afghani.

Decline in use by non-Arabic states
Since the early 20th century, as the Ottoman Empire collapsed and European influence increased, many non-Arab Islamic areas began using the Cyrillic or Latin alphabet, and local adaptations of the Arabic alphabet were abandoned. In many cases, the writing of a language in Arabic script has become restricted to classical texts and traditional purposes (as in the Turkic States of Central Asia, or Hausa and others in West Africa), while in others, the Arabic alphabet is used alongside the Latin one (as with Jawi in Brunei).