Hindi–Urdu transliteration

Hindi–Urdu (Devanagari: हिन्दी-उर्दू, Nastaliq: ) (also known as Hindustani) is the lingua franca of modern-day Northern India and Pakistan (together classically known as Hindustan). Modern Standard Hindi is officially registered in India as a standard written using the Devanagari script, and Standard Urdu is officially registered in Pakistan as a standard written using an extended Perso-Arabic script.

Hindi–Urdu transliteration (or Hindustani transliteration) is essential for Hindustani speakers to understand each other's text, and it is especially important considering that the underlying language of both the Hindi & Urdu registers are almost the same. Transliteration is theoretically possible because of the common Hindustani phonology underlying Hindi-Urdu. In the present day, the Hindustani language is seen as a unifying language, as initially proposed by Mahatma Gandhi to resolve the Hindi–Urdu controversy. ("Hindustani" is not to be confused with followers of Hinduism, as 'Hindu' in Persian means 'Indo')

Technically, a direct one-to-one script mapping or rule-based lossless transliteration of Hindi-Urdu is not possible, majorly since Hindi is written in an abugida script and Urdu is written in an abjad script, and also because of other constraints like multiple similar characters from Perso-Arabic mapping onto a single character in Devanagari. However, there have been dictionary-based mapping attempts which have yielded very high accuracy, providing near-to-perfect transliterations. For literary domains, a mere transliteration between Hindi-Urdu will not suffice as formal Hindi is more inclined towards Sanskrit vocabulary whereas formal Urdu is more inclined towards Persian and Arabic vocabulary; hence a system combining transliteration and translation would be necessary for such cases.

In addition to Hindi-Urdu, there have been attempts to design Indo-Pakistani transliteration systems for digraphic languages like Sindhi (written in extended Perso-Arabic in Sindh of Pakistan and in Devanagari by Sindhis in partitioned India), Punjabi (written in Gurmukhi in East Punjab and Shahmukhi in West Punjab), Saraiki (written in extended-Shahmukhi script in Saraikistan and unofficially in Sindhi-Devanagari script in India) and Kashmiri (written in extended Perso-Arabic by Kashmiri Muslims and extended-Devanagari by Kashmiri Hindus).

Consonants
Hindustani has a rich set of consonants in its full-alphabet, since it has a mixed-vocabulary (rekhta) derived from Old Hindi (from Dehlavi), with loanwords from Parsi (from Pahlavi) and Arabic languages, all of which itself are from 3 different language-families respectively: Indo-Aryan, Iranian and Semitic.

The following table provides an approximate one-to-one mapping for Hindi-Urdu consonants, especially for computational purposes (lossless script conversion). Note that this direct script conversion will not yield correct spellings, but rather a readable text for both the readers. Note that Hindi–Urdu transliteration schemes can be used for Punjabi as well, for Gurmukhi (Eastern Punjabi) to Shahmukhi (Western Punjabi) conversion, since Shahmukhi is a superset of the Urdu alphabet (with 2 extra consonants) and the Gurmukhi script can be easily converted to the Devanagari script.

Sanskrit consonants
The following consonants are mostly used in words that are directly borrowed or adapted from Sanskrit.

Implosive consonants
These consonants are mostly found only in languages like Sindhi and Saraiki.

Sample text
The following is an excerpt from the Hindustani poem Tarānah-e-Hindi written by Muhammad Iqbal.