User:Ruud Koot/Gujarati script/How To: Use Unicode for creating Gujarati script

This is a subpage for the main article - Gujarati script. Here you can find additional details and resources regarding how to user Unicode for creating Gujarati script.

Unicode Code-set for Gujarati Script
The Unicode range for Gujarati script is from U+0A80 to U+0AFF. The ISCII Code-page identifier for Gujarati script is 57010.

The table below shows the glyphs that are implemented in Unicode standard 4.0.0. Gray boxes indicate the code-points that are reserved/unused.


 * For further details regarding Unicode Code-points and standards, you may refer to Unicode Code-chart — Standard 4.1.

How To: Use Unicode for creating Gujarati script
Note: In the examples shown in the sections below, the "+" sign denotes the combination of key-strokes.

Half-form of consonants
Half-forms of consonants are used in pre-base position. For consonants that do not have distinct glyph for half-forms, a Halant (્) is used to create half-forms as follows: (Note the Half-form of મ, which is used here in conjunction with ય) Note: Half-form is not created for the base glyph even if the syllable ends with a Halant.

Application of Upper-based form of Ra – (Reph)
Application of Ra with a Halant (Half-form of Ra, as seen above) to a full-form consonant before the constonant produces Reph for that consonant. This affects the pronunciation of Ra in conjunction with that consonant. A Reph can be created as follows: (Ra + Halant + થ = Reph effect on થ)

Application of Lower-based form of Ra – (Vattu)
Application of a Halant of a consonant (Half-form of consonant) to a full-form of Ra produces Vattu for that consonant. This affects the pronunciation of Ra in conjunction with that consonant. A Vattu can be created as follows: (પ + Halant + Ra = Vattu effect on પ)

Vattu variants
Vattu variants (half and full) are formed when consonants with vattu mark are combined. Often in some cases, a special glyph is required to represent vattu when various consonants are combined. (special glyph ડ્ર. Notice the two lower-based marks, as compared to only one in the previous example.)

Above-based marks
All above-based marks and post-based matra are created as under:

Below-based marks
The below-based marks and post-based matra are created as below:

Characters શ્ર, ક્ષ and જ્ઞ
Following characters, which are part of the Gujarati alphabet, but are not explicitly created as glyphs in Unicode character-set, can be generated as indicated below:

Application of Nukta
Nukta effects the pronunciation of the (preceding) consonant to which it is applied. A Nukta form of a consonant can be created in Unicode as follows:

Substitutions for specific typography of the script
Substitution, in the context applicable here, means replacing a set or group of characters with a resultant single unicode character. Following are the main character substitutions which are required to address the complexity of the language and to generate various character forms of the script:

Pre-base substitutions
The half-form conjunctions, one of the most common occurrences of the script, are created by pre-base substitutions. Also, the special use of this substitution is in creating I-Matra (and its appropriately aligned shape) as shown below:

Post-base substitutions
Consonants of the Gujarati script do not have post-based forms. Primarily, post-based substitution is used to create visarga out of vowels, and is also applied for "I-Matra" substitutions as follows (which will precede any above-based substitution, if applied as well): (Compare the special shape જી – a result of post-based substitution – with another result of similar conbination using a character like લ, which will generate: લ +ી = લી)

Above-base substitutions
Above-based substitution is mainly applied for Matra, Reph, vowel modifications and for stress and tone marks. Consider the following examples:

Below-base substitutions
Mainly used for below-based matra, the below-based substitution could produce a conjunction, or change the whole shape of the glyph. This substitution is also used for producing special tone effect like anudatta.

More details on Gujarati Unicode

 * For further details on Gujarati Unicode, you may refer to Unicode Std 4.0.0 - Chapter 9
 * TDIL: Ministry of Communication & Information Technology, India
 * If you are creating a web-page while the OS language is not Gujarati, save the file as UTF-8 Unicode HTML. The code-points may be lost otherwise.