Sesotho orthography

The orthography of the Sotho language is fairly recent and is based on the Latin script, but, like most languages written using the Latin alphabet, it does not use all the letters; as well, several digraphs and trigraphs are used to represent single sounds.

The orthographies used in Lesotho and South Africa differ, with the Lesotho variant using diacritics.

As with almost all other Bantu languages, although the language is a tonal language, tone is never indicated.

For an overview of the symbols used and the sounds they represent, see the phoneme tables at Sotho phonology.
 * Note that often when a section discusses formatives, affixes, or vowels it may be necessary to view the IPA to see the proper conjunctive word division and vowel qualities.

History
The original orthography was developed in the early 19th century by missionaries from the Paris Evangelical Missionary Society to aid in translating the Bible. The earliest orthographies were more like French spelling, still seen in the writing of the approximants and  in the modern Lesotho variant.

South African alphabet
Sesotho in South Africa uses the following alphabet:

Lesotho versus South African writing
One issue which complicates the written language is the two divergent orthographies used by the two countries with the largest number of first language speakers. The Lesotho orthography is older than the South African one and differs from it not only in the choice of letters and the marking of initial syllabic nasals, but also (to a much lesser extent) in written word division and the use of diacritics on vowels to distinguish some ambiguous spellings.

Additionally, in older texts the nasalized click was written nǵ in Lesotho (as a relic of a much older click series: ḱ, ḱh, and nǵ), but now the more universal digraph nq is used in both countries.

When the symbol "š" is unavailable electronically, people who write in Lesotho Sesotho often use ts' or t's to represent the aspirated alveolar affricate tš.

In word-initial positions, a syllabic nasal followed by a syllable starting with the same nasal is written as an n or m in South Africa but as an apostrophe in Lesotho.

Note that, when not word-initial, Lesotho orthography uses an n or m just like South African orthography.

When consonants or vowels are omitted due to (diachronic or synchronic) contractions, Lesotho orthography uses apostrophes to indicate the missing sounds while the South African orthography generally does not.
 * Ha ke eso mmone &mdash; Ha ke e-s'o 'mone I haven't seen her
 * Ngwana ka &mdash; Ngoan'a ka My child

In order to distinguish between the concords of class 1(a) and the 2nd. person singular, Lesotho orthography uses u to represent phonetic o and w for the 2nd. person, even when there is no chance of ambiguity.
 * U motle You are beautiful
 * O motle He/she is beautiful


 * Le uena ke u elelitse I did advise you too
 * Le eena ke mo elelitse I did advise him/her too

In Lesotho, ò (for the two mid back vowels), ō (for the near-close back vowel), è (for the two mid front vowels), and ē (for the near-close front vowel) are sometimes used to avoid spelling ambiguities. This is never done in South African writing.


 * ho tšèla to pour &mdash; ho tšēla to cross
 * ho ròka to sing a praise poem &mdash; ho rōka to sew

These examples also have differing tone patterns.

Although the two orthographies tend to use similar written word divisions, they do differ on some points:
 * 1) More often than not compounds that are written as one word in South African Sesotho will be written with dashes in Lesotho Sesotho
 * moetapele &mdash; moeta-pele leader
 * 1) The prosodic penultimate e- that is sometimes affixed to monosyllabic verbs is written with a dash in Lesotho
 * eba! &mdash; e-ba! be!
 * 1) The "focus marker" -a- is inserted between the subject concord and the verb stem in different ways in the two orthographies. This is probably the most commonly encountered difference between the word divisions of the two orthographies
 * Dikgomo di a fula &mdash; Likhomo lia fula The cows are grazing
 * 1) The class 2a prefix is usually simply attached to the class 1a noun in South Africa but Lesotho orthography uses a dash
 * ntate father ⇒ bontate &mdash; bo-ntate fathers/father-and-them

Very often South Africans with recent ancestors from Lesotho have surnames written in Lesotho orthography, preserving the old spellings.
 * Gloria Moshoeshoe, South African actor and talk show host
 * Aaron Mokoena, South African and European soccer player

Word division
Like all other Bantu languages, Sesotho is an agglutinative language spoken conjunctively; however, like many Bantu languages it is written disjunctively. The difference lies in the characteristically European word division used for writing the language, in contrast with some Bantu languages such as the South African Nguni languages.

This issue is investigated in more detail in The Sesotho word.

Roughly speaking the following principles may be used to explain the current orthographical word division: Of course, there are exceptions to these rough rules.
 * 1) Prefixes (except noun class prefixes) and infixes are written separately on their own, and the root and all following suffixes are written together. This is most obvious in the writing of the verb complex. One exception is the 1st. pers. sg objectival concord, and another is in the writing of the concords used with the qualificative parts of speech.
 * 2) With the exception of class 15, noun class prefixes are directly attached to the noun stem. These are an essential part of the lexicon, and not merely functional morphemes.
 * 3) Words which have been fossilised/lexicalised with historical prefixes are written as one word. This most frequently occurs with adverbs.

Punctuation
Modern Sesotho punctuation essentially mimics popular English usage. Full stops separate sentences, with the first letter of each sentence capitalized; commas indicate slight pauses; direct quotes are indicated with double quotation marks; proper nouns have their first letter capitalized (this was often not done in the old French-based orthographies); and so forth.

Direct quotations are introduced with a comma followed by the utterance in double quotes. The comma is used to indicate the pause which is mandatory in speech when introducing quotes, and indeed, in older orthographies the quotes were not used at all since the pause by itself is sufficient to introduce the next phrase as a quotation.
 * A re, "Ke lakatsa ho bua le wena." He said, "I wish to speak with you."

Proper nouns are indicated by capitalizing the first letter (usually the first letter of the noun prefix). Since prefixes are written separately from the main noun in the disjunctive orthography, they are not written differently. Contrast this with the situation in the disjunctively-written Nguni languages where it is the first letter of the stem that is capitalized.
 * Lentswe la Batho The Voice of the People (isiZulu iZwi labaNtu)

Limitations
Although it is a sufficient medium which has been used for almost 200 years to pen some of the most celebrated African literature (such as Thomas Mofolo's Chaka), the current Sesotho orthography does exhibit certain (phonological) deficiencies.

One problem is that, although the spoken language has at least seven contrasting vowel phonemes, these are only written using the five vowel letters of the standard Latin alphabet. The letter "e" represents the vowels, , and , and the letter "o" represents the vowels , , and. Not only does this result in numerous homographs, there is also some overlap between many distinct morphemes and formatives, as well as the final vowels of Sesotho verbs in various tenses and moods.

Another problem is the complete lack of tone marking even though Sesotho is a grammatical tone language. Not only does this also result in numerous homographs, it may also cause problems in situations where the only difference between grammatical constructions is the tones of a few key syllables in two otherwise similar sounding phrases. That this would be a rather difficult issue to tackle is revealed by the fact that very few of the large number of written Niger–Congo languages have any consistently used tone marking schemes, even though some of their tonal systems are much more complex than that of Sesotho.

The following not too unlikely example is illustrative of both these issues:
 * ke ye ke reke dijo, either [ _ _ _ ¯ ¯ _ ¯ ] I often buy food, or  [ ¯ _ ¯ ¯ ¯ _ ¯ ] so I may go and buy food

The first meaning is rendered if the phrase is composed of a Group III deficient verb (-ye, indicating habitual actions) followed by a verb in the perfect subjunctive mood. The second verb's mood is indicated by the low toned subjectival concord as well as the final vowel. The second meaning is rendered by basically using two normal verbs in the subjunctive mood (with high toned subjectival concords and final vowels) with the actions following each other.