User:MichaelGasser/Naming conventions

Ethiopia/Eritrea Naming Conventions
I'd like to propose that those of us who edit pages that have to do with Ethiopia and Eritrea agree on an official policy to transliterate the languages that are written in the Ge'ez (Ethiopic) script, as has been done for some other languages, including Chinese, Japanese, and Korean, and is currently under discussion for Arabic and Hebrew.

There are two places for transliteration in an encyclopedia: for articles on the languages themselves and for writing words that come originally from the languages, especially the names of people and places and the titles of works, in non-linguistic articles. We wouldn't have to necessarily agree on the same conventions for both purposes. The discussion here is meant to deal with words originating in these languages that appear in non-linguistic articles. There is currently informal agreement on using a variant of the WL system described below for linguistic articles (see Amharic language, Tigrinya language, Soddo language).

Existing systems
There are at least three well-accepted sets of conventions for romanizing Amharic, and in some cases other Ethiopian Semitic languages, and a number of variations on these. These include
 * the system associated with the linguist Wolf Leslau, who in his long career has written books or papers on every one of the Ethiopian Semitic languages ("WL" below), starting with his work on Tigrinya in the 1940s; this system has been adopted by many linguists since, though it is not used by all (for example, not by Lionel Bender and Hailu Fulass in their Amharic Verb Morphology or by Degif Petros Banksira in his Sound Mutations: the Morphophonology of Chaha)
 * the system adopted in 1997 (or before) by the US Library of Congress and the American Library Association for romanizing the names of authors and titles of books ("LOC/ALA" below): | Amharic, | Tigrinya
 * the system adopted by the Ethiopian Mapping Authority and by the United Nations Group of Experts on Geographical Names ("UNGEGN/EMA" below) in 1967
 * the system adopted by the United States Board on Geographic Names and the Permanent Committee on Geographical Names for British Official Use ("BGN/PCGN" below) in 1967 and apparently in more common use in maps in Ethiopia than UNGEGN/EMA, also used by National Geographic Society for its Ethiopia maps, though not its Eritrea maps: | Amharic, | Tigrinya

Vowels
The vowels present an obvious problem because the seven of them need to be distributed among the five roman vowel letters. Here is how the WL, LOC/ALA, UNGEGN/EMA, and BGN/PCGN system represent the vowels (in their traditional order). Leslau represents the first and fourth vowels of Ge'ez differently in his Concise Dictionary of Ge'ez; those symbols are shown in parentheses.

There is another minor difference: for some reason, the BGN/PCGN system uses ā to represent the vowel for the 1st order characters which have the /a/ vowel: አ ሐ ሀ ኀ ዐ: Ādis Ābeba.

Consonants
The consonants that differ in the systems are the following. (The last four columns are not relevant for Amharic but are for some other Ethiopian Semitic languages.)

The goals of LOC/ALA are to accurately reproduce what appears orthographically in a title or author name, so they do not indicate gemination (because it's not indicated in the orthography) but do distinguish the consonant letters with the same pronunciation (for example, ሀ and ኀ). UNGEGN/EMA and BGN/PCGN have optional ways of distinguishing these letters and, like LOC/ALA, do not indicate gemination.

Considerations
Here are some desirable properties for a transliteration scheme for non-linguistic articles, in no particular order.
 * The characters used should "suggest" the correct pronunciation to naive English-speaking readers, that is, those who know nothing about Ethiopian languages.
 * Diacritics should be minimized, and if they are omitted, their absence should not detract too much from readability. (This is what happens, for example, with Japanese, when the length sign is omitted: Tokyo in place of Tōkyō.)
 * More frequent phones should be represented by characters without diacritics.


 * The system should not deviate too much from familiar conventions that are already in place. For Ethiopian, there are already some informal, though not systematic, conventions for transliterating Amharic and Tigrinya names.
 * The system should not deviate too much from the conventions used in linguistic articles about the languages.
 * Ideally the system could be used also for transliteration in other languages using roman scripts (Spanish, French, Swahili, etc.).

Proposal
The BGN/PCGN system has several advantages. BGN/PCGN does not, however, handle the non-Amharic consonants found in other Ethiopian Semitic languages. For these, we could use x for the ኸ series, ‘ for the ዐ series, x' for the ቐ series, and perhaps ḥ for the ሐ series.
 * It is (according to the UN) in use in Ethiopia, at least by the Mapping Authority.
 * The characters used for the vowels in most cases are similar to those already used by Ethiopians to transliterate their names: ከበደ 'Kebede', ጸሐይ 'Tsehay', ግርማ 'Girma'. (Note that you also see 'e' for the 6th form vowel, especially when it starts a word: እሸቱ 'Eshetu'.)
 * The characters used for both the vowels and the consonants probably suggest their correct pronunciations to naive English readers better than other alternatives do (this needs to be tested; if people are interested, I'll try an informal experiment).
 * The three most common vowels do not require diacritics.
 * With the diacritics missing, words would still be readable.

Note that BGN/PCGN does deviate considerably from the WL system that people have informally agreed to use for linguistic articles.

So here is the proposal. There are two levels of transliteration, one more precise, with diacritics, and one less precise, with no diacritics. Without diacritics some of the distinctions are lost (two distinctions within the vowels and the difference between h and in languages such as Tigrinya that have pharyngeals), but this is common in other transliteration schemes, for example, what is being proposed for Arabic. What I give here is the more precise scheme.

Further:
 * Gemination is not indicated.
 * For pairs such as ኮ/ኰ, ኩ/ኵ, ቆ/ቈ, ቁ/ቍ, only the alternative with o is used: ko, k'o, etc.

Here are names of some familiar places and people as they would appear in the more precise version of the proposed transliteration (with diacritics) and as they appear in the modified WL system ("WL*") that we are using for articles on the languages.

As you can see, one drawback of the proposal is that it leaves us with two quite different ways of transliterating. I would argue against adopting WL for non-linguistic articles because it uses unusual characters (, ä) for very common sounds and because it deviates a lot from what people (other than linguists) are used to.

Related issues
There are several issues related to the choice of a transliteration scheme for names written in Ge'ez script.
 * When is (roman) Oromo orthography used for names that also have Amharic spellings? Or should both regularly be used?
 * Should the original (Ge'ez) form of names always appear together with the transliteration (as is done for Chinese, Japanese, Korean, Arabic)?