Beja language

Beja (Bidhaawyeet or Tubdhaawi) is an Afroasiatic language of the Cushitic branch spoken on the western coast of the Red Sea by the Beja people. Its speakers inhabit parts of Egypt, Sudan and Eritrea. In 2022 there were 2,550,000 Beja speakers in Sudan, and 121,000 Beja speakers in Eritrea according to Ethnologue. As of 2023 there are an estimated 88,000 Beja speakers in Egypt. The total number of speakers in all three countries is 2,759,000.

Name
The name Beja, derived from بجا, is most common in English-language literature. Native speakers use the term Bidhaawyeet (indefinite) or Tubdhaawi (definite).

Classification
Beja is held by most linguists to be part of the Cushitic branch of the Afroasiatic family, constituting the only member of the Northern Cushitic subgroup. As such, Beja contains a number of linguistic innovations that are unique to it, as is also the situation with the other subgroups of Cushitic (e.g. idiosyncratic features in Agaw or Central Cushitic). The characteristics of Beja that differ from those of other Cushitic languages are likewise generally acknowledged as normal branch variation.

The relation of the Northern Cushitic branch of Cushitic to the other branches is unknown. Christopher Ehret proposes, based on the devoicing of Proto-Cushitic voiced velar fricatives, that Northern Cushitic is possibly more closely related to South Cushitic than to the other branches.

The identification of Beja as an independent branch of Cushitic dates to the work of Enrico Cerulli between 1925 and 1951. Due to Beja's linguistic innovations, Robert Hetzron argued that it constituted an independent branch of Afroasiatic. Hetzron's proposal was generally rejected by other linguists, and Cerulli's identification of Beja as the sole member of a North Cushitic branch remains standard today across otherwise divergent proposals for the internal relations of the Cushitic language family.

History


Christopher Ehret proposes the following sequence of sound changes between Proto-Cushitic and Beja:
 * 1) PC * → * (alveolar ejective affricate becomes palatal ejective stop)
 * 2) PC * → * (dental ejective stop becomes alveolar ejective affricate)
 * 3) *C' → C (ejectives become their non-ejective voiceless counterparts)
 * 4) [+lateral/+obstruent] → [+retroflex/+obstruent] (that is,  and  become  and, respectively)
 * 5) PC * → * (voiced alveolar affricate becomes voiceless)
 * 6) * → ; * →  (voiceless alveolar affricate becomes a fricative, voiceless palatal plosive becomes a postalveolar fricative
 * 7) PC * → * (labialized voiced velar fricative becomes voiceless)
 * 8) * → * (velar fricatives become plosives)
 * 9) PC * →  /V_V (lateral fricative becomes alveolar tap between vowels)
 * 10) PC * →  /#_ (lateral fricative becomes lateral approximant word-initially)
 * 11) PC *z →  /V_ (a consonant of unknown value becomes palatal approximant after vowels)
 * 12) PC *z →  /#_ (the same consonant of unknown value becomes voiced alveolar stop word-initially)
 * 13) PC *, * →  (all nasals but  collapse into alveolar nasal)
 * 1) PC *, * →  (all nasals but  collapse into alveolar nasal)

Ehret's reconstructed Proto-Cushitic /z/ is not a voiced alveolar fricative, but a consonant of unknown value. Ehret proposes that it might be a voiced palatal plosive.

Some linguists and paleographers believe that they have uncovered evidence of an earlier stage of Beja, referred to in different publications as "Old Bedauye" or "Old Beja." Helmut Satzinger has identified the names found on several third century CE ostraca (potsherds) from the Eastern Desert as likely Blemmye, representing a form of Old Beja. He also identifies several epigraphic texts from the fifth and sixth centuries as representing a later form of the same language. Nubiologist Gerald Browne, Egyptologist Helmut Satzinger, and Cushiticist Klaus Wedekind believed that an ostracon discovered in a monastery in Saqqarah also represents the Old Beja language. Browne and Wedekind identified the text as a translation of Psalm 30.

Phonology
Nasals other than and  are positional variants of. The consonants and  only appear in Arabic loanwords in some speakers' speech; in others', they are replaced by  or  and. Some speakers replace in Arabic loanwords with.

Beja has the five vowels, , , , and. and only appear long, while, , and  have long and short variants.

Beja has pitch accent.

Orthography
Both Roman and Arabic script have been used to write Beja. The Roman orthography below is that used by the Eritrean government and was used in a literacy program at Red Sea University in Port Sudan from 2010 to 2013. Three Arabic orthographies have seen limited use: The first below was that used by the now defunct Website Sakanab; the second was devised by Muhammad Adaroob Muhammad and used in his translation of E.M. Roper's Beja lexicon; the third was devised by Mahmud Ahmad Abu Bikr Ooriib, and was employed briefly at Red Sea University in 2019. No system of writing has gained wide support. The only system to have been employed in publications by more than one writer is the Latin script.

In the Roman orthography, the vowels are written with the letters corresponding to the IPA symbols (i.e., $⟨a, e, i, o, u⟩$). Long vowels are written with doubled signs. As and  cannot be short vowels, they only appear as $⟨ee⟩$ and $⟨oo⟩$, respectively.

The single $⟨e⟩$ sign, however, does have a use: To distinguish between and, $⟨dh⟩$ is used for the former and $⟨deh⟩$ for the latter. Similarly, $⟨keh⟩$ is, $⟨teh⟩$ is , $⟨seh⟩$ is. Single $⟨o⟩$ is not used.

In all Arabic orthographies, short vowels are written with the same diacritics used in Arabic: fatḥah for (ـَ), kasrah for  (ـِ), ḍammah for  (ـُ). 'Alif (ا) is used as the seat for these diacritics at the beginning of a word. Long is written with 'alif (ا) preceded by fatḥah, or alif maddah (آ) when word-initial. Long is written with yā' ي preceded by kasrah. Long is written with wāw و preceded by ḍammah. The systems vary on the representation of long and long. In the Usakana system, is written with a modified Kurdish yā' ێ; in the system devised by Muhammad Adaroob Muhammad it is represented by yā' with a shaddah يّ; in the Red Sea University system, it is not distinguished from the yā' for  or. In the Usakana system, is written with a modified Kurdish wāw ۆ; in the system devised by Muhammad Adaroob Muhammad it is represented by wāw with a shaddah وّ; in the Red Sea University system, it is not distinguished from the wāw for  or.

Pitch accent is not marked in any orthography. In Wedekind, Wedekind, and Musa (2006 and 2007), stressed syllables are indicated in boldface.

In addition to these two systems and several academic systems of transcribing Beja texts, it is possible that Beja was at least occasionally written in the Greek alphabet-based Coptic script during the Middle Ages.

Nouns, articles, and adjectives
Beja nouns and adjectives have two genders: masculine and feminine, two numbers: singular and plural, two cases: nominative and oblique, and may be definite, indefinite, or in construct state. Gender, case, and definiteness are not marked on the noun itself, but on clitics and affixes. Singular-plural pairs in Beja are unpredictable.

Plural forms
Plurals may be formed by: A small number of nouns do not distinguish between singular and plural forms. Some nouns are always plural. A few nouns have suppletive plurals.
 * the addition of a suffix -a to the singular stem: gaw 'house', gawaab 'houses' (the final -b is an indefinite suffix)
 * the shortening of the final syllable of the singular stem (or Ablaut in this syllable): kaam 'camel', kam 'camels'
 * shift of the accent from the ultimate to the penultimate syllable: hadhaab 'lion', hadhaab  'lions' (orthographically identical)
 * a combination of these.

Case and definiteness
A noun may be prefixed by a clitic definite article, or have an indefinite suffix. Definite articles indicate gender, number, and case. The indefinite suffix marks gender only, and does not appear in the nominative case. For feminine common nouns, the indefinite suffix is -t; for masculine nouns and feminine proper nouns, -b. The indefinite suffixes only appear after vowels. The definite article is proclitic. It has the following forms with masculine monosyllabic nouns that do not begin with or  (note that an initial glottal stop is usually omitted in writing, and that all words that appear to be vowel-initial actually begin with a glottal stop):

The feminine definite articles begin with $⟨t⟩$ but are otherwise identical (tuu-, too-, taa-, tee-). With nouns longer than one syllable and with nouns that begin with or, reduced forms of the definite article are used which do not distinguish between cases, but maintain gender distinctions. In some dialects (e.g. that described by Wedekind, Wedekind, and Musa for Port Sudan) the reduced forms maintain number distinctions; in others (e.g. that described by Vanhove and Roper for Sinkat) they do not.

Possession
Possessive relationships are shown through a genitive suffix -ii (singular possessed) or -ee (plural possessed) which attaches to the possessing noun. If the possessing noun is feminine, the genitive marker will begin with t; if the possessed is feminine, the suffix will end with t. When the suffix does not end with the feminine marker t, it reduces to -(t)i, whether singular or plural (that is, the singular/plural distinction is only marked for feminine possessa). Because this suffix adds a syllable to the noun, full forms of articles cannot be used; thus, the article on the noun itself does not indicate case. However, agreeing adjectives will be marked for oblique case. No article or indefinite suffix may be applied to the possessed noun. The possessed noun follows the possessor. Examples:

(The noun tak 'man' has the suppletive plural (n)da 'men'; raaw 'friend' has the shortened plural raw 'friends'.)
 * utaki raaw 'the man's friend (m)'
 * utakiit raaw 'the man's friend (f)'
 * tutakatti raaw 'the woman's friend (m)'
 * tutakattiit raaw 'the woman's friend (f)'
 * indaayeet raw 'the men's friends (f)'

Postpositions follow nouns in the genitive. Examples:


 * Whad'aayiida uutak eeya. 'The man came toward the chief/elder.' (-da: 'toward')
 * W'oor t'aritti geeb eefi 'The boy is with the girls.' (geeb: 'with')

Adjectives
Adjectives follow the nominal heads of noun phrases. They agree in gender, number, case, and definiteness, and carry case and definiteness markers of the same form as nouns.

Copula
Clauses may be composed of two noun phrases or a noun phrase and a predicative adjective followed by a copular clitic. The copula agrees in person, gender, and number with the copula complement (the second term), but the first- and third-person forms are identical. The copular subject will be in the nominative case, the copular complement in the oblique. Oblique -b becomes -w before -wa. Copular complements that end in a vowel will employ an epenthetic y between the final vowel and any vowel-initial copular clitic.

Examples:


 * Ani akraabu. "I am strong."
 * Baruuk akraawwa. "You are strong."
 * Baruuh hadhaabu. "He is a lion."
 * Tuun ay-girshaytu. "This is a five-piastre piece."
 * Hinin Imeeraaba. "We are Amirab."
 * Baraah imaka. "They are the donkeys."
 * Baraah igwharaaya. "They are the thieves."

Verbs
Beja verbs have two different types, first noted by Almkvist: "strong verbs," which conjugate with both prefixes and suffixes and have several principal parts; and "weak verbs," which conjugate with suffixes only and which have a fixed root. Verbs conjugate for a number of tense, aspect, modality, and polarity variations, which have been given different names by different linguists:

(Roper analyzes additional subjunctive forms where Wedekind, Wedekind, and Musa, and Vanhove see a conditional particle.)

Each of the above forms has a corresponding negative. (Vanhove refers to the imperative negative as the "prohibitive".) The past continuous and past share a past negative. Negative forms are not derived from corresponding positive forms, but are independent conjugations.

Every verb has a corresponding deverbal noun, which Wedekind, Wedekind, and Musa refer to as a "noun of action", Vanhove calls an "action noun", and Roper a "nomen actionis". Numerous serial verb constructions exist which connote different aspectual and potential meanings.

Imperative
The third person masculine singular positive imperative is the citation form of the verb. Weak verbs have a long final suffix -aa while strong verbs have a short final suffix -a. For both weak and strong verbs, the negative imperative is formed by an identical set of prefixes baa- (for masculine singular and common plural) and bii- (for feminine singular). Strong verbs use a negative imperative root which has a lengthened vowel.

Deverbal noun
Every Beja verb has a corresponding deverbal noun (Wedekind, Wedekind, and Musa: "noun of action"; Vanhove: "action noun"; Roper: "nomen actionis"). For weak verbs, the deverbal noun is formed by a suffix -ti attached to the imperative root (see above). For strong verbs, deverbal nouns are not entirely predictable.

Examples:
 * Weak verbs: diwaaa "to sleep" → diwtiib "sleeping"; afooyaa "to forgive" → afootiib "forgiving"
 * Strong verbs: adhidha "to hobble" → adhuudh "hobbling"; nikwiyi "to be pregnant" → nakwiit "being pregnant"

There are patterns in strong verb deverbal nouns related to the structure of the citation form of the verb. However, these are not consistent.

Deverbal adjective
A further derived form is a suffix -aa attached to the citation root, and then followed by -b for masculine nouns and -t for feminine. Examples:

This form may be used as an adjective, but it is also employed in the construction of multiple conjugated negative forms. Wedekind, Wedekind, and Musa analyse this form as a participle. Martine Vanhove analyses it as a manner converb -a.

Past continuous/aorist
The past continuous stem for strong verbs is not derivable from any other verb stem. The negative of the past continuous is identical to that of the past: There is only one past tense negative form. For both weak and strong verbs, the past negative is formed through a deverbal participial or converbal form (see above) followed by the present negative of the irregular verb aka "to be".

Wedekind, Wedekind, and Musa describe the past continuous as being used for "habitual, repeated actions of the (more distant) past." It is the verb conjugation used for counterfactual conditionals, which leads to Roper's identifying this tense as the "conditional". It is also frequently used in narratives.

Past/perfective
The past or perfective stem for strong verbs is identical to the citation form (imperative) stem, with predictable phonetic modifications. The negative is identical to that of the past continuous/aorist (above).

Present/imperfective
The present or imperfective has two stems for positive strong verbs, while the negative strong stem is identical to that used for the imperative (and thus also for past/perfective verbs). Weak negative verbs add the prefix ka- to positive past/perfective forms.

Future
The strong future stem is described differently by Wedekind, Wedekind, and Musa and by Vanhove. Both agree that it is a fixed stem followed by a present/imperfective conjugated form of the verb diya "to say." Wedekind, Wedekind, and Musa's strong stem is similar to the past continuous/aorist stem (next section), and identical for all numbers, genders, and persons, except the first person plural, which has a prefixed n-. For Vanhove, there are distinct singular and plural stems which are identical to the past continuous/aorist first person singular and plural, respectively. Similarly, for weak verbs, Wedekind, Wedekind, and Musa have a future stem ending in -i with a first person plural -ni, followed by a present tense/imperfective conjugation of diya. Vanhove sees the -i as a singular future, and the -ni as a general plural. For negative verbs, the negative present/imperfective of diya is used as the conjugated auxiliary.

(NB: Wedekind, Wedekind, and Musa see verbs of the form CiCiC as having identical past continuous [aorist] and future stems. Some verbs of other forms have different stems, which would lead to a greater divergence between the forms described by them and those described by Vanhove.) E.M. Roper, describing the same dialect as Vanhove, identifies the stem employed as being identical to the past continuous/aorist (for him, "conditional"—see above), just as Vanhove does. However, he understands the form with n- as being used only with the first person plural, as Wedekind, Wedekind, and Musa do.

Intentional/desiderative
In addition to the future, Bidhaawyeet has a similar form expressing desire to undertake an act or intention to do so. The citation root takes a suffix -a for all persons, genders, and numbers, and is followed by a present tense/imperfective conjugated form of the verb diya "to say", as the future is.

Jussive, optative, potential
There is distinct disagreement between the major grammars of the past century on the modal conjugation or conjugations referred to as "jussive," "optative," and "potential."

Wedekind, Wedekind, and Musa describe a "jussive" with the following paradigm. For strong verbs, the first person is based on the past/perfective stem, and the persons are based on the future stem; no negative jussive is given:

They give various examples of the jussive with translations into English, in order to give a sense of the meaning:
 * Araatatay! "Let me ask!"
 * Naan gw'ata? "What would you (m) like to drink?"
 * Hindeeh nihiriway! "Please let us look for it!" (Atmaan dialect)

Vanhove identifies a complex "potential" form composed of a nominalizing suffix -at followed by a present/imperfective reduced conjugation of the verb m'a 'come' (eeya in the non-reduced present/imperfective).

Vanhove describes the potential as expressing "epistemic modalities of inference or near-certainty." Examples below, with the potential verbs in bold:
 * "Deeyaraneek kaakan dabal had fiinataay," indi een. I am really exhausted, so I should rest a while,' he says."

Additionally, she recognizes an optative with positive and negative polarity. The positive optative is formed from a prefix baa- to the past continuous/aorist. The negative construction is more complex. In some dialects, the final -aay of most forms of the weak negative is a short -ay:

Vanhove gives no explanation for the use of the optative positive. The optative negative is used in conditional clauses with meanings of incapacity and necessity:
 * "Har'iisii bity'aheebaay," ani. Don't let it come from behind me!' I told myself."
 * Naat bitkatiim mhiin uumeek ingad. "The donkey stopped in a place where nothing can arrive."
 * Dhaabi biidiiyeeb hiisan. "I thought he would not be able to run."
 * Yaa iraanaay, ooyhaam thab'a! Baakwinhaay akaabuuyit... "Oh, man, hit the leopard! I don't need to shout at you and…"

Lexicon
Through lexicostatistical analysis, David Cohen (1988) observed that Beja shared a basic vocabulary of around 20% with the East Cushitic Afar and Somali languages and the Central Cushitic Agaw languages, which are among its most geographically near Afroasiatic languages. This was analogous to the percentage of common lexical terms that was calculated for certain other Cushitic languages, such as Afar and Oromo. Václav Blažek (1997) conducted a more comprehensive glottochronological examination of languages and data. He identified a markedly close ratio of 40% cognates between Beja and Proto-East Cushitic as well as a cognate percentage of approximately 20% between Beja and Central Cushitic, similar to that found by Cohen.

A fairly large portion of Beja vocabulary is borrowed from Arabic. In Eritrea and Sudan, some terms are instead Tigre loanwords. Andrzej Zaborski has noted close parallels between Beja and Egyptian vocabulary.

The only independent Beja dictionary yet printed is Leo Reinisch's 1895 Wörterbuch der Beḍauye-Sprache. An extensive vocabulary forms an appendix to E.M. Roper's 1928 Tu Beḍawiɛ: An Elementary Handbook for the use of Sudan Government Officials, and this has formed the basis for much recent comparative Cushitic work. Klaus and Charlotte Wedekind and Abuzeinab Musa's 2007 A Learner's Grammar of Beja (East Sudan) comes with a CD which contains a roughly 7,000-word lexicon, composed mostly of one-word glosses. Klaus Wedekind, Abuzeinab Muhammed, Feki Mahamed, and Mohamed Talib were working on a Beja-Arabic-English dictionary, but publication appears to have been stalled by Wedekind's death. Martine Vanhove announced a forthcoming Beja-Arabic-English-French dictionary in 2006. It has not yet been published. The Beja scholar Muhammed Adarob Ohaj produced a Beja-Arabic dictionary as his masters thesis in 1972. It has not yet been published.

Swadesh List
The following list is drawn from Wedekind, Wedekind, and Musa's 2007 grammar and Roper's 1928 handbook. Nouns are given in indefinite accusative forms (the citation form); unless marked otherwise, forms that end in $⟨t⟩$ are feminine and all others are masculine. Verbs are given in the singular masculine imperative.
 * 1) Beja handles negation through distinct negative polarity conjugation. There is no lexical "not."
 * 2) In some dialects liiliit means "pupil."
 * 3) Ragad refers to the foot and leg.
 * 4) This is a rare suppletive imperative. Other forms of the verb have no  and are constructed around a consonantal root.
 * 5) Sootaay covers the blue-green range.

Numbers
"Ten" has combining forms for the production of teens and products of ten. Numbers from 11 to 19 are formed by tamna- followed by the units. E.g., "fourteen" is tamna fadhig. Combining ones use the form -gwir; e.g., "eleven" is tamnagwir. "Twenty" is tagwuugw. "Twenty-one" is tagwgwagwir. "Thirty" is mhay tamun; "forty" is fadhig tamun; "fifty" is ay tamun; etc. "One hundred" is sheeb. For higher numbers, Beja-speakers use Arabic terms.

Ordinal numbers are formed by the addition of a suffix -a. "First" is awwal, borrowed from Arabic.

"Half" is tarab. Other fractions are borrowed from Arabic.

Literature
Beja has an extensive oral tradition, including multiple poetic genres. A well-known epic is the story of the hero Mhamuud Oofaash, portions of which have appeared in various publications by Klaus Wedekind. An edition appears in Mahmud Mohammed Ahmed's Oomraay, published in Asmara. In the 1960s and '70s, the Beja intellectual Muhammed Adarob Ohaj collected oral recordings of poetic and narrative material which are in the University of Khartoum Institute of African and Asian Studies Sound Archives. Didier Morin and Mohamed-Tahir Hamid Ahmed have used these, in addition to their own collections, for multiple academic publications in French on Beja poetics. Red Sea University and the NGO Uhaashoon worked with oral story-tellers to produce a collection of 41 short readers and a longer collection of three short stories in Beja between 2010 and 2013.