Nilo-Saharan languages

The Nilo-Saharan languages are a proposed family of around 210 African languages spoken by somewhere around 70 million speakers, mainly in the upper parts of the Chari and Nile rivers, including historic Nubia, north of where the two tributaries of the Nile meet. The languages extend through 17 nations in the northern half of Africa: from Algeria to Benin in the west; from Libya to the Democratic Republic of the Congo in the centre; and from Egypt to Tanzania in the east.

As indicated by its hyphenated name, Nilo-Saharan is a family of the African interior, including the greater Nile Basin and the Central Sahara Desert. Eight of its proposed constituent divisions (excluding Kunama, Kuliak, and Songhay) are found in the modern countries of Sudan and South Sudan, through which the Nile River flows.

In his book The Languages of Africa (1963), Joseph Greenberg named the group and argued it was a genetic family. It contained all the languages that were not included in the Niger–Congo, Afroasiatic or Khoisan families. Although some linguists have referred to the phylum as "Greenberg's wastebasket", into which he placed all the otherwise unaffiliated non-click languages of Africa, other specialists in the field have accepted it as a working hypothesis since Greenberg's classification. Linguists accept that it is a challenging proposal to demonstrate but contend that it looks more promising the more work is done.

Some of the constituent groups of Nilo-Saharan are estimated to predate the African neolithic. For example, the unity of Eastern Sudanic is estimated to date to at least the 5th millennium BC. Nilo-Saharan genetic unity would thus be much older still and date to the late Upper Paleolithic. The earliest written language associated with the Nilo-Saharan family is Old Nubian, one of the oldest written African languages, attested in writing from the 8th to the 15th century AD.

This larger classification system is not accepted by all linguists, however. Glottolog (2013), for example, a publication of the Max Planck Institute in Germany, does not recognise the unity of the Nilo-Saharan family or even of the Eastern Sudanic branch; Georgiy Starostin (2016) likewise does not accept a relationship between the branches of Nilo-Saharan, though he leaves open the possibility that some of them may prove to be related to each other once the necessary reconstructive work is done. According to Güldemann (2018), "the current state of research is not sufficient to prove the Nilo-Saharan hypothesis."

Characteristics
The constituent families of Nilo-Saharan are quite diverse. One characteristic feature is a tripartite singulative–collective–plurative number system, which Blench (2010) believes is a result of a noun-classifier system in the protolanguage. The distribution of the families may reflect ancient watercourses in a green Sahara during the African humid period before the 4.2-kiloyear event, when the desert was more habitable than it is today.

Major languages
Within the Nilo-Saharan languages are a number of languages with at least a million speakers (most data from SIL's Ethnologue 16 (2009)). In descending order:
 * Luo (Dholuo, 4.4 million). Dholuo language of the Luo people of Kenya and Tanzania, Kenya's third largest ethnicity after the Bantu-speaking Agĩkũyũ and Luhya). (The term "Luo" is also used for a wider group of languages which includes Dholuo.)
 * Kanuri (4.0 million, all dialects; 4.7 million if Kanembu is included). The major ethnicity around Lake Chad.
 * Zarma (6 million). Spread along the Niger River in Niger and into Nigeria, in the southern region of the historic Songhai Empire.
 * Teso (1.9 million). Related to Karamojong, Turkana, Toposa and Nyangatom
 * Nubian (1.7 million, all dialects). The language of Nubia, extending today from southern Egypt into northern Sudan. Many Nubians have also migrated northwards to Cairo since the building of the Aswan Dam.
 * Lugbara (1.7 million, 2.2 if Aringa (Low Lugbara) is included). The major Central Sudanic language; Uganda and the Democratic Republic of the Congo.
 * Nandi–Markweta languages (Kalenjin, 1.6 million). Kenyan Rift Valley, Kapchorua Uganda.
 * Lango (1.5 million). A Luo language, one of the major languages of Uganda.
 * Dinka (1.4 million). The major ethnicity of South Sudan.
 * Acholi (1.2 million). Another Luo language of Uganda.
 * Nuer (1.1 million in 2011, significantly more today). The language of the Nuer, another numerous people from South Sudan and Ethiopia.
 * Maasai (1.0 million). Spoken by the Maasai people of Kenya and Tanzania, one of the most well-known African peoples internationally.
 * Ngambay (1.0 million with Laka). Central Sudanic, the principal language of southern Chad.

Some other important Nilo-Saharan languages under 1 million speakers:
 * Fur (500,000 in 1983, significantly more today). The eponymous language of Darfur Province in western Sudan.
 * Tubu (350,000 to 400,000) One of the northernmost Nilo-Saharan languages, extending from Nigeria, Niger, and Chad into Libya. Most Tubu speakers live in Northern Chad close to the Tibesti Mountains. Tubu has two main varieties: the Daza language and the Teda language.

The total for all speakers of Nilo-Saharan languages according to Ethnologue 16 is 38–39 million people. However, the data spans a range from ca. 1980 to 2005, with a weighted median at ca. 1990. Given population growth rates, the figure in 2010 might be half again higher, or about 60 million.

History of the proposal
The Saharan family (which includes Kanuri, Kanembu, the Tebu languages, and Zaghawa) was recognized by Heinrich Barth in 1853, the Nilotic languages by Karl Richard Lepsius in 1880, the various constituent branches of Central Sudanic (but not the connection between them) by Friedrich Müller in 1889, and the Maban family by Maurice Gaudefroy-Demombynes in 1907. The first inklings of a wider family came in 1912, when Diedrich Westermann included three of the (still independent) Central Sudanic families within Nilotic in a proposal he called Niloto-Sudanic; this expanded Nilotic was in turn linked to Nubian, Kunama, and possibly Berta, essentially Greenberg's Macro-Sudanic (Chari–Nile) proposal of 1954.

In 1920 G. W. Murray fleshed out the Eastern Sudanic languages when he grouped Nilotic, Nubian, Nera, Gaam, and Kunama. Carlo Conti Rossini made similar proposals in 1926, and in 1935 Westermann added Murle. In 1940 A. N. Tucker published evidence linking five of the six branches of Central Sudanic alongside his more explicit proposal for East Sudanic. In 1950 Greenberg retained Eastern Sudanic and Central Sudanic as separate families, but accepted Westermann's conclusions of four decades earlier in 1954 when he linked them together as Macro-Sudanic (later Chari–Nile, from the Chari and Nile Watersheds).

Greenberg's later contribution came in 1963, when he tied Chari–Nile to Songhai, Saharan, Maban, Fur, and Koman-Gumuz and coined the current name Nilo-Saharan for the resulting family. Lionel Bender noted that Chari–Nile was an artifact of the order of European contact with members of the family and did not reflect an exclusive relationship between these languages, and the group has been abandoned, with its constituents becoming primary branches of Nilo-Saharan—or, equivalently, Chari–Nile and Nilo-Saharan have merged, with the name Nilo-Saharan retained. When it was realized that the Kadu languages were not Niger–Congo, they were commonly assumed to therefore be Nilo-Saharan, but this remains somewhat controversial.

Progress has been made since Greenberg established the plausibility of the family. Koman and Gumuz remain poorly attested and are difficult to work with, while arguments continue over the inclusion of Songhai. Blench (2010) believes that the distribution of Nilo-Saharan reflects the waterways of the wet Sahara 12,000 years ago, and that the protolanguage had noun classifiers, which today are reflected in a diverse range of prefixes, suffixes, and number marking.

Internal relationships
Dimmendaal (2008) notes that Greenberg (1963) based his conclusion on strong evidence and that the proposal as a whole has become more convincing in the decades since. Mikkola (1999) reviewed Greenberg's evidence and found it convincing. Roger Blench notes morphological similarities in all putative branches, which leads him to believe that the family is likely to be valid.

Koman and Gumuz are poorly known and have been difficult to evaluate until recently. Songhay is markedly divergent, in part due to massive influence from the Mande languages. Also problematic are the Kuliak languages, which are spoken by hunter-gatherers and appear to retain a non-Nilo-Saharan core; Blench believes they might have been similar to Hadza or Dahalo and shifted incompletely to Nilo-Saharan.

Anbessa Tefera and Peter Unseth consider the poorly attested Shabo language to be Nilo-Saharan, though unclassified within the family due to lack of data; Dimmendaal and Blench, based on a more complete description, consider it to be a language isolate on current evidence. Proposals have sometimes been made to add Mande (usually included in Niger–Congo), largely due to its many noteworthy similarities with Songhay rather than with Nilo-Saharan as a whole, however this relationship is more likely due to a close relationship between Songhay and Mande many thousands of years ago in the early days of Nilo-Saharan, so the relationship is probably more one of ancient contact than a genetic link.

The extinct Meroitic language of ancient Kush has been accepted by linguists such as Rille, Dimmendaal, and Blench as Nilo-Saharan, though others argue for an Afroasiatic affiliation. It is poorly attested.

There is little doubt that the constituent families of Nilo-Saharan—of which only Eastern Sudanic and Central Sudanic show much internal diversity—are valid groups. However, there have been several conflicting classifications in grouping them together. Each of the proposed higher-order groups has been rejected by other researchers: Greenberg's Chari–Nile by Bender and Blench, and Bender's Core Nilo-Saharan by Dimmendaal and Blench. What remains are eight (Dimmendaal) to twelve (Bender) constituent families of no consensus arrangement.

Greenberg 1963
Joseph Greenberg, in The Languages of Africa, set up the family with the following branches. The Chari–Nile core are the connections that had been suggested by previous researchers.

Gumuz was not recognized as distinct from neighbouring Koman; it was separated out (forming "Komuz") by Bender (1989).

Bender 1989, 1991
Lionel Bender came up with a classification which expanded upon and revised that of Greenberg. He considered Fur and Maban to constitute a Fur–Maban branch, added Kadu to Nilo-Saharan, removed Kuliak from Eastern Sudanic, removed Gumuz from Koman (but left it as a sister node), and chose to posit Kunama as an independent branch of the family. By 1991 he had added more detail to the tree, dividing Chari–Nile into nested clades, including a Core group in which Berta was considered divergent, and coordinating Fur–Maban as a sister clade to Chari–Nile.

Bender revised his model of Nilo-Saharan again in 1996, at which point he split Koman and Gumuz into completely separate branches of Core Nilo-Saharan.

Ehret 1989
Christopher Ehret came up with a novel classification of Nilo-Saharan as a preliminary part of his then-ongoing research into the macrofamily. His evidence for the classification was not fully published until much later (see Ehret 2001 below), and so it did not attain the same level of acclaim as competing proposals, namely those of Bender and Blench.

Bender 2000
By 2000 Bender had entirely abandoned the Chari–Nile and Komuz branches. He also added Kunama back to the "Satellite–Core" group and simplified the subdivisions therein. He retracted the inclusion of Shabo, stating that it could not yet be adequately classified but might prove to be Nilo-Saharan once sufficient research has been done. This tentative and somewhat conservative classification held as a sort of standard for the next decade.

Ehret 2001
Ehret's updated classification was published in his book A Historical–Comparative Reconstruction of Nilo-Saharan (2001). This model is notable in that it consists of two primary branches: Gumuz–Koman, and a Sudanic group containing the rest of the families (see Sudanic languages § Nilo-Saharan for more detail). Also, unusually, Songhay is well-nested within a core group and coordinate with Maban in a "Western Sahelian" clade, and Kadu is not included in Nilo-Saharan. Note that "Koman" in this classification is equivalent to Komuz, i.e. a family with Gumuz and Koman as primary branches, and Ehret renames the traditional Koman group as "Western Koman".

Blench 2006
Niger-Saharan, a language macrofamily linking the Niger-Congo and Nilo-Saharan phyla, was proposed by Blench (2006). It was not accepted by other linguists. Blench's (2006) internal classification of the Niger-Saharan macrophylum is as follows:


 * Proto-Niger-Saharan
 * Songhay, Saharan, Maba, Fur, Kuliak, Berta, Kunama, Komuz, Shabo
 * Kado-Sudanic
 * Kado (Kadugli-Krongo)
 * Niger-Sudanic
 * East Sudanic
 * Niger-Central Sudanic
 * Central Sudanic
 * Niger-Congo

According to Blench (2006), typological features common to both Niger-Congo and Nilo-Saharan include:
 * Phonology: ATR vowel harmony and the labial-velars /kp/ and /gb/
 * Noun-class affixes: e.g., ma- affix for mass nouns in Nilo-Saharan
 * Verbal extensions and plural verbs

Blench 2010
With a better understanding of Nilo-Saharan classifiers, and the affixes or number marking they have developed into in various branches, Blench believes that all of the families postulated as Nilo-Saharan belong together. He proposes the following tentative internal classification, with Songhai closest to Saharan, a relationship that had not previously been suggested:

? Mimi of Decorse

Blench 2015
By 2015, and again in 2017, Blench had refined the subclassification of this model, linking Maban with Fur, Kadu with Eastern Sudanic, and Kuliak with the node that contained them, and added a tentative, extinct branch he names "Plateau" as to explain a possible Nilo-Saharan substrate in the Malian Dogon and Bangime languages, for the following structure:

Blench (2021) concludes that Maban may be close to Eastern Sudanic.

Starostin (2016)
Georgiy Starostin (2016), using lexicostatistics based on Swadesh lists, is more inclusive than Glottolog, and in addition finds probable and possible links between the families that will require reconstruction of the proto-languages for confirmation. Starostin also does not consider Greenberg's Nilo-Saharan to be a valid, coherent clade.

In addition to the families listed in Glottolog (previous section), Starostin considers the following to be established:


 * Northern "K" Eastern Sudanic or "NNT" (Nubian, Nara, and Tama; see below for Nyima)
 * Southern "N" Eastern Sudanic (Surmic, Temein, Jebel, Daju, Nilotic), though their exact relationships to each other remain obscure
 * Central Sudanic (including Birri and Kresh–Aja, which may prove to be closest to each other)
 * Koman (including Gule)

A relationship of Nyima with Nubian, Nara, and Tama (NNT) is considered "highly likely" and close enough that proper comparative work should be able to demonstrate the connection if it's valid, though it would fall outside NNT proper (see Eastern Sudanic languages).

Other units that are "highly likely" to eventually prove to be valid families are:
 * East Sudanic as a whole
 * Central Sudanic – Kadu (Central Sudanic + Kadugli–Krongo)
 * Maba–Kunama (Maban + Kunama)
 * Komuz (Koman + Gumuz)

In summary, at this level of certainty, "Nilo-Saharan" constitutes ten distinct and separate language families: Eastern Sudanic, Central Sudanic – Kadu, Maba–Kunama, Komuz, Saharan, Songhai, Kuliak, Fur, Berta, and Shabo.

Possible further "deep" connections, which cannot be evaluated until the proper comparative work on the constituent branches has been completed, are:


 * Eastern Sudanic + Fur + Berta
 * Central Sudanic – Kadu + Maba–Kunama

There are faint suggestions that Eastern and Central Sudanic may be related (essentially the old Chari–Nile clade), though that possibility is "unexplorable under current conditions" and could be complicated if Niger–Congo were added to the comparison. Starostin finds no evidence that the Komuz, Kuliak, Saharan, Songhai, or Shabo languages are related to any of the other Nilo-Saharan languages. Mimi-D and Meroitic were not considered, though Starostin had previously proposed that Mimi-D was also an isolate despite its slight similarity to Central Sudanic.

In a follow-up study published in 2017, Starostin reiterated his previous points as well as explicitly accepting a genetic relationship between Macro-East Sudanic and Macro-Central Sudanic. Starostin names this proposal "Macro-Sudanic". The classification is as follows.


 * Macro-Sudanic
 * Macro-Sudanic macrofamily
 * Macro-Central Sudanic family
 * Central Sudanic family
 * Sara-Bongo-Bagirmi (West-Central Sudanic branch)
 * Kresh-Aja-Birri
 * East-Central Sudanic branch
 * Mangbutu-Efe
 * Mangbetu-Asoa
 * Lendu-Ngiti
 * Moru-Madi
 * Krongo-Kadugli (Kadu) group
 * Maba group
 * Macro-Eastern Sudanic family
 * Eastern Sudanic family
 * Northeast Sudanic family
 * Nubian group
 * Tama group
 * Nara language
 * Nyimang-Afitti Group
 * Southeast Sudanic family
 * Surmic languages (Southern Surmic + Northern Surmic / Majang branches)
 * Nilotic languages (Western, Eastern, Southern branches)
 * Jebel group
 * Temein group
 * Daju group
 * Berta group
 * Fur-Amdang group
 * Kunama-Ilit group
 * Koman-Gumuz ("Komuz") family
 * Koman family
 * "Narrow Koman" group
 * Gule (Anej) language
 * Gumuz languages (group)
 * Saharan family
 * Western Saharan group (Kanuri-Kanembu + Teda-Dazaga)
 * Eastern Saharan group (Zaghawa + Berti)
 * Kuliak group
 * Songhay group
 * Shabo language (Mikeyir)

Starostin (2017) finds significant lexical similarities between Kadu and Central Sudanic, while some lexical similarities also shared by Central Sudanic with Fur-Amdang, Berta, and Eastern Sudanic to a lesser extent.

Dimmendaal 2016, 2019
Gerrit J. Dimmendaal suggests the following subclassification of Nilo-Saharan:

Dimmendaal et al. consider the evidence for the inclusion of Kadu and Songhay too weak to draw any conclusions at present, whereas there is some evidence that Koman and Gumuz belong together and may be Nilo-Saharan.

The large Northeastern division is based on several typological markers:
 * tolerance of complex syllable structure
 * higher amount of both inflectional and derivational morphology, including the presence of cases
 * verb-final (SOV or OSV) word order
 * coverb + light verb constructions
 * converbs

Blench 2023
By 2023, Blench had slightly revised the model for a deep primary split between Koman–Gumuz and the rest. Kunama and Berta are "provisionally" placed as the next to branch off, because they only partially share the features that unite the rest of the family. However, it is not clear if this is because they actually diverged early, or if they might have lost those features at a later date. For example, Berta shares plausible lexical cognates with the Eastern Jebel languages (East Sudanic) and its system of grammatical number "closely resembles" those of the East Sudanic languages; Kunama could be divergent "due to long-term interaction with Afroasiatic languages." Saharan–Songhay (especially Songhay) have seen substantial erosion of key characteristics, but this appears to be a secondary development and not evidence of early branching. "Core" Nilo-Saharan ("Central African" in Blench 2015) thus appears to be a typological rather than genetic grouping, though Maban is treated as a divergent branch of Eastern Sudanic; Kadu also seems to be quite close. The resulting structure is as follows:

Beyond the work of Colleen Ahland, Blench notes that the inclusion of Koman is buttressed by the work of Manuel Otero. The argument for Songhay is mostly lexical, especially the pronouns. Blench gives Greenberg credit for both East and Central Sudanic. Saharan and Songhay have some "striking" similarities in their lexicon, which Blench argues is genetic, though the absence of reliable proto-Sarahan and proto-Songhay reconstructions makes evaluation difficult.

Glottolog 4.0 (2019)
In summarizing the literature to date, Hammarström et al. in Glottolog do not accept that the following families are demonstrably related with current research:


 * Berta
 * Central Sudanic (excluding Kresh–Aja; Birri is also questionable as Central Sudanic)
 * Daju (putatively East Sudanic)
 * Eastern Jebel (putatively East Sudanic)
 * Furan
 * Gule
 * Gumuz
 * Kadugli–Krongo
 * Koman (excluding Gule)
 * Kresh–Aja (putatively Central Sudanic)
 * Kuliak
 * Kunama
 * Maban (including Mimi-N)
 * Mimi-Gaudefroy (Mimi-D)
 * Nara (putatively East Sudanic)
 * Nilotic (putatively East Sudanic)
 * Nubian (putatively East Sudanic)
 * Nyimang (putatively East Sudanic)
 * Saharan
 * Songhai
 * Surmic (putatively East Sudanic)
 * Tama (putatively East Sudanic)
 * Temein (putatively East Sudanic)

External relations
Proposals for the external relationships of Nilo-Saharan typically center on Niger–Congo: Gregersen (1972) grouped the two together as Kongo–Saharan. However, Blench (2011) proposed that the similarities between Niger–Congo and Nilo-Saharan (specifically Atlantic–Congo and Central Sudanic) are due to contact, with the noun-class system of Niger–Congo developed from, or elaborated on the model of, the noun classifiers of Central Sudanic.

Phonology
Nilo-Saharan languages present great differences, being a highly diversified group. It has proven difficult to reconstruct many aspects of Proto-Nilo-Saharan. Two very different reconstructions of the proto-language have been proposed by Lionel Bender and Christopher Ehret.

Bender's reconstruction
The consonant system reconstructed by Bender for Proto-Nilo-Saharan is:

The phonemes correspond to coronal plosives, the phonetic details are difficult to specify, but clearly, they remain distinct from  and supported by many phonetic correspondences (another author, C. Ehret, reconstructs for the coronal area the sound  and  which perhaps are closer to the phonetic detail of, see infra)

Bender gave a list of about 350 cognates and discussed in depth the grouping and the phonological system proposed by Ch. Ehret. Blench (2000) compares both systems (Bender's and Ehret's) and prefers the former because it is more secure and is based in more reliable data. For example, Bender points out that there is a set of phonemes including implosives, ejectives and prenasal constants , but it seems that they can be reconstructed only for core groups (E, I, J, L) and the collateral group (C, D, F, G, H), but not for Proto-Nilo-Saharan.

Ehret's reconstruction
Christopher Ehret used a less clear methodology and proposed a maximalist phonemic system: Ehret's maximalist system has been criticized by Bender and Blench. These authors state that the correspondences used by Ehret are not very clear and because of this many of the sounds in the table may only be allophonic variations.

Morphology
Dimmendaal (2016) cites the following morphological elements as stable across Nilo-Saharan:
 * Causative prefix: *ɪ- or *i-
 * Deverbal noun (abstract / participial / agent) prefix: *a-
 * Number suffixes: *-i, *-in, *-k
 * Reflexive marker: *rʊ
 * Personal pronouns: first person singular *qa, second person singular *yi
 * Logophoric pronoun: *(y)ɛ
 * Deictic markers: singular *n, plural *k
 * Postpositions: possessive *ne, locative *ta
 * Preposition: *kɪ
 * Negative verb: *kʊ

Comparative vocabulary
Sample basic vocabulary in different Nilo-Saharan branches:

Note: In table cells with slashes, the singular form is given before the slash, while the plural form follows the slash.

Population history
In the Sahel and East Africa Nilo-Saharan speakers are associated with the ruling class of powerful empires and sultanates that have dominated the region such as the Gao Empire, being the largest contiguous Songhai Empire that dominated the Sahel, West Africa, the Sahara/Maghreb and Central Africa, the Kanem-Bornu Empire in Central Africa, the Sultanate of Damagaram, the Wadai Empire, the Sultanate of Baguirmi, the Sultanate of Darfur, the Sultanate of Sennar, the Zabarma Emirate, and the Shilluk Kingdom.

The pastoralist Tutsi and the Rutara people of the great lakes are also of Nilotic ancestry and have led the powerful kingdom of Rwanda, the Kingdom of Burundi, the Kingdom of Bunyoro, the Kitara Empire, the Kingdom of Toro, the Kingdom of Buganda, the Kingdom of Karagwe, and the Kingdom of Rwenzururu. Whilst these are established on the Bantu peoples from which they adopted the language, they have preserved the bovine pastoralism of the Nilotic peoples.