Bantu languages

The Bantu languages (English:, Proto-Bantu: *bantʊ̀)  are a language family of about 600 languages that are spoken by the Bantu peoples of Central, Southern, Eastern and Southeast Africa. They form the largest branch of the Southern Bantoid languages.

The total number of Bantu languages is estimated at between 440 and 680 distinct languages, depending on the definition of "language" versus "dialect". Many Bantu languages borrow words from each other, and some are mutually intelligible. Some of the languages are spoken by a very small number of people, for example the Kabwa language was estimated in 2007 to be spoken by only 8500 people but was assessed to be a distinct language.

The total number of Bantu speakers is estimated to be around 350 million in 2015 (roughly 30% of the population of Africa or 5% of the world population). Bantu languages are largely spoken southeast of Cameroon, and throughout Central, Southern, Eastern, and Southeast Africa. About one-sixth of Bantu speakers, and one-third of Bantu languages, are found in the Democratic Republic of the Congo.

The most widely spoken Bantu language by number of speakers is Swahili, with 16 million native speakers and 80 million L2 speakers (2015). Most native speakers of Swahili live in Tanzania, where it is a national language, while as a second language, it is taught as a mandatory subject in many schools in East Africa, and is a lingua franca of the East African Community.

Other major Bantu languages include Lingala with more than 20 million speakers (Congo, DRC), Zulu with 12 million speakers (South Africa), Xhosa with 8.2 million speakers (South Africa and Zimbabwe), and Shona with less than 10 million speakers (if Manyika and Ndau are included), while Sotho-Tswana languages (Sotho, Tswana and Pedi) have more than 15 million speakers (across Botswana, Lesotho, South Africa, and Zambia). Zimbabwe has Kalanga, Matebele, Nambiya, and Xhosa speakers. Ethnologue separates the largely mutually intelligible Kinyarwanda and Kirundi, which together have 20 million speakers.

Name
The similarity among dispersed Bantu languages had been observed as early as the 17th century. The term Bantu as a name for the group was not coined but "noticed" or "identified" (as Bâ-ntu) by Wilhelm Bleek as the first European in 1857 or 1858, and popularized in his Comparative Grammar of 1862. He noticed the term to represent the word for "people" in loosely reconstructed Proto-Bantu, from the plural noun class prefix *ba- categorizing "people", and the root *ntʊ̀- "some (entity), any" (e.g. Xhosa umntu "person", abantu "people"; Zulu umuntu "person", abantu "people").

There is no native term for the people who speak Bantu languages because they are not an ethnic group. People speaking Bantu languages refer to their languages by ethnic endonyms, which did not have an indigenous concept prior to European contact for the larger ethnolinguistic phylum named by 19th-century European linguists. Bleek's identification was inspired by the anthropological observation of groups frequently self-identifying as "people" or "the true people" (as is the case, for example, with the term Khoikhoi, but this is a kare "praise address" and not an ethnic name).

The term narrow Bantu, excluding those languages classified as Bantoid by Guthrie (1948), was introduced in the 1960s.

The prefix ba- specifically refers to people. Endonymically, the term for cultural objects, including language, is formed with the ki- noun class (Nguni ísi-), as in KiSwahili (Swahili language and culture), IsiZulu (Zulu language and culture) and KiGanda (Ganda religion and culture).

In the 1980s, South African linguists suggested referring to these languages as KiNtu. The word kintu exists in some places, but it means "thing", with no relation to the concept of "language". In addition, delegates at the African Languages Association of Southern Africa conference in 1984 reported that, in some places, the term Kintu has a derogatory significance. This is because kintu refers to "things" and is used as a dehumanizing term for people who have lost their dignity.

In addition, Kintu is a figure in some mythologies.

In the 1990s, the term Kintu was still occasionally used by South African linguists. But in contemporary decolonial South African linguistics, the term Ntu languages is used.

Within the fierce debate among linguists about the word "Bantu", Seidensticker (2024) indicates that there has been a "profound conceptual trend in which a "purely technical [term] without any non-linguistic connotations was transformed into a designation referring indiscriminately to language, culture, society, and race"."

Origin
The Bantu languages descend from a common Proto-Bantu language, which is believed to have been spoken in what is now Cameroon in Central Africa. An estimated 2,500–3,000 years ago (1000 BC to 500 BC), speakers of the Proto-Bantu language began a series of migrations eastward and southward, carrying agriculture with them. This Bantu expansion came to dominate Sub-Saharan Africa east of Cameroon, an area where Bantu peoples now constitute nearly the entire population. Some other sources estimate the Bantu Expansion started closer to 3000 BC.

The technical term Bantu, meaning "human beings" or simply "people", was first used by Wilhelm Bleek (1827–1875), as the concept is reflected in many of the languages of this group. A common characteristic of Bantu languages is that they use words such as muntu or mutu for "human being" or in simplistic terms "person", and the plural prefix for human nouns starting with mu- (class 1) in most languages is ba- (class 2), thus giving bantu for "people". Bleek, and later Carl Meinhof, pursued extensive studies comparing the grammatical structures of Bantu languages.

Classification


The most widely used classification is an alphanumeric coding system developed by Malcolm Guthrie in his 1948 classification of the Bantu languages. It is mainly geographic. The term "narrow Bantu" was coined by the Benue–Congo Working Group to distinguish Bantu as recognized by Guthrie, from the Bantoid languages not recognized as Bantu by Guthrie.

In recent times, the distinctiveness of Narrow Bantu as opposed to the other Southern Bantoid languages has been called into doubt, but the term is still widely used.

There is no true genealogical classification of the (Narrow) Bantu languages. Until recently most attempted classifications only considered languages that happen to fall within traditional Narrow Bantu, but there seems to be a continuum with the related languages of South Bantoid.

At a broader level, the family is commonly split in two depending on the reflexes of proto-Bantu tone patterns: many Bantuists group together parts of zones A through D (the extent depending on the author) as Northwest Bantu or Forest Bantu, and the remainder as Central Bantu or Savanna Bantu. The two groups have been described as having mirror-image tone systems: where Northwest Bantu has a high tone in a cognate, Central Bantu languages generally have a low tone, and vice versa.

Northwest Bantu is more divergent internally than Central Bantu, and perhaps less conservative due to contact with non-Bantu Niger–Congo languages; Central Bantu is likely the innovative line cladistically. Northwest Bantu is not a coherent family, but even for Central Bantu the evidence is lexical, with little evidence that it is a historically valid group.

Another attempt at a detailed genetic classification to replace the Guthrie system is the 1999 "Tervuren" proposal of Bastin, Coupez, and Mann. However, it relies on lexicostatistics, which, because of its reliance on overall similarity rather than shared innovations, may predict spurious groups of conservative languages that are not closely related. Meanwhile, Ethnologue has added languages to the Guthrie classification which Guthrie overlooked, while removing the Mbam languages (much of zone A), and shifting some languages between groups (much of zones D and E to a new zone J, for example, and part of zone L to K, and part of M to F) in an apparent effort at a semi-genetic, or at least semi-areal, classification. This has been criticized for sowing confusion in one of the few unambiguous ways to distinguish Bantu languages. Nurse & Philippson (2006) evaluate many proposals for low-level groups of Bantu languages, but the result is not a complete portrayal of the family. Glottolog has incorporated many of these into their classification.

The languages that share Dahl's law may also form a valid group, Northeast Bantu. The infobox at right lists these together with various low-level groups that are fairly uncontroversial, though they continue to be revised. The development of a rigorous genealogical classification of many branches of Niger–Congo, not just Bantu, is hampered by insufficient data.

Computational phylogenetic classifications
Simplified phylogeny of northwestern branches of Bantu by Grollemund (2012):

Other computational phylogenetic analyses of Bantu include Currie et al. (2013), Grollemund et al. (2015), Rexova et al. 2006, Holden et al., 2016, and Whiteley et al. 2018.

Glottolog classification
Glottolog (2021) does not consider the older geographic classification by Guthrie relevant for its ongoing classification based on more recent linguistic studies, and divides Bantu into four main branches: Bantu A-B10-B20-B30, Central-Western Bantu, East Bantu and Mbam-Bube-Jarawan.

Language structure
Guthrie reconstructed both the phonemic inventory and the vocabulary of Proto-Bantu.

The most prominent grammatical characteristic of Bantu languages is the extensive use of affixes (see Sotho grammar and Ganda noun classes for detailed discussions of these affixes). Each noun belongs to a class, and each language may have several numbered classes, somewhat like grammatical gender in European languages. The class is indicated by a prefix that is part of the noun, as well as agreement markers on verb and qualificative roots connected with the noun. Plurality is indicated by a change of class, with a resulting change of prefix. All Bantu languages are agglutinative.

The verb has a number of prefixes, though in the western languages these are often treated as independent words. In Swahili, for example, Kitoto kidogo kimekisoma (for comparison, Kamwana kadoko karikuverenga in Shona language) means 'The small child has read it [a book]'. kitoto 'child' governs the adjective prefix ki- (representing the diminutive form of the word) and the verb subject prefix a-. Then comes perfect tense -me- and an object marker -ki- agreeing with implicit kitabu 'book' (from Arabic kitab). Pluralizing to 'children' gives Vitoto vidogo vimekisoma (Vana vadoko varikuverenga in Shona), and pluralizing to 'books' (vitabu) gives vitoto vidogo vimevisoma.

Bantu words are typically made up of open syllables of the type CV (consonant-vowel) with most languages having syllables exclusively of this type. The Bushong language recorded by Vansina, however, has final consonants, while slurring of the final syllable (though written) is reported as common among the Tonga of Malawi. The morphological shape of Bantu words is typically CV, VCV, CVCV, VCVCV, etc.; that is, any combination of CV (with possibly a V- syllable at the start). In other words, a strong claim for this language family is that almost all words end in a vowel, precisely because closed syllables (CVC) are not permissible in most of the documented languages, as far as is understood.

This tendency to avoid consonant clusters in some positions is important when words are imported from English or other non-Bantu languages. An example from Chewa: the word "school", borrowed from English, and then transformed to fit the sound patterns of this language, is sukulu. That is, sk- has been broken up by inserting an epenthetic -u-; -u has also been added at the end of the word. Another example is buledi for "bread". Similar effects are seen in loanwords for other non-African CV languages like Japanese. However, a clustering of sounds at the beginning of a syllable can be readily observed in such languages as Shona, and the Makua languages.

With few exceptions, such as Kiswahili and Rutooro, Bantu languages are tonal and have two to four register tones.

Reduplication
Reduplication is a common morphological phenomenon in Bantu languages and is usually used to indicate frequency or intensity of the action signalled by the (unreduplicated) verb stem.


 * Example: in Swahili, piga means "strike", pigapiga means "strike repeatedly".

Well-known words and names that have reduplication include:
 * Bafana Bafana, a football team
 * Chipolopolo, a football team
 * Eric Djemba-Djemba, a footballer
 * Lomana LuaLua, a footballer

Repetition emphasizes the repeated word in the context that it is used. For instance, "Mwenda pole hajikwai," means "He who goes slowly doesn't trip," while, "Pole pole ndio mwendo," means "A slow but steady pace wins the race." The latter repeats "pole" to emphasize the consistency of slowness of the pace.

As another example, "Haraka haraka" would mean "hurrying just for the sake of hurrying" (reckless hurry), as in "Njoo! Haraka haraka" [come here! Hurry, hurry].

In contrast, there are some words in some of the languages in which reduplication has the opposite meaning. It usually denotes short durations, or lower intensity of the action, and also means a few repetitions or a little bit more.


 * Example 1: In Xitsonga and (Chi)Shona, famba means "walk" while famba-famba means "walk around".
 * Example 2: in isiZulu and SiSwati hamba means "go", hambahamba means "go a little bit, but not much".
 * Example 3: in both of the above languages shaya means "strike", shayashaya means "strike a few more times lightly, but not heavy strikes and not too many times".
 * Example 4: In Shona  means "scratch", Kwenyakwenya means "scratch excessively or a lot".
 * Example 5: In Luhya cheenda means "walk", cheendacheenda means "take a walk but not far off", as in buying time before something is ready or a situation or time is right.

Noun class
The following is a list of nominal classes in Bantu languages:

Syntax
Virtually all Bantu languages have a Subject–verb–object word order with some exceptions such as the Nen language which has a Subject-Object-Verb word order.

By country
Following is an incomplete list of the principal Bantu languages of each country. Included are those languages that constitute at least 1% of the population and have at least 10% the number of speakers of the largest Bantu language in the country.

Most languages are referred to in English without the class prefix (Swahili, Tswana, Ndebele), but are sometimes seen with the (language-specific) prefix (Kiswahili, Setswana, Sindebele). In a few cases prefixes are used to distinguish languages with the same root in their name, such as Tshiluba and Kiluba (both Luba), Umbundu and Kimbundu (both Mbundu). The prefixless form typically does not occur in the language itself, but is the basis for other words based on the ethnicity. So, in the country of Botswana the people are the Batswana, one person is a Motswana, and the language is Setswana; and in Uganda, centred on the kingdom of Buganda, the dominant ethnicity are the Baganda (singular Muganda), whose language is Luganda.

Lingua franca

 * Swahili (Kiswahili) (350,000; tens of millions as L2)

Angola

 * South Mbundu (Umbundu) (4 million)
 * Central North Mbundu (Kimbundu) (3 million)
 * North Bakongo (Kikongo) (576,800)
 * Ovambo (Ambo) (Oshiwambo) (500,000)
 * Luvale (Chiluvale) (500,000)
 * Chokwe (Chichokwe) (500,000)

Botswana

 * Tswana (Setswana) (1.6 million)
 * Kalanga (Ikalanga) (150,000)

Burundi

 * Swahili is a recognized national language


 * Kirundi (8.5 – 10.5 million)

Cameroon

 * Beti (1.7 million: 900,000 Bulu, 600,000 Ewondo, 120,000 Fang, 60,000 Eton, 30,000 Bebele)
 * Basaa (230,000)
 * Duala (350,000)
 * Manenguba languages (230,000)

Central African Republic

 * Mbati (60,000)
 * Aka (30,000)
 * Pande (8,870)
 * Ngando (5,000)
 * Ukhwejo
 * Kako
 * Mpiemo
 * Bodo
 * Kari

Democratic Republic of the Congo

 * Swahili is a recognized national language


 * Lingala (Ngala) (2 million; 7 million with L2 speakers)
 * Luba-Kasai (Tshiluba) (6.5 million)
 * Kituba (4.5 million), a Bantu creole
 * Kongo (Kikongo) (3.5 million)
 * Luba-Katanga (Kiluba) (1.5+ million)
 * Songe (Lusonge) (1+ million)
 * Nande (Orundandi) (1 million)
 * Tetela (Otetela) (800,000)
 * Yaka (Iyaka) (700,000+)
 * Shi (700,000)
 * Yombe (Kiyombe) (670,000)
 * Lele (Bashilele) (26,000)

Equatorial Guinea

 * Beti (Fang) (300,000)
 * Bube (40,000)

Eswatini

 * Swazi (Siswati) (1 million)

Gabon

 * Baka
 * Barama
 * Bekwel
 * Benga
 * Bubi
 * Bwisi
 * Duma
 * Fang (500,000)
 * Kendell
 * Kanin
 * Sake
 * Sangu
 * Seki
 * Sighu
 * Simba
 * Sira
 * Northern Teke
 * Western Teke
 * Tsaangi
 * Tsogo
 * Vili (3,600)
 * Vumbu
 * Wandji
 * Wumbvu
 * Yangho
 * Yasa

Kenya

 * Swahili and English are national languages


 * Gikuyu (8 million)
 * Luhya (6.8 million)
 * Kamba (4 million)
 * Meru (Kimeru) (2.7 million)
 * Gusii (2 million)
 * Mijikenda
 * Taita
 * Embu
 * Mbeere
 * Giriama

Lesotho

 * Sesotho (1.8 million)
 * Zulu (Isizulu) (300,000)

Malawi

 * Chewa (Nyanja) (Chichewa) (7 million)
 * Tumbuka (1 million)
 * Yao (1 million)

Mozambique

 * Swahili is a recognized national language


 * Makhuwa (4 million; 7.4 million all Makua)
 * Tsonga (Xitsonga) (3.1 million)
 * Shona (Ndau) (1.6 million)
 * Lomwe (1.5 million)
 * Sena (1.3 million)
 * Tswa (1.2 million)
 * Chuwabu (1.0 million)
 * Chopi (800,000)
 * Ronga (700,000)
 * Chewa (Nyanja) (Chichewa) (600,000)
 * Yao (Chiyao) (500,000)
 * Nyungwe (Cinyungwe/Nhungue)(400,000)
 * Tonga (400,000)
 * Makonde (400,000)
 * Nathembo (25,000)

Namibia

 * Ovambo (Ambo, Oshiwambo) (1,500,000)
 * Herero (200,000)

Nigeria

 * Jarawa (250,000)
 * Mbula-Bwazza (100,000)
 * Kulung (40,000)
 * Bile (38,000)
 * Lame (10,000)
 * Mama (2,000–3,000)
 * Shiki (1,200)
 * Gwa
 * Labir
 * Dulbu

Republic of the Congo

 * Kituba (1.2+ million) [a Bantu creole]
 * Kongo (Kikongo) (1.0 million)
 * Teke languages (500,000)
 * Yombe (350,000)
 * Suundi (120,000)
 * Mbosi (110,000)
 * Lingala (100,000; ? L2 speakers)

Rwanda

 * Swahili, Kinyarwanda, English, and French are official languages


 * Kinyarwanda (Kinyarwanda) (10 – 12 million)

Somalia

 * Swahili (Mwini dialect)
 * Chimwini
 * Mushungulu

South Africa
According to the South African National Census of 2011
 * Zulu (Isizulu) (11,587,374 )
 * Xhosa (Isixhosa) (8,154,258 )
 * Sepedi(4,618,576 )
 * Tswana (Setswana) (4,067,248 )
 * Sotho (Sesotho) (3,849,563 )
 * Tsonga (Xitsonga) (2,277,148 )
 * Swazi (Siswati) (1,297,046 )
 * Venda (Tshivenda) (1,209,388 )
 * Southern Ndebele (Transvaal Ndebele) (1,090,223 )
 * Total Nguni: 22,406,049 (61.98%)
 * Total Sotho-Tswana: 13,744,775 (38.02%)
 * Total official indigenous language speakers: 36,150,824 (69.83% )

Tanzania

 * Swahili is the national language


 * Sukuma (5.5 million)
 * Gogo (1.5 million)
 * Haya (Kihaya) (1.3 million)
 * Chaga (Kichaga) (1.2+ million : 600,000 Mochi, 300,000+ Machame, 300,000+ Vunjo)
 * Nyamwezi (1.0 million)
 * Makonde (1.0 million)
 * Ha (1.0 million)
 * Nyakyusa (800,000)
 * Hehe (800,000)
 * Luguru (700,000)
 * Bena (600,000)
 * Shambala (650,000)
 * Nyaturu (600,000)

Uganda

 * Swahili and English are official languages


 * Luganda (9,295,300)
 * Runyankore (4,436,000)
 * Lusoga (3,904,600)
 * Rukiga (3,129,000)
 * Masaba (Lumasaba) (2.7 million)
 * Runyoro (1,273,000)
 * Konjo (1,118,000)
 * Rutooro (1,111,000)
 * Lugwere (816,000)
 * Kinyarwanda (750,000)
 * Samia (684,000)
 * Ruuli (250,000)
 * Talinga Bwisi (133,000)
 * Gungu (110,000)
 * Amba (56,000)
 * Singa

Zambia

 * Aushi (Unknown)
 * Bemba (3.3 million)
 * Tonga (1.0 million)
 * Chewa (Nyanja) (Chichewa) (800,000)
 * Kaonde (240,000)
 * Lozi (Silozi) (600,000)
 * Lala-Bisa (600,000)
 * Nsenga (550,000)
 * Tumbuka (Chitumbuka) (500,000)
 * Lunda (450,000)
 * Nyiha (400,000+)
 * Mambwe-Lungu (400,000)

Zimbabwe

 * Shona languages (15 million incl. Karanga, Zezuru, Korekore, Ndau, Manyika)
 * Northern Ndebele (IsiNdebele) (estimated 2 million)
 * Tonga
 * Chewa/ Nyanja (Chichewa/ChiNyanja)
 * Venda
 * Kalanga

Geographic areas
Map 1 shows Bantu languages in Africa and map 2 a magnification of the Benin, Nigeria and Cameroon area, as of July 2017.

Bantu words popularised in western cultures
A case has been made out for borrowings of many place-names and even misremembered rhymes – chiefly from one of the Luba varieties – in the USA.

Some words from various Bantu languages have been borrowed into western languages. These include:

• Boma

• Bomba

• Bongos

• Bwana

• Candombe

• Chimpanzee

• Gumbo

• Hakuna matata

• Impala

• Indaba

• Jenga

• Jumbo

• Kalimba

• Kwanzaa

• Mamba

• Mambo

• Mbira

• Marimba

• Rumba

• Safari

• Samba

• Simba

• Ubuntu

Writing systems
Along with the Latin script and Arabic script orthographies, there are also some modern indigenous writing systems used for Bantu languages:
 * The Mwangwego alphabet is an abugida created in 1979 that is sometimes used to write the Chewa language and other languages of Malawi.
 * The Mandombe script is an abugida that is used to write the Bantu languages of the Democratic Republic of the Congo, mainly by the Kimbanguist movement.
 * The Isibheqe Sohlamvu or Ditema tsa Dinoko script is a featural syllabary used to write the Sintu or Southern Bantu languages.