Proto-human language

The proto-human language, also known as proto-sapiens or proto-world, is the hypothetical direct genetic predecessor of all human languages.

The concept is speculative and not amenable to analysis in historical linguistics. It presupposes a monogenetic origin of language, i.e. the derivation of all natural languages from a single origin, presumably at some time in the Paleolithic period. As the predecessor of all extant languages spoken by modern humans (Homo sapiens), proto-human language as hypothesised would not necessarily be ancestral to any hypothetical Neanderthal language.

Terminology
There is no generally accepted term for this concept. Most treatments of the subject do not include a name for the language under consideration (e.g. Bengtson and Ruhlen ). The terms proto-world and proto-human are in occasional use. Merritt Ruhlen used the term proto-sapiens.

History of the idea
The first serious scientific attempt to establish the reality of monogenesis was that of Alfredo Trombetti, in his book L'unità d'origine del linguaggio, published in 1905. Trombetti estimated that the common ancestor of existing languages had been spoken between 100,000 and 200,000 years ago.

Monogenesis was dismissed by many linguists in the late 19th and early 20th centuries when the doctrine of the polygenesis of the human races and their languages was popularised.

The best-known supporter of monogenesis in America in the mid-20th century was Morris Swadesh. He pioneered two important methods for investigating deep relationships between languages, lexicostatistics and glottochronology.

In the second half of the 20th century, Joseph Greenberg produced a series of large-scale classifications of the world's languages. These were and are controversial but widely discussed. Although Greenberg did not produce an explicit argument for monogenesis, all of his classification work was geared toward this end. As he stated: "The ultimate goal is a comprehensive classification of what is very likely a single language family."

Notable American advocates of linguistic monogenesis include Merritt Ruhlen, John Bengtson, and Harold Fleming.

Date and location
The first concrete attempt to estimate the date of the hypothetical ancestor language was that of Alfredo Trombetti, who concluded it was spoken between 100,000 and 200,000 years ago, or close to the first emergence of Homo sapiens.

It is uncertain or disputed whether the earliest members of Homo sapiens had fully developed language. Some scholars link the emergence of language proper (out of a proto-linguistic stage that may have lasted considerably longer) to the development of behavioral modernity toward the end of the Middle Paleolithic or at the beginning of the Upper Paleolithic, roughly 50,000 years ago. Thus, in the opinion of Richard Klein, the ability to produce complex speech only developed some 50,000 years ago (with the appearance of modern humans or Cro-Magnon). Johanna Nichols (1998) argued that vocal languages must have begun diversifying in our species at least 100,000 years ago.

In 2011, an article in the journal Science proposed an African origin of modern human languages. It was suggested that human language predates the out-of-Africa migrations of 50,000 to 70,000 years ago and that language might have been the essential cultural and cognitive innovation that facilitated human colonization of the globe.

In Perreault and Mathew (2012), an estimate of the time of the first emergence of human language was based on phonemic diversity. This is based on the assumption that phonemic diversity evolves much more slowly than grammar or vocabulary, slowly increasing over time (but reduced among small founding populations). The largest phoneme inventories are found among African languages, while the smallest inventories are found in South America and Oceania, some of the last regions of the globe to be settled. The authors used data from the colonization of Southeast Asia to estimate the rate of increase in phonemic diversity. Applying this rate to African languages, Perreault and Mathew (2012) arrived at an estimated age of 150,000 to 350,000 years, compatible with the emergence and early dispersal of H. sapiens. The validity of this approach has been criticized as flawed.

Claimed characteristics
Speculation on the "characteristics" of proto-world is limited to linguistic typology, i.e. the identification of universal features shared by all human languages, such as grammar (in the sense of "fixed or preferred sequences of linguistic elements"), and recursion, but beyond this, nothing is known of it.

Christopher Ehret has hypothesized that proto-human had a very complex consonant system, including clicks.

A few linguists, such as Merritt Ruhlen, have suggested the application of mass comparison and internal reconstruction (cf. Babaev 2008). Several linguists have attempted to reconstruct the language, while many others reject this as fringe science.

Vocabulary
Ruhlen tentatively traces several words back to the ancestral language, based on the occurrence of similar sound-and-meaning forms in languages across the globe. Bengtson and Ruhlen identify 27 "global etymologies". The following table lists a selection of these forms:

Based on these correspondences, Ruhlen lists these roots for the ancestor language:


 * *ku = 'who'
 * *ma = 'what'
 * *akʷa = 'water'
 * *sum = 'hair'
 * *čuna = 'nose, smell'

Selected items from Bengtson's and Ruhlen's (1994) list of 27 "global etymologies":


 * {| class="wikitable sortable"

! No. !! Root !! Gloss
 * 4 || *čun(g)a || 'nose; to smell'
 * 10 || *ku(n) || 'who?'
 * 26 || *tsuma || 'hair'
 * 27 || *ʔaq'wa || 'water'
 * }
 * 26 || *tsuma || 'hair'
 * 27 || *ʔaq'wa || 'water'
 * }
 * }

Syntax
There are competing theories about the basic word order of the hypothesized proto-human. These usually assume subject-initial ordering because it is the most common globally. Derek Bickerton proposes SVO (subject-verb-object) because this word order (like its mirror OVS) helps differentiate between the subject and object in the absence of evolved case markers by separating them with the verb.

By contrast, Talmy Givón hypothesizes that proto-human had SOV (subject-object-verb), based on the observation that many old languages (e.g., Sanskrit and Latin) had dominant SOV, but the proportion of SVO has increased over time. On such a basis, it is suggested that human languages are shifting globally from the original SOV to the modern SVO. Givón bases his theory on the empirical claim that word-order change mostly results in SVO and never in SOV.

Exploring Givón's idea in their 2011 paper, Murray Gell-Mann and Merritt Ruhlen stated that shifts to SOV are also attested. However, when these are excluded, the data indeed supported Givón's claim. The authors justified the exclusion by pointing out that the shift to SOV is unexceptionally a matter of borrowing the order from a neighboring language. Moreover, they argued that, since many languages have already changed to SVO, a new trend towards VSO and VOS ordering has arisen.

Harald Hammarström reanalysed the data. In contrast to such claims, he found that a shift to SOV is in every case the most common type, suggesting that there is, rather, an unchanged universal tendency towards SOV regardless of the way that languages change and that the relative increase of SVO is a historical effect of European colonialism.

Criticism
Many linguists reject the methods used to determine these forms. Several areas of criticism are raised with the methods Ruhlen and Gell-Mann employed. The essential basis of these criticisms is that the words being compared do not show common ancestry; the reasons for this vary. One is onomatopoeia: for example, the suggested root for smell listed above, *čuna, may simply be a result of many languages employing an onomatopoeic word that sounds like sniffing, snuffling, or smelling. Another is the taboo quality of certain words. Lyle Campbell points out that many established proto-languages do not contain an equivalent word for *putV 'vulva' because of how often such taboo words are replaced in the lexicon, and notes that it "strains credibility to imagine" that a proto-world form of such a word would survive in many languages.

Using the criteria that Bengtson and Ruhlen employ to find cognates to their proposed roots, Campbell found seven possible matches to their root for woman *kuna in Spanish, including cónyuge 'wife', 'spouse', chica 'girl', and cana 'old woman' (adjective). He then goes on to show how what Bengtson and Ruhlen would identify as reflexes of *kuna cannot possibly be related to a proto-world word for woman. Cónyuge, for example, comes from the Latin root meaning 'to join', so its origin had nothing to do with the word woman; chica is related to a Latin word meaning 'insignificant thing'; cana comes from the Latin word for white, and again shows a history unrelated to the word woman. Campbell asserts that these types of problems are endemic to the methods used by Ruhlen and others.

Some linguists question the very possibility of tracing language elements so far back into the past. Campbell notes that given the time elapsed since the origin of human language, every word from that time would have been replaced or changed beyond recognition in all languages today. Campbell harshly criticizes efforts to reconstruct a proto-human language, saying "the search for global etymologies is at best a hopeless waste of time, at worst an embarrassment to linguistics as a discipline, unfortunately confusing and misleading to those who might look to linguistics for understanding in this area."