Naʼvi language

The Naʼvi language (Naʼvi: Lìʼfya leNaʼvi) is a fictional constructed language originally made for the 2009 film Avatar. In the film franchise, the language is spoken by the Naʼvi, a race of sapient humanoids indigenous to the extraterrestrial moon Pandora. The language was created by Paul Frommer, a professor at the USC Marshall School of Business with a doctorate in linguistics. Naʼvi was designed to fit moviemaker James Cameron's conception of what the language should sound like in the film. It had to be realistically learnable by the fictional human characters of the film and pronounceable by the actors, but also not closely resemble any single human language.

When the film was released in 2009, Naʼvi had a growing vocabulary of about a thousand words, but understanding of its grammar was limited to the language's creator. However, this has changed subsequently as Frommer has expanded the lexicon to more than 2600 words and has published the grammar, thus making Naʼvi a relatively complete, learnable and serviceable language.

Roots
The Naʼvi language has its origins in James Cameron's early work on Avatar. In 2005, while the film was still in scriptment form, Cameron felt it needed a complete, consistent language for the alien characters to speak. He had written approximately thirty words for this alien language but wanted a linguist to create the language in full. His production company, Lightstorm Entertainment, contacted the linguistics department at the University of Southern California seeking someone who would be interested in creating such a language. Edward Finegan, a professor of linguistics at USC, thought that the project would appeal to Paul Frommer, with whom he had co-authored a linguistics textbook, and so forwarded Lightstorm's inquiry on to him. Frommer and Cameron met to discuss the director's vision for the language and its use in the film; at the end of the meeting, Cameron shook Frommer's hand and said "Welcome aboard."

Based on Cameron's initial list of words, which had a "Polynesian flavor" according to Frommer, the linguist developed three different sets of meaningless words and phrases that conveyed a sense of what an alien language might sound like: one using contrasting tones, one using varying vowel lengths, and one using ejective consonants. Of the three, Cameron liked the sound of the ejectives most. His choice established the phonology that Frommer would use in developing the rest of the Naʼvi language – morphology, syntax, and an initial vocabulary – a task that took six months.

Development
The Naʼvi vocabulary was created by Frommer as needed for the script of the movie. By the time casting for Avatar began, the language was sufficiently developed that actors were required to read and pronounce Naʼvi dialogue during auditions. During shooting Frommer worked with the cast, helping them understand their Naʼvi dialogue and advising them on their Naʼvi pronunciation, stress, and intonation. Actors would often make mistakes in speaking Naʼvi. In some cases, those mistakes were plausibly explained as ones their human characters would make while learning the language in-universe; in other cases, the mistakes were incorporated into the language.

Frommer expanded the vocabulary further in May 2009 when he worked on the Avatar video game, which required Naʼvi words that had not been needed for the film script and thus had not yet been invented. Frommer also translated into Naʼvi four sets of song lyrics that had been written by Cameron in English, and he helped vocalists with their pronunciation during the recording of James Horner's Avatar score. At the time of the film's release on December 18, 2009, the Naʼvi vocabulary consisted of approximately 1000 words.

Work on the Naʼvi language has continued even after the film's release. Frommer is working on a compendium which he plans to deliver to Fox in the near future. He hopes that the language will "have a life of its own," and thinks it would be "wonderful" if the language developed a following. Since then, it has developed a following, as is evident through the increasing learner community of the language. The community's Lexical Expansion Project, together with Frommer, has expanded the lexicon by more than 50 percent.

Frommer also maintains a blog, Na’viteri, where he regularly posts additions to the lexicon and clarifications on grammar. Naʼviteri has been the source of the vast majority of Naʼvi growth independent of Frommer's contract with 20th Century Fox.

Structure and usage
The Naʼvi language was developed under three significant constraints. First, Cameron wanted the language to sound alien but pleasant and appealing to audiences. Second, since the storyline included humans who have learned to speak the language, it had to be a language that humans could plausibly learn to speak. And finally, the actors would have to be able to pronounce their Naʼvi dialogue without unreasonable difficulty. The language in its final form contains several elements which are uncommon in human languages, such as verbal conjugation using infixes. All Naʼvi linguistic elements are found in human languages, but the combination is unique.

Phonology and orthography
Naʼvi lacks voiced plosives like , but has the ejective consonants   , which are spelled px, tx, kx. It also has the syllabic consonants ll and rr. There are seven vowels, a ä e i ì o u. Although all the sounds were designed to be pronounceable by the human actors of the film, there are unusual consonant clusters, as in fngap "metal".

Naʼvi syllables may be as simple as a single vowel, or as complex as skxawng "moron" or fngap above (both CCVC).

The fictional language Naʼvi of Pandora is unwritten. However, the actual (studio) language is written in the Latin script for the actors of Avatar. Some words include: zìsìt "year", fpeio "ceremonial challenge", ’awve "first" (’aw "one"), muiä "fair", tireaioang "spirit animal", tskxe "rock", kllpxìltu "territory", uniltìrantokx "avatar" (dream-walk-body).

Vowels
There are seven monophthong vowels:

There are additionally four diphthongs: aw [aw], ew [εw], ay [aj], ey [εj], and two syllabic consonants: ll [l̩] and rr [r̩], which mostly behave as vowels.

Note that the e is open-mid while the o is close-mid, and that there is no *oy. The rr is strongly trilled, and the ll is "light" (plain), never a "dark" (velarized).

These vowels may occur in sequences, as in the Polynesian languages, Swahili, and Japanese. Each vowel counts as a syllable, so that tsaleioae has six syllables,, and meoauniaea has eight,.

Naʼvi does not have vowel length or tone, but it does have contrastive stress: túte "person", tuté "female person". Although stress may move with derivation, as here, it is not affected by inflection (case on nouns, tense on verbs, etc.). So, for example, the verb lu ("to be") has stress on its only vowel, the u, and no matter what else happens to it, the stress stays on that vowel: lolú "was" (l$\langleol\rangle$u), lolängú "was (ugh!)" (l$\langleol\rangle$$\langleäng\rangle$u), etc.

Consonants
There are twenty consonants. There are two Latin transcriptions: one that more closely approaches the ideal of one letter per phoneme, with the c and g for and  (the values they have in much of Eastern Europe and Polynesia, respectively), and a modified transcription used for the actors, with the digraphs ts and ng used for those sounds. In both transcriptions, the ejective consonants are written with digraphs in x, a convention that appears to have no external inspiration, but could potentially be inspired by the Esperanto convention of writing x as a stand-in for the circumflex.

The fricatives and the affricate, f v ts s z h, are restricted to the onset of a syllable; the others may occur at the beginning or at the end (though w y in final position are considered parts of diphthongs, as they only occur as ay ey aw ew and may be followed by another final consonant, as in skxawng "moron"). However, in addition to appearing before vowels, f ts s may form consonant clusters with any of the unrestricted consonants (the plosives and liquids/glides) apart from ’, making for 39 clusters. Other sequences occur across syllable boundaries, such as Naʼvi and ikran  "banshee".

The plosives p t k are tenuis, as in Spanish or French. In final position, they have no audible release, as in Indonesian and other languages of Southeast Asia, as well as in many dialects of English in words such as "bat". The r is flapped, as in Spanish and Indonesian; it sounds a bit like the tt or dd in the American pronunciation of the words latter / ladder.

Sound change
The plosives undergo lenition after certain prefixes and prepositions. The ejective consonants px tx kx become the corresponding plosives p t k; the plosives and affricate p t ts k become the corresponding fricatives f s h; and the glottal stop ’ disappears entirely. For example, the plural form of po "s/he" is ayfo "they", with the p weakening into an f after the prefix ay-.

Lenition has its own significance when the plural prefix can optionally be omitted. In the above example, ayfo can be shortened to fo. Similarly, the plural of tsmukan "brother" can be smukan (from aysmukan).

Grammar
Naʼvi has free word order. For example, the English "I see you" (a common greeting in Naʼvi) can be written as follows in Naʼvi:


 * Oel ngati kameie
 * Ngati oel kameie
 * Oel kameie ngati

As sentences become more complex, some words, like adjectives and negatives, will have to stay in a more or less fixed position in the sentence, depending on what the adjective or negative is describing.


 * "Today is a good day"
 * Fìtrr lu sìltsana trr
 * Sìltsana trr fìtrr lu

In this case, the adjective sìltsan(a) (good) will need to stay with the noun trr (day), therefore limiting the sentence to fewer combinations on the construction of the sentence, but as long as it follows or precedes the noun, the sentence is fine. By putting the attributive a before the adjective, the adjective can be put after the noun:


 * Fìtrr lu trr asìltsan

Nouns
Nouns in Naʼvi show greater number distinctions than those in most human languages do: besides singular and plural, they not only have special dual forms for two of an item (eyes, hands, lovers, etc.), which are common in human language (English has a remnant in "both"), but also trial forms for three of an item, which on Earth are only found with pronouns. Gender is only occasionally (and optionally) marked.

The plural prefix is ay+, and the dual is me+. Both trigger lenition (indicated by the "+" signs rather than the hyphens that usually mark prefix boundaries). In nouns which undergo lenition, the plural prefix may be dropped, so the plural of tokx "body" is either aysokx or just sokx.

Masculine and feminine nouns may be distinguished by suffix. There are no articles (words for "a" or "the").

Nouns are declined for case in a tripartite system, which is rare among human languages. In a tripartite system, there are distinct forms for the object of a clause, as in "he kicks the ball "; the agent of a transitive clause which has such an object, as in " he kicks the ball"; and the subject of an intransitive clause, which does not have an object, as in " he runs". An object is marked with the accusative suffix -ti, and an agent with the ergative suffix -l, while an intransitive subject has no case suffix. The use of such case forms leaves the word order of Naʼvi largely free.

There are two other cases—genitive in -yä, dative in -ru—as well as a topic marker -ri. The latter is used to introduce the topic of the clause, and is somewhat equivalent to Japanese wa and the much less common English "as for". It preempts the case of the noun: that is, when a noun is made topical, usually at the beginning of the clause, it takes the -ri suffix rather than the case suffix one would expect from its grammatical role. For example, in,

Oe-ri ontu teya l⟨äng⟩u

I-TOP nose full be⟨PEJ⟩

"My nose is full (of his distasteful smell)", lit. "As for me, (my) nose is full"
 * since the topic is "I", the subject "nose" is associated with "me": That is, it's understood to be "my nose". "Nose" itself is unmarked for case, as it's the subject of the intransitive verb "to be". However, in most cases the genitive marker -yä is used for this purpose.

Besides case, the role of a noun in a clause may be indicated with adpositions. Any adposition may occur as either as a preposition before the noun, or as an enclitic after the noun, a greater degree of freedom than English allows. For example, "with you" may be either hu nga or ngahu. When used as enclitics, they are much like the numerous cases found in Hungarian and Finnish. When used as prepositions, more along the lines of what English does, certain of them trigger lenition. One of the leniting prepositions is mì "in", as in mì sokx "in the body". This may cause some ambiguity with short plurals: mì sokx could also be short for mì aysokx "in the bodies".

Naʼvi pronouns encode clusivity. That is, there are different words for "we" depending on whether the speaker is including his/her addressee or not. There are also special forms for "the two of us" (with or without the addressee), "the three of us", etc. They do not inflect for gender; although it is possible to distinguish "he" from "she", the distinction is optional.

The deferential forms of "I" and "you" are ohe and ngenga. Possessive forms include ngeyä "your" and peyä "her/his". "He" and "she" can optionally be differentiated as poan and poé.

The grammatical distinctions made by nouns are also made by pronouns.

Adjectives
Naʼvi adjectives are uninflected—that is, they do not agree with the noun they modify—and may occur either before or after the noun. They are marked by a syllable a, which is attached on the side closest to the noun. For example, "a long river" can be expressed either as,

ngim-a kilvan

long-ATTR river

"a long river"

or as,

kilvan a-ngim

river ATTR-long

"a long river"

The free word order holds for all attributives: Genitives (possessives) and relative clauses can also either precede or follow the noun they modify. The latter especially allows for great freedom of expression.

The attributive affix a- is only used when an adjective modifies a noun. Predicative adjectives instead take the "be" verb lu:

kilvan ngim lu

river long be

"The river is long"

Verbs
Verbs are conjugated for tense and aspect, but not for person. That is, they record distinctions like "I am, I was, I would", but not like "I am, we are, s/he is". Conjugation relies exclusively on infixes, which are like suffixes but go inside the verb. "To hunt", for example, is taron, but "hunted" is t$\langleol\rangle$aron, with the infix $\langleol\rangle$.

There are two positions for infixes: after the onset (optional consonant(s)) of the penultimate syllable, and after the onset of the final syllable. Because many Na’vi verbs have two syllables, these commonly occur on the first and last syllable. In monosyllabic words like lu "be", they both appear after the initial onset, keeping their relative order.

The first infix position is taken by infixes for tense, aspect, mood, or combinations thereof; also appearing in this position are participle, reflexive, and causative forms, the latter two of which may co-occur with a tense/aspect/mood infix by preceding it. Tenses are past, recent past, present (unmarked), future, and immediate future; aspects are perfective (completed or contained) and imperfective (ongoing or uncontained). The aspectual forms are not found in English but are somewhat like the distinction between 'having done' and 'was doing'.


 * taron [hunt] "hunts"
 * t$\langleìm\rangle$aron [hunt$\langle\rangle$] "just hunted"
 * t$\langleay\rangle$aron [hunt$\langle\rangle$] "will hunt"
 * t$\langleer\rangle$aron [hunt$\langle\rangle$] "hunting"
 * t$\langleol\rangle$aron [hunt$\langle\rangle$] "hunted"
 * t$\langleì⟨r⟩m\rangle$aron [hunt$\langle⟨⟩\rangle$] "was just hunting"

Tense and aspect need not be marked when they can be understood by context or elsewhere in the sentence.

The second infix position is taken by infixes for affect (speaker attitude, whether positive or negative) and for evidentiality (uncertainty or indirect knowledge). For example, in the greeting in the section on nouns, Oel ngati kameie "I See you", the verb kame "to See" is inflected positively as kam$\langleei\rangle$e to indicate the pleasure the speaker has in meeting you. In the subsequent sentence, Oeri ontu teya längu "My nose is full (of his smell)", however, the phrase teya lu "is full" is inflected pejoratively as teya l$\langleäng\rangle$u to indicate the speaker's distaste at the experience. Examples with both infix positions filled:


 * t$\langleìrm\rangle$ar$\langleei\rangle$on [hunt$\langle\rangle$$\langle\rangle$] "was just hunting": The speaker is happy about it, whether due to success or just the pleasure of the hunt
 * t$\langleay\rangle$ar$\langleäng\rangle$on [hunt$\langle\rangle$$\langle\rangle$] "will hunt": The speaker is anxious about or bored by it

Lexicon
The Naʼvi language currently has over 2,600 words. These include a few English loan words such as kunsìp "gunship". The complete dictionary, including the odd inflectional form, is available online at http://dict-navi.com. Additionally, the community of speakers is working with Dr. Frommer to further develop the language. Naʼvi is a very modular language and the total number of usable words far exceeds the 2,600 dictionary words. For example: rol "to sing" → tìrusol "the act of singing" or ngop "to create" → ngopyu "creator". Workarounds using existing words also abound in the Naʼvi corpus, such as eltu lefngap "metallic brain" for "computer" and palulukantsyìp "little thanator" for "cat".