Object–subject word order

In linguistic typology, object–subject (OS) word order, also called O-before-S or patient–agent word order, is a word order in which the object appears before the subject. OS is notable for its statistical rarity as a default or predominant word order among natural languages. Languages with predominant OS word order display properties that distinguish them from languages with subject–object (SO) word order.

The three OS word orders are VOS, OVS, and OSV. Collectively, these three orders comprise only around 2.9% of the world's languages. SO word orders (SOV, SVO, VSO) are significantly more common, comprising approximately 83.3% of the world's languages (the remaining 13.7% have free word order).

Despite their low relative frequency, languages that use OS order by default can be found across a wide variety of families, including Nilotic, Austronesian, Mayan, Oto-Manguean, Chumashan, Arawakan, Cariban, Tupi–Guarani, Jê, Nadahup, and Chonan.

Examples
 CERT:certainty (evidential):evidentiality  Tzeltal (VOS) 

la y-il te'tikil mut ta hamal te Ziak-e

PST 3.SG-see wild chicken in forest ART Ziak-ART

'Ziak saw a wild bird in the forest.'

Selk'nam (OVS) 

Kųlįųt matị-n nị-y Kịyųk

European kill-CERT.MASC PRES-MASC Keyuk

'Keyuk kills the white man.'

Xavante (OSV) 

Toptö wahi mate ti-tsa

Toptö snake it her-bite

'A snake bit Toptö.'

Ditransitive constructions
According to Maria Polinsky (1995), the default order of the subject and the object relative to each other in a given language determines other features of that language's syntax. In particular, languages with default SO order construct ditransitive clauses differently than languages with OS order:

The patient can also be referred to as the direct object (DO) and the causee/recipient/benefactive as the indirect object (IO).

To demonstrate this principle, Polinsky provides an example sentence from Malagasy, which has V–DO–IO–S order:

h-amp-anasa an' i Jaona an' i Jeanne i Paoly

FUT-CAUS-wash ACC ART John ACC ART Jeanne ART Paul

'Paul will be having Jeanne wash John.'

Notice how in the original Malagasy, John (the direct object) precedes Jeanne (the indirect object), whereas in the English equivalent, Jeanne precedes John. English, unlike Malagasy, has S–V–IO–DO order.

Another example of this phenomenon, from Päri (DO–V–S–IO order):

diɛl-Ø áñúth'ì pònd'-ɛ rwʌth-Ø

goat-ABS showed boy-ERG chief-ABS

'The boy showed the chief the goat.'

Note that Polinsky's principle does not state anything about the order of the indirect object and the subject relative to each other, hence the difference between Malagasy (IO–S) and Päri (S–IO) in this regard.

Correlation with ergativity
Anna Siewierska (1996) suggests that ergative–absolutive alignment is overrepresented in OS languages relative to SO languages. To test this, she measured the occurrence of ergative alignment in two samples (SO vs. OS languages) across three categories: agreement, pronouns, and nouns (since some languages have different alignment systems in different categories). She then calculated the frequency of ergativity in each category relative to the sample.

Notably, full noun phrases in the OS sample (but not the SO sample) favor ergative alignment, with a majority (60%) of noun alignment in the OS sample being ergative. This is the only cell of the table with a higher than 50% frequency of ergativity. Even in the other categories, the OS sample consistently has a higher relative frequency of ergativity than the SO sample. However, Siewierska notes that her sample size of 12 OS languages is too small for a significance test.

Siewierska theorizes that, in these languages, ergativity may have arisen from the reanalysis of passive clauses as active transitive clauses. The order of the patient and the agent relative to each other remained during this reanalysis, resulting in unmarked OS word order.

Object-initial word order
A notable subset of OS order is object-initial word order, in which the object appears first in the clause. This includes OVS and OSV, but not VOS (which is verb-initial, i.e. the verb appears first in the clause).

In a 1979 study, Desmond C. Derbyshire and Geoffrey K. Pullum reported that predominant object-initial word order only occurs in the Amazonian language area. Amazonian languages with object-initial order include Hixkaryana, Urubu, Apurinã, Xavante, and Nadëb. However, since Derbyshire and Pullum's study, examples of languages with object-initial order have been found outside the Amazon. These include Mangarayi (Australia), Äiwoo (Melanesia), and Päri (Africa). The ancient Mesopotamian language Hurrian frequently utilizes object-initial order (OSV) in its attested writing, although it's debatable whether or not this can be considered the default order of the language.

Typologically, object-initial order can be analyzed as the presence of OS order within an OV structure, since the object comes before both the subject and the verb. OV structure is most commonly found in languages with SOV order, such as Japanese and Turkish. Features generally associated with OV structure include the use of postpositions rather than prepositions (i.e. constructions like "the house in" rather than "in the house" as in English), and placement of the genitive before the noun rather than after, among other features. As for object-initial languages, Edward L. Keenan III (1978) notes that features associated with OV order are present in Hixkaryana, for example.

Matthew Dryer (1997) proposes that in the case of ergative–absolutive languages that display this pattern (such as Mangarayi and Päri), the first constituent in the order is technically not the object but the absolutive, since the conventional notions of "subject" and "object" (best suited to a nominative–accusative paradigm) do not exactly apply. Thus, it may be more appropriate to describe these particular languages as absolutive-initial rather than object-initial.

Statistics
The scarcity of OS as a default word order has been observed since at least 1963, when Joseph Greenberg proposed the tendency of subjects to precede objects as his first universal. Other linguists of the 20th century, such as Theo Vennemann in 1973, even stated that true OS languages were not attested at all.

In 2013, Dryer surveyed 1377 languages to determine which word orders are more frequently predominant than others. His findings are summarized in the table below:

On the basis of the SO / OS dichotomy, this table can be divided into two halves: the three SO orders (SOV, SVO, VSO) constitute the more common half of the table, while the three OS orders (VOS, OVS, OSV) constitute the less common half. The two object-initial orders (OVS and OSV) are the rarest of all. Even the least common SO order, VSO, is still significantly more common (6.9%) than all three of the OS orders combined (2.9%).

Hammarström (2016) surveyed the word orders of 5252 languages in two ways: counting the languages directly, and stratifying them by language families. Both of these methods yielded a ranking of the word orders identical to Dryer's ranking, albeit with different percentages:

The data shows a preference for SO order over OS order among the languages of the world. Linguists have proposed several explanations for this phenomenon.

Keenan: Relevance Principle
Keenan (1978) postulates a Relevance Principle that motivates placing the subject before the object.


 * The Relevance Principle: The reference of the subject (phrase) determines in part, the relevance of what is said, regardless of what it is, to the addressee.

In a typical sentence, the subject is the same as the topic, i.e. the thing that is being talked about. Thus, in a language that puts the subject first, a listener can immediately determine whether or not the speaker's utterance is relevant to them personally (or relevant to what has already been said). Conversely, a language that postpones the subject will require the listener to process a larger portion of the utterance in order to determine how relevant it is.

Take the following example sentence:


 * John left the meeting early.

If John is a political candidate the listener is supporting, this sentence is much more relevant to the listener than if John is just a man who is setting up the tables. Thus, the relevance of "left the meeting early" to the listener is dependent upon the relevance of "John".

Derbyshire and Pullum: survivorship bias
In their 1979 study, Derbyshire and Pullum suggest that the scarcity of OS languages (and specifically object-initial languages) relative to SO languages may simply be the result of survivorship bias rather than any underlying structural motivation. They argue that the global prevalence of SO order, and SVO in particular, has been amplified by the colonial expansion of the English, French, Spanish, Portuguese, and Dutch empires (all of which speak SVO languages) and the resultant mass language extinction events in the continents which they colonized. Thus, there may have been an unknown number of OS languages which were driven to extinction by colonizing SO languages.

Tily et al.: cognitive bias
Tily et al. (2011) performed an experiment with a sample of 285 native English speakers. In the experiment, the participants were taught simple phrases in constructed languages of all six possible word orders, then performed trials in these languages to demonstrate how successfully they had acquired the syntax. The researchers calculated how many of the trials were performed correctly for each word order:

The researchers note that these results may be biased by the fact that all the participants are native speakers of English, which has SVO word order. This would explain why the two orders with the poorest performances (OSV and VOS) are those in which none of the three constituents are in the same position as in SVO. However, this English bias does not explain the high accuracy score of SOV in particular, and VSO to a lesser extent, relative to all the OS orders. The researchers hypothesize that there may be a universal cognitive bias in favor of placing agents before patients, but note that this hypothesis has yet to be tested with participants whose native language is not English.