Center embedding

In linguistics, center embedding is the process of embedding a phrase in the middle of another phrase of the same type. This often leads to difficulty with parsing which would be difficult to explain on grammatical grounds alone. The most frequently used example involves embedding a relative clause inside another one as in:


 * A man that a woman loves $$\Rightarrow$$
 * A man that a woman that a child knows loves $$\Rightarrow$$
 * A man that a woman that a child that a bird saw knows loves $$\Rightarrow$$
 * A man that a woman that a child that a bird that I heard saw knows loves

In theories of natural language parsing, the difficulty with multiple center embedding is thought to arise from limitations of the human short term memory. In order to process multiple center embeddings, we have to store many subjects in order to connect them to their predicates.

An interesting theoretical point is that sentences with multiple center embedding are grammatical, but unacceptable. Such examples are behind Noam Chomsky's comment that, "Languages are not 'designed for parsability' … we may say that languages, as such, are not usable."

Some researchers (such as Peter Reich) came up with theories that though single center embedding is acceptable (as in "the man that boy kicked is a friend of mine"), double center embedding is not. The linguist Anne De Roeck and colleagues provided a counter-example: "Isn't it true that example-sentences that people that you know produce are more likely to be accepted?" (De Roeck et al., 1982).

The linguist Fred Karlsson provided empirical evidence in 2007 that the maximal degree of multiple center-embedding of clauses is exactly 3 in written language. He provided thirteen genuine examples of this type from various Indo-European languages (Danish, English, German, Latin, Swedish). No real examples of degree 4 have been recorded. In spoken language, multiple center-embeddings even of degree 2 are so rare as to be practically non-existing.

Center embedding is the focus of a science fiction novel, Ian Watson's The Embedding, and plays a part in Ted Chiang's Story of Your Life.

Background
Embedding on its own refers to all types of clauses occurring as subordinate parts of a superordinate clause. There are three types of sub-clauses: complement, relative, and adverbial. Subordinators or relative pronouns indicate which sub clause is being used. Center embedding (abbreviated "C" or "c") contains words of the superordinate clause on the left and the right of the sub-clauses. Multiple center embedding of the same type of clause is called self-embedding.

In the English language we can create an infinite number of sentences, even though we have a set number of words and grammatical rules. We can create infinite sentences because of the rules of recursion and iteration. The rule of recursion is how we come to center embedding by embedding one sentence within another sentence. Linguists say that center embedding could go on forever and technically be grammatically correct. The reader would however become confused trying to keep track of who did what and when because our working memory would not be able to store and keep track of all the information. Given enough time and a piece of paper and pencil, the reader could work out the information until the sentence made sense.

Japanese
Japanese allows a singly nested clause, but an additional nesting makes a sentence unprocessable. Example from, section 13.4.

兄が 妹を いじめた

older.brother-NOM younger.sister-ACC bullied

"My older brother bullied my younger sister."

ベビーシッターは 兄が 妹を いじめた と 言った

babysitter-TOP older.brother-NOM younger.sister-ACC bullied that said

The babysitter said that my older brother bullied my younger sister.The following sentence is unprocessable:おばさんは ベビーシッターが 兄が 妹を いじめた と 言った と 思っている

aunt-TOP babysitter-NOM older.brother-NOM younger.sister-ACC bullied that said that thinks

My aunt thinks that the babysitter said that my older brother bullied my younger sister.

Effective and ineffective embedding
Embedding can be used when two clauses share a common category and can expand a sentence. It is not effective when optional categories are used to create extensive embedding in a sentence.

Example of effective embedding
My brother opened the window. The maid had closed it. -The common category is the window. So this sentence can be expanded to become My brother opened the window the maid had closed.

Example of ineffective embedding

 * My brother opened the window the maid the janitor Uncle Bill had hired had married had closed.

There is no common category for this sentence. So it should be broken up into multiple sentences to make sense to the reader:


 * My brother opened the window the maid had closed. She was the one who had married the janitor Uncle Bill had hired.

A center embedded sentence is difficult to comprehend when a relative clause is embedded in another relative clause. Comprehension becomes easier when the types of clause are different – when a complement clause is embedded in a relative clause or when a relative clause is embedded in a complement clause. For example: The man who heard that the dog had been killed on the radio ran away.

One can tell if a sentence is center embedded or edge embedded depending on where the brackets are located in the sentence.


 * 1) [Joe believes [Mary thinks [John is handsome.]]]
 * 2) The cat [that the dog [that the man hit] chased] meowed.

In sentence (1), all of the brackets are located on the right, so this sentence is right-embedded. In sentence (2), the brackets are located inside the sentence spaced throughout.