Immediate constituent analysis

In linguistics, immediate constituent analysis or IC analysis is a method of sentence analysis that was proposed by Wilhelm Wundt and named by Leonard Bloomfield. The process reached a full-blown strategy for analyzing sentence structure in the distributionalist works of Zellig Harris and Charles F. Hockett, and in glossematics by Knud Togeby. The practice is now widespread. Most tree structures employed to represent the syntactic structure of sentences are products of some form of IC-analysis. The process and result of IC-analysis can, however, vary greatly based upon whether one chooses the constituency relation of phrase structure grammars (= constituency grammars) or the dependency relation of dependency grammars as the underlying principle that organizes constituents into hierarchical structures.

IC-analysis in phrase structure grammars
Given a phrase structure grammar (= constituency grammar), IC-analysis divides up a sentence into major parts or immediate constituents, and these constituents are in turn divided into further immediate constituents. The process continues until irreducible constituents are reached, i.e., until each constituent consists of only a word or a meaningful part of a word. The end result of IC-analysis is often presented in a visual diagrammatic form that reveals the hierarchical immediate constituent structure of the sentence at hand. These diagrams are usually trees. For example:


 * E-ICA-01.jpg

This tree illustrates the manner in which the entire sentence is divided first into the two immediate constituents this tree and illustrates IC-analysis according to the constituency relation; these two constituents are further divided into the immediate constituents this and tree, and illustrates IC-analysis and according to the constituency relation; and so on.

An important aspect of IC-analysis in phrase structure grammars is that each individual word is a constituent by definition. The process of IC-analysis always ends when the smallest constituents are reached, which are often words (although the analysis can also be extended into the words to acknowledge the manner in which words are structured). The process is, however, different in dependency grammars, since many individual words do not end up as constituents in dependency grammars.

IC-analysis in dependency grammars
As a rule, dependency grammars do not employ IC-analysis, as the principle of syntactic ordering is not inclusion but, rather, asymmetrical dominance-dependency between words. When an attempt is made to incorporate IC-analysis into a dependency-type grammar, the results are some kind of a hybrid system. In actuality, IC-analysis is different in dependency grammars. Since dependency grammars view the finite verb as the root of all sentence structure, they cannot and do not acknowledge the initial binary subject-predicate division of the clause associated with phrase structure grammars. What this means for the general understanding of constituent structure is that dependency grammars do not acknowledge a finite verb phrase (VP) constituent and many individual words also do not qualify as constituents, which means in turn that they will not show up as constituents in the IC-analysis. Thus in the example sentence This tree illustrates IC-analysis according to the dependency relation, many of the phrase structure grammar constituents do not qualify as dependency grammar constituents:


 * E-ICA-02.jpg

This IC-analysis does not view the finite verb phrase illustrates IC-analysis according to the dependency relation nor the individual words tree, illustrates, according, to, and relation as constituents.

While the structures that IC-analysis identifies for dependency and constituency grammars differ in significant ways, as the two trees just produced illustrate, both views of sentence structure acknowledge constituents. The constituent is defined in a theory-neutral manner:


 * Constituent
 * A given word/node plus all the words/nodes that that word/node dominates

This definition is neutral with respect to the dependency vs. constituency distinction. It allows one to compare the IC-analyses across the two types of structure. A constituent is always a complete tree or a complete subtree of a tree, regardless of whether the tree at hand is a constituency or a dependency tree.

Constituency tests
The IC-analysis for a given sentence is arrived at usually by way of constituency tests. Constituency tests (e.g. topicalization, clefting, pseudoclefting, pro-form substitution, answer ellipsis, passivization, omission, coordination, etc.) identify the constituents, large and small, of English sentences. Two illustrations of the manner in which constituency tests deliver clues about constituent structure and thus about the correct IC-analysis of a given sentence are now given. Consider the phrase The girl in the following trees:


 * Thegirlishappy.png

The acronym BPS stands for "bare phrase structure", which is an indication that the words are used as the node labels in the tree. Again, focusing on the phrase The girl, the tests unanimously confirm that it is a constituent as both trees show:


 * ...the girl is happy - Topicalization (invalid test because test constituent is already at front of sentence)
 * It is the girl who is happy. - Clefting
 * (The one)Who is happy is the girl. - Pseudoclefting
 * She is happy. - Pro-form substitution
 * Who is happy? -The girl. - Answer ellipsis

Based on these results, one can safely assume that the noun phrase The girl in the example sentence is a constituent and should therefore be shown as one in the corresponding IC-representation, which it is in both trees. Consider next what these tests tell us about the verb string is happy:


 * *...is happy, the girl. - Topicalization
 * *It is is happy that the girl. - Clefting
 * *What the girl is is happy. - Pseudoclefting
 * *The girl so/that/did that. - Pro-form substitution
 * What is the girl? -*Is happy. - Answer ellipsis

The star * indicates that the sentence is not acceptable English. Based on data like these, one might conclude that the finite verb string is happy in the example sentence is not a constituent and should therefore not be shown as a constituent in the corresponding IC-representation. Hence this result supports the IC-analysis in the dependency tree over the one in the constituency tree, since the dependency tree does not view is happy as a constituent.