Minimalist program

In linguistics, the minimalist program is a major line of inquiry that has been developing inside generative grammar since the early 1990s, starting with a 1993 paper by Noam Chomsky.

Following Imre Lakatos's distinction, Chomsky presents minimalism as a program, understood as a mode of inquiry that provides a conceptual framework which guides the development of linguistic theory. As such, it is characterized by a broad and diverse range of research directions. For Chomsky, there are two basic minimalist questions—What is language? and Why does it have the properties it has?—but the answers to these two questions can be framed in any theory.

Goals and assumptions
Minimalism is an approach developed with the goal of understanding the nature of language. It models a speaker's knowledge of language as a computational system with one basic operation, namely Merge. Merge combines expressions taken from the lexicon in a successive fashion to generate representations that characterize I-Language, understood to be the internalized intensional knowledge state as represented in individual speakers. By hypothesis, I-language—also called universal grammar—corresponds to the initial state of the human language faculty in individual human development.

Minimalism is reductive in that it aims to identify which aspects of human language—as well the computational system that underlies it—are conceptually necessary. This is sometimes framed as questions relating to perfect design (Is the design of human language perfect?) and optimal computation (Is the computational system for human language optimal?) According to Chomsky, a human natural language is not optimal when judged based on how it functions, since it often contains ambiguities, garden paths, etc. However, it may be optimal for interaction with the systems that are internal to the mind.

Such questions are informed by a set of background assumptions, some of which date back to the earliest stages of generative grammar:


 * 1) Language is a form of cognition. There is a language faculty (FL) that interacts with other cognitive systems; this accounts for why humans acquire language.
 * 2) Language is a computational system. The language faculty consists of a computational system (CHL) whose initial state (S0) contains invariant principles and parameters.
 * 3) Language acquisition consists of acquiring a lexicon and fixing the parameter values of the target language.
 * 4) Language generates an infinite set of expressions given as a sound-meaning pair (π, λ).
 * 5) Syntactic computation interfaces with phonology: π corresponds to phonetic form (PF), the interface with the articulatory-perceptual (A-P) performance system, which includes articulatory speech production and acoustic speech perception.
 * 6) Syntactic computation interfaces with semantics: λ corresponds to logical form (LF), the interface with the conceptual-intentional (C-I) performance system, which includes conceptual structure and intentionality.
 * 7) Syntactic computations are fully interpreted at the relevant interface: (π, λ) are interpreted at the PF and LF interfaces as instructions to the A-P and C-I performance systems.
 * 8) Some aspects of language are invariant. In particular, the computational system (i.e. syntax) and LF are invariant.
 * 9) Some aspects of language show variation. In particular, variation reduces to Saussurean arbitrariness, parameters and the mapping to PF.
 * 10) The theory of grammar meets the criterion of conceptual necessity; this is the Strong Minimalist Thesis introduced by Chomsky in (2001). Consequently, language is an optimal association of sound with meaning; the language faculty satisfies only the interface conditions imposed by the A-P and C-I performance systems; PF and LF are the only linguistic levels.

Strong minimalist thesis
Minimalism develops the idea that human language ability is optimal in its design and exquisite in its organization, and that its inner workings conform to a very simple computation. On this view, universal grammar instantiates a perfect design in the sense that it contains only what is necessary. Minimalism further develops the notion of economy, which came to the fore in the early 1990s, though still peripheral to transformational grammar. Economy of derivation requires that movements (i.e., transformations) occur only if necessary, and specifically to satisfy to feature-checking, whereby an interpretable feature is matched with a corresponding uninterpretable feature. (See discussion of feature-checking below.) Economy of representation requires that grammatical structures exist for a purpose. The structure of a sentence should be no larger or more complex than required to satisfy constraints on grammaticality.

Within minimalism, economy—recast in terms of the strong minimalist thesis (SMT)—has acquired increased importance. The 2016 book entitled Why Only Us—co-authored by Noam Chomsky and Robert Berwick—defines the strong minimalist thesis as follows:

"The optimal situation would be that UG reduces to the simplest computational principles which operate in accord with conditions of computational efficiency. This conjecture is ... called the Strong Minimalist Thesis (SMT)."

Under the strong minimalist thesis, language is a product of inherited traits as developmentally enhanced through intersubjective communication and social exposure to individual languages (amongst other things). This reduces to a minimum the "innate" component (the genetically inherited component) of the language faculty, which has been criticized over many decades and is separate from the developmental psychology component.

Intrinsic to the syntactic model (e.g. the Y/T-model) is the fact that social and other factors play no role in the computation that takes place in narrow syntax; what Chomsky, Hauser and Fitch refer to as faculty of language in the narrow sense (FLN), as distinct from faculty of language in the broad sense (FLB). Thus, narrow syntax only concerns itself with interface requirements, also called legibility conditions. SMT can be restated as follows: syntax, narrowly defined, is a product of the requirements of the interfaces and nothing else. This is what is meant by "Language is an optimal solution to legibility conditions" (Chomsky 2001:96).

Interface requirements force deletion of features that are uninterpretable at a particular interface, a necessary consequence of Full Interpretation. A PF object must only consist of features that are interpretable at the articulatory-perceptual (A-P) interface; likewise a LF object must consist of features that are interpretable at the conceptual-intentional (C-I) interface. The presence of an uninterpretable feature at either interface will cause the derivation to crash.

Narrow syntax proceeds as a set of operations—Merge, Move and Agree—carried out upon a numeration (a selection of features, words etc., from the lexicon) with the sole aim of removing all uninterpretable features before being sent via Spell-Out to the A-P and C-I interfaces. The result of these operations is a hierarchical syntactic structure that captures the relationships between the component features.

Technical innovations
The exploration of minimalist questions has led to several radical changes in the technical apparatus of transformational generative grammatical theory. Some of the most important are:
 * the elimination of the distinction between deep structure and surface structure in favour of a derivational approach
 * the elimination of X-bar theory in favour of bare phrase structure (see below)
 * the elimination of indexation in favour of Move or Agree
 * the elimination of the notion of government in favour of feature-checking
 * the idea that feature-checking—which matches interpretable and uninterpretable features, and subsequently deletes the latter—might be responsible for all structure-building operations, including Merge, Move, and Agree (see below)
 * the idea that syntactic derivations proceed by clearly delineated stages called "phases" (see below)
 * the specification that there are exactly two points where syntax interacts with other components: a "spell-out" point between syntax and the interface with phonetic form (PF), and an additional point of interaction with logical form (LF)

Basic operations
Early versions of minimalism posits two basic operations: Merge and Move. Earlier theories of grammar—as well as early minimalist analyses—treat phrasal and movement dependencies differently than current minimalist analyses. In the latter, Merge and Move are different outputs of a single operation. Merge of two syntactic objects (SOs) is called "external Merge". As for Move, it is defined as an instance of "internal Merge", and involves the re-merge of an already merged SO with another SO. In regards to how Move should be formulated, there continues to be active debate about this, but the differences between current proposals are relatively minute.

More recent versions of minimalism recognize three operations: Merge (i.e. external Merge), Move (i.e. internal Merge), and Agree. The emergence of Agree as a basic operation is related to the mechanism which forces movement, which is mediated by feature-checking.

Merge
In its original formulation, Merge is a function that takes two objects (α and β) and merges them into an unordered set with a label, either α or β. In more recent treatments, the possibility of the derived syntactic object being un-labelled is also considered; this is called "simple Merge" (see Label section).

In the version of Merge which generates a label, the label identifies the properties of the phrase. Merge will always occur between two syntactic objects: a head and a non-head. For example, Merge can combine the two lexical items drink and water to generate drink water. In the Minimalist Program, the phrase is identified with a label. In the case of drink water, the label is drink since the phrase acts as a verb. This can be represented in a typical syntax tree as follows, with the name of the derived syntactic object (SO) determined either by the lexical item (LI) itself, or by the category label of the LI: Merge can operate on already-built structures; in other words, it is a recursive operation. If Merge were not recursive, then this would predict that only two-word utterances are grammatical. (This is relevant for child language acquisition, where children are observed to go through a so-called "two-word" stage. This is discussed below in the implications section.) As illustrated in the accompanying tree structure, if a new head (here γ) is merged with a previously formed syntactic object (a phrase, here {α, {α, β} }), the function has the form Merge (γ, {α, {α, β}} ) → {γ, {γ, {α, {α, β}}}}. Here, γ is the head, so the output label of the derived syntactic object is γ.



Chomsky's earlier work defines each lexical item as a syntactic object that is associated with both categorical features and selectional features. Features—more precisely formal features—participate in feature-checking, which takes as input two expressions that share the same feature, and checks them off against each other in a certain domain. In some but not all versions of minimalism, projection of selectional features proceeds via feature-checking, as required by locality of selection:

Selection as projection: As illustrated in the bare phrase structure tree for the sentence The girl ate the food; a notable feature is the absence of distinct labels (see Labels below). Relative to Merge, the selectional features of a lexical item determine how it participates in Merge:
 * eat The Lexical Item eat is a transitive verb, and so assigns two theta-roles (Agent, Theme). Theta-roles can be represented as D-features on V—VD,D—and these D features force the verb to merge with two DPs. As illustrated in the tree, the first application of Merge generates the Verb-Complement sequence (ate the food), with the DP the food in complement position. The second application of Merge generates the equivalent of a Specifier-VP sequence (the girl ate the food), with the DP the girl in specifier position.
 * PAST The Lexical Item for "past tense" is represented as the feature. Tense requires the presence of a DP subject and a verb; this is notated as TD,V. (Or more precisely as TEPP:D.NOM,V. The "EPP" notation stands for "extended projection principle" feature, NOM stands for "nominative case".) Tense first merges with a V-projection, and the output then combines the DP subject the girl, which, in some sense, merges twice: once within the V-projection, and once within the T-projection. (See discussion of Move below.)
 * C∅ The Lexical Item for clause-typing is a phonologically null C∅. By hypothesis, all sentences are clauses (CPs), so the root clause The girl ate the food is analyzed as CP. Given the assumption that all phrases are headed (endocentric), CP must be headed by C. C selects TP, notated as CT.

Feature-checking: When a feature is "checked", it is removed.


 * Merge(V,DP) checks off one of the D features of V. We see this on the intermediate V projection, where the complement position is realized by the DP the food. This D-feature is then "checked" and we can see one of the D features is removed at the intermediate V-projection. Merge(V,DP) applies a second time, and the maximal V in the tree has no D features because at this stage of the derivation both D features have been "checked". Specifically, the D feature of the intermediate V-projection is "checked" by the DP the girl in the specifier position of V.
 * Merge(T,VP) checks off the V-feature of T; Merge(T,DP) checks off the D-feature of T.
 * Merge(C,TP) checks off the T-feature of C.

Locality of selection (LOS) is a principle that forces selectional features to participate in feature checking. LOS states that a selected element must combine with the head that selects it either as complement or specifier. Selection is local in the sense that there is a maximum distance that can occur between a head and what it selects: selection must be satisfied with the projection of the head.

Move
Move arises via "internal Merge".

Movement as feature-checking: The original formulation of the extended projection principle states that clauses must contain a subject in the specifier position of spec TP/IP. In the tree above, there is an EPP feature. This is a strong feature which forces re-Merge—which is also called internal merge—of the DP the girl. The EPP feature in the tree above is a subscript to the T head, which indicates that T needs a subject in its specifier position. This causes the movement of to the specifier position of T.

Label
A substantial body of literature in the minimalist tradition focuses on how a phrase receives a proper label. The debate about labeling reflects the deeper aspirations of the minimalist program, which is to remove all redundant elements in favour of the simplest analysis possible. While earlier proposals focus on how to distinguish adjunction from substitution via labeling, more recent proposals attempt to eliminate labeling altogether, but they have not been universally accepted.

Adjunction and substitution: Chomsky's 1995 monograph entitled The Minimalist Program outlines two methods of forming structure: adjunction and substitution. The standard properties of segments, categories, adjuncts, and specifiers are easily constructed. In the general form of a structured tree for adjunction and substitution, α is an adjunct to X, and α is substituted into SPEC, X position. α can raise to aim for the Xmax position, and it builds a new position that can either be adjoined to [Y-X] or is SPEC, X, in which it is termed the 'target'. At the bottom of the tree, the minimal domain includes SPEC Y and Z along with a new position formed by the raising of α which is either contained within Z, or is Z. Adjunction: Before the introduction of bare phrase structure, adjuncts did not alter information about bar-level, category information, or the target's (located in the adjoined structure) head. An example of adjunction using the X-bar theory notation is given below for the sentence Luna bought the purse yesterday. Observe that the adverbial modifier yesterday is sister to VP and dominated by VP. Thus, the addition of the modifier does not change information about the bar-level: in this case the maximal projection VP. In the minimalist program, adjuncts are argued to exhibit a different, perhaps more simplified, structure. Chomsky (1995) proposes that adjunction forms a two-segment object/category consisting of: (i) the head of a label; (ii) a different label from the head of the label. The label L is not considered a term in the structure that is formed because it is not identical to the head S, but it is derived from it in an irrelevant way. If α adjoins to S, and S projects, then the structure that results is L = {,{α,S}}, where the entire structure is replaced with the head S, as well as what the structure contains. The head is what projects, so it can itself be the label or can determine the label irrelevantly. In the new account developed in bare phrase structure, the properties of the head are no longer preserved in adjunction structures, as the attachment of an adjunct to a particular XP following adjunction is non-maximal, as shown in the figure below that illustrates adjunction in BPS. Such an account is applicable to XPs that are related to multiple adjunction.


 * AdjunctionDEFINITION: , where Label = {,{α,S}} (S = head)

Substitution forms a new category consisting of a head (H), which is the label, and an element being projected. Some ambiguities may arise if the features raising, in this case α, contain the entire head and the head is also XMAX.
 * SubstitutionDEFINITION: Label = {H(S), {α,S}).

Labeling algorithm (LA): Merge is a function that takes two objects (α and β) and merges them into an unordered set with a label (either α or β), where the label indicates the kind of phrase that is built via merge. But this labeling technique is too unrestricted since the input labels make incorrect predictions about which lexical categories can merge with each other. Consequently, a different mechanism is needed to generate the correct output label for each application of Merge in order to account for how lexical categories combine; this mechanism is referred to as the labeling algorithm (LA).


 * Labeling via selection and agreement. In a series of articles, Chomsky has proposed that labels are determined by a labeling algorithm which operates after syntactic structures have been built. This mechanism departs from previous version of generative grammar in that the labels of a phrase are now determined endocentrically. There are a number of proposals that have been hypothesized to explain the exact nature of the labeling algorithm. In earlier discussions, Chomsky hypothesizes that determining the label of a set-theoretic object (α, β) depends on either semantic selection or agreement holding between α and β. Although this formulation of the LA is consistent with the basic principles of X-bar theory, reference to external relations like semantic selection and agreement are at odds with the goal of developing a parsimonious account.
 * Labeling Algorithm (version 1): The output of Merge (α, β) is labeled by α if:
 * (a) α selects β as its semantic argument, or
 * (b) α agrees with β, meaning, β is led to α because of Spec-head agreement (feature checking) (Chomsky 1995b;2000)


 * Labeling via external and internal Merge. Proposed by Chomsky in 2008, in this version of LA, clause (a) means that the output of Merge(V, DP) would be labelled V because V is a lexical item. Clause (b) means that if a syntactic object is re-introduced into the derivation via internal Merge—as it is when a subject DP moves to Spec,TP—then the output of Merge(DP,T) would be labelled T. However, this version of LA uses a disjunctive definition of labelling, one for external Merge (clause a), and one for internal merge (clause b).
 * Labeling algorithm (version 2): The output of Merge (α, β) is labeled by α if
 * (a) α is a lexical item (LI), or
 * (b) β is internally merged to α (Chomsky 2008: 145)


 * Labeling via prominence. Chomsky (2013) suggests an even simpler LA based on the notion of "prominence". In this version of the LA, the label is independently determined by whichever properties of UG (universal grammar) allow us to identify a "prominent Lexical Item. Due to this revision, it becomes questionable whether Merge plays any role at all in labelling/projection, since it is now redundant.
 * Labeling algorithm (version 3): The label/head of an SO (syntactic object) Σ is the most prominent Lexical Item within Σ. (Chomsky 2013)


 * Labeling via set formation: Further simplification is given in Chomsky (2000), where Merge is simplified to an elementary set-formation operation, meaning that syntactic objects (SOs) are no longer associated with non-terminal nodes, like projections. Labels are now only the syntactically relevant "head" of a phrase that are determined independently by the LA. With this theory, labeling leaves bare phrase structure (BPS) completely projection-free.


 * Labeling algorithm (version 4): Merge(α, β) = {α, β}.

Recently, the suitability of a labeling algorithm has been questioned, as syntacticians have identified a number of limitations associated with what Chomsky has proposed. It has been argued that two kinds of phrases pose a problem. The labeling algorithm proposes that labelling occurs via minimal search, a process where a single lexical item within a phrasal structure acts as a head and provides the label for the phrase. It has been noted that minimal search cannot account for the following two possibilities:


 * {H, H} where both constituents are lexical items.
 * 1) {XP, YP} where neither constituent is a lexical item.

In each of these cases, there is no lexical item acting as a prominent element (i.e. a head). Given this, it is not possible through minimal search to extract a label for the phrase. While Chomsky has proposed solutions for these cases, it has been argued that the fact that such cases are problematic suggests that the labeling algorithm violates the tenets of the minimalist program, as it departs from conceptual necessity.

Other linguistic phenomena that create instances where Chomsky's labeling algorithm cannot assign labels include predicate fronting, embedded topicalization, scrambling (free movement of constituents), stacked structures (which involve multiple specifiers).

Given these criticisms of Chomsky's labeling algorithm, it has been recently argued that the labeling algorithm theory should be eliminated altogether and replaced by another labeling mechanism. The symmetry principle has been identified as one such mechanism, as it provides an account of labeling that assigns the correct labels even when phrases are derived through complex linguistic phenomena.

Agree
Starting in the early 2000s, attention turned from feature-checking as a condition on movement to feature-checking as a condition on agreement. This line of inquiry was initiated in Chomsky (2000), and formulated as follows:


 * Agree: α can agree with β if and only if:
 * (a) α carries at least one unvalued and uninterpretable feature and
 * β carries a matching interpretable and valued feature
 * (b) α c-commands β
 * (c) β is the closest goal to α
 * (d) β bears an unvalued uninterpretable feature (from Zeijlstra 2012)

Many recent analyses assume that Agree is a basic operation, on par with Merge and Move. This is currently a very active area of research, and there remain numerous open questions:


 * 1) Is Agree a primitive operation?
 * 2) What is the "direction" of the Agree relation: does it apply top-down, bottom-up, or both?
 * 3) Is Agree a syntactic operation, a post-syntactic operation that applies at PF, or both?
 * 4) Is the Agree relation restricted to certain feature types?
 * 5) Is the Agree relation subject to locality restrictions?
 * 6) Which phenomena are best modelled by the Agree relation?
 * 7) Is the Agree relation conditioned by other factors, or does it apply freely?
 * 8) How does Agree interact with other operations such as Merge and Label?

Co-indexation as feature checking: co-indexation markers such as {k, m, o, etc.}

Derivation by phase
A phase is a syntactic domain first hypothesized by Noam Chomsky in 1998. It is a domain where all derivational processes operate and where all features are checked. A phase consists of a phase head and a phase domain. Once any derivation reaches a phase and all the features are checked, the phase domain is sent to transfer and becomes invisible to further computations. The literature shows three trends relative to what is generally considered to be a phase:


 * 1) All CPs and some vPs are phases: Chomsky originally proposed that CP and vP in transitive and unergative verbs constitute phases. This was proposed based on the phrases showing strong phase effects discussed  above.
 * 2) A specified set of phrases are phases: CP, DP (based on parallels between DP and CP ), all vPs, TP (in some languages )
 * 3) Every phrase is a phase, with moved constituents cycling through all intermediate phrase edges.

Strong phases: CP and vP
A simple sentence can be decomposed into two phases, CP and vP. Chomsky considers CP and vP to be strong phases because of their propositional content, as well as their interaction with movement and reconstruction.

Propositional content: CP and vP are both propositional units, but for different reasons. CP is considered a propositional unit because it is a full clause that has tense and force: example (1) shows that the complementizer that in the CP phase conditions finiteness (here past tense) and force (here, affirmative) of the subordinate clause. vP is considered a propositional unit because all the theta roles are assigned in vP: in (2) the verb ate in the vP phase assigns the Theme theta role to the DP the cake and the Agent theta-role to the DP Mary. (1) John said [CP that Mary will eat the cake ].

(2) [CP Mary [vP  ate the cake ].

Movement: CP and vP can be the focus of pseudo-cleft movement, showing that CP and vP form syntactic units: this is shown in (3) for the CP constituent that John is bringing the dessert, and in (4) for the vP constituent arrive tomorrow. (3) a. Mary said [CP that John is bringing the dessert]. b. What Mary said was [CP that John is bringing the dessert].

(4) a. Alice will [vP arrive tomorrow]. b. What Alice will do is [vP arrive tomorrow]. Reconstruction. When a moved constituent is interpreted in its original position to satisfy binding principles, this is called reconstruction. Evidence from reconstruction is consistent with the claim that the moved phrase stops at the left edge of CP and vP phases.

(5) a. [Which picture of himselfk] did Johnk think ___ Fredj liked __? b. [Which picture of himselfj] did Johnk think ___ Fredj liked __?
 * Reconstruction at left edge of CP phase: In (5), the reflexive himself can be understood as being co-referential with either John or Fred, where co-reference is indicated by co-indexation. However, the constituent that contains himself, namely the sentence-initial phrase [which picture of himself], is not c-commanded by either John or Fred, as is required by Principle A of the Binding Theory. The fact that co-indexation of himself with either one of John or Fred is possible is taken as evidence that the constituent containing the reflexive, namely [which picture of himself] has moved through a reconstruction site—here the left edge of the lower CP phrase—from where it can satisfy Principle A of the Binding Theory relative to the DP John.
 * Reconstruction at left edge of vP phase: In (6), bound variable anaphora requires that the pronoun he must be c-commanded by every student, but Condition C of the Binding Theory requires that the R-expression Mary be free. However, these requirements cannot be satisfied by the sentence-initial constituent that contains both he and Mary, namely the phrase [which of the papers that he gave Mary]. The fact that the sentence is nevertheless well-formed is taken to indicate that this phrase must have moved through a reconstruction site first, from where it is interpreted. The left edge of the vP phase is the only position where these binding requirements could be satisfied: (i) every student c-commands the pronoun he; (ii) Mary is free from any c-commanding DP.

(6) [Which of the papers that hek gave Maryj] did every studentk __ ask herj to read __ carefully?

Phase edge
Chomsky theorized that syntactic operations must obey the phase impenetrability condition (PIC) which essentially requires that movement be from the left-edge of a phase. The PIC has been variously formulated in the literature. The extended projection principle feature that is on the heads of phases triggers the intermediate movement steps to phase edges.

Phase impenetrability condition (PIC)
Movement of a constituent out of a phase is (in the general case) only permitted if the constituent has first moved to the left edge of the phase (XP).

The edge of a head X is defined as the residue outside of X', in either specifier of X and adjuncts to XP.

English successive cyclic wh-movement obeys the PIC. Sentence (7) has two phases: vP and CP. Relative to the application of movement, who moves from the (lower) vP phase to the (higher) CP phase in two steps:


 * Step 1: who moves from the complement position of VP to the left edge of vP, and the EPP feature of the verb forces movement of who to the edge of vP.
 * Step 2: who moves from the left edge of the lower vP phase to the specifier of the (higher) CP phase.

(7) [CP Who did you [vP see who ]]?

Medumba wh-movement
Another example of PIC can be observed when analyzing A'-agreement in Medumba. A'-agreement is a term used for the morphological reflex of A'-movement of an XP. In Medumba, when the moved phrase reaches a phase edge, a high low tonal melody is added to the head of the complement of the phase head. Since A'-agreement in Medumba requires movement, the presence of agreement on the complements of phase heads shows that the wh-word moves to the edges of phases and obeys PIC.

Example:

The sentence (2a) has a high low tone on the verb nɔ́ʔ and tense ʤʉ̀n, therefore is grammatical.

(2a) [CP á wʉ́ Wàtɛ̀t nɔ́ɔ̀ʔ [vP ⁿ-ʤʉ́ʉ̀n á?]]

'Who did Watat see?'

The sentence (2b) does not have a high low tone on the verb nɔ́ʔ and tense ʤʉ̀n, therefore is not grammatical.

(2b) *[CP á wʉ́ Wàtɛ̀t nɔ́ʔ [vP ⁿ-ʤʉ́n á?]]

* 'Who did Watat see?'

To generate the grammatical sentence (2a), the wh-phrase á wʉ́ moves from the vP phase to the CP phase. To obey PIC, this movement must take two steps since the wh-phrase needs to move to the edge of the vP phase in order to move out of the lower phase.


 * Step 1: First, the wh-phrase moves from the complement of VP to the edge of the vP phase to avoid violating PIC. In this position, the agreement is expressed on the verb ʤʉ̀n and surfaces as a high low (HL) tone melody (ⁿ-ʤʉ́ʉ̀n). The agreement is expressed on the verb which is the head of the complement of the v phase head.
 * Step 2: Now that it is at the edge of the vP phase, the wh-phrase is able to leave the vP phase and move to the Spec-C position of the CP phase. Agreement is expressed on the tense nɔ́ʔ as a high low tone melody (nɔ́ɔ̀ʔ). The tense which agreement is expressed on is the head of the complement of the C phase head

One can confirm that A' agreement only occurs with movement by examining sentences where the wh-phrase does not move. In sentence (2c) below, one can observe that there is no high low tone melody on the verb nɔ́ʔ and tense fá since the wh-word does not move to the edge of the vP and CP phase.

(2c) [m-ɛ́n nɔ́ʔ fá bɔ̀ á wʉ́ á]

'The child gave the bag to who?'

Cycle
The spell-out of a string is assumed to be cyclic, but there is no consensus about how to implement this. Some analyses adopt an iterative spell-out algorithm, with spell-out applying after each application of Merge. Other analyses adopt an opportunistic algorithm, where spell-out applies only if it must. And yet others adopt a wait-til-the-end algorithm, with spell-out occurring only at the end of the derivation.

There is no consensus about the cyclicality of the Agree relation: it is sometimes treated as cyclic, sometimes as a-cyclic, and sometimes as counter-cyclic.

Principles and parameters
From a theoretical standpoint, and in the context of generative grammar, the Minimalist Program is an outgrowth of the principles and parameters (P&P) model, considered to be the ultimate standard theoretical model that generative linguistics developed from the early 1980s through to the early 1990s. The Principles and Parameters model posits a fixed set of principles (held to be valid for all human languages) that—when combined with settings for a finite set of parameters—could describe the properties that characterize the language competence that a child eventually attains. One aim of the Minimalist Program is to ascertain how much of the Principles and Parameters model can be taken to result from the hypothethesized optimal and computationally efficient design of the human language faculty. In turn, some aspects of the Principles and Parameters model provide technical tools and foundational concepts that inform the broad outlines of the Minimalist Program.

X-bar theory
X-bar theory—first introduced in Chomsky (1970) and elaborated in Jackendoff (1977) among other works—was a major milestone in the history of the development of generative grammar. It contains the following postulates:
 * Each phrase has a head (endocentric) and it projects to a larger phrase.
 * Heads are feature complexes that consist of a primitive feature.
 * The general X-bar schema in (1) is a property of universal grammar (UG):
 * (1) X' → X...
 * X″ → [Spec, X'] X'

In the chapter "Phrase Structure" of The Handbook of Contemporary Syntactic Theory, Naoki Fukui determined three kinds of syntactic relationships, (1) Dominance: the hierarchical categorization of the lexical items and constituents of the structure, (2) Labeling: the syntactic category of each constituent and (3) Linear order (or Precedence): the left-to-right order of the constituents (essentially the existence of the X-bar schemata). Whereas X-bar theory was composed of the three relationships, bare phrase structure only encodes the first two relationships. Claims 1 and 2 have almost completely withstood their original forms through grammatical theory development, unlike Claim 3, which has not. Claim 1 will be eliminated later on in favour of projection-less nodes.

In 1980, the principles and parameters (P&P) approach took place which marked the emergence of different theories that stray from rule-based grammars/rules, and have instead been replaced with multiple segments of UG such as X-bar theory, case theory, etc. During this time, PS rules disappeared because they have proved to be redundant since they recap what is in the lexicon. Transformational rules have survived with a few amendments to how they are expressed. For complex traditional rules, they do not need to be defined and they can be dwindled to a general schema called Move-α—which means things can be moved anywhere. The only two sub-theories that withstood time within P&P is Move-α. Of the fundamental properties mentioned above, X-bar theory accounts for hierarchical structure and endocentricity, while Move-α accounts for unboundedness and non-local dependencies. A few years later, an effort was made to merge X-bar theory with Move-a by suggesting that structures are built from the bottom going up (using adjunction or substitution depending on the target structure): X-bar theory had a number of weaknesses and was replaced by bare phrase structure, but some X-bar theory notions were borrowed by BPS. Labeling in bar phrase structure specifically was adapted from conventions of X-bar theory; however, in order to get the "barest" phrase structures there are some dissimilarities. BPS differs from X-bar theory in the following ways:
 * Features are discharged as soon as a head projects. This follows from the idea that phrases are endocentric (headed): the head is the obligatory component of a phrasal constituent and projects its essential features.
 * There is no X-bar schema, and no requirements for maximal projection to be specified as bar levels. This is a consequence of the claim that features discharged by projection of the head.
 * At any given bar level, iteration is possible. This is based on the idea that phrase structure composition is infinite.
 * Adjunction is responsible for movement and structure-building. This is based on the idea that transformational operations are fundamental.
 * Projections are closed by agreement. This based on the idea that in some languages (Japanese), phrases do not close and elements can be added to keep expanding it. (This is not the case in English.)
 * 1) BPS is explicitly derivational. That is, it is built from the bottom up, bit by bit. In contrast, X-bar theory is representational—a structure for a given construction is built in one fell swoop, and lexical items are inserted into the structure.
 * 2) BPS does not have a preconceived phrasal structure, while in X-bar theory every phrase has a specifier, a head, and a complement.
 * 3) BPS permits only binary branching, while X-bar theory permits both binary and unary branching.
 * 4) BPS does not distinguish between a "head" and a "terminal", while some versions of X-bar theory require such a distinction.
 * 5) BPS incorporates features into their structure, such as Xmax and Xmin, while X-bar theory contains levels, such as XP, X', X
 * 6) BPS accounts cross-linguistically as maximal projections can be perceived at an XP level or an X' level, whereas X-bar theory only perceives XP as the maximal projection.

The main reasoning behind the transition from X-bar theory to BPS is the following: The examples below show the progression of syntax structure from X-bar theory (the theory preceding BPS), to specifier-less structure. BPS satisfies the principles of UG using at minimum two interfaces such as 'conceptual-intentional and sensorimotor systems' or a third condition not specific to language but still satisfying the conditions put forth by the interface.
 * 1) Eliminating the notion of non-branching domination
 * 2) Eliminating the necessity of bar-level projections

Functionalism
In linguistics, there are differing approaches taken to explore the basis of language: two of these approaches are formalism and functionalism. It has been argued that the formalist approach can be characterized by the belief that rules governing syntax can be analyzed independently from things such as meaning and discourse. In other words, according to formalists, syntax is an independent system (referred to as the autonomy of syntax). By contrast, functionalists believe that syntax is determined largely by the communicative function that it serves. Therefore, syntax is not kept separate from things such as meaning and discourse.

Under functionalism, there is a belief that language evolved alongside other cognitive abilities, and that these cognitive abilities must be understood in order to understand language. In Chomsky's theories prior to MP, he had been interested exclusively in formalism, and had believed that language could be isolated from other cognitive abilities. However, with the introduction of MP, Chomsky considers aspects of cognition (e.g. the conceptual-intentional (CI) system and the sensory motor (SM) system) to be linked to language. Rather than arguing that syntax is a specialized model which excludes other systems, under MP, Chomsky considers the roles of cognition, production, and articulation in formulating language. Given that these cognitive systems are considered in an account of language under MP, it has been argued that in contrast to Chomsky's previous theories, MP is consistent with functionalism.

Dependency grammar
There is a trend in minimalism that shifts from constituency-based to dependency-based structures. Minimalism falls under the dependency grammar umbrella by virtue of adopting bare phrase structure, label-less trees, and specifier-less syntax.


 * bare phrase structure: merge does away with non-branching nodes and bar levels, which are replaced by minimal projections (XMIN) and maximal projections (XMAX):
 * a minimal projection does not dominate other lexical items or categories
 * a maximal projection is unable to project any higher.
 * label-less trees: to simplify phrase structures, Noam Chomsky argues that the labels of the category are unnecessary, and therefore do not need to be included, leading to what is now known as label-less trees. In lieu of a specific category in the projection, the lexical item that is classified as a head become its own label.
 * specifier-less syntax: the generalization of Abney's (1987) DP hypothesis gives rise to the development of specifier-less syntax. Lexical items that would have been analyzed as a specifier in earlier versions of X-bar theory—e.g. Determiners were introduced in [Spec,N]; Auxiliaries were introduced in [Spec,V]—become the heads of their own phrases. For example, D introduces NP as a complement; T introduces VP as a complement.

First language (L1) acquisition
As discussed by Helen Goodluck and Nina Kazanin in their 2020 paper, certain aspects of the minimalist program provide insightful accounts for first language (L1) acquisition by children.


 * Two-word stage: Merge is the operation where two syntactic elements are brought together and combined to form a constituent. The head of the pair determines the constituent's label, but the element that becomes the head depends on the language. English is a left-headed language, such that the element on the left is the head; Japanese is a right-headed language, such that the element on the right is the head. Merge (a critical operation in MP) can account for the patterns of word-combination, and more specifically word-order, observed in children's first language acquisition. In first language acquisition, it has been observed that young children combine two words in ways that are consistent with either the head-initial or head-final pattern of the language they are learning. Children learning English produce "pivot" words (e.g. see) before "open" words (e.g. shoe), which is consistent with the head-initial pattern of English, whereas children learning Japanese produce "open" words before "pivot" words.
 * Emergence of headed combinations: Within the minimalist program, bare phrase structure, described in detail above, accounts for children's first language acquisition better than earlier theories of phrase structure building, such as X-bar theory. This is because, under bare phrase structure, children do not need to account for the intermediate layers of structure that appear in X-bar theory. The account of first language acquisition provided under bare phrase structure is simpler than that provided under X-bar theory. In particular, children typically progress from (unordered) conjunctions to headed combinations. This trajectory can be modelled as a progression from symmetric Merge (where the output label output of the derived syntactic object is indeterminate) to asymmetric Merge (where the output label of the derived syntactic object is determinate; i.e. endocentric/headed).

Criticisms
In the late 1990s, David E. Johnson and Shalom Lappin published the first detailed critiques of Chomsky's minimalist program. This technical work was followed by a lively debate with proponents of minimalism on the scientific status of the program. The original article provoked several replies    and two further rounds of replies and counter-replies in subsequent issues of the same journal.

Lappin et al. argue that the minimalist program is a radical departure from earlier Chomskyan linguistic practice that is not motivated by any new empirical discoveries, but rather by a general appeal to perfection, which is both empirically unmotivated and so vague as to be unfalsifiable. They compare the adoption of this paradigm by linguistic researchers to other historical paradigm shifts in natural sciences and conclude that of the minimalist program has been an "unscientific revolution", driven primarily by Chomsky's authority in linguistics. The several replies to the article in Natural Language and Linguistic Theory Volume 18 number 4 (2000) make a number of different defenses of the minimalist program. Some claim that it is not in fact revolutionary or not in fact widely adopted, while others agree with Lappin and Johnson on these points, but defend the vagueness of its formulation as not problematic in light of its status as a research program rather than a theory (see above).

Prakash Mondal has published a book-length critique of the minimalist model of grammar, arguing that there are a number of contradictions, inconsistencies and paradoxes within the formal structure of the system. In particular, his critique examines the consequences of adopting some rather innocuous and widespread assumptions or axioms about the nature of language as adopted in the Minimalist model of the language faculty.

Developments in the minimalist program have also been critiqued by Hubert Haider, who has argued that minimalist studies routinely fail to follow scientific rigour. In particular, data compatible with hypotheses are filed under confirmation whereas crucial counter-evidence is largely ignored or shielded off by making ad hoc auxiliary assumptions. Moreover, the supporting data are biased towards SVO languages and are often based on the linguist's introspection rather attempts to gather data in an unbiased manner by experimental means. Haider further refers to the appeal to an authority figure in the field, with dedicated followers taking the core premises of minimalism for granted as if they were established facts.

Works by Noam Chomsky

 * Chomsky, Noam. 2013. Problems of Projection. Lingua 130: 33–49.
 * Chomsky, Noam. 2008. On Phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud, eds. Robert Freidin, Carlos Peregrín Otero and Maria Luisa Zubizarreta, 133–166. Cambridge, Massachusetts: MIT Press.
 * Chomsky, Noam. 2007. Approaching UG From Below. In Interfaces + Recursion = Language?, eds. Uli Sauerland and Hans Martin Gärtner, 1–29. New York: Mouton de Gruyter.
 * Chomsky, Noam. 2005. Three Factors in Language Design. Linguistic Inquiry 36: 1–22.
 * Chomsky, Noam. 2004. Beyond Explanatory Adequacy. In Structures and Beyond. The Cartography of Syntactic Structures, ed. Adriana Belletti, 104–131. Oxford: Oxford University Press.
 * Chomsky, Noam. 2001. Derivation by Phase. In Ken Hale: A Life in Language, ed. Michael Kenstowicz, 1–52. Cambridge, Massachusetts: MIT Press.
 * Chomsky, Noam. 2000. New horizons in the study of language and mind. Cambridge, UK; New York: Cambridge University Press.
 * Chomsky, Noam. 2000. Minimalist inquiries: the framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, eds. Roger Martin, David Michaels and Juan Uriagereka, 89–155. Cambridge, Massachusetts: MIT Press.
 * Chomsky, Noam. 1995. The Minimalist Program. Cambridge, Massachusetts: The MIT Press.
 * Chomsky, Noam. 1993. "A minimalist program for linguistic theory". In Hale, Kenneth L. and S. Jay Keyser, eds. The view from Building 20: Essays in linguistics in honor of Sylvain Bromberger. Cambridge, Massachusetts: MIT Press. 1–52.

Works on minimalism and its applications

 * Citko, Barbara and Martina Gračanin-Yuksek. 2020. Merge: Binarity in (Multidominant) Syntax. Cambridge, Massachusetts: MIT Press.
 * Smith, Peter W., Johannes Mursell, and Katharina Hartmann (eds.) 2020. Agree to Agree: Agreement in the Minimalist Programme. Berlin: Language Science Press. ISBN 978-3-96110-214-3 (Digital). doi:10.5281/zenodo.3528036
 * Cipriani, Enrico. 2019. Semantics in Generative Grammar. A Critical Survey. Lingvisticae Investigationes, 42, 2, pp. 134–85 doi: https://doi.org/10.1075/li.00033.cip
 * Stroik, Thomas. 2009. Locality in Minimalist Syntax. Cambridge, Massachusetts: MIT Press.
 * Boeckx, Cedric (ed). 2006. Minimalist Essays. Amsterdam: John Benjamins.
 * Epstein, Samuel David, and Seely, T. Daniel (eds). 2002. Derivation and Explanation in the Minimalist Program. Malden, MA: Blackwell.
 * Richards, Norvin. 2001. Movement in Language. Oxford: Oxford University Press.
 * Pesetsky, David. 2001. Phrasal Movement and its Kin. Cambridge, Massachusetts: MIT Press.
 * Martin, Roger, David Michaels and Juan Uriagereka (eds). 2000. Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, Massachusetts: MIT Press.
 * Epstein, Samuel David, and Hornstein, Norbert (eds). 1999. Working Minimalism. Cambridge, Massachusetts: MIT Press.
 * Fox, Danny. 1999. Economy and Semantic Interpretation. Cambridge, Massachusetts: MIT Press.
 * Bošković, Željko. 1997. The Syntax of Nonfinite Complementation. An Economy Approach. Cambridge, Massachusetts: MIT Press.
 * Collins, Chris. 1997. Local Economy. Cambridge, Massachusetts: MIT Press.
 * Brody, Michael. 1995. Lexico-Logical Form: a Radically Minimalist Theory. Cambridge, Massachusetts: MIT Press.

Textbooks on minimalism

 * Adger, David. 2003. Core Syntax. A Minimalist Approach. Oxford: Oxford University Press
 * Boeckx, Cedric. 2006. Linguistic Minimalism. Origins, Concepts, Methods and Aims. Oxford: Oxford University Press.
 * Bošković, Željko and Howard Lasnik (eds). 2006. Minimalist Syntax: The Essential Readings. Malden, MA: Blackwell.
 * Cook, Vivian J. and Newson, Mark. 2007. Chomsky's Universal Grammar: An Introduction. Third Edition. Malden, MA: Blackwell.
 * Hornstein, Norbert, Jairo Nunes and Kleanthes K. Grohmann. 2005. Understanding Minimalism. Cambridge: Cambridge University Press
 * Lasnik, Howard, Juan Uriagereka, Cedric Boeckx. 2005. A Course in Minimalist Syntax. Malden, MA: Blackwell
 * Radford, Andrew. 2004. Minimalist Syntax: Exploring the Structure of English. Cambridge: Cambridge University Press.
 * Uriagereka, Juan. 1998. Rhyme and Reason. An Introduction to Minimalist Syntax. Cambridge, Massachusetts: MIT Press.
 * Webelhuth, Gert (ed.). 1995. Government and Binding Theory and the Minimalist Program: Principles and Parameters in Syntactic Theory. Wiley-Blackwell