Phraseme

A phraseme, also called a set phrase, fixed expression, idiomatic phrase, multiword expression (in computational linguistics), or idiom,   is a multi-word or multi-morphemic utterance whose components include at least one that is selectionally constrained or restricted by linguistic convention such that it is not freely chosen. In the most extreme cases, there are expressions such as X kicks the bucket ≈ ‘person X dies of natural causes, the speaker being flippant about X’s demise’ where the unit is selected as a whole to express a meaning that bears little or no relation to the meanings of its parts. All of the words in this expression are chosen restrictedly, as part of a chunk. At the other extreme, there are collocations such as stark naked, hearty laugh, or infinite patience where one of the words is chosen freely (naked, laugh, and patience, respectively) based on the meaning the speaker wishes to express while the choice of the other (intensifying) word (stark, hearty, infinite) is constrained by the conventions of the English language (hence, *hearty naked, *infinite laugh, *stark patience). Both kinds of expression are phrasemes, and can be contrasted with ’’free phrases’’, expressions where all of the members (barring grammatical elements whose choice is forced by the morphosyntax of the language) are chosen freely, based exclusively on their meaning and the message that the speaker wishes to communicate.

Major types of phraseme
Phrasemes can be broken down into groups based on their compositionality (whether or not the meaning they express is the sum of the meaning of their parts) and the type of selectional restrictions that are placed on their non-freely chosen members. Non-compositional phrasemes are what are commonly known as idioms, while compositional phrasemes can be further divided into collocations, clichés, and pragmatemes.

Non-compositional phrasemes: Idioms
A phraseme is an idiom if its meaning is not the predictable sum of the meanings of its component—that is, if it is non-compositional. Generally speaking, idioms will not be intelligible to people hearing them for the first time without having learned them. Consider the following examples (an idiom is indicated by elevated half-brackets: ˹ … ˺):


 * ˹rock and roll˺ ‘a Western music genre characterised by a strong beat with sounds generated by guitar, piano, and vocalists’
 * ˹cheek by jowl˺ ‘in close association’
 * ˹the game is up˺ ‘your deceit is exposed’
 * ˹[X] comes to [NX’s] senses˺ ‘X becomes conscious or rational again’
 * ˹put [NY] on the map˺ ‘make the place Y well-known’
 * ˹bull session˺ ‘long informal talk on a subject by a group of people’

In none of these cases are the meanings of any of the component parts of the idiom included in the meaning of the expression as a whole.

An idiom can be further characterized by its transparency, the degree to which its meaning includes the meanings of its components. Three types of idioms can be distinguished in this way—full idioms, semi-idioms, and quasi-idioms.

Full idioms
An idiom AB (that is, composed of the elements A ‘A’ and B ‘B’) is a full idiom if its meaning does not include the meaning of any of its lexical components: ‘AB’ ⊅ ‘A’ and ‘AB’ ⊅ ‘B’.
 * ˹put [NY] through its paces˺ ‘to test Y thoroughly’
 * ˹go ballistic˺ ‘suddenly become very angry’
 * ˹by heart˺ ‘remembering verbatim’
 * ˹bone of contention˺ ‘reason for quarrels or fights’

Semi-idioms
An idiom AB is a semi-idiom if its meaning
 * 1) includes the meaning of one of its lexical components, but not as its semantic pivot (see below),
 * 2) does not include the meaning of the other component and
 * 3) includes an additional meaning ‘C’ as its semantic pivot:
 * ‘AB’ ⊃ ‘A’, and ‘AB’ ⊅ ‘B’, and ‘AB’ ⊃ ‘C’.

The semantic pivot of an idiom is, roughly speaking, the part of the meaning that defines what sort of referent the idiom has (person, place, thing, event, etc.) and is shown in the examples in italic. More precisely, the semantic pivot is defined, for an expression AB meaning ‘S’, as that part ‘S1’ of AB’s meaning ‘S’, such that ‘S’ [= ‘S1’ ⊕ ‘S2’] can be represented as a predicate ‘S2’ bearing on ‘S1’—i.e., ‘S’ = ‘S2’(‘S1’) (Mel’čuk 2006: 277).
 * ˹private eye (I)˺ : ‘private investigator’
 * ˹sea anemone˺ : ‘predatory polyp dwelling in the sea’
 * Rus. ˹mozolit´ glaza˺ : ‘be in Y's sight too often or for too long ’ (lit. ‘make corns on Y’s eyes’)

Quasi-idiom or weak idiom
An idiom AB is a quasi-idiom, or weak idiom if its meaning
 * 1) includes the meaning of its lexical components, neither as the semantic pivot, and
 * 2) includes an additional meaning ‘C’ as its semantic pivot:
 * ‘AB’ ⊃ ‘A’, and ‘AB’ ⊃ ‘B’, and ‘AB’ ⊃ ‘C’.


 * Fr. ˹donner le sein à Y˺ : ‘feed the baby Y by putting one teat into the mouth of Y’
 * ˹start a family˺ : ‘conceive a first child with one’s spouse, starting a family’
 * ˹barbed wire˺ : ‘[artifact designed to make obstacles with and constituted by] wire with barbs [fixed on it in small regular intervals]’

Compositional phrasemes
A phraseme AB is said to be compositional if the meaning ‘AB’ = ‘A’ ⊕ ‘B’ and the form/AB/ = /A/ ⊕ /B/ (“⊕” here means ‘combined in accordance with the rules of the language’). Compositional phrasemes are generally broken down into two groups—collocations and clichés.

Collocations
A collocation is generally said to consist of a base (shown in ), a lexical unit chosen freely by the speaker, and of a collocate, a lexical unit chosen as a function of the base.
 * heavy  : ‘strong accent’
 * sound  : ‘asleep such that one is hard to awaken’
 *  to the teeth : ‘armed with many or with powerful weapons’
 * leap  : ‘year in which February has 29 days’

In American English, you make a decision, and in British English, you can also take it. For the same thing, French says prendre [= ‘take’] une décision, German—eine Entscheidung treffen/fällen [= ‘meet/fell’], Russian—prinjat´ [= ‘accept’] rešenie, Turkish—karar vermek [= ‘give’], Polish—podjąć [= ‘take up’] decyzję, Serbian—doneti [= ‘bring’] odluku, Korean—gyeoljeongeul hada 〈naerida〉 [= ‘do 〈take/put down〉’], and Swedish—fatta [= ‘grab’]. This clearly shows that boldfaced verbs are selected as a function of the noun meaning ‘decision’. If instead of DÉCISION a French speaker uses CHOIX ‘choice’ (Jean a pris la décision de rester ‘Jean has taken the decision to stay’ ≅ Jean a … le choix de rester ‘Jean has ... the choice to stay’), he has to say FAIRE ‘make’ rather than PRENDRE ‘take’: Jean a fait 〈*a pris〉 le choix de rester ‘Jean has made the choice to stay’. A collocation is semantically compositional since its meaning is divisible into two parts such that the first one corresponds to the base and the second to the collocate. This is not to say that a collocate, when used outside the collocation, must have the meaning it expresses within the collocation. For instance, in the collocation sit for an exam ‘undergo an exam’, the verb SIT expresses the meaning ‘undergo’; but in an English dictionary, the verb SIT does not appear with this meaning: ‘undergo’ is not its inherent meaning, but rather is a context-imposed meaning.

Clichés
Generally, a cliché is said to be a phraseme consisting of components of which none are selected freely and whose usage restrictions are imposed by conventional linguistic usage, as in the following examples:
 * in the wrong place at the wrong time
 * you’ve seen one, you’ve seen ’em all!
 * no matter what
 * we all make mistakes
 * one thing after another

Clichés are compositional in the sense that their meaning is more or less the sum of the meanings of their parts (not, for example, in no matter what), and clichés (unlike idioms) would be completely intelligible to someone hearing them for the first time without having learned the expression beforehand. They are not completely free expressions, however, because they are the conventionalized means of expressing the desired meanings in the language.

For example, in English one asks What is your name? and answers My name is [N] or I am [N], but to do the same in Spanish one asks ¿Cómo se llama? (lit. ‘How are you called?’) and one answers Me llamo [N] (‘I am called [N]’). The literal renderings of the English expressions are ¿Cómo es su nombre? (lit. ‘What is your name?’) and Soy [N] (‘I am [N]’), and while they are fully understandable and grammatical they are not standard; equally, the literal translations of the Spanish expressions would sound odd in English, as the question ‘How are you called?’ sounds unnatural to English speakers.

A subtype of cliché is the pragmateme, a cliché where the restrictions are imposed by the situation of utterance:
 * English - Will you marry me? : [when making a marriage proposal]
 * Russian - Bud´(te) moej ženoj! (lit. ‘Be my wife!’) : [when making a marriage proposal]
 * English - Best before… : [on a container of packaged food]
 * Russian - Srok godnosti – … (lit. ‘Deadline of fitness is …’) : [on a container of packaged food]
 * French - À consommer avant … (‘To be consumed before …’) : [on a container of packaged food]
 * German - Mindestens haltbar bis … (‘Keeps until at least …’) : [on a container of packaged food]

As with clichés, the conventions of the languages in question dictate a particular pragmateme for a particular situation—alternate expressions would be understandable, but would not be perceived as normal.

Phrasemes in morphology
Although the discussion of phrasemes centres largely on multi-word expressions such as those illustrated above, phrasemes are known to exist on the morphological level as well. Morphological phrasemes are conventionalized combinations of morphemes such that at least one of their components is selectionally restricted. Just as with lexical phrasemes, morphological phrasemes can be either compositional or non-compositional.

Non-compositional morphological phrasemes
Non-compositional morphological phrasemes, also known as morphological idioms, are actually familiar to most linguists, although the term “idiom” is rarely applied to them—instead, they are usually referred to as “lexicalized” or “conventionalized” forms. Good examples are English compounds such as harvestman ‘arachnid belonging to the order Opiliones’ (≠ ‘harvest’ ⊕ ‘man’) and bookworm (≠ ‘book’ ⊕ ‘worm’); derivational idioms can also be found: airliner ‘large vehicle for flying passengers by air’ (≠ airline ‘company that transports people by air’ ⊕ -er ‘person or thing that performs an action’). Morphological idioms are also found in inflection, as shown by these examples from the irrealis mood paradigm in Upper Necaxa Totonac:

ḭš-tḭ-tachalá̰x-lḭ

PAST-POT-shatter-PFV

‘it could have shattered earlier (but didn't)’

ḭš-tachalá̰x-lḭ

PAST-shatter-PFV

‘it could have shattered now (but hasn’t)’

ka-tḭ-tachalá̰x-lḭ

OPT-POT-shatter-PFV

‘it could shatter (but won't now)’

The irrealis mood has no unique marker of its own, but is expressed in conjunction with tense by combinations of affixes “borrowed” from other paradigms—ḭš- ‘past tense’, tḭ- ‘potential mood’, ka- ‘optative mood’, -lḭ ‘perfective aspect’. None of the resulting meanings is a compositional combination of the meanings of its constituent parts (‘present irrealis’ ≠ ‘past’ ⊕ ‘perfective’, etc.).

Compositional morphological phrasemes
Morphological collocations are expressions such that not all of their component morphemes are chosen freely: instead, one or more of the morphemes is chosen as a function of another morphological component of the expression, its base. This type of situation is quite familiar in derivation, where selectional restrictions placed by radicals on (near-)synonymous derivational affixes are common. Two examples from English are the nominalizers used with particular verbal bases (e.g., establishment, *establishation; infestation, *infestment; etc.), and the inhabitant suffixes required for particular place names (Winnipeger, *Winnipegian; Calgarian, *Calgarier; etc.); in both cases, the choice of derivational affix is restricted by the base, but the derivation is compositional, forming a morphological gap. An example of an inflectional morphological collocation is the plural form of nouns in Burushaski:

Burushaski has about 70 plural suffixal morphemes The plurals are semantically compositional, consisting of a stem expressing the lexical meaning and a suffix expressing PLURAL, but for each individual noun, the appropriate plural suffix has to be learned. Unlike compositional lexical phrasemes, compositional morphological phrasemes seem only to exist as collocations: morphological clichés and morphological pragmatemes have yet to be observed in natural language.