Indo-European copula

A feature common to all Indo-European languages is the presence of a verb corresponding to the English verb to be.

General features
This verb has two basic meanings:
 * In a less marked context it is a simple copula (I’m tired; That’s a shame!), a function which in non-Indo-European languages can be expressed quite differently.
 * In a more heavily marked context it expresses existence (I think therefore I am); the dividing line between these is not always easy to draw.

Some languages have shared these functions between several verbs: Irish, Spanish and Persian all have multiple equivalents of to be, making a variety of distinctions.

Many Indo-European languages also use the verb "to be" as an auxiliary for the formation of compound (periphrastic) tenses (I’m working; I was bitten). Other functions vary from language to language. For example, although in its basic meanings, to be is a stative verb, English puts it to work as a dynamic verb in fixed collocations (You are being very annoying).

The copula is the most irregular verb in many Indo-European languages. This is partly because it is more frequently used than any other, and partly because Proto-Indo-European offered more than one verb suitable for use in these functions, with the result that the daughter languages, in different ways, have tended to form suppletive verb paradigms.

This article describes the way in which the irregular forms have developed from a series of roots.

*h1es-
The root *h1es- was certainly already a copula in Proto-Indo-European.

The e-grade *h1es- (see Indo-European ablaut) is found in such forms as English is, Irish is, German ist, Latin est, Sanskrit asti, Persian ast, Old Church Slavonic  jestĭ .

The zero grade *h1s- produces forms beginning with /s/, like German sind, Latin sumus, Vedic Sanskrit smas, etc.

In PIE, *h1es- was an athematic verb in -mi; that is, the first person singular was *h1esmi; this inflection survives in English am, Pashto yem, Persian am, Sanskrit asmi, Bengali first-person verb ending -ām, Old Church Slavonic esmĭ, etc.

This verb is generally reconstructed for Proto-Indo-European thus:

The root or  (which did not have ablaut variations in the protolanguage ) probably meant 'to grow', but also 'to become'.

This is the source of the English infinitive be and participle been. Also, for example, the Scottish Gaelic "future" tense bithidh; the Irish imperative bí, past bhí and future beidh; the Welsh bod (along with the other b- initial forms); Persian imperative bov, past bud and future bâš; and the Slavic infinitive and past, etc. for example Russian быть (byt’), был (byl).

PIE became Latin /f/, hence the Latin future participle futūrus and perfect fuī; Latin fīō 'I become' is also from this root, as is the Greek verb φύω (phúō), from which physics and physical are derived.

was a preterito-present verb, i.e. Imperfect endings for Present, and can be reconstructed as follows:

*h2wes-
The root *h2wes- may originally have meant "to live", and has been productive in all Germanic languages. The e-grade is present in the German participle gewesen, the o-grade (*wos-) survives in English and Old High German was, while the lengthened e-grade (*wēs-) gives us English were. (The Germanic forms with /r/ instead of /s/ result from grammatischer Wechsel.) See Germanic strong verb: Class 5.

*h1er-
This has been claimed as the origin of the Old Norse and later Scandinavian languages' present stem: Old Norse em, ert, er, erum, eruð, eru; the second person forms of which were borrowed into English as art and are. It has also been seen as the origin of the Latin imperfect (eram, eras, erat) and future tenses (ero, eris, erit).

However, other authorities link these forms with *h1es- and assume grammatischer Wechsel (/s/→/r/), although this is not normally found in the present stem. Donald Ringe argues that the copula was sometimes unaccented in Pre-Proto-Germanic, which would have then triggered the voicing under Verner's law. He explains the Germanic first person singular form as such, deriving it from earlier, since -zm-, but not -sm-, was assimilated to -mm- in Germanic (for which other evidence exists as well). Furthermore, the third person plural form (from PIE ) shows that this word, too, was unaccented. If the accent had been preserved, it would have become, but that form is not found in any Germanic language. In this view, it is likely that stressed and unstressed varieties of the copula (with corresponding voiceless and voiced fricatives) existed side by side in Germanic, and the involvement of a separate root  is unnecessary.

The Latin forms could be explained by rhotacism.

*steh2-
The root *(s)teh2- meant "to stand". From this root comes the present stem of the so-called "substantive verb" in Irish and Scottish Gaelic, tá and tha respectively, as well as taw in Welsh. On the absence of the initial s- in Celtic, see Indo-European s-mobile.

In Latin, stō, stare retained the meaning "to stand", until local forms of Vulgar Latin began to use it as a copula in certain circumstances. Today, this survives in that several Romance languages (Galician-Portuguese, Spanish, Catalan) use it as one of their two copulae, and there is also a Romance tendency for a past participle derived from *steh2- to replace the original one of the copula (this occurs in French, Italian and the main dialects of Catalan). See also Romance copula.

Although in Dutch this verb retains its primary meaning of "stand", it is used in an auxiliary-like function that only has a secondary meaning of "standing", for example: ik sta te koken ("I am cooking", literally "I stand to cook"). While it is not a full copula (it can normally only be used as an auxiliary with another verb), it does have shades of meaning that resemble that of the Italian sto cucinando ("I am cooking"). The intransitive verbs zitten ("to sit"), liggen ("to lie") and lopen ("to walk/run") are used in similar ways.

In Swedish, which usually lacks gerund forms, the corresponding stå is often used similarly, along with sitta ("to sit"), ligga ("to lie") and gå ("to walk").

In Hindustani the past tense of the copula honā "to be" which are «tʰā», «tʰe», «tʰī» and «tʰī̃» are derived from Sanskrit «stʰā». Gujarati has a cognate verb «tʰavũ» "to happen"; cf. Bengali aorist «tʰā-» (to stay) as well.

Sanskrit
The Vedic Sanskrit root as (to be) is derived from the Indo-European root *.

bhū (to be) is derived from Indo-European *.

Hindi-Urdu
In modern Hindi-Urdu (Hindustani), the Sanskrit verb अस् (as) (to be) which is derived from the Indo-European root * has developed into the present indicative forms of the verb होना ہونا (honā) (to be). The infinitive होना ہونا (honā) itself is derived from the Sanskrit verb root भू (bʱū) which is derived from Indo-European root *. The indicative imperfect forms of होना ہونا (honā) comes from Sanskrit स्थित (stʰita) "standing, situated" which are derived from the PIE root *steh₂- (“to stand”). होना ہونا (honā) is the only verb in Hindi-Urdu to have the present indicative, imperfect indicative, presumptive mood and the present subjunctive conjugations, and all the other verbs in Hindi-Urdu lack them.

The verb होना / ہونا (honā) can be translated as "to be", "to exist", "to happen" or "to have" depending on the context, and when used in the third person it could also be translated as "there is/are". Many verbs conjugations in Hindi-Urdu are derived from participles and hence are gendered and numbered, and they agree with either the object or the subject of the sentence depending on the grammatical case of the subject of the sentence. When the subject is in the ergative or the dative case (seeː dative construction & quirky subject) the verb agrees in gender and number with the object of the sentence and with the subject when the subject is in the nominative case.

Bengali
Bengali is considered a zero copula language, however there are notable exceptions. In the simple present tense there is no verb connecting the subject to the predicative (the "zero verb" copula) but when the predicate expresses ideas of existence, location, or possession, for such cases the verb আছ- (ach) can be roughly translated as "to exist" or "to be present".


 * In the past tense, the incomplete verb আছ- (ach) is always used as the copula, regardless of the nature of the predicative.
 * For the future tense and non-finite structures, the copula is supplied by the verb হওয়া (howa), with the exceptions being the possessive and locative predicatives for which the verb থাকা (thaka, "to remain") is utilized.
 * Bengali does not have a verb for possession (i.e. "to have", "to own"). Instead, possession in Bengali is expressed by the verb আছ- (āch) (for present and past tenses) and the verb থাকা (thaka) (for future tense) inflected with the possessed object and a genitive case for the possessor.

Bengali verbs are highly inflected and are regular with only few exceptions. They consist of a stem and an ending; they are traditionally listed in Bengali dictionaries in their "verbal noun" form, which is usually formed by adding -a to the stem: for instance, করা (kôra, to do) is formed from the stem কর (kôr). The stem can end in either a vowel or a consonant.

Nepali
The copula verb of Nepali has two sets of conjugations. The हो (ho) set is used in sentences that equate two things, like त्यो किताब हो (tyo kitāb ho, “That is a book.”) The छ (cha) set is used in sentences that describe something, or locate where something is, like त्यो ठूलो छ (tyo ṭhūlo cha, “That is big.”). Singular present tense forms of the copulas in Nepali are shown in the table below:

Persian
With regard to the function of the verb ‘to be’ as a copula, the most conspicuous feature of Modern Persian language is the evolution of an existential be, hast (exists), out of ast (is). In fact, when studying the forms and functions of ‘to be’, one might find certain characteristics specific to Persian that are worth pondering upon — i.e. even without considering the diachronic evolution of Modern Persian language and its relation to Ancient Iranian languages (such as Old Persian and Avestan) whose usage of the verb ‘to be’ seems more close to Sanskrit. Paradoxically, despite the fact that Persian is apparently the only Indo-European language that has created an existential be out of the copula, it has simultaneously made an extreme use of the latter to produce a general paradigm for conjugating all Persian verbs.

Historically speaking, like most of Indo-European languages that make use of suppletive roots to denote ‘to be’, Persian integrates Proto-Indo-European (PIE) verbs *h1es- (to be) and  *bhuH  (to grow> to become> to be). Hence, while Persian infinitive būdan (to be) < PIE *bhuH forms the past stem of the verb (e.g. Persian būd-  ‘was’) or acts as an auxiliary verb in formation of pluperfect of other verbs, its present tense is solely based on the derivatives of PIE *h1es-. It is, in fact, from the declension of PIE *h1es- (to be) that six present stems have been created and assigned to the 1st, 2nd, and 3rd person singular and plural to act as the present-tense conjugation of Persian būdan (to be), as shown in the following table.

As an example, in the following sentences, the present forms of the verb 'to be' are used as copulas or predicates: Furthermore, as endings added to the stem of the verbs, these declensional forms have been grammaticalized to shape a general paradigm for the grammatical conjugation of all other verbs; these endings were once auxiliary verbs which evolved into an enclitic. This generalized conjugational paradigm is also applied to the past tense of the verb būdan (shown in the table below). However, what is linguistically notable, is the emergence of an existential be out of the copula, viz hast (exists) out of ast (is). The evolution of this exceptional form, might go back to ancient Iranian languages, where ast could have two variants (cf. Avestan which has both as- and has- <PIE *h1es- ‘be’). In the next phase, what we may call a ''pseudo-verb appeared, vis. the verb hastan (to exist) has been analogically evolved from hast (exists) and has been conjugated like any other Persian verb (e.g. hast-am = literally:  *‘(I) am existence''’→ ‘I exist’).

The simple past conjugation of the verb būdan (to be) is in fact formed by a double-copula, in the sense that both the stem and the ending are copulas: the past stem of the verb būd- is derived from PIE *bhuH-, while the endings are from the suppletive form of PIE *h1es- (to be) with the exception of 3rd person singular which has zero ending for the all Persian verbs in the past tense. The present perfect conjugation of the verb būdan (to be) is a double copula paradigm as it is produced by addition of all enclitic copulas to the past participle of the verb: būde (been). The pseudo-verb hastan (to exist) has only simple present tense; in addition, it is truly and purely existencial only in the case of third person singular (hast). The fact is that the verb has been the product of this very case, as an "existential is", hast (he/she/it exists). For other persons the conjugation has to use enclitic copulas. These copulas are, in turn, derived from the declension of PIE *h1es- (to be); as if the predicative "to be" has been an auxiliary verb turned into enclitic, to provide six endings for 1st/2nd/3rd person (singular & plural). However, as it is said, the 3rd person singular has no ending in the case of hastan. That is to say that the existential hast (exists), which is like the alter-ego of the copula ast (is), takes no ending, while the present stem of all other verbs take an archaic ending -ad in their 3rd person singular.

Greek
The Ancient Greek verb eimi (I am) is derived from the Indo-European root *.

Dual is not shown in the table.

The participles are based on the full-grade stem ἐσ- in Homeric, according to Smyth.

Italic languages
Except for Latin, the older Italic languages are very scarcely attested, but we have in Oscan set (they are), fiiet (they become), fufans (they have been) and fust (he will be), and in Umbrian sent (they are). This section will explain Latin, and the Romance languages that have evolved from it.

Esse and the forms beginning with (e)s- are from the root *h1es-, while the forms beginning with f- are from the root. For the forms beginning with er-, see . Stāre is derived from the root *steh2-.

In Spanish, Catalan, Galician-Portuguese and to a lesser extent, Italian there are two parallel paradigms, ser/èsser/essere from Latin esse "to be" on the one hand, and estar/stare from Latin stare, "to stand" on the other.

In several modern Romance languages, the perfect is a compound tense formed with the past participle as in English, but the old Latin perfect survives as a commonly used preterite in Spanish and Portuguese, and as a literary "past historic" in French, Italian and Catalan.

There is a tendency for a past participle derived from stare (or more specifically its supine, statum) to replace that of the main copula derived from esse. For example, the French participle été comes from statum.

Germanic languages
The proto-Germanic verb for 'to be', *wesaną, and its conjugations are derived from the Proto-Indo-European verbs *h2wes (‘stay overnight, camp’) and the optative of *h1es-. Proto-Germanic retained the dual, but only in the first and second person.


 * Old English kept the verbs wesan and bēon separate throughout the present stem, though it is not clear that the kind of consistent distinction in usage was made that we find, for example in Spanish. In the preterite, however, the paradigms fell together.  Old English has no participle for this verb.
 * The plural forms in Modern Swedish (indicated in brackets) were in common use in formal written language until the mid-20th century, but are now no longer in use except in deliberately archaising texts. The preterite subjunctive is also increasingly being replaced by the indicative, or past participle.
 * Dutch, like English, has abandoned the original second-person singular forms, replacing them with the second-person plural forms. However, while in English the old forms are still in limited and deliberately archaic use, in Dutch they have disappeared entirely and are no longer known or used at all. The forms listed in the plural are the historical plural forms, the 'jij' and 'gij' forms. Dutch formed a new plural pronoun 'jullie' with inflection similar to the 1st and 3rd person plural, but it would be redundant to list them here.

Slavic languages

 * In Russian, the present forms are archaic and no longer in common use, except for the third person forms, which are used in "there is/are" type phrases.
 * In Ukrainian, the present tense forms of the verb "бути" have all but disappeared from contemporary language, except for the third person form which is used in existential phrases; єсть (jesť) is archaic and encountered only in poetry. All participles have turned into other parts of speech, future and past active participles becoming present and past active adverbial participle respectively, and resultative pariciple becoming past tense of verbs.
 * In Serbo-Croatian the forms jesam, jesi, jeste and so on are used as the basic form of the Present Tense "to be" (i.e. I am, you are etc.), while the forms budem, budeš, bude etc. are used only for the formation of the Future Perfect.
 * In Bulgarian, forms бъда, бъдеш, etc. are not used by themselves but only in compound forms (future ще бъда, subjunctive да бъда). In this respect they closely follow the usage (and non-usage) of perfective verbs. As such it has its own forms for the aorist (бидох, биде, биде, бидохме, бидохте, 'бидоха), the imperfect (бъдех, бъдеше, бъдеше, бъдехме, бъдехте, бъдеха) and the resultative participle (бъдел). Another verb - бивам with fully regular conjugation type III paradigm - completes an aspect triple: imperfective съм, perfective бъда, secondary imperfective бивам. The perfective aorist has lost its original meaning and is now used only to form the compound conditional mood (бих чел = I would read). All participles except the resultative participle (бил) have lost their function and are now used as regular adjectives with changed meanings (същ = same, бивш = previous, ex-, бъдещ = future).
 * In Polish, the present forms, except for jest and są, have turned into suffixes (-m, -ś, -śmy, -ście) used primarily to construct the past tense and the conditional clitic. The modern conjugation comes from attaching these suffixes onto the third person singular form jest.

Baltic languages
In Lithuanian, the paradigm būnu, būni, būna, etc. is not considered archaic or dialectal but rather a special use of the verb būti, to be, mostly used to describe repeated actions or states, or habits.

Celtic languages
In the Celtic languages there is a distinction between the so-called substantive verb, used when the predicate is an adjective phrase or prepositional phrase, and the so-called copula, used when the predicate is a noun.

The conjugation of the Old Irish and Middle Welsh verbs is as follows:

The forms of the Old Irish present tense of the substantive verb, as well as Welsh taw, come from the PIE root *stā-. The other forms are from the roots *es- and *bhū-. Welsh mae originally meant "here is" (cf. yma 'here').

Irish and Scottish Gaelic
In modern Gaelic, person inflections have almost disappeared, but the negative and interrogative are marked by distinctive forms. In Irish, particularly in the south, person inflections are still very common for the tá/bhí series.

The verb bí
† archaic forms

Gaelic (bh)eil and Irish (bh)fuil are from Old Irish fuil, originally an imperative meaning "see!" (PIE root *wel-, also in Welsh gweled, Germanic wlitu- "appearance", and Latin voltus "face"), then coming to mean "here is" (cf. French voici < vois ci and voilà < vois là), later becoming a suppletive dependent form of at-tá. Gaelic robh and Modern Irish raibh are from the perfective particle ro (ry in Welsh) plus ba (lenited after ro).

Modern Welsh
The present tense in particular shows a split between the North and the South. Though the situation is undoubtedly more complicated, King (2003) notes the following variations in the present tense as spoken (not as written according to the standard orthography):


 * {| class="wikitable"

! colspan="2" scope="col" style="font-weight: normal;" |Affirmative (I am) ! colspan="2" scope="col" style="font-weight: normal;" |Interrogative (Am I?) ! colspan="2" scope="col" style="font-weight: normal;" |Negative (I am not) ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! rowspan="3" scope="row" style="background: #efefef;" | North ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person ! rowspan="3" scope="row" style="background: #efefef;" | South ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person For example, the spoken first person singular dw i'n is a contraction of the formal written  yr ydwyf fi yn . The Welsh F /v/ is the fricative analogue of the nasal /m/, the PIE suffix consonant for the first person singular.
 * - style="background: #efefef;"
 * colspan="2" rowspan="2" |
 * - style="background: #efefef;"
 * dw
 * dan
 * ydw?
 * ydan?
 * (dy)dw
 * (dy)dan
 * &mdash;, (r)wyt
 * dach
 * wyt?
 * (y)dach?
 * dwyt
 * (dy)dach
 * mae
 * maen
 * ydy?
 * ydyn?
 * dydy
 * dydyn
 * rw, w
 * ŷn, &mdash;
 * ydw?
 * ŷn?
 * (d)w
 * ŷn
 * &mdash;, (r)wyt
 * ych
 * wyt?
 * ych?
 * &mdash;
 * (ych)
 * mae
 * maen
 * ydy?, yw?
 * ŷn?
 * dyw
 * ŷn
 * }


 * {| class="wikitable"

! colspan="2" scope="col" style="font-weight: normal;" |Affirmative (I am) ! colspan="2" scope="col" style="font-weight: normal;" |Interrogative (Am I?) ! colspan="2" scope="col" style="font-weight: normal;" |Negative (I am not) ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! rowspan="3" scope="row" style="background: #efefef;" | Preterite ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person ! rowspan="3" scope="row" style="background: #efefef;" | Imperfect ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person ! rowspan="3" scope="row" style="background: #efefef;" | Future ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person
 * - style="background: #efefef;"
 * colspan="2" rowspan="2" |
 * - style="background: #efefef;"
 * bues
 * buon
 * fues?
 * fuon?
 * fues
 * fuon
 * buest
 * buoch
 * fuest?
 * fuoch?
 * fuest
 * fuoch
 * buodd
 * buon
 * fuodd?
 * fuon?
 * fuodd
 * fuon
 * roeddwn
 * roedden
 * oeddwn?
 * oedden?
 * doeddwn
 * doedden
 * roeddet
 * roeddech
 * oeddet?
 * oeddech?
 * doeddet
 * doeddech
 * roedd
 * roeddyn
 * oedd?
 * oeddyn?
 * doedd
 * doeddyn
 * bydda
 * byddwn
 * fydda?
 * fyddwn?
 * fydda
 * fyddwn
 * byddi
 * byddwch
 * fyddi?
 * fyddwch?
 * fyddi
 * fyddwch
 * bydd
 * byddan
 * fydd?
 * fyddan?
 * fydd
 * fyddan
 * }

Bod also has a conditional, for which there are two stems. The bas- stem is more common in the North, and the bydd- stem is more common in the South:
 * {| class="wikitable"

! colspan="2" scope="col" | Affirmative ! colspan="2" scope="col" | Interrogative ! colspan="2" scope="col" | Negative ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! scope="col" | Singular ! scope="col" | Plural ! rowspan="3" scope="row" style="background: #efefef;" |bydd- ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person ! rowspan="3" scope="row" style="background: #efefef;" |bas- ! scope="row" style="background: #efefef;" | First person ! scope="row" style="background: #efefef;" | Second person ! scope="row" style="background: #efefef;" | Third person
 * - style="background: #efefef;"
 * colspan="2" rowspan="2" |
 * - style="background: #efefef;"
 * byddwn
 * bydden
 * fyddwn
 * fydden
 * fyddwn?
 * fydden?
 * byddet
 * byddech
 * fyddet
 * fyddech
 * fyddet?
 * fyddech?
 * byddai
 * bydden
 * fyddai
 * fydden
 * fyddai?
 * fydden?
 * baswn
 * basen
 * faswn
 * fasen
 * faswn?
 * fasen?
 * baset
 * basech
 * faset
 * fasech
 * faset?
 * fasech?
 * basai
 * basen
 * fasai
 * fasen
 * fasai?
 * fasen?
 * }

Hittite
The Hittite verb "to be" is derived from the Indo-European root *.

Armenian
The Classical Armenian present tense derives from PIE *'- (cf. sg. ', ', '; 3rd pl. ).

Albanian
The Albanian copula shows two distinct roots. The present jam ‘I am’ is an athematic root stem built from PIE *. The imperfect continues the PIE imperfect of the same root but was rebuilt based on the 3rd person singular and plural. The preterite, on the other hand, comes from the thematic aorist of PIE * ‘turn’ (cf. Ancient Greek épleto ‘he turned’, Armenian eɫew ‘he became’, Old Irish cloïd ‘turns back, defeats’). Analogical or otherwise indirect reflexes are italicized below.