Substrate in Romanian



The proposed substratal elements in Romanian are mostly lexical items. The process of determining if a word is from the substratum involves comparison to Latin, languages with which Romanian came into contact, or determining if it is an internal construct. If there are no matching results, a comparison to Albanian vocabulary, Thracian remnants or Proto-Indo-European reconstructed words is made.

In addition to vocabulary, some other features of Eastern Romance, such as phonological features and elements of grammar (see Balkan sprachbund) may also be from Paleo-Balkan languages.

Romanian developed from the Common Romanian language, which in turn developed from Vulgar Latin. According to a widely accepted theory, the territory where the language formed was a large one, consisting of both the north and the south of the Danube (encompassing the regions of Dacia, Moesia, and possibly Illyria), more precisely to the north of the Jireček Line. Other scholars place the origin of the Romanian language in the Balkan Peninsula, strictly south of the Danube. The Cambridge History of the Romance Languages, published in 2013, came to the conclusion that the "historical, archaeological and linguistic data available do not seem adequate" to determine the territory where the development of the Romanian language began.

Lexical items
The study of the substrate involves comparative methods applied to:
 * 1) Albanian and its reconstructed ancient precursor – Proto-Albanian – an Indo-European language and the only surviving representative of the Albanoid branch, belonging to the Paleo-Balkan group of antiquity. Albanian varieties are today spoken by approximately 6 million people in the Balkans, primarily in Albania, Kosovo, North Macedonia, Serbia, Montenegro and Greece. Albanian, especially the Tosk dialect, also represents one of the core languages of the Balkan Sprachbund.
 * 2) Thraco-Dacian or Thracian, a language that although almost unattested has left traces in toponomy and inscriptions.
 * 3) Proto-Indo-European, if none of the other languages yielded any results.

Comparative methods applied to Albanian
In general, words assumed to belong to substratum can be placed into two categories:

those related to nature and natural world and those used in pastoral life for:
 * terrain: ciucă, groapă, mal, măgură, noian;
 * bodies of water: bâlc, pârâu;
 * flora: brusture, bung(et), ciump, coacăză, copac, curpen, druete, leurdă, ghimpe, mazăre, mărar, mugure, sâmbure, spânz, strugure, ţeapă;
 * fauna: balaur, bală, baligă, barză, brad, călbează, căpușă, cioară, cioc, ciut, ghionoaie, măgar, mânz(at), murg, mușcoi, năpârcă, pupăză, raţă, strepede, şopârlă, ştiră, ţap, viezure, vizuină;
 * food: abur, brânză, fărâmă, grunz, sarbăd, scrum, urdă, zară;
 * clothing: bască, brâu, căciulă, sarică;
 * housing: argea, cătun, gard, vatră;
 * body (some initially used for livestock):  buză, ceafă, ciuf, grumaz, guşă, rânză;
 * related activities: baci, bâr, buc, grapă, gresie, lete, strungă, ţarc, zgardă.

Other words from substratum are: bucur(ie), ciupi, copil, cursă, fluier, droaie, gata, ghiuj, jumătate, mare (adj), moş, scăpăra. Words possibly of substratum but not generally agreed among linguists are: arichiță, băiat, băl, brâncă, orbalţ, borţ, bulz, burduf, burtă, codru, Crăciun, creţ, cruţa, curma, daltă, dărâma, fluture, lai, mătură, mire, negură, păstaie, scorbură, spuză, stăpân, sterp, stână, traistă

Comparative methods applied to Thraco-Dacian and/or other Indo-European languages
The comparative method can be extended to other languages of the Indo-European family, including ones from which Romanian could not have borrowed directly or indirectly, in order to reconstruct Thraco-Dacian substratum words. This yields results with varying degrees of probability. Between 80 and 100 words belong to this category.

Substratum words like mal (1. shore, bank; 2. ravine, reg. a raised portion of land smaller than a hill and with abrupt sides) have almost identical correspondents in Albanian mal (mountain), but they can also be related to toponyms like Dacia Maluensis later renamed by Romans to Dacia Ripensis (rīpa - meaning bank, shore - has been inherited in Romanian as râpă - the abrupt side of a hill).

All river names over 500 km and half of those between 200 and 500 km derive from pre-Latin substratum, according to linguist and philologist Oliviu Felecan. Similarly, linguist Grigore Brâncuș states that almost the entire major hydronymy has been transmitted from Dacian to Romanian. Other linguists have pointed out that the present Romanian forms of these hydronyms indicate that they were borrowed from Slavs or Hungarians.



Phonetic, morphological and syntactic features
A couple of phonetic changes have been agreed on as substratum influence:


 * the fricative post-alveolar consonant ș - /ʃ/ - comes from the voiceless fricative s in a soft position for example Lat. serpens> Rom. șarpe.
 * rhotacism of n consonant, seen only marginally in Romanian, is a general rule for lexical items of Istro-Romanian and Tosk Albanian prior to the contact with Slavic languages (before c. 600 CE).

Several other have been attributed to the influence of substratum by some researchers, but there is no general consensus among scholars. For example, the development of "ă" vowel: linguists Al. Phillipide and Grigore Brâncuș consider the spontaneous evolution of unstressed "a" from words like Lat. camisia>Rom. cămașă, and stresses "a" before a /n/ or a consonant cluster beginning with /m/, a vowel found also in Bulgarian and Albanian, as the substratum influence in Romanian, while linguist Marius Sala points this changes can also be seen as the tendency of the oral language to differentiate between forms of a paradigm, comparable to the development of similar central vowels in Portuguese or Neapolitan.

Likewise, the morphological and syntactical features attributed to substratum, identified by comparison to Albanian and other languages of the Balkan sprachbund, are subject to scholarly debate since the grammatical structure of Thraco-Dacian is unattested.

A difficult research topic
Numerous language studies and research papers discuss the problems of the Substrate in Romanian, considered by some to be the most controversial and difficult part of Romanian language since its nature and development could explain the evolution of Latin to Romanian.

Some linguists (including Sorin Olteanu, Sorin Paliga and Ivan Duridanov) propose that a number of words presented as borrowings from a Slavic language or from Hungarian in standard literature may have actually developed from reconstructed (not attested) words of local Indo-European languages and they were borrowed from Romanian by the neighboring languages. Though the substratum status of many Romanian words is not much disputed, their status as Dacian words is controversial, some more than others since there are no significant surviving written examples of the Dacian language. Many of the possible pre-Roman lexical items of Romanian have Albanian parallels, and if they are in fact substratum words cognates with the Albanian ones, and not loanwords from Albanian, it indicates that the substrate language of Romanian may have been on the same Indo-European branch as Albanian.

Other languages
The Bulgarian Thracologist Vladimir Georgiev developed the theory that the Romanian language has a "Daco-Moesian" language as its substrate, a hypothecised language that accordig to him had a number of features which distinguished it from the Thracian language spoken further south, across the Haemus range.

There are also some Romanian substratum words in languages other than Romanian, these examples having entered via Romanian dialects. For example, Bryndza is a type of cheese made in Eastern Austria, Poland, the Czech Republic (Moravian Wallachia), Slovakia and Ukraine, the name being derived from the Romanian word for cheese (brânză).