Origin of the Albanians

The origin of the Albanians has been the subject of historical, linguistic, archaeological and genetic studies. The first mention of the ethnonym Albanoi occurred in the 2nd century AD by Ptolemy describing an Illyrian tribe who lived around present-day central Albania. The first certain attestation of medieval Albanians as an ethnic group is in the 11th century, when they continuously appear in Byzantine sources.

Albanians have a western Paleo-Balkan origin. Besides the Illyrians, theories regarding which specific ancient Paleo-Balkan group had participated in the origin of the Albanians vary between attributing Thracian, Dacian, or another Paleo-Balkan component whose language was unattested. Among those scholars who support an exclusively Illyrian origin, there is a distinction between those who propose a direct continuity from Illyrian times, and those who propose an in-migration of a different Illyrian population. However, these propositions are not mutually exclusive. The Albanians are also one of Europe's ethnic groups with the highest number of common ancestors within their own ethnic group even though they share ancestors with other ethnic groups.

Albanian is an Indo-European language and the only surviving representative of its own branch, which belongs to the Paleo-Balkan group, having its formative core in the Balkans after the Indo-European migrations in the region. Early Proto-Albanian speakers came into contact with Doric Greek (West Greek) since the 7th century BCE, and with Ancient Macedonian during the 5th–4th centuries BCE. Thereafter they also had contacts with Koine Greek. Proto-Albanian speakers came into contact with Latin after the Roman conquest of the Western Balkans in the 2nd century BCE, but the major Latin influence in Proto-Albanian occurred during the first years of the common era onwards, when the Western Balkans were eventually incorporated into the Roman Empire after the Great Illyrian Revolt (6–9 CE). Latin loanwords were borrowed through the entire period of spoken Latin in the Western Balkans, reflecting different chronological layers and penetrating into almost all semantic fields. Proto-Albanian speakers were Christianized under the Latin sphere of influence, specifically in the 4th century CE.

All aspects of Albanian tribal society have been directed by the Albanian traditional law code, which is of interest to Indo-European studies as it reflects many legal practices of great antiquity that find precise echoes in Vedic India and ancient Greece and Rome. The surviving pre-Christian elements of Albanian culture indicate that Albanian mythology and folklore are of pagan Paleo-Balkanic origin.

Arbënesh
The two ethnonyms used by Albanians to refer to themselves are Arbënesh(ë)/Arbëresh(ë) and Shqiptar(ë). Arbënesh is the original Albanian endonym and forms that basis for most names of Albanians in foreign languages and the name of Albania as a country. Greek Arvanitai, Alvanitai and Alvanoi, Turkish Arnaut, Serbo-Croatian Arbanasi and others derive from this term. The ethnic name Albanian was used by Latin and Byzantine sources in the forms arb- and alb- since at least the 2nd century A.D, and eventually in Old Albanian texts as an endonym. The ancient attestation of the ethnic designation is not considered strong evidence of an Albanian continuity in southern Illyria, since there are many examples in history of an ethnic name shifting from one ethnos to another. Nevertheless, the ancient ethnonym gave rise to the Albanian old endonym, early generalized to all the tribes of Illyria who spoke the same idiom. The process was similar to the spread of the name Illyrians from a small group of people on the Adriatic coast, the Illyrioi.

Albanians gradually replaced their old endonym by the term Shqiptar, a change most likely trigged after the Ottoman conquests of the Balkans in the 15th century. The words Shqipëri and Shqiptar are attested from 14th century onward, but it was only at the end of 17th and beginning of the early 18th centuries that the placename Shqipëria and the ethnic demonym Shqiptarë gradually replaced Arbëria and Arbëreshë amongst Albanian speakers. The usage of the old endonym Arbënesh/Arbëresh, however, persisted and was retained by Albanian communities which had migrated from Albania and adjacent areas centuries before the change of the self-designation, namely the Arbëreshë of Italy, the Arvanites of Greece as well as the Arbanasi in Croatia. As such, the medieval migrants to Greece and later migrants to Italy during the 15th-century are not aware of the term Shqiptar.

References to Albania

 * In the 2nd century BC, the History of the World written by Polybius, mentions a location named Arbona (Ἄρβωνα; Latinised form: Arbo) in which some Illyrian troops, under Queen Teuta, scattered and fled to in order to escape the Romans. Arbona was perhaps an island in Liburnia or another location within Illyria.
 * The names Albanoi and Albanopolis have been attested in ancient funeral inscriptions in present-day North Macedonia. The toponym Albanopolis has been found on a funeral inscription in Gorno Sonje, near the city of Skopje (ancient Scupi), present-day North Macedonia. It was excavated in 1931 by Nikola Vulić and its text was curated and published in 1982 by Borka Dragojević-Josifovska. The inscription in Latin reads "POSIS MESTYLU F[ILIUS] FL[AVIA] DELVS MVCATI F[ILIA] DOM[O] ALBANOP[OLI] IPSA DELVS" ("Posis Mestylu, son of Flavia Delus, daughter of Mucat, who comes from Albanopolis"). It dates to the end of the 1st century AD and the beginning of the 2nd century AD.
 * In the 2nd century AD, Ptolemy, the geographer and astronomer from Alexandria, drafted a map that shows the city of Albanopolis, located Northeast of Durrës) in the Roman province of Macedonia and the tribe of Albanoi, which were viewed as Illyrians by later historians.
 * The ethnonym Albanos was found on a funeral inscription from ancient Stobi in present-day North Macedonia, near Gradsko about 90 km to the southeast of Gorno Sonje. The inscription in ancient Greek reads "ΦΛ(ΑΒΙΩ) ΑΛΒΑΝΩ ΤΩ ΤΕΚΝΩ ΑΙΜΙΛΙΑΝΟΣ ΑΛΒΑΝΟ(Σ) ΜΝΗΜ(Η)Σ [ΧΑΡΗΝ]" ("In memory of Flavios Albanos, his son Aemilianos Albanos"). It dates to the 2nd/3rd century AD.
 * In the 6th century AD, Stephanus of Byzantium, in his important geographical dictionary entitled Ethnica (Ἐθνικά), mentions a city in Illyria called Arbon (Ἀρβών), and gives an ethnic name for its inhabitants, in two singular number forms, i.e. Arbonios (Ἀρβώνιος; pl. Ἀρβώνιοι Arbonioi) and Arbonites (Ἀρβωνίτης; pl. Ἀρβωνῖται Arbonitai). He cites Polybius (as he does many other times in Ethnica).

References to the Albanians in medieval sources
Michael Attaleiates (1022-1080) mentions the term Albanoi twice and the term Arbanitai once. The term Albanoi is used first to describe the groups which rebelled in southern Italy and Sicily against the Byzantines in 1038–40. The second use of the term Albanoi is related to groups which supported the revolt of George Maniakes in 1042 and marched with him throughout the Balkans against the Byzantine capital, Constantinople. The term Arvanitai is used to describe a revolt of Bulgarians (Boulgaroi) and Arbanitai in the theme of Dyrrhachium in 1078–79. It is generally accepted that Arbanitai refers to the ethnonym of medieval Albanians. As such, it is considered to be the first attestation of Albanian as an ethnic group in Byzantine historiography. The use of the term Albanoi in 1038-49 and 1042 as an ethnonym related to Albanians have been a subject of debate. In what has been termed the "Ducellier-Vrannousi" debate, Alain Ducellier proposed that both uses of the term referred to medieval Albanians. Era Vrannousi counter-suggested that the first use referred to Normans, while the second did not have an ethnic connotation necessarily and could be a reference to the Normans as "foreigners" (aubain) in Epirus which Maniakes and his army traversed. The debate has never been resolved. A newer synthesis about the second use of the term Albanoi by Pëllumb Xhufi suggests that the term Albanoi may have referred to Albanians of the specific district of Arbanon, while Arbanitai to Albanians in general regardless of the specific region they inhabited.


 * The Arbanasi people are recorded as being 'half-believers' and speaking their own language in a Bulgarian text found in a Serbian manuscript dating to 1628; the text was written by an anonymous author that according to Radoslav Grujić (1934) dated to the reign of Samuel of Bulgaria (997–1014), or possibly, according to R. Elsie, 1000–1018.
 * In History written in 1079–1080, Byzantine historian Michael Attaliates referred to the Albanoi as having taken part in a revolt against Constantinople in 1043 and to the Arbanitai as subjects of the duke of Dyrrhachium. It is disputed, however, whether the "Albanoi" of the events of 1043 refers to Albanians in an ethnic sense or whether "Albanoi" is a reference to folks from southern Italy under an archaic name (there was also a tribe of Italy by the name of Albani). However a later reference to Albanians from the same Attaliates, regarding the participation of Albanians in a rebellion in 1078, is undisputed. That rebellion was led by Nikephoros Basilakes, doux of Dyrrhachium.
 * Some authors (like Alain Ducellier, 1968 ) believe that Arvanoi are mentioned in Book IV of the Alexiad by Anna Comnena (c. 1148). Others believe that this is a wrong reading and interpretation of the Greek phrase ἐξ Ἀρβάνων (i.e. ‘from Arvana’) found in the original manuscript and in one edition (Bonn, 1839) of the Alexiad.
 * The earliest Serbian source mentioning "Albania" (Ar'banas') is a charter by Stefan Nemanja, dated 1198, which lists the region of Pilot (Pulatum) among the parts Nemanja conquered from Albania (ѡд Арьбанась Пилоть, "de Albania Pulatum").
 * In the 12th to 13th centuries, Byzantine writers used the name Arbanon (Ἄρβανον) for a principality in the region of Kruja.
 * The oldest reference to Albanians in Epirus is from a Venetian document dating to 1210, which states that “the continent facing the island of Corfu is inhabited by Albanians”.
 * A Ragusan document dating to 1285 states: "I heard a voice crying in the mountains in Albanian" (Audivi unam vocem clamantem in monte in lingua albanesca).

Pre-Indo-European linguistic substratum
Pre-Indo-European sites are found throughout the territory of Albania; such as in Maliq, Vashtëm, Burimas, Barç, Dërsnik in Korçë District, Kamnik in Kolonja, Kolsh in Kukës District, Rashtan in Librazhd and Nezir in Mat District. As in other parts of Europe, these migratory Indo-European tribes entered the Balkans and contributed to the formation of the historical Paleo-Balkan tribes, to which Albanians trace their origin. The previous populations – during the process of assimilation by the immigrating IE tribes – have played an important part in the formation of the various ethnic groups generated by their long symbiosis. Consequently, the IE languages that developed in the Balkan Peninsula, in addition to their natural evolution, have also been impacted by the idioms of the assimilated pre-Indo-European people. In terms of linguistics, the pre-Indo-European substrate language spoken in the southern Balkans has probably influenced pre-Proto-Albanian, the ancestor idiom of Albanian. The extent of this linguistic impact cannot be determined with precision due to the uncertain position of Albanian among Paleo-Balkan languages and their scarce attestation. Some loanwords, however, have been proposed, such as shegë pomegranate and lëpjetë orach; compare with pre-Greek monk's rhubarb. Albanian is also the only language in the Balkans which has retained elements of the vigesimal numeral system – njëzet twenty, dyzet forty – which was prevalent in the pre-Indo-European languages of Europe; such as the Basque language, which broadly uses vigesimal numeration.

This pre-Indo-European substratum has also been identified as one of the contributing factors to the customs of Albanians.

Attestation
The first attested mention of Albanian occurred in 1285 at the Venetian city of Ragusa (present-day Dubrovnik, Croatia) when a crime witness named Matthew testified: "I heard a voice crying in the mountains in Albanian".

The earliest attested written specimens of Albanian are Formula e pagëzimit (1462) and Arnold Ritter von Harff's lexicon (1496). The first Albanian text written with Greek letters is a fragment of the Ungjilli i Pashkëve (Passover Gospel) from the 15 or 16th century. The first printed books in Albanian are Meshari (1555) and Luca Matranga's E mbsuame e krështerë (1592).

However, as Fortson notes, Albanian written works existed before this point; they have simply been lost. The existence of written Albanian is explicitly mentioned in a letter attested from 1332, and the first preserved books, including both those in Gheg and in Tosk, share orthographic features that indicate that some form of common literary language had developed.

Toponymy
In the Balkans and southern Italy, several toponyms, river and mountain names which have been attested since antiquity can be explained etymologically via Albanian or have evolved phonologically through Albanian and later adopted in other languages. Inherited toponyms from a Proto-Albanian language and the date of adoption of non-Albanian toponyms indicate in Albanology the regions were the Albanian language originated, evolved and expanded. Depending on which proposed etymology and phonological development linguists support, different etymologies are usually used to link Albanian to Illyrian, Messapic, Dardanian, Thracian or an unattested Paleo-Balkan language.
 * Brindisi is a town in southern Italy. Brundisium was originally a settlement of the Iapygian Messapians, descendants of an Illyrian people who migrated from the Balkans to Italy in Late Bronze/Early Iron Age transition. The name highlights the ties between Messapic to Albanian as Messapic brendo (stag) is linked to Old Gheg bri (horns).
 * Bunë is a river in northwestern Albania, near the cities of Shkodër and Ulcinj (Ulqin). The majority of scholars consider it a directly inherited hydronym from Illyrian Barbanna. A less accepted proposition by Eqrem Çabej considers it an unrelated name which derives from buenë (overflow of waters). The hydronym Bunë via which Slavic Bojana emerged, is often seen as indication that Albanian was spoken in the pre-Slavic era in southern Montenegro.
 * Drin is a river in northern Albania, Kosovo and North Macedonia. Similar hydronyms include Drino in southern Albania and Drina in Bosnia. It is generally considered to be of Illyrian origin.
 * Durrës is a city in central Albania. It was founded as an ancient Greek colony and greatly expanded in Roman times. It was known as Epidamnos and Dyrrhachion/Dyrrhachium. Dyrrhachium is of Greek origin and refers to the position of the city on a rocky shore. The modern names of the city in Albanian (Durrës) and Italian (Durazzo, ) are derived from Dyrrachium/Dyrrachion. An intermediate, palatalized antecedent is found in the form Dyrratio, attested in the early centuries AD. The palatalized /-tio/ ending probably represents a phonetic change in the way the inhabitants of the city pronounced its name. The preservation of old Doric /u/ indicates that the modern name derives from populations to whom the toponym was known in its original Doric pronunciation. The initial stress in Albanian Durrës presupposes an Illyrian accentuation on the first syllable. Theories which support local Illyrian-Albanian continuity interpret Durrës < Dyrratio as evidence that Albanian-speakers continuously lived in coastal central Albania. Other theories propose that the toponym doesn't necessarily show continuity but can equally be the evolution of a loanword acquired by a Proto-Albanian population which moved in the city and its area in late antiquity from northern Albanian regions.
 * Epidamnos is the oldest known name of Durrës and it is the first name under which the ancient Greek Corinthian colony was known. It is widely considered to be of Illyrian origin, as first proposed by linguist Hans Krahe, and is attested in Thucydides (5th century BC), Aristotle (4th century BC), and Polybius (2nd century BC). Etymologically, Epidamnos may be related to Proto-Albanian *dami (cub, young animal, young bull) > dem (modern Albanian) as proposed by linguist Eqrem Çabej.
 * Erzen is a river in central Albania. It derives from Illyrian Ardaxanos (*daksa "water", "sea") found in Daksa and the name of the Dassareti tribe.
 * Ishëm is a river in central Albania. It is recorded as Illyrian Isamnus in antiquity. Albanian Ishëm derives directly from Isamnus and indicates that its ancestral language was spoken in the area.
 * Mat is a river in northern Albania. It is generally considered to be of Illyrian origin and originally meant "river bank, shore". It evolved within Albanian as an inherited term from its ancestral language. It indicates that it was spoken in the Mat river valley. A similar hydronym, Matlumë, is found in Kaçanik.
 * Nish (Niš) is a city in southeastern Serbia. It evolved from a toponym attested in Ancient Greek as ΝΑΙΣΣΟΣ (Naissos), which achieved its present form via phonetic changes in Proto-Albanian and thereafter entered Slavic. Nish might indicate that Proto-Albanians lived in the region in pre-Slavic times. When this settlement happened is a matter of debate, as Proto-Albanians might have moved relatively late in antiquity in the area which might have been an eastern expansion of Proto-Albanian settlement as no other toponyms known in antiquity in the area presuppose an Albanian development. The development of Nish < Naiss- may also represent a regional development in late antiquity Balkans which while related may not be identical with Albanian.
 * Vjosë is a river in southern Albania and northern Greece. In antiquity, it formed part of the boundary between Illyrian and Epirotic Greek languages. In the early Middle Ages, the Vjosa (in Greek, Aoos or Vovousa) river valley was settled by Slavic peoples. A gradual evolution within Albanian and a borrowing by Slavic-speakers or a borrowing from Slavic *Vojusha into Albanian have been proposed for Albanian Vjosë. Both propositions are disputed. Regardless of the etymology, the Vjosë valley is an area of Albanian-Slavic linguistic contact from the 6th-7th century onwards.
 * Vlorë is a city in southwestern Albania. It was founded as ancient Greek colony Aulona (/Avlon/) in the pre-Roman era. Albanian Vlorë is a direct derivation from ancient Greek Aulon. A proposed Slavic intermediation from *Vavlona has been rejected as it doesn't conform to Albanian phonological development. The toponym has two forms, Vlorë (Tosk) and Vlonë (Gheg), which indicates that it was already in use among the population of Northern Albania before the appearance of rhotacism in Tosk.
 * Shkodër is a city in northwestern Albania. It is one of the most significant settlements in Albania and in the pre-Roman era it was the capital of the Illyrian kingdom of Genthius. Late antiquity Scodra was a Romanized city, which even relatively late in the Middle Ages had a native Dalmatian-speaking population which called it Skudra. Slavic Skadar is a borrowing from the Romance name. The origin of Albanian Shkodër/Shkodra as a direct development of Illyrian Scodra or as the development of a Latin loanword in Proto-Albanian is a subject of debate. In theories which reject a direct derivation from Scodra, the possible break in linguistic continuity from the Illyrian form is invoked as indication that Albanian was not spoken continuously in Shkodra and the surrounding area from pre-Roman to late antiquity.
 * Shkumbin is a river in central Albania. It derives from Latin Scampinus which replaced Illyrian Genusus, as recorded in Latin and ancient Greek literature. A Slavic intermediation has been rejected. Its inclusion in Latin loanwords into Proto-Albanian and phonetic evolution coincides with the historical existence of a large Roman town (near present-day Elbasan) which gave the river its new name.
 * Shtip (Štip) is a city in eastern North Macedonia. It was known in antiquity as Astibo-s. It is generally acknowledged that Slavic Štip was acquired via Albanian Shtip. About the date of settlement of Proto-Albanians in eastern Macedonia similar arguments as in the case of Nish have emerged.

Linguistic reconstruction
Albanian is attested in a written form beginning only in the 15th century AD. In the absence of prior data on the language, scholars have used Albanian linguistic contacts with Ancient Greek, Latin and Slavic for identifying its historical location. The precursor of Albanian can be considered a completely formed independent IE language since at least the first millennium BCE, with the beginning of the early Proto-Albanian phase. Proto-Albanian is reconstructed by way of the comparative method between the Tosk and Gheg dialects and between Albanian and other Indo-European languages, as well as through contact linguistics studying early loanwords from and into Albanian and structural and phonological convergences with other languages. Loanwords into Albanian treated through its phonetic evolution can be traced back as early as the first contacts with Doric Greek (West Greek) since the 7th century BCE, and with Ancient Macedonian during the 5th–4th centuries BCE, but the most important of which are those from Latin (dated to the period 167 BCE to 400 CE) and from Slavic (dated from c. 600 CE onward). The evidence from loanwords allows linguists to construct in great detail the shape of Albanian native words at the points of major influxes of loans from well-attested languages.

Pastoralism
That Albanian possesses a rich and "elaborated" pastoral vocabulary which has been taken to suggest Albanian society in post-Roman times was pastoral, with widespread transhumance, and stock-breeding particularly of sheep and goats. Joseph takes interest in the fact that some of the lexemes in question have "exact counterparts" in Romanian. The fact that the Albanian language reflects a clear pastoralist stage does not allow conclusions about the Proto-Albanian speakers' way of life during classical antiquity, as only the speech of the mountain pastoralists managed to survive the Great Migrations.

Albanian-speakers appear to have been cattle breeders given the vastness of preserved native vocabulary pertaining to cow breeding, milking and so forth, while words pertaining to dogs tend to be loaned. Many words concerning horses are preserved, but the word for horse itself is a Latin loan. The original Palaeo-Balkan word for 'horse', preserved in Albanian mëz or mâz 'foal', from *me(n)za- 'horse', underwent a later semantic shift 'horse' > 'foal' after the loan from Latin caballus into Albanian kalë 'horse'. The Albanian name Mazrek(u), which means 'horse breeder' in Albanian, is found throughout all Albanian regions, and notably it was the name used by the Kastrioti noble family to highlight their tribal affiliation (Albanian: farefisní). Also the Palaeo-Balkan word for 'mule' has been preserved in Albanian mushk(ë) 'mule'.

Hydronyms
Concerning the inheritance of hydronymic vocabulary, it has been noted that there were no lexemes relating to seamanship in the Proto-Indo-European language. PIE hydronyms reconstructed so far refer to swamps, marshes, lakes, and riverine environments, but not to the sea. For instance, the Greek term thalassa "sea" is Pre-Greek, not an inherited Indo-European word. The Albanian term for "sea" (det ), which was considered by some Albanologists to be an inherited term from Proto-Albanian *deubeta as a cognate of Proto-Germanic *deupiþō- "depth", is firmly dismissed by present-day historical linguists. Instead, a borrowing from Greek δέλτα delta "river delta" has been proposed recently. At least two other Albanian terms from the same semantic field are early Greek loanwords: pellg "pond, basin, depth" from πέλαγος pelagos "sea", and zall "riverbank, river sand", from αι҆γιαλός "sea-shore", which underwent in Proto-Albanian a semantic shift, indicating for this language a change in location after its contact with Ancient Greek. Also all Albanian words relating to seamanship appear to be loans.

Words referring to large streams and their banks tend to be loans, but lumë ("river") is native, as is rrymë (the flow of river water). Words for smaller streams and stagnant pools of water are more often native, except pellg. Albanian has maintained since Proto-Indo-European a specific term referring to a riverside forest (gjazë), as well as its words for marshes. Albanian has maintained native terms for "whirlpool", "water pit" and (aquatic) "deep place", leading Orel to speculate that Albanian was likely spoken in an area with an excess of dangerous whirlpools and depths. The term mat, meaning "height", "beach", "bank/shore" in Northern Albanian and "beach", "shore" in Arbëresh, is inherited from Proto-Albanian *mata < *mn̥-ti "height" (cf. Latin mŏns "mountain"), after which the river Mat (and the region with the same name) in north-central Albania was named, which can be explained as "mountain river". The meaning "bank/shore" hence would have emerged only at a later time (cf. German Berg "mountain" in relation to Slavic *bergъ "bank/shore").

Vegetation
Regarding forests, words for most conifers and shrubs are native, as are the terms for "alder", "elm", "oak", "beech", and "linden", while "ash", "chestnut", "birch", "maple", "poplar", and "willow" are loans.

Social organization
The original kinship terminology of Indo-European was radically reshaped; changes included a shift from "mother" to "sister", and were so thorough that only three terms retained their original function; the words for "son-in-law", "mother-in-law" and "father-in-law". All the words for second-degree blood kinship, including "aunt", "uncle", "nephew", "niece", and terms for grandchildren, are ancient loans from Latin.

Overall patterns in loaning
Openness to loans has been called a "characteristic feature" of Albanian. The Albanian original lexical items directly inherited from Proto-Indo-European are far fewer in comparison to the loanwords, though loans are considered to be "perfectly integrated" and not distinguishable from native vocabulary on a synchronic level. Although Albanian is characterized by the absorption of many loans, even, in the case of Latin, reaching deep into the core vocabulary, certain semantic fields nevertheless remained more resistant. Terms pertaining to social organization are often preserved, though not those pertaining to political organization, while those pertaining to trade are all loaned or innovated.

While the words for plants and animals characteristic of mountainous regions are entirely original, the names for fish and for agricultural activities are often assumed to have been borrowed from other languages. However, considering the presence of some preserved old terms related to the sea fauna, some have proposed that this vocabulary might have been lost in the course of time after proto-Albanian tribes were pushed back into the inland during invasions. Wilkes holds that the Slavic loans in Albanian suggest that contacts between the two populations took place when Albanians dwelt in forests 600–900 metres above sea level.

Greek
Linguistic contact between Albanian and Greek has been securely dated to the Iron Age. Also contacts between the respective post-PIE languages which gave rise to the two languages also occurred in previous times. Common traces of the Mediterranean-Balkan substratum are considered to date to the common Indo-European phase of Albanian and Greek (c.f. Graeco-Albanian). Innovative creations of agricultural terms shared only between Albanian and Greek, such as *h₂(e)lbʰ-it- 'barley' and *spor-eh₂- 'seed', were formed from non-agricultural Proto-Indo-European roots through semantic changes to adapt them for agriculture. Since they are limited only to Albanian and Greek, they could be traced back with certainty only to their last common Indo-European ancestor, and not projected back into Proto-Indo-European. Shortly after they had diverged from one another, Albanian, Greek and Armenian, also underwent a longer period of contact (as can be seen, for example, in the irregular correspondence: Greek σκόρ(ο)δον, Armenian sxtor, xstor, and Albanian hudhër, hurdhë "garlic"). Furthermore, intense Greek–Albanian contacts have certainly occurred thereafter, with ongoing connections between them in the Balkans from the ancient times, continuing up to the present-days.

Ancient Greek loans in Proto-Albanian originated from two distinct geographical and historical groups: borrowings from the Greek colonies on the Adriatic coast from the 7th century BCE, either directly or indirectly through trade communication in the hinterland; direct borrowings from Greek-speaking populations of ancient Macedonia during the 5th–4th centuries BCE, before the replacement of Ancient Macedonian with Koine Greek. Several Proto-Albanian terms have been preserved in the lexicon of Hesychius of Alexandria and other ancient glossaries. Some of the Proto-Albanian glosses in Hesychius are considered to have been loaned to the Dorik Greek as early as the 7th century BCE. Witczak (2016) specifically points to seven words recorded by the Greek grammarian Hesychius of Alexandria (5th century AD), and particularly to the term ἀάνθα 'a kind of earring', which was first attested in the work of the choral lyric poet Alcman (fl. 7th century BCE). This means that the ancestors of the Albanians were in contact with the northwestern part of Ancient Greek civilization and probably borrowed words from Greek cities (Dyrrachium, Apollonia, etc.) in the Illyrian territory, colonies which belonged to the Doric division of Greek, or from contacts in the Epirus area. The earliest Greek loans began to enter Albanian circa 600 BC, and are of Doric provenance, tending to refer to vegetables, fruits, spices, animals and tools. This stratum reflects contacts between Greeks and Proto-Albanians from the 8th century BC onward, with the Greeks being either colonists on the Adriatic coast or Greek merchants inland in the Balkans. The second wave of Greek loans began after the split of the Roman empire in 395 and continued throughout the Byzantine, Ottoman and modern periods.

According to Hermann Ölberg, the modern Albanian lexicon may include 33 words of ancient Greek origin, although it can be increased if the Albanian lexicon is properly evaluated. An argument claimed by some scholars as an indication of a location of Albanian further north than present-day Albania in antiquity is the number of loanwords from Ancient Greek, mostly from Doric dialect, which is considered by them relatively small, even though Southern Illyria neighbored the Classical Greek civilization and there were a number of Greek colonies along the Illyrian coastline. For instance, according to Bulgarian linguist Vladimir I. Georgiev there is limited Greek influence in Albanian (See Jireček Line of Roman times), and if Albanians had been inhabiting a homeland situated in modern Albania continuously since ancient times, the number of Greek loanwords in Albanian should be higher. However, the number of surviving loanwords is not a valid argument, as many Greek loans were likely lost through replacement by later Latin and Slavic loans, just as notoriously happened to most native Albanian vocabulary. On the other hand, the specifically Northwestern/Doric affiliations and ancient dating of Greek loans imply a specifically Western Balkan Albanian presence to the north and west of Greeks specifically in antiquity, though Huld cautions that the classical "precursors" of the Albanians would be "'Illyrians' to classical writers", but that the Illyrian label is hardly "enlightening" since classical ethnology was imprecise.

Evidence of a significant level of early linguistic contact between Albanian and Greek is provided by ancient common structural innovations and phonologic convergence such as: Those innovations are limited only to the Albanian and Greek languages and are not shared with other languages of the Balkan sprachbund. Since they precede the Balkan sprachbund era, those innovations date to a prehistoric phase of the Albanian language, spoken at that time in the same area as Greek and within a social frame of bilingualism among early Albanians having to be able to speak some form of Greek.
 * the rise of the close front rounded vowel /y/ (documented in Attic and Koine Greek);
 * the rise of dental fricatives;
 * the voicing of voiceless plosives after nasal consonants;
 * the replacement, with a form that featured a prefix, of the inherited present tense 3rd person singular of the verb "be" (documented in Koine Greek).

Latin and early Romance loans
Latin loans are dated to the period of 167 BC to 400 AD. 167 BC coincides with the fall of the kingdom ruled by Gentius and reflects the early date of the entry of Latin-based vocabulary in Albanian. It entered Albanian in the Early Proto-Albanian stage and evolved in later stages as a part of the Proto-Albanian vocabulary and within its phonological system. Albanian is one of the oldest languages that came into contact with Latin and adopted Latin vocabulary. It has preserved 270 Latin-based words which are found in all Romance languages, 85 words which are not found in Romance languages, 151 which are found in Albanian but not in Eastern Romance and its descendant Romanian, and 39 words which are found only in Albanian and Romanian. The contact zone between Albanian and Romanian was likely located in eastern and southeastern Serbia. The preservation of Proto-Albanian vocabulary and linguistic features in Romanian highlights that at least partly Balkan Latin emerged as Albanian-speakers shifted to Latin.

The other layer of linguistic contacts of Albanian with Latin involves Old Dalmatian, a western Balkan derivative of Balkan Latin. Albanian maintained links with both coastal western and central inland Balkan Latin formations. Hamp indicates there are words that follow Dalmatian phonetic rules in Albanian, giving as an example the word drejt 'straight' < d(i)rectus matching developments in Old Dalmatian traita < tract. Romanian scholars Vatasescu and Mihaescu, using lexical analysis of Albanian, have concluded that Albanian was also heavily influenced by an extinct Romance language that was distinct from both Romanian and Dalmatian. Because the Latin words common to only Romanian and Albanian are significantly less than those that are common to only Albanian and Western Romance, Mihaescu argues that Albanian evolved in a region with much greater contact with Western Romance regions than with Romanian-speaking regions, and located this region in present-day Albania, Kosovo and Western North Macedonia, spanning east to Bitola and Pristina.

The Christian religious vocabulary of Albanian is mostly Latin as well, including even the basic terms such "to bless", "altar," and "to receive communion". It indicates that Albanians were Christianized under the Latin-based liturgy and ecclesiastical order which would be known as "Roman Catholic" in later centuries.

Slavic
The contacts began after the South Slavic migrations to Southeastern Europe in the 6th and 7th centuries. The modern Albanian lexicon contains around 250 Slavic borrowings that are shared among all the dialects. Slavic settlement probably shaped the present geographic spread of the Albanians. It is likely that Albanians took refuge in the mountainous areas of northern and central Albania, eastern Montenegro, western North Macedonia, and Kosovo. Long-standing contact between Slavs and Albanians might have been common in mountain passages and agriculture or fishing areas, in particular in the valleys of the White and Black branches of the Drin and around the Shkodër and Ohrid lakes. Such contact with one another in these areas has caused many changes in Slavic and Albanian local dialects. Historical linguist Eric P. Hamp, analyzing the influence of substrates on the Old Serbo-Croatian language, has concluded that the toponymic and Romanian evidence indicate that the South Slavs who became Serbo-Croatian speakers settled in a zone of former Albanoid speech, which reasonably explains why the resultant population was well-predisposed to preserve the richest system of lateral consonant distinctions and alternations among the later Slavic-speaking peoples.

The evolution of the ancient toponym Lychnidus into Oh(ë)r(id) (city and lake), which is attested in this form from 879 CE, required an early long-standing period of Tosk Albanian–East South Slavic bilingualism, or at least contact, resulting from the Tosk Albanian rhotacism -n- into -r- and Eastern South Slavic l-vocalization ly- into o-.

As Albanian and Slavic have been in contact since the early Middle Ages, toponymical loanwords in both belong to different chronological strata and reveal different periods of acquisition. Old Slavic loanwords into Albanian develop early Slavic *s as sh and *y as u within Albanian phonology of that era. Norbert Jokl defined this older period from the earliest Albanian-Slavic contacts to 1000 AD at the latest, while contemporary linguists like Vladimir Orel define it as between the 6th and the 8th century AD. Newer loanwords preserve Slavic /s/ and other features which no longer show phonological development within Albanian. Such toponyms from the earlier period of contact in Albania include Bushtricë (Kukës), Dishnica (Përmet), Dragoshtunjë (Elbasan), Leshnjë (Leshnjë, Berat and other areas), Shelcan (Elbasan), Shishtavec (Kukës/Gora), Shuec (Devoll) and Shtëpëz (Gjirokastër), Shopël (Iballë), Veleshnjë (Skrapar) and others. Similar toponyms in a later period produced different results e.g. Bistricë (Sarandë) instead of Bushtricë or Selcan (Këlcyrë) instead of Shelcan. Part of the toponyms of Slavic origin were acquired in Albanian before undergoing the changes of Slavic liquid metathesis (before ca. the end of the 8th century). They include Ardenicë (Lushnjë), Berzanë (Lezhë), Gërdec and Berzi (Tiranë) and a cluster of toponyms along the route Berat-Tepelenë-Përmet. Labëri, from the Albanian endonym, resulted through the Slavic liquid metathesis, and was reborrowed in that form into Albanian.

Unidentified Romance language hypothesis
It has been concluded that the partial Latinization of Roman-era Albania was heavy in coastal areas, in the plains, and along the Via Egnatia, which passed through Albania. In these regions, Madgearu notes that the survival of Illyrian names and the depiction of people with Illyrian dress on gravestones is not enough to prove successful resistance against Romanization, and that in these regions there were many Latin inscriptions and Roman settlements. Madgearu concludes that only the northern mountain regions escaped Romanization. In some regions, Madgearu concludes that it has been shown that in some areas a Latinate population that survived until at least the seventh century passed on local place names that had mixed characteristics of Eastern and Western Romance into Albanian.

Archaeology


The Komani-Kruja culture is an archaeological culture attested from late antiquity to the Middle Ages in central and northern Albania, southern Montenegro and similar sites in the western parts of North Macedonia. It consists of settlements usually built below hillforts along the Lezhë (Praevalitana)-Dardania and Via Egnatia road networks which connected the Adriatic coastline with the central Balkan Roman provinces. Its type site is Komani and the nearby Dalmace hill in the Drin river valley. Limited excavations campaigns occurred until the 1990s. Objects from a vast area covering nearby regions the entire Byzantine Empire, the northern Balkans and Hungary and sea routes from Sicily to Crimea were found in Dalmace and other sites coming from many different production centres: local, Byzantine, Sicilian, Avar-Slavic, Hungarian, Crimean and even possibly Merovingian and Carolingian. Within Albanian archaeology, based on the continuity of pre-Roman Illyrian forms in the production of several types of local objects found in graves, the population of Komani-Kruja was framed as a group which descended from the local Illyrians who "re-asserted their independence" from the Roman Empire after many centuries and formed the core of the later historical region of Arbanon. As research focused almost entirely on grave contexts and burial sites, settlements and living spaces were often ignored. Yugoslav archaeology proposed an opposite narrative and tried to frame the population as Slavic, especially in the region of western Macedonia. Archaeological research has shown that these sites were not related to regions then inhabited by Slavs and even in regions like Macedonia, no Slavic settlements had been founded in the 7th century.

What was established in this early phase of research was that Komani-Kruja settlements represented a local, non-Slavic population which has been described as Romanized Illyrian, Latin-speaking or Latin-literate. This is corroborated by the absence of Slavic toponyms and survival of Latin ones in the Komani-Kruja area. In terms of historiography, the thesis of older Albanian archaeology was an untestable hypothesis as no historical sources exist which can link Komani-Kruja to the first definite attestation of medieval Albanians in the 11th century. Archaeologically, while it was considered possible and even likely that Komani-Kruja sites were used continuously from the 7th century onwards, it remained an untested hypothesis as research was still limited. Whether this population represented local continuity or arrived at an earlier period from a more northern location as the Slavs entered the Balkans remained unclear at the time but regardless of their ultimate geographical origins, these groups maintained Justinianic era cultural traditions of the 6th century possibly as a statement of their collective identity and derived their material cultural references to the Justinianic military system. In this context, they may have used burial customs as a means of reference to an "idealized image of the past Roman power".

Research greatly expanded after 2009 and the first survey of Komani's topography was produced in 2014. Until then, except for the area of the cemetery the size of the settlement and its extension remained unknown. In 2014, it was revealed that Komani occupied an area of more than 40 ha, a much larger territory than originally thought. Its oldest settlement phase dates to the Hellenistic era. Proper development began in the late antiquity and continued well into the Middle Ages (13th-14th centuries). It indicates that Komani was a late Roman fort and an important trading node in the networks of Praevalitana and Dardania. In the Avar-Slavic raids, communities from present-day northern Albania and nearby areas clustered around hill sites for better protection as is the case of other areas like Lezha and Sarda. During the 7th century as Byzantine authority was reestablished after the Avar-Slavic raids and the prosperity of the settlements increased, Komani saw increase in population and a new elite began to take shape. Increase in population and wealth was marked by the establishment of new settlements and new churches in their vicinity. Komani formed a local network with Lezha and Kruja and in turn this network was integrated in the wider Byzantine Mediterranean world, maintained contacts with the northern Balkans and engaged in long-distance trade. Tom Winnifrith (2020) says that the Komani-Kruja culture shows that in that area a Latin-Illyrian civilization survived, to emerge later as Albanians and Vlachs. The lack of interest among Slavs for the barren mountains of Northern Albania would explain the survival of Albanian as a language.

Paleo-Balkan linguistic theories


The general consensus is that Albanians originate from one or possibly a mixture of Paleo-Balkan peoples but which specific peoples besides Illyrians is a matter of continuing debate.

Messapic is the only sufficiently attested ancient language via which commonly accepted Illyrian-Albanian connections have been produced. It is unclear whether Messapic was an Illyrian dialect or if it diverged enough to be a separate language, although in general it is treated as a distinct language. Dardanian in the context of a distinct language has gained prominence in the possible genealogy of the Albanian language in recent decades.

Vladimir I. Georgiev, although accepting an Illyrian component in Albanian, and even not excluding an Illyrian origin of Albanian, proposed as the ancestor of Albanian a language called "Daco-Mysian" by him, considering it a separate language from Thracian. Georgiev maintained that "Daco-Mysian tribes gradually migrated to the northern-central part of the Balkan Peninsula, approximately to Dardania, probably in the second millennium B.C. (or not later than the first half of the first millennium B.C.), and thence they migrated to the areas of present Albania". Based on shared innovations between Albanian and Messapic, Eric P. Hamp has argued that Albanian is closely related to Illyrian and not to Thracian or Daco-Moesian, maintaining that it descended from a language that was sibling of Illyrian and that was once closer to the Danube and in contact with Daco-Moesian. Due to the paucity of written evidence, what can be said with certainty in current research is that on the one hand a significant group of shared Indo-European non-Romance cognates between Albanian and Romanian indicates at least contact with the 'Daco-Thraco-Moesian complex', and that on the other hand there is some evidence to argue that Albanian is descended from the 'Illyrian complex'. From a "genealogical standpoint", Messapic is the closest at least partially attested language to Albanian. Hyllested & Joseph (2022) label this Albanian-Messapic branch as Illyric and in agreement with recent bibliography identify Greco-Phrygian as the IE branch closest to the Albanian-Messapic one. These two branches form an areal grouping - which is often called "Balkan IE" - with Armenian.

The Illyrian linguistic theory has some consensus, but Illyrian language is too little attested for definite comparisons to be made. Further issues are linked to the definitions of "Illyrian" and "Thracian" which are vague and aren't applied to the same areas which were considered to be part of Illyria and Thrace in antiquity. For instance, Martin Huld argues that the classical "precursors" of the Albanians would be "'Illyrians' to classical writers", but that the Illyrian label is hardly "enlightening" because ethnology in classical antiquity was imprecise. It is also uncertain whether Illyrians spoke a homogeneous language or rather a collection of different but related languages that were wrongly considered the same language by ancient writers. In contemporary research, two main onomastic provinces have been defined in which Illyrian personal names occur; the southern Illyrian or south-eastern Dalmatian province (Albania, Montenegro and their hinterland) and the central Illyrian or middle Dalmatian-Pannonian province (parts of Croatia, Bosnia and western Serbia). The region of the Dardani (modern Kosovo, parts of northern North Macedonia, parts of eastern Serbia) saw the overlap of the southern/south-eastern, Dalmatian and local anthroponymy. A third area around modern Slovenia sometimes considered part of Illyria in antiquity is considered to have been closer to Venetic, which is no longer considered to be related to Illyrian. The conceptual paucity of the label 'Illyrian' makes its usage uncomfortable to some scholars, for this reason in current research some call the Albanian's ancestor 'Albanoid' in reference to a "specific ethnolinguistically pertinent and historically compact language group", which still remains relatable with Messapic. The term 'Albanoid' for the ancestor of the Albanian was used for the first time by Hamp, who developed the thesis about the Proto-Albanoid dialects, spoken in the central-western Balkans including the historical regions of Dardania, Illyria proper, Paeonia, Upper Moesia, western Dacia and western Thrace.

Albanian shows traces of satemization within the Indo-European language tree, however the majority of Albanologists hold that unlike most satem languages it has preserved the distinction of /kʷ/ and /gʷ/ from /k/ and /g/ before front vowels (merged in satem languages), and there is a debate whether Illyrian was centum or satem. On the other hand, Dacian and Thracian seem to belong to satem. A clear isogloss that distinguishes Albanoid languages and Thracian is the palatilization of the IE labiovelars, which in Albanoid was present well before Roman times, while the IE labiovelars clearly did not palatalize in the pre-Roman period in Thracian or in the area where it was spoken.

The debate is often politically charged, and to be conclusive, more evidence is needed. Such evidence unfortunately may not be easily forthcoming because of a lack of sources.

Illyrian
The very first recorded mention of a connection between Illyrians and Albanians is in 1709, attributed to the German philosopher and mathematician Gottfried Leibniz, most famous for being the co-inventor of calculus along with Isaac Newton. In a series of letters, he first speculated Albanian to be related to the other Slavic languages along the Adriatic, but soon changed his mind and connected the Albanian language to that of the ancient Illyrians.

In terms of linguists or historians, the theory that Albanians were related to the Illyrians was proposed for the first time by the Swedish historian Johann Erich Thunmann in 1774. The scholars who advocate an Illyrian origin are numerous. Those who argue in favour of an Illyrian origin maintain that the indigenous Illyrian tribes dwelling in South Illyria (including today's Albania) went up into the mountains when Slavs occupied the lowlands, while another version of this hypothesis states that the Albanians are the descendants of Illyrian tribes located between Dalmatia and the Danube who spilled south.

Some of the arguments for the Illyrian-Albanian connection have been as follows:
 * From what is known from the old Balkan populations territories (Greeks, Illyrians, Thracians, Dacians), Albanian is spoken in a region where Illyrian was spoken in ancient times.
 * There is no evidence of any major migration into Albanian territory since the records of Illyrian occupation. Because descent from Illyrians makes "geographical sense" and there is no linguistic or historical evidence proving a replacement, then the burden of proof lies on the side of those who would deny a connection of Albanian with Illyrian.
 * The Albanian tribal society has preserved the ancient Illyrian social structure based on tribal units. In addition, Çabej analyzed the morphology of some tribal names and pointed out that the Illyrian suffix -at appeared in the names of Illyrian tribes, such as Docleatae, Labeatae, Autariates, Delmatae correspondends to the suffix -at appeared in the 15th century Albanian tribes names like Bakirat and Demat; in Albania today, the suffixes of the names of some villages, such as Dukat and Filat, do match to the Illyrian one.
 * Many of what remain as attested words to Illyrian have an Albanian explanation and also a number of Illyrian lexical items (toponyms, hydronyms, oronyms, anthroponyms, etc.) have been linked to Albanian.
 * Words borrowed from Latin (e.g. Latin aurum > ar "gold", gaudium > gaz "joy" etc. ) date back before the Christian era, while the Illyrians on the territory of modern Albania were the first from the old Balkan populations to be conquered by Romans in 229–167 BC, the Thracians were conquered in 45 AD and the Dacians in 106 AD.
 * The characteristics of the Albanian dialects Tosk and Gheg in the treatment of the native and loanwords from other languages, have led to the conclusion that the dialectal split occurred after Christianisation of the region (4th century AD) and at the time of the Slavic migration to the Balkans or thereafter between the 6th to 7th century AD with the historic boundary between the Gheg and Tosk dialects being the Shkumbin river which straddled the Jireček line.

Messapic
Messapic is an Iron Age language spoken in Apulia by the Iapygians (Messapians, Peucetians, Daunians), which settled in Italy as part of an Illyrian migration from the Balkans in the transitional period between the Bronze and Iron Ages. As Messapic was attested after over 500 years of development in the Italian peninsula, it's generally treated as distinct linguistically from Illyrian. Both languages are placed in the same branch of Indo-European. Eric Hamp has grouped them under "Messapo-Illyrian" which is further grouped with Albanian under "Adriatic Indo-European". Other schemes group the three languages under "General Illyrian" and "Western Paleo-Balkan". Messapian shares several exclusive lexical correspondences and general features with Albanian. Whether Messapian and Albanian share common features because of a common ancestral Illyrian idiom or whether these are features which developed in convergence among the languages of their grouping in the territory of Illyria. Shared cognates and features indicate a closer link between the two languages. The cognates include Messapic aran and Albanian arë ("field"), biliā and bijë ("daughter"), menza- (in the name Manzanas) and mëz ("foal"), brendion (in Brundisium) and bri (horn). Some Messapian toponyms like Manduria in Apulia have no etymological forms outside Albanian linguistic sources. Other linguistic elements such as particles, prepositions, suffixes and phonological features of the Messapic language find singular affinities with Albanian.

Thracian or "Daco-Moesian"
Aside from an Illyrian origin, Thracian or "Daco-Moesian" origins have also been hypothesized based on linguistic arguments that had been claimed as evidence, although in current historical linguistics the documented Thracian material clearly points to a different language than Albanian or its reconstructed precursor, whereas the "Daco-Mysian" hypothetical relation is highly based on speculations that have been thoroughly dismantled by other scholars.

Scholars who support a Dacian origin maintain on their side that Albanians moved southwards between the 3rd and 6th centuries AD from the Moesian area. Others argue instead for a Thracian origin and maintain that the proto-Albanians are to be located in the area between Niš, Skopje, Sofia and Albania or between the Rhodope and Balkan Mountains, from which they moved to present-day Albania before the arrival of the Slavs.

German historian Gottfried Schramm speculated that the Albanians derived from the Christianized Bessi, after their remnants were allegedly pushed by Slavs and Bulgars during the 9th century westwards into today Albania. Archaeologically, there is absolutely no evidence of a 9th-century migration of any population, such as the Bessi, from western Bulgaria to Albania. Also according to historical linguistics the Thracian-Bessian hypothesis of the origin of Albanian should be rejected, since only very little comparative linguistic material is available (the Thracian is attested only marginally, while the Bessian is completely unknown), and at the same time the individual phonetic history of Albanian and Thracian clearly indicates a very different sound development that cannot be considered as the result of one language. Furthermore, the Christian vocabulary of Albanian is mainly Latin, which speaks against the construct of a "Bessian church language". The elite of the Bessi tribe was gradually Hellenized. Low level of borrowings from Greek in the Albanian language is a further argument against the identification of Albanian with the Bessi. Also the dialectal division of the Albanian-speaking area in the Early Middle Ages contradicts the alleged migration of Albanians in the hinterland of Dyrrhachium in the first decades of the 9th century AD, especially because the dialectal division of a linguistic space is in general a result of a number of linguistic phenomena occurring during a considerable span of time and requires a very large number of natural speakers.

Cities whose names follow Albanian phonetic laws – such as Shtip (Štip), Shkupi (Skopje) and Nish (Niš) – lie in the areas, believed to historically been inhabited by Thracians, Paionians and Dardani; the latter is most often considered an Illyrian tribe by ancient historians. While there still is no clear picture of where the Illyrian-Thracian border was, Niš is mostly considered Illyrian territory.

There are some close correspondences between Thracian and Albanian words. However, as with Illyrian, most Dacian and Thracian words and names have not been closely linked with Albanian (v. Hamp). Also, many Dacian and Thracian placenames were made out of joined names (such as Dacian Sucidava or Thracian Bessapara; see List of Dacian cities and List of ancient Thracian cities), while modern Albanian does not allow this. Many city names were composed of an initial lexical element affixed to -dava, -daua, -deva, -deba, -daba, or -dova, which meant "city" or "town" Endings on more southern regions are exclusively -bria ("town, city"), -disza, -diza, -dizos ("fortress, walled settlement"), -para, -paron, -pera, -phara ("town, village"). Most Illyrian names are composed of a single unit; many Thracian ones are made of two units joined. Several Thracian place-names end in -para, for example, which is thought to mean 'ford', or -diza, which is thought to mean 'fortress'. Thus in the territory of the Bessi, a well-known Thracian tribe, we have the town of Bessapara, 'ford of the Bessi'. The structure here is the same as in many European languages: thus the 'town of Peter' can be called Peterborough, Petrograd, Petersburg, Pierreville, and so on. But the crucial fact is that this structure is impossible in Albanian, which can only say 'Qytet i Pjetrit', not 'Pjeterqytet'. If para were the Albanian for 'ford', then the place-name would have to be 'Para e Besseve'; this might be reduced in time to something like 'Parabessa', but it could never become 'Bessapara'. And what is at stake here is not some superficial feature of the language, which might easily change over time, but a profound structural principle. This is one of the strongest available arguments to show that Albanian cannot have developed out of Thracian or Dacian.

Bulgarian linguist Vladimir I. Georgiev posits that Albanians descend from a Dacian population from Moesia, now the Morava region of eastern Serbia, and that Illyrian toponyms are found in a far smaller area than the traditional area of Illyrian settlement. According to Georgiev, Latin loanwords into Albanian show East Balkan Latin (proto-Romanian) phonetics, rather than West Balkan (Dalmatian) phonetics. Combined with the fact that the Romanian language contains several hundred words similar only to Albanian, Georgiev proposes that Albanian formed in Dardania, in the Roman province of Moesia Superior, where his "Daco-Mysian" construct had allegedly been spoken probably since the 2nd millennium BCE or not later than circa 500 BCE. He suggests that Romanian is a fully Romanised Dacian language, whereas Albanian a partly Romanized "Daco-Mysian" language. Georgiev's theory however has been challenged and dismantled by other scholars. Noel Malcolm suggests Romanian and Aromanian originated in the Southern Balkans from Romanized Illyrians.

Apart from the linguistic theory that Albanian is more akin to East Balkan Romance (i.e. Dacian substrate) than West Balkan Romance (i.e. Illyrian/Dalmatian substrate), Georgiev also notes that marine words in Albanian are borrowed from other languages, suggesting that Albanians were not originally a coastal people. According to Georgiev the scarcity of Greek loan words also supports a "Daco-Mysian" theory – if Albanians originated in the region of Illyria there would surely be a heavy Greek influence. According to historian John Van Antwerp Fine, who does define "Albanians" in his glossary as "an Indo-European people, probably descended from the ancient Illyrians", nevertheless states that "these are serious (non-chauvinistic) arguments that cannot be summarily dismissed." Romanian scholars Vatasescu and Mihaescu, using lexical analysis of Albanian, have concluded that Albanian was also heavily influenced by an extinct Romance language that was distinct from both Romanian and Dalmatian. Because the Latin words common to only Romanian and Albanian are significantly less than those that are common to only Albanian and Western Romance, Mihaescu argues that Albanian evolved in a region with much greater contact to Western Romance regions than to Romanian-speaking regions, and located this region in present-day Albania, Kosovo and Western North Macedonia, spanning east to Bitola and Pristina.

An argument against a Thracian origin (which does not apply to Dacian) is that most Thracian territory was on the Greek half of the Jireček Line, aside from varied Thracian populations stretching from Thrace into Albania, passing through Paionia and Dardania and up into Moesia; it is considered that most Thracians were Hellenized in Thrace (v. Hoddinott) and Macedonia.

The Dacian theory could also be consistent with the known patterns of barbarian incursions. Although there is no documentation of an Albanian migration, "during the fourth to sixth centuries the Rumanian region was heavily affected by large-scale invasion of Goths and Slavs, and the Morava valley (in Serbia) was a possible main invasion route and the site of the earliest known Slavic sites. Thus this would have been a region from which an indigenous population would naturally have fled".

Genetic studies
Various genetic studies have been done on the European population, some of them including current Albanian population, Albanian-speaking populations outside Albania, and the Balkan region as a whole. Albanians share similar genetics with neighbouring ethnic populations with close clusters forming primarily with mainland Greeks and southern Italian populations.

Y-DNA
The three haplogroups most strongly associated with Albanian people are E-V13, R1b and J2b-L283.


 * E-V13, the most common European sub-clade of E1b1b1a (E-M78) represents about 1/3 of all Albanian men and peaks in Kosovo (~40%). The current distribution of this lineage might be the result of several demographic expansions from the Balkans, such as that associated with the Balkan Bronze Age, and more recently, during the Roman era with the so-called "rise of Illyrian soldiery".    The peak of the haplogroup in Kosovo, however, has been attributed to genetic drift.
 * R1b-M269 represents about 1/5 of Albanian men, mostly under clades R-Z2103 and R-PF7562. It is linked with the introduction of the Indo-European languages in the Balkans. The oldest R1b (> R-RPF7562) sample in historically Albanian-inhabited regions has been found in EBA Çinamak, northern Albania, 2663-2472 calBCE (4045±25 BP, PSUAMS-7926). In the same site during the Iron Age, half of the men carried R-M269 (R-CTS1450 x1).
 * J2b-L283 represents 14-18% of Albanian men. It peaks in northern Albania. The oldest J-L283 (> J-Z597) sample in Albania found in MBA Shkrel as early as the 19th century BC. It first spread from the northwestern Balkans southwards during the EBA/MBA with cultures likes Cetina (Dalmatia) and Cetina-derived groups which have yielded most J-L283 samples in antiquity. In a 2022 study, J-L283 and its paternal clade J-M241 were found in three out of seven Daunian samples. In IA Çinamak(northern Albania), half of the samples belonged to J-L283.
 * Y haplogroup I is represented by I1 more common in northern Europe and I2 where several of its sub-clades are found in significant amounts in the South Slavic population. The specific I sub-clade which has attracted most discussion in Balkan studies currently referred to as I2a1b, defined by SNP M423 This clade has higher frequencies to the north of the Albanophone area, in Dalmatia and Bosnia. The expansion of I2a-Din took place during Late Antiquity and Early Middle Ages and today is common in Slavic speaking peoples.
 * Haplogroup R1a is common in Central and Eastern Europe, especially in Slavic nations, (and is also common in Central Asia and the Indian subcontinent). In the Balkans, it is strongly associated with Slavic areas.

A study by Battaglia et al. in 2008 found the following haplogroup distributions among Albanians in Albania itself:

The same study by Battaglia et al. (2008) also found the following distributions among Albanians in North Macedonia:

The same study by Battaglia et al. (2008) also found the following distributions among Albanians in Albania itself and Albanians in North Macedonia:

A study by Peričić et al. in 2005 found the following Y-Dna haplogroup frequencies in Albanians from Kosovo with E-V13 subclade of haplogroup E1b1b representing 43.85% of the total (note that Albanians from other regions have slightly lower percentages of E-V13, but similar J2b and R1b):

The same study by Peričić et al. in 2005 found the following Y-Dna haplogroup frequencies in Albanians from Kosovo with E-V13 subclade of haplogroup E1b1b representing 43.85% of the total (note that Albanians from other regions have slightly lower percentages of E-V13, but similar J2b and R1b):

Table notes:

A study on the Y chromosome haplotypes DYS19 STR and YAP and on mitochondrial DNA found no significant difference between Albanians and most other Europeans.

Larger samples collected by volunteer-led projects, show the Albanians belong largely to Y-chromosomes J2b2-L283, R1b-Z2103/BY611 and EV-13 from Ancient Balkan populations.

In a 2013 study which compared one Albanian sample to other European samples, the authors concluded that it did not differ significantly to other European populations, especially groups such as Greeks, Italians and Macedonians.

mtDNA
Another study of old Balkan populations and their genetic affinities with current European populations was done in 2004, based on mitochondrial DNA on the skeletal remains of some old Thracian populations from SE of Romania, dating from the Bronze and Iron Age. This study was during excavations of some human fossil bones of 20 individuals dating about 3200–4100 years, from the Bronze Age, belonging to some cultures such as Tei, Monteoru and Noua were found in graves from some necropoles SE of Romania, namely in Zimnicea, Smeeni, Candesti, Cioinagi-Balintesti, Gradistea-Coslogeni and Sultana-Malu Rosu; and the human fossil bones and teeth of 27 individuals from the early Iron Age, dating from the 10th to 7th centuries BC from the Hallstatt Era (the Babadag culture), were found extremely SE of Romania near the Black Sea coast, in some settlements from Dobruja, namely: Jurilovca, Satu Nou, Babadag, Niculitel and Enisala-Palanca. After comparing this material with the present-day European population, the authors concluded:

Computing the frequency of common point mutations of the present-day European population with the Thracian population has resulted that the Italian (7.9%), the Albanian (6.3%) and the Greek (5.8%) have shown a bias of closer [mtDna] genetic kinship with the Thracian individuals than the Romanian and Bulgarian individuals (only 4.2%).

Autosomal DNA
Analysis of autosomal DNA, which analyses all genetic components has revealed that few rigid genetic discontinuities exist in European populations, apart from certain outliers such as Saami, Sardinians, Basques, Finns and Kosovar Albanians. They found that Albanians, on the one hand, have a high amount of identity by descent sharing, suggesting that Albanian-speakers derived from a relatively small population that expanded recently and rapidly in the last 1,500 years. On the other hand, they are not wholly isolated or endogamous because Greek and Macedonian samples shared much higher numbers of common ancestors with Albanian speakers than with other neighbors, possibly a result of historical migrations, or else perhaps smaller effects of the Slavic expansion in these populations. At the same time the sampled Italians shared nearly as much IBD with Albanian speakers as with each other.

In Lazaridis et al. (2022) a transect of samples from Albania which date from the EBA to the present day were tested. The population of Albania "appears to be largely made up of the same components in similar proportions" since the MBA. The core part of this profile consists of 50% Anatolian Neolithic Farmers, 20-25% Caucasus Hunter-Gatherers, 10-15% Eastern Hunter-Gatherers. According to this study, the speakers of Albanian, as well as Greek and other Paleo-Balkan languages, go back directly to the migration of Yamnaya steppe pastoralists into the Balkans about 5,000 to 4,500 years ago, whose admixture with the local populations generated a tapestry of various ancestry, which in Albanians resulted in the above-mentioned components.

Another study, from 2023, concludes that "a significant proportion" of the Modern Albanians paternal ancestry comes from West Balkans "including those traditionally known as Illyrians"

Italian hypothesis
Laonikos Chalkokondyles (c. 1423–1490), the Byzantine historian, considered the Albanians to be an extension of the Italians. The theory has its origin in the first mention of the Albanians, disputed whether it refers to Albanians in an ethnic sense, made by Attaliates (11th century): "...For when subsequent commanders made base and shameful plans and decisions, not only was the island lost to Byzantium, but also the greater part of the army. Unfortunately, the people who had once been our allies and who possessed the same rights as citizens and the same religion, i.e. the Albanians and the Latins, who live in the Italian regions of our Empire beyond Western Rome, quite suddenly became enemies when Michael Dokeianos insanely directed his command against their leaders..."

Caucasian hypothesis
One of the earliest theories on the origins of the Albanians, now considered obsolete, incorrectly identified the proto-Albanians with an area of the eastern Caucasus, separately referred to by classical geographers as Caucasian Albania, located in what roughly corresponds to modern-day southern Dagestan, northern Azerbaijan and bordering Caucasian Iberia to its west. This theory conflated the two Albanias supposing that the ancestors of the Balkan Albanians (Shqiptarët) had migrated westward in the late classical or early medieval period. The Caucasian theory was first proposed by Pope Pius II in his writings. and later by Renaissance humanists who were familiar with the works of classical geographers, and further developed by early 19th-century French consul and writer François Pouqueville. It was soon rendered obsolete in the 19th century when linguists proved Albanian as being an Indo-European, rather than a Caucasian language.

Pelasgian hypothesis
In terms of historical theories, an outdated theory is the 19th century theory that Albanians specifically descend from the Pelasgians, a broad term used by classical authors to denote the autochthonous, pre-Indo-European inhabitants of Greece and the southern Balkans in general. However, there is no evidence about the possible language, customs and existence of the Pelasgians as a distinct and homogeneous people and thus any particular connection to this population is unfounded. This theory was developed by the Austrian linguist Johann Georg von Hahn in his work Albanesische Studien in 1854. According to Hahn, the Pelasgians were the original proto-Albanians and the language spoken by the Pelasgians, Illyrians, Epirotes and ancient Macedonians were closely related. In Hahn's theory the term Pelasgians was mostly used as a synonym for Illyrians. This theory quickly attracted support in Albanian circles, as it established a claim of predecence over other Balkan nations, particularly the Greeks. In addition to establishing "historic right" to territory this theory also established that the ancient Greek civilization and its achievements had an "Albanian" origin. The theory gained staunch support among early 20th-century Albanian publicists. This theory is rejected by scholars today. In contemporary times with the Arvanite revival of the Pelasgian theory, it has also been recently borrowed by other Albanian speaking populations within and from Albania in Greece to counter the negative image of their communities.De Rapper, Gilles (2009). "Pelasgic Encounters in the Greek–Albanian Borderland: Border Dynamics and Reversion to Ancient Past in Southern Albania." Anthropological Journal of European Cultures. 18. (1): 60-61. “In 2002, another important book was translated from Greek: Aristides Kollias’ Arvanites and the Origin of Greeks, first published in Athens in 1983 and re-edited several times since then (Kollias 1983; Kolia 2002). In this book, which is considered a cornerstone of the rehabilitation of Arvanites in post- dictatorial Greece, the author presents the Albanian speaking population of Greece, known as Arvanites, as the most authentic Greeks because their language is closer to ancient Pelasgic, who were the first inhabitants of Greece. According to him, ancient Greek was formed on the basis of Pelasgic, so that man Greek words have an Albanian etymology. In the Greek context, the book initiated a 'counterdiscourse' (Gefou-Madianou 1999: 122) aiming at giving Arvanitic communities of southern Greece a positive role in Greek history. This was achieved by using nineteenth-century ideas on Pelasgians and by melting together Greeks and Albanians in one historical genealogy (Baltsiotis and Embirikos 2007: 130–431, 445). In the Albanian context of the 1990s and 2000s, the book is read as proving the anteriority of Albanians not only in Albania but also in Greece; it serves mainly the rehabilitation of Albanians as an antique and autochthonous population in the Balkans. These ideas legitimise the presence of Albanians in Greece and give them a decisive role in the development of ancient Greek civilisation and, later on, the creation of the modern Greek state, in contrast to the general negative image of Albanians in contemporary Greek society. They also reverse the unequal relation between the migrants and the host country, making the former the heirs of an autochthonous and civilised population from whom the latter owes everything that makes their superiority in the present day.”

"Free Dacian" hypothesis
In the late 19th and early 20th century Romanian linguist Hasdeu speculated the origin of Albanians from the free Dacians (i.e., according to him, the Costoboci, the Carpi and the Bessi), after their alleged migration southwards from outside the Danubian or Carpathian limes during Roman Imperial times. His outdated methods are linguistically unsustainable and his reconstructed narrative is based on no factual evidence. This unfounded narrative was revived in the late 20th century by Romanian historian I. I. Russu persistently and beyond the scientific knowledge that was achieved in the meantime. Despite having a history background, he made claims in the field of philology and comparative linguistics, eager to prove the autochthony of the Romanian people in their present-day heartlands (mainly north of the Danube and in Transylvania). According to him, the pre-Roman lexical element shared with Romanian ("Traco-Dacian"), the massive Roman lexical element in Albanian, and the little Ancient Greek element, indicate an origin from the Thracian Carpi beyond the northeastern borders of the Empire, in the Carpatho-Danubian areas where Romanization could have been averted. Although Russu himself reported Pedersen's argument according to which precisely the large Latin influence and the small Ancient Greek influence speak in favor of the Illyrian origin of Albanian, the question arises why Russu ignored the fact that the large Latin influence actually indicates the location of Albanian within the Roman world and not outside it. Russu's linguistic analysis obviously has errors, and above all his mode of argumentation goes even beyond Hasdeu's romantic narrative.