User:Coriordan7/sandbox

Domari is an endangered Indic language, spoken by older Dom people scattered across the Middle East and North Africa. The language is reported to be spoken as far north as Azerbaijan and as far south as central Sudan, in Turkey, Iran, Iraq, Palestine, Israel, Jordan, Egypt, Sudan, Libya, Tunisia, Algeria, Morocco, Syria and Lebanon. Based on the systematicity of sound changes, we know with a fair degree of certainty that the names Domari and Romani derive from the Indic word ḍom. The language itself actually derives from an Indo-Aryan language. It shares many similarities to Punjabi and Rajasthani, two languages that originated in India. The Arabs referred to them as nawar as they were a nomadic people that originally immigrated to the Middle East from India.

Domari is also known as "Middle Eastern Romani", "Tsigene", "Luti", or "Mehtar". There is no standard written form. In the Arab world, it is occasionally written using the Arabic script and has many Arabic and Persian loanwords. Descriptive work was done by Yaron Matras, who published a comprehensive grammar of the language along with an historical and dialectological evaluation of secondary sources (Matras 2012).

Domari is an endangered language and is currently being shifted away from in younger generations, according to Yaron Matras. In certain areas such as Jerusalem, only about 20% of these Dom people, known as “Middle Eastern Gypsies”, speak the Domari language in everyday interactions. The language is mainly spoken by the elderly in the Jerusalem community. The younger generation are more influenced by Arabic, therefore most only know basic words and phrases. The modern-day community of Doms in Jerusalem was established by the nomadic people deciding to settle inside the Old City from 1940 until it came under Israeli administration in 1967 (Matras 1999). '''

Domari is classified as a shifting language at level 7 on the Intergenerational Disruptional Scale, meaning that the language can be used by the child-bearing generation but is not being passed down to children. At the time of the last large linguistic study of Domari in 2012, the language was estimated to have between 50-70 fluent speakers. However, in present day, the number is estimated to be closer to 10. '''

Dialects
The best-known variety of Domari is Palestinian Domari, also known as "Syrian Gypsy", the dialect of the Dom community of Jerusalem, which was described by R.A. S. Macalister in the 1910s. Palestinian Domari is an endangered language, with fewer than 200 speakers, the majority of the 1,200 members of the Jerusalem Domari community being native speakers of Palestinian Arabic.

Other dialects include:
 * Nawari in Syria, Jordan, Lebanon, Israel, Palestine and Egypt.
 * Kurbati (Ghorbati) in Syria and western Iran
 * Helebi in Egypt, Libya, Tunisia, Algeria and Morocco
 * Halab/Ghajar in Sudan.
 * Karachi (Garachi) in northern Turkey, northern Iran and the Caucasus
 * Marashi in Turkey
 * Barake in Syria
 * Churi-Wali in Afghanistan
 * Narikurava in southern India

Some dialects may be highly divergent and not mutually intelligible. Published sources often lump together dialects of Domari and the various unrelated in-group vocabularies of diverse peripatetic populations in the Middle East. Thus there is no evidence at all that the Lyuli, for example, speak a dialect of Domari, not is there any obvious connection between Domari and the vocabulary used by the Helebi of Egypt (see discussion in Matras 2012, chapter 1).

The small Seb Seliyer language of Iran is distinctive in its core vocabulary.

Status
Jerusalem Domari is fluently spoken only among the elder generation in the Dom community. These nomadic people have been bilingual for many generations, however recently there has been a language shift towards the dominant geographic language, Arabic. In the 1940s, the Dom began to abandon their nomadic culture and began settling and working in the local economy. This led to the next phenomenon, the assimilation of Dom children in the primary school system which marked the first generation to grow up in an academic environment alongside Arab children. Consequently, this 1940 generation do not fluently speak the Domari language. Arabic replaced their native Domari and became the language of cross-generation communication. In Jerusalem, it is estimated that there are about 600-900 members of the Dom population in Jerusalem. Less than 10% can effectively communicate in Jerusalem Domari.

Comparison with Romani
Domari was once thought to be the "sister language" of Romani, the two languages having split after the departure from the Indian subcontinent, but more recent research suggests that the differences between them are significant enough to treat them as two separate languages within the Central zone (Hindustani) group of languages. The Dom and the Rom are therefore likely to be descendants of two different migration waves out of India, separated by several centuries.

There are nevertheless remarkable similarities between the two beyond their shared Central zone Indic origin, indicating a period of shared history as itinerant populations in the Middle East. These include shared archaisms that have been lost in the Central Indo-Aryan languages over the millennium since Dom/Rom emigration, a series of innovations connecting them with the Northwestern zone group, indicating their route of migration out of India, and finally a number of radical syntactical changes due to superstrate influence of Middle Eastern languages, including Persian, Arabic and Byzantine Greek.

Orthography
Since Domari is a minority Middle-Eastern language for a specific community of speakers, it did not have a standard orthography for many years; therefore many writers have used differing spelling systems (similarly to what happened with Ladino). Most Middle-Easterners used the Arabic script, while scholars made do with a modified Pan-Vlakh Latin-based alphabet.

Modified Pan-Vlakh orthography
In 2012, Yaron Matras used such a system in his recent publications on this subject where the Pan-Vlakh orthography served as a basis, with several modifications:
 * Romani j changed to y
 * Romani c use limited to the accented form č for /tʃ/, the /dʒ/ counterpart being denoted by dž
 * Doubled vowel letters for long vowels (aa ee ii oo uu)
 * Diphthongs denoted with vowel pairs (ai au ei eu oi and so on ... )
 * Additional letters in use for Semitic-derived words and names (ḍ ḥ ṣ ṭ ẓ ġ q ‘ ’ and so on ... )

Pan-Domari Alphabet
A new Semitic-flavored Latin-based pan-alphabet has recently been introduced by some scholars for the purpose of codifying written Domari.

The Pan-Domari Alphabet, which was invented in 2015, is a Semitic-flavored simplification of the previous Matras notation:
 * Y is used for /j/, and w for /w/—like in English
 * X is used for the sound /x/—the well-known guttural {kh} of Greek, Russian, and Middle Eastern languages
 * Q stands for /q/, the uvular plosive sound heard in the Semitic languages
 * Circumflexes are used to mark long vowels <â ê î ô û> and certain fricative/affricate consonants <ĉ ĝ ĵ ŝ ẑ> (={ch gh j sh zh})
 * Underdots under letters represent pharyngeal(-ized) consonants <ḍ ḥ ṣ ṭ ẓ> (IPA /d̪ˤ ħ sˤ t̪ˤ zˤ/)
 * Other letters include þ (thorn) and ð (edh) for the interdental fricatives /θ ð/, the characters <ʾ> (ʾalef/hamzaʾ—IPA /ʔ/) and <ʿ> (ʿayn—IPA /ʕ/), and the letter <ə> for the vowel sound shəwaʾ.
 * The diphthongs are now denoted by vowel + approximant digraphs .

The Pan-Domari Alphabet is shown in this table:

NOTES

§ Spelling alternates are shown for certain of these sounds (i.e.:  when typing on an ASCII or typeweriter keyboard, or when/where computers cannot show the proper accented Domari letters); these alternates are also used on the KURI’s Learn Domari article series.

1 The letter fe may be sounded either as a labiodental /f/ or a bilabial [ɸ] fricative, depending on the context, or origin of a given word/name.

2 The letter ĝe usually represents a voiced velar fricative /ɣ/, but may be sounded as a velarolaryngeal [ʁ] in words/names derived from Arabic, Persian, and Urdu.

3 The letter ne usually represents a voiced dental nasal /n̪/; however, it manifests as a velar [ŋ] before the letters g ĝ k q x, but as a palatal [ɲ] before the letters ĉ ĵ y.

4 The letter re represents a flapped [ɾ] or a trilled [r] rhotative resonant continuant, depending on the position within a word/name, and whether it appears singly or doubly.

5 The letter ve shows up mainly in words and names derived from foreign loans, and may represent either a voiced labiodental /v/ or a voiced bilabial [β] fricative.

6 The letter xe (pronounced as KHEH) usually represents a voiceless velar fricative /x/, but usually is sounded as a velarolaryngeal /χ/ one in scores of loan words/loan names which are derived from Arabic, Persian, and Urdu.

7 The vowel letter called ŝǝwaʾ (its name derives from the cognate Hebrew vowel point for this very same sound) represents the mean-mid central spread neutral vowel as it exists in the English words about, taken, pencil, lemon, and circus. While its normal manifestation is indeed [ə], it may vary in the direction of either a higher-mid [ʌ] or a fronted lower-mid [ɜ] one, depending on the dialect spoken.


 * SPECIAL NOTE*: The plain (unaccented) letters c and j are only found in foreign loan words and loan names, as shown in the above table.

Vowels
There are five main vowel sounds, however this inventory shows the variation and quantity of short vowels. Most are interchangeable with a vowel sound next to it, however all of the sounds produced above are identical to the local Palestinian Arabic (Matras 1999).

Consonants
Most of these consonants are influenced by Palestinian Arabic such as gemination; however, consonants such as [p], [g], [tʃ] and [h] are not found in the local dialect. There is speculation among linguists that these sounds are considered a part of the pre-Arabic component. Alveopalatal affricates such as [tʃ] and [dʒ] are also consonants that differ in sound from Arabic. However, these affricates are often considered interchangeable with sibilants [ʒ] and [ʃ], with the recent trend toward use of the simpler sibilants by many speakers.'''

Domari's voiced glottal stop is a unique aspect of its inventory, a phoneme that is often considered impossible in the International Phonetic Alphabet.

Stress
The biggest difference in expression of language between Arabic and Domari is where the stress is placed. Arabic has phoneme-level stress while Domari is a language of word-level stress. The Domari language emphasizes stress on the final syllable, as well as grammatical markers for gender and number. Most nouns, besides proper nouns, adopted from Arabic sound distinct because of the unique stresses in Domari (Matras 1999). Domari is thought to have borrowed a lot of words and grammatical structure from Arabic; however, this is not entirely true. Complex verbs and most core prepositions did not transfer into the realms of grammar of the Domari language. The syntactic typology remains independent of Arabic influence. It also important to note that the numerals used by the Doms were inherited from Kurdish. Even though Domari was influenced by local Arabic, the language also felt the impacts of Kurdish and certain dialects of Iranian in the grammar of the language.

Syllable Structure
Most Domari roots are constructed with two or three syllables. The five main types of syllable structures for forming word roots are CV, CVC, CCV, CCVC, and VC, with examples of each shown below. Clear trends dominate which consonants occupy which phoneme positions, with many onset clusters showing [r] ,[l], or [f] in the second position. On the other hand, no onset clusters contain [t], [ʈʃ], [dʒ] or [l] in the first position.

Numerals
Here is a table of the numerals (1-10, 20, and 100) in Hindi, Romani, Domari, Lomavren, and Persian for comparison.

Verb Derivation
Domari verbs are derived from non-verbs, including both nouns and adjectives, and from Arabic verb roots.

The suffix -(h)o-/-(h)r- derives verbs from adjectives, while the suffix -k(ar)- derives verbs from nouns.

In the example below, wida, the Domari root for 'old', is marked with the suffix hr- to produce the entity 'grew old.' widahra

old-VBZ-M

'He grew old.'

These two verbalizing suffixes are also used when a verb root is borrowed from Arabic.

In the example below, the Domari marker, -hr, is added as an ending to the Arabic verb root for 'marry,' dẑawwiz. ū day-os dẑawwiz-hr-i ekak

and mother-3SG marry-VBZ.PAST-F one

'And her mother married someone.'

Negation
In order to express negation, a suffix, -e, is added to the end of the verb, whereas a prefix, in-, is added to the beginning of the verb.

For example: yaini in-kar-ad-e' masakl-e       mai hukum-e-ki wala did hukum-e-ki

PTCL NEG-do-3PL-NEG problems-OBL.F with government-OBL.F-ABL nor against government-OBL.F-ABL

'Well, they don't cause any trouble [either] with the government nor against the government.'

Tense, aspect and modality
The system of expressing tense-aspect modality in Domari affects the construction of verbs in two dimensions. The first, the aspectual dimension, marks the difference between the present or non-perfective stem and the past or perfective stem. The difference is represented by the presence or absence of a perfective marker. The second dimension, the temporal dimension, is represented by vowel markers found at the end of the verb.

Two external tense markers are used in Domari to mark the temporal category. The first, -i, marks the progressive tense, which denotes a proximate, ongoing activity or event. The second, -a, marks a remote event that one cannot see or experience in the present context. The remote marker, when used with the present stem, is used to characterize the imperfect past, (karam-a 'I was doing, I used to do') and conveys a remote, or inaccessible event that has not been completed yet. The remote marker, when used with the past stem, is used to characterize the pluperfect past, (kardom-a 'I had done, I would have done') and denotes a remote, or inaccessible event that has been completed.

The lack of one of these two external tense markers can also be used to mark a change in tense. For example, the use of the present stem in the absence of an external tense marker marks the subjective tense (karam 'that I do, I should do').

Noun Morphology
The largest class of nouns in Domari are simple morphemes that are not constructed from any morphological processes. Nouns that are constructed through morphological derivation are made mostly with category-changing suffixes.

Gender Inflection
A distinct class of nouns in Domari are inflectionally marked for gender, for example gori 'horse', and kuri ' house' both have the feminine ending -i. On the other hand, mana 'bread' and qrara 'Bedouin' both have the masculine ending -a.

Plurality
Plurality of nouns is expressed with the marker -e, as shown below.

ex: xudwar-e

Child-PL

'Children'

The affix used to express plurality in the oblique case is -an. An example is shown below.

lak-ed-om xudwar-an

see-PAST-1SG child-OBL.PL

'I saw the children.'

Indefiniteness
The category of indefiniteness is marked in nouns with the suffix marker, -ak.

Gony-ak

sack-INDEF

a sack

Pronouns
The nominative 1st and 2nd person pronouns in Domari are derived from Indo-Aryan pronouns. Domari has ama and atu in the singular form, and eme and itme in the plural.

Demonstrative pronouns are another common class of pronouns in Domari used to identify a particular entity and often to remove any ambiguity as to who or what the narrator is referring to. Demonstratives can be used to distinguish objects and actors that are physically present in the conversation taking place. In the example below, a man refers to his wife with the demonstrative eraki, but the physical presence of his wife enables the listener to identify her by this pronoun.

day-os er-a-ki māmi-m dir-i

mother-3SG this-OBL.F-ABL aunt-1SG daughter-PRED.SG

'This one's mother is my paternal cousin.'

In some cases, Domari also uses an additional set of morphemes, called enclitic subject pronouns, that attach to the interrogative pronoun or to the presentative particle. These pronouns are only used for the 3rd person and inflect for both gender and number, as shown below:

Kate-ta? - where is he? Kate-ti? - where is she? Kate-te? - where are they?

Case
Case marking on nominal categories in Domari can be categorized into two major groups, Layer 1 and Layer 2 case inflections. Layer 1 case markings are generally used to distinguish between the nominative and oblique in nouns and demonstrative pronouns. Layer 2 case inflections take on different semantic roles including the benefactive -ke, sociative/comitative -san, locative -ma, ablative and prepositional -ki, and dative -ta/-ka.

An example is shown below, which demonstrates how "the boy" can take on different ending depending on whether it is acting as the subject, the direct object, or the indirect object. The -as marking is used to indicate the oblique case, indicating that the boy, the object is being acted upon by the noun. In the third example, the -ke marking is used additionally to indicate that the boy is receiving something, in this case, he is receiving the words of the person speaking.

er-a ŝōna

arrived.PAST-M boy

'The boy [subject] arrived.'

lake-d-om  ŝōn-as

see-PAST-1SG boy-OBL.M

'I saw the boy[direct object].'

pandži ŝir-d-a   ŝōn-as-ke

3SG   say-PAST-M boy-OBL.M-BEN

'He said to the boy[indirect object]'

Bilingual Suppletion
In Domari, bilingual suppletion occurs when constructing comparative adjectives. The Arabic word is borrowed for the comparative and superlative form of adjectives. Tilla in Domari is the word for 'big', but the Arabic word, akbar, is borrowed to say 'bigger.'

Particles
In Domari, particles can be used to indicate quotations, interjections, and modality. Many Domari particles have Arabic origins. The particle for marking quoted speech is qal, which is actually the Arabic past-tense 3SG verb for "he said". The Arabic particle yimkin, used to say "perhaps," also serves the same particle function in Domari.

Basic Word Order
The basic word order of Domari is mixed, although the majority of lexical verbs come before direct and indirect objects, and existential verbs are generally found in the final position. For example: hu ka-št-a taz-a masi

he eat-PROG-3.SG fresh-M meat

'He eats fresh meat.' gand	        gulda hi

sugar sweet is

'Sugar is sweet.'

Structuring Verbal Clauses
In Domari, however, it is not generally a default word order that regulates the ordering of the verb relative to other constituents". Instead, verbal clauses are structured such that the pre-verbal field and post-verbal field serve distinct functional roles. The post-verbal field contains the majority of the information in the clause, as it brings the verb arguments, often the direct object, and the motivation for the action, the purpose. In default object order, the direct object generally precedes the indirect object. The pre-verbal field is often less occupied, as seen in the example below, in which the pre-verbal field is empty.

However, the pre-verbal field can be used to name the principal actor in the situation and also to convey temporal and locative information, as it does in the example below.

In structuring the post-verbal field, the given information is generally placed closer to the verb than the new information.

Structuring Other Constituents
In structuring the genitive-possessive in Domari, the pattern is generally:

Head-PossSG + Modified-ABL

An example is shown below:

Kury-os	       +	 kazz-as-ki

House	           +	 man-OBL.M-ABL'

'The man’s house'

In structuring the comparative form of adjectives, the comparative precedes the head noun, as in the example below:

yaʕni ama akbar min nadẑwa-ki di wars

PTCL I bigger from Najwa-ABL two year

'So I am two years older than Najwa'