Wikipedia:Reference desk/Archives/Language/2023 January 25

= January 25 =

Many questions
--40bus (talk) 20:00, 25 January 2023 (UTC)
 * 1) Is there any script where numerals are cased?
 * 2) Does Spanish allow word-final consonant clusters?
 * 3) In Latin, numbers 18 and 19 are formed by subtracting 2 and 1, respectively, from 20: duodēvīgintī and ūndēvīgintī. Similarly, numbers 28 and 29 are again formed by subtraction: duodētrīgintā and ūndētrīgintā. Each group of ten numerals through 100 follows the patterns of the 20s but 98 is nōnāgintā octō and 99 is nōnāgintā novem rather than *duodēcentum and *ūndēcentum respectively. The question is: is there any Romance language that has retained this system?
 * 4) Is there any language which allows sonorant + obstruent combinations word-initially, like nsub, lkok and rpur?
 * 5) Is there any language in Europe which allows velar nasal [ŋ] in onset?
 * 6) Is there any script where letters have more than two cases?
 * 7) Does writing directions depend from language or from script?

The only question I can answer is #7. It depends on language. Most languages are written from left to right. But Hebrew is written from right to left. Georgia guy (talk) 20:18, 25 January 2023 (UTC)
 * Uhm, no, that's wrong. It clearly depends on the script – whenever the same language can be written in, say, both Hebrew and Latin, or both Arabic and Cyrillic, etc., then invariably the Hebrew/Arabic version will be right-to-left, and the Latin/Cyrillic version will be left-to-right. Conversely, I'm not aware of any script that's written left-to-right in one language but right-to-left in another (there's some variation in Chinese, but that's a stylistic variation within one language, not between languages). Fut.Perf. ☼ 20:45, 25 January 2023 (UTC)


 * Prior to about 500 BCE, Ancient Greek, using early versions of the Greek alphabet, were written right-to-left. See  and .  This changed over time to Boustrophedon before flipping around to the left-to-right style we see today.  As noted at History of the Latin script the early latin writing could go right-to-left as well.  So, there have been examples of languages using the Latin and Greek scripts that were written right-to-left.  -- Jayron 32 20:59, 25 January 2023 (UTC)
 * True, of course. But that doesn't contradict the generalization: At any given point in time, it was a property of the script that it could be written R>L, L>R, boustrophedon or any combination of them. Even then, to the best of my knowledge, these habits wouldn't have differed significantly between languages, such that two languages shared the same alphabet but one of them consistently wrote it R>L and the other L>R. Fut.Perf. ☼ 08:03, 26 January 2023 (UTC)
 * Most writers are right-handed, everywhere in the world, and it doesn't seems to have much importance on the direction of the script. Some scripts nowadays are prone to change their writing direction depending on the context, such as Chinese or Japanese scripts. 176.128.237.169 (talk) 18:31, 28 January 2023 (UTC)
 * I don't think it's context as much as Western impact. Traditionally, the scripts have been written with horizontal linesgoing from right-to-left, but when written vertically they're now mostly written left-to-right, due to Western influence. 惑乱 Wakuran (talk) 19:37, 28 January 2023 (UTC)
 * Yes in fact, so it depends in what context the text is written. The separatedness of each grapheme also makes the script's direction easier to change. 176.128.237.169 (talk) 16:48, 29 January 2023 (UTC)


 * About word-initial [ŋ] in Europe, just look up the article on velar nasal: the list shows you examples from Albanian and Irish. Fut.Perf. ☼ 20:49, 25 January 2023 (UTC)
 * In Albanian it's an allophone of /n/ before /ɡ/, while in Irish it's the result of an initial mutation taking place after certain determiners, prepositions, etc, and therefore cannot occur after a pause (would be the same in other Celtic languages). --Theurgist (talk) 19:25, 26 January 2023 (UTC)


 * About word-final consonant clusters in Spanish, the answer is very rarely, more than 70% of the syllables are open. It normally only happens in learned borrowings from Latin or modern loanwords. For instance, trans, bíceps, or post. Qoan (talk) 21:08, 25 January 2023 (UTC)
 * Qoan, I'm sure Spanish in general is derived from Latin. Georgia guy (talk) 21:11, 25 January 2023 (UTC)
 * There is a difference between words that evolved alongside of the rest of Spanish as it diverged from Latin, and more modern direct borrowings from Latin. -- Jayron 32 21:12, 25 January 2023 (UTC)
 * Regarding #3: This page covers all of the major Romance languages except Romanian. Wikipedia has an article titled Romanian numbers which covers that one.  The answer appears to be likely not.  -- Jayron 32 21:12, 25 January 2023 (UTC)
 * Jayron32, you can find a Romanian number by looking up the English name for the number in Wiktionary (example: two) then going to the Translations template to find the Romanian word for that number. Georgia guy (talk) 21:19, 25 January 2023 (UTC)
 * You could do that. Or you can click the link I provided.  Which I think is easier.  But you do you.  -- Jayron 32 21:58, 25 January 2023 (UTC)
 * Ad #4: The Calabrian 'Ndrangheta comes to mind, which at least looks the part. Not sure to what extent the n is actually pronounced. This and other calabrian words appear to be cases where an initial unstressed vowel has been dropped (aphaeresis (linguistics)). --Wrongfilter (talk) 22:02, 25 January 2023 (UTC)
 * To #4, in Slavic languages like Russian and Polish, word-initial clusters of /l/ or /r/ + obstruent are quite common. Nasal + obstruent clusters aren't common in Slavic as far I know. In other languages, apparent nasal + obstruent clusters may often actually be prenasalized stops or have syllabic nasals, but perhaps there are some with true nonsyllabic nasal + obstruent clusters. —Mahāgaja · talk 22:17, 25 January 2023 (UTC)
 * I have not noticed Slavic examples of initial liquid+obstruent; share a couple? —Tamfang (talk) 08:34, 26 January 2023 (UTC)
 * Lviv, Rzhev --82.166.199.42 (talk) 08:40, 26 January 2023 (UTC)
 * And, for an example that is nasal + obstruent (despite what I said above), Mstislav. —Mahāgaja · talk 12:15, 26 January 2023 (UTC)
 * I would not call those quite common. There are isolated examples of various etymologies, chiefly toponyms, found across Slavic languages. For Serbo-Croatian, see e.g. Rgotina, wikt:rđa 'rust' or wikt:rdeč 'red' for Slovene (both coming from wikt:Reconstruction:Proto-Indo-European/h₁rewdʰ-. No such user (talk) 13:13, 26 January 2023 (UTC)
 * Quite the opposite, these are the regular outcome of Havlík's law, and toponyms are a small minority among them. For more examples, see the descendants of wikt:Reconstruction:Proto-Slavic/lъbъ (e.g. Polish łby) or of wikt:Reconstruction:Proto-Slavic/mъxъ (e.g. Polish mchy) or of wikt:Reconstruction:Proto-Slavic/rъtъ (e.g. Czech rty) or of pretty much any Proto-Slavic lemma starting with lъ or mъ or rъ --2A02:5080:1C00:C700:E17D:2B63:D229:1308 (talk) 19:05, 26 January 2023 (UTC)
 * In Polish and East Slavic the /r/ is part of the syllable onset, but in Serbo-Croatian it's a syllable on its own, as it would be in the other Slavic languages with a syllabic /r/: Macedonian, Czech and Slovak. But the OP asked for word-initial combinations, not for syllable onsets, so it technically qualifies. However, Slovene doesn't, because such an ⟨r⟩ is in fact pronounced /ər/. --Theurgist (talk) 19:14, 26 January 2023 (UTC)
 * However, Slovene doesn't, because such an ⟨r⟩ is in fact pronounced /ər/. The truth is out there concerning this one; there certainly is an epenthetic schwa somewhere, and it is clearly audible at Forvo. It is not recorded in e.g. Kostelski slovar, which apparently uses some IPA-derived notation. For example, I don't hear a significant difference (apart from tone) in Slovene and Serbo-Croatian pronunciations of e.g. tržnica. No such user (talk) 10:55, 27 January 2023 (UTC)


 * Relating to #3 (though not answering it) there are several other examples of languages where the numbering system changes as you approach a multiple or power of ten. One example that come to mind is Finnish yhdeksän "9" which is yksi ("1") with a suffix, and kahdeksan "8" which is kaksi ("2") with the same suffix. Another example is the tens in Russian. Apart from 40, which is unique, the numbers 20-80 are transparent compounds of unit-ten, eg шестьдеся́т (šestʹdesját) "60" = шесть (šestʹ) "6" + де́сять (désjatʹ) "10" (minus the soft sign at the end). But "90" is девяно́сто (devjanósto), which clearly starts with a form of де́вять (dévjatʹ) "9" and appears to end with сто (sto) "100", but nobody's sure what the bit in the middle is. --ColinFine (talk) 22:33, 25 January 2023 (UTC)


 * For #1, perhaps see text figures. This is not the same thing as letter case, but it has a similar visual effect. Shells-shells (talk) 22:49, 25 January 2023 (UTC)


 * Style guides often recommend not beginning a sentence with a number in digits because digits can't be capitalized. Years ago I was reading an article in Reader's Digest, Canada edition, and noticed that it was using a different rule.  It generally used old-style digits, but if a sentence began with a number in digits, that number was written in lining digits.  So a sentence like "800 to 900 people were affected" would look something like "800 to goo people were affected".  I have never seen this done anywhere else. --142.112.220.65 (talk) 05:55, 26 January 2023 (UTC)


 * Re #1: Roman numerals in Latin script may be in lower case (i, ii, iii, iv, ...) and upper case (I, II, III, IV, ...). --Lambiam 23:08, 25 January 2023 (UTC)

As for no. 1, Unicode now has both upper-case and lower-case versions of archaic Greek letters used only as numbers in modern times (koppa, sampi, stigma/digamma), but the usage justification for this is not entirely clear (some might say that some of the case forms were basically invented by Michael Everson)... AnonMoos (talk) 23:16, 25 January 2023 (UTC)
 * Unicode's "usage justification" is pretty much "yeah, alright, sounds legit" and not much else. If you don't believe that, the history of Emoji basically boils down to the fact that one mobile phone operator in one country, in the age before international standardization, decided to include some pictograms in its phone character sets for text messaging on its phones, and skip ahead a few years and suddenly, emoji are in Unicode.  Unicode's current architecture allows for about 1.1 million characters.  As of now, there are currently just shy of 150,000 of those potential characters in use.  They're not really hurting for space, so they can afford to play a bit fast and loose with usage justification.  -- Jayron 32 00:11, 26 January 2023 (UTC)
 * They decided to merge the CJK ideograms based on Chinese hanzi, though. I remember that some Asians got angry that they didn't do the same for the Greek-derived alphabets of Latin and Cyrillic, and the reply was that it would be too technically complicated to adjust for the different Lower Case standards. 惑乱 Wakuran (talk) 17:45, 26 January 2023 (UTC)
 * Wakuran -- In the 1990s, Chinese (both Beijing and Taipei) and Koreans were all basically satisfied with the results of Unicode efforts up to that point, while Japanese were often dissatisfied. There were some attempts at setting up character sets rival to Unicode which would be more satisfactory for Japanese needs, but I can't find anything about them by searching now, so I doubt they flourished... AnonMoos (talk) 01:14, 28 January 2023 (UTC)
 * Are you thinking of TRON code and Mojikyō? Double sharp (talk) 23:15, 31 January 2023 (UTC)
 * Ancient Greek writing had no cases, but capital numerals were used in printed modern Greek publications predating Everson, as seen here in a book published in 1867. For the majuscules of archaic letters, the typographers appear to have been inspired by versions of the archaic (monocase) shapes, basically enlarging or raising them to caps height. --Lambiam 11:50, 26 January 2023 (UTC)

4: Georgians are extremely permissive about consonant clusters, so that they have Mtkvari flowing through Mtskheta, etc. Nasal+obstruent onsets are typical for Maghrebi Arabic, so that they have Mhraïer, Msila, etc. --82.166.199.42 (talk) 07:51, 26 January 2023 (UTC)
 * Nasal+obstruent onsets are also common in Albanian, so that they have Mbrostar, Ndroq, etc. --82.166.199.42 (talk) 11:07, 26 January 2023 (UTC)

6: Unicode includes several tricase sets in Latin Extended-B: Ǳ ǲ ǳ, Ǆ ǅ ǆ, Ǉ ǈ ǉ, Ǌ ǋ ǌ; each such digraph is considered a single letter of Gaj's Latin alphabet. --82.166.199.42 (talk) 08:29, 26 January 2023 (UTC)


 * In Arabic, letters can have up to four different forms (or cases), depending on where they occur in a word: beginning, middle or end, or stand-alone. There's no such thing as a capital letter in Arabic, however. Xuxl (talk) 16:18, 26 January 2023 (UTC)

4: Initial nasal+stop is quite common in Bantu languages, but it's a syllabic nasal. —Tamfang (talk) 08:34, 26 January 2023 (UTC)

8. Is there any Romance language where $⟨ce⟩$ and $⟨ci⟩$ are pronounced [ke] and [ki]? 9. Why name of letter G in English is [d͡ʒiː] and not [giː] --40bus (talk) 21:29, 27 January 2023 (UTC)

8. Not that I know of. Palatalization seems to have appeared very early. 9. Norman French impact. Haven't you asked about this about a gillion times already? Are you still not satisfied with the answers? 惑乱 Wakuran (talk) 21:56, 27 January 2023 (UTC)
 * Wakuran -- Sardinian is somewhat of an exception; see Sardinian_phonology... AnonMoos (talk) 01:19, 28 January 2023 (UTC)
 * Aahh. Sardinian still seems to follow the Italian orthographic standard with the spellings che and chi, though. 惑乱 Wakuran (talk) 03:04, 28 January 2023 (UTC)
 * Dear 40bus, I have already explained several times that the modern English lettername [dʒiː] is A COMPLETELY REGULAR PHONOLOGICAL DEVELOPMENT of the ancient Roman lettername [ge] along a path through Vulgar Latin, Old French, Middle English, and the Great Vowel Shift. What part of "COMPLETELY REGULAR PHONOLOGICAL DEVELOPMENT" are you having trouble understanding, and why are you basically wasting the time of people here with your meaningless persistence?? AnonMoos (talk) 01:10, 28 January 2023 (UTC)
 * Why didn't English borrow letter names from Dutch? --40bus (talk) 21:08, 28 January 2023 (UTC)
 * The Dutch were asking too much rent. --Lambiam 21:52, 28 January 2023 (UTC)
 * Why would it? Really, you've been told repeatedly that asking why a language developed in one way rather than another isn't very productive. This is especially true of English, which is absolutely loaded with irregularities.--User:Khajidha (talk) (contributions) 22:53, 28 January 2023 (UTC)
 * I note that in Norman French pronunciation, the letter G had a /dʒ/ variant (as opposed to the modern French /ʒ/), and the name of letter was presumably /d͡ʒe/ like in Ecclesiastical Latin from 9th century France, contemporaneous with the Normans settling in Normandy. That's not to say that the pronunciation of the letter name in English is due to Norman influence: looking at English words of French origin, many of the dates given are a couple of centuries later than the Norman invasion, although that may merely be the date of the earliest available evidence for the words. But it's a possible reason, since the Normans didn't invade the Low Countries (which remained under the control of the Franks), although they did invade Italy, and in Italian the letter name is pronounced /d͡ʒi/ (and they have the /dʒ/ variant of G in words, of course). So this is somewhat OR but I blame the Normans.  Card Zero  (talk) 05:36, 29 January 2023 (UTC)

40bus -- If you've been paying any attention at all, then you know that French heavily influenced Middle English. Dutch had a moment of semi-influence on English in the late 15th century, resulting in "h" being added to the spelling of "ghost", and a few other minor things. However, the influence of Dutch on English has been nothing compared to the influence of French on English. If you think that it should have been the other way around, then you need to write an alternate-history novel or something, because disregarding past replies to questions in your future questions really does not accomplish anything whatsoever... AnonMoos (talk) 17:11, 29 January 2023 (UTC)


 * How about an alternate history in which 40bus absorbed the answers … —Tamfang (talk) 19:27, 29 January 2023 (UTC)

10. Is there any Balto-Slavic language where voiced sonorants also trigger voicing or voiceless obstruents, like nonsense word sleda would be pronounced [zleda] rather than [sleda]?
 * Polish phonology gives examples such as kot rudy [kɔd‿ɾudɨ] --Crash48 (talk) 08:33, 31 January 2023 (UTC)

11. Is there any language which allows consonant clusters differing only in voicing like [gk], [xɣ] and [ll̥]? --40bus (talk) 20:30, 30 January 2023 (UTC)
 * English, across morpheme boundary, as in upbeat --Crash48 (talk) 08:33, 31 January 2023 (UTC)
 * And inside morphemes? --40bus (talk) 21:47, 1 February 2023 (UTC)


 * The thing to understand about language change is that, while it may follow certain historical patterns those patterns are arbitrary and not predictable. Which is to say that, we can look at the past and come up with a pattern that describes some kind of linguistic change, such as Grimm's law, but what we can't do is say why Grimm's law happened, which is to say there is no way to set up a predictive theory of language change, such that one could say what languages will do and why they will do it.  When you ask questions like "Why didn't (something happen)" it will always be an unanswerable question from a linguistic perspective.  Language change may follow patterns, but it is always arbitrary.  -- Jayron 32 12:51, 31 January 2023 (UTC)

Cased numerals: that used to be the case in English, in the sense that lining figures were used in all-cap text, e.g. headlines, and text figures in normal text. (That's the same environment that gets you the third Unicode casing form with digraphs.) But not capitalizing the first digit in a sentence, or a proper name

Besides Albanian and several Italian languages with #/ŋɡV/, there's the Samoyedic languages such as Nenets, which #/ŋV/ in eastern dialects. — kwami (talk)