Wikipedia:Reference desk/Archives/Language/2014 May 22

= May 22 =

Chinese help
What are the characters in these images?
 * File:Ultimate PC & Mac Gallery denies being involved in the Edison Chen scandal.jpg
 * File:2.10 demonstration.jpg

Also, how would you say the following in Mandarin Chinese?
 * "No we only sell pirated software"
 * "denial of involvement in the Edison Chen celebrity sex photo scandal posted outside Ultimate Mac"

Thanks, WhisperToMe (talk) 11:05, 22 May 2014 (UTC)
 * The largest banner headline says 社會民住連線. Note it's traditional Chinese, not simplified Chinese. That's all I have time/patience to do. If you're stuck for translating something you can read but can't type in you can learn to write Chinese. You can then copy a char into an OS that lets you enter characters by stroke (iOS and Mac OS both do, I don't know which versions of Windows support this). Doing this is much easier than it looks as strokes always go in a particular order, your writing doesn't have to look nice, while the OSes predictive capabilities help a lot in resolving ambiguities (letting you choose between likely chars) and guessing what you mean long before you've finished writing.-- JohnBlackburne wordsdeeds 12:01, 26 May 2014 (UTC)
 * Thank you for looking into it! I'll check for that kind of software WhisperToMe (talk) 14:50, 28 May 2014 (UTC)

Polish word "żółć"
Is the Polish word "żółć" (with four letters) the shortest longest word (in any language) in which each letter has at least one diacritical mark? —Wavelength (talk) 19:48, 22 May 2014 (UTC)
 * Did you mean the longest? ---Sluzzelin talk  19:50, 22 May 2014 (UTC)
 * Yes, I did, and I am now correcting my question. Thank you.
 * —Wavelength (talk) 20:04, 22 May 2014 (UTC)
 * Someone (half-seriously) suggests things like "Ćśśśśśśśś" in this discussion, but I realize that's not quite what you meant (though "Ćśśśś", five letters, does get almost 18,000 hits). "żółćżeż", also mentioned in that thread, sounds interesting, but one of the seven letters lacks a diacritic, unfortunately. ---Sluzzelin talk  20:20, 22 May 2014 (UTC)


 * This says that a Romanian word can have up to 4 diacritics. So, if there were a 4-letter Romanian word with each letter containing a diacritic, it would match żółć as the longest so far. --   Jack of Oz   [pleasantries]  20:26, 22 May 2014 (UTC)


 * Thank you, Sluzzelin and Jack of Oz, for your replies and links. Incidentally, the Vietnamese word được (with four letters) contains four diacritical marks, but two of them are on one letter, so this word is not one in which each letter has at least one diacritical mark.
 * —Wavelength (talk) 14:07, 23 May 2014 (UTC)
 * I think you might have some luck with Skolt Sami. I've found the words ââ´ǩǩ and čää´čč on this word list, though how we count the freestanding acute accents I'm not sure. Kahastok talk 21:18, 23 May 2014 (UTC)
 * Your wish is my command. A Romanian word for 'teat', 'nipple', or 'udder' is țâță. Angr (talk) 21:36, 23 May 2014 (UTC)
 * There you go then. --  Jack of Oz   [pleasantries]  06:55, 24 May 2014 (UTC)


 * I'm wondering whether there might be a Japanese word of at least four characters from the set がぎぐげござじずぜぞだぢづでどばびぶべぼぱぴぷぺぽ. 86.190.50.244 (talk) 19:53, 23 May 2014 (UTC)
 * Thank you. I had not considered syllabaries or the dakuten.
 * —Wavelength (talk) 20:17, 23 May 2014 (UTC)
 * The Japanese noun ジグザグ (jiguzagu) means "zigzag".
 * —Wavelength (talk) 00:51, 24 May 2014 (UTC)


 * Oh yeah, good one! Shame we can't have the plural ジグザグズ. 86.190.50.244 (talk) 02:16, 24 May 2014 (UTC)
 * Czech has the verb šířit, whose second person singular is 5-letter šíříš and the third person singular is 4-letter šíří. I think the longest such word could be found in a language such as Czech, Slovak, Latvian, or some Turkic or Sami language, because their alphabets contain both enough vowel and consonant letters that are diacriticized. I wouldn't expect to find it in Vietnamese (where đ is the only diacriticized consonant), Polish (where four out of six diacriticized consonants do not occur before vowels), or Romanian (despite the țâță example given above). --Theurgist (talk) 00:20, 24 May 2014 (UTC)


 * You're right for Vietnamese. The longest possible string is three, as in đương, because none of the possible syllable finals has a diacritic. Itsmejudith (talk) 09:39, 24 May 2014 (UTC)
 * Let's still not rule out Polish as strongly as we did it with Vietnamese. It's not too hard to find 3-letter solutions and 4-letter and 5-letter near-solutions (having only one non-diacriticized letter) in Polish. Still, not all of the combinations bewteen the six consonants and the three vowels that have diacritics are permitted by the language's orthographical rules. --Theurgist (talk) 14:51, 25 May 2014 (UTC)


 * I was looking at a collection of Native American texts a few months ago and some languages struck me as having an inordinate amount of letters, both consonants and vowels, with diacritics. Unfortunately I didn't have a lot of time with the collection and had to focus on the task at hand, so I can't remember which particular languages used the heavily "diacriticized" alphabets, although I'm pretty sure they may have been from the Athabaskan family. This might be a place to look for a longer word that meets your criteria.--William Thweatt TalkContribs 05:44, 24 May 2014 (UTC)


 * Does a vowel mark in an abugida count as a diacritic? —Tamfang (talk) 07:19, 25 May 2014 (UTC)


 * No, it does not. According to Abugida (version of 20:39, 22 May 2014), "[i]n general, a letter of an abugida transcribes a consonant. Letters are written as a linear sequence, in most cases left to right. Vowels are written through modification of these consonantal letters, either by means of diacritics (which may not follow the direction of writing the letters), or by changes in the form of the letter itself."   (I am revising some indentations in this section, in harmony with WP:TPOC, point 8: Fixing format errors.)
 * —Wavelength (talk) 15:23, 25 May 2014 (UTC)
 * I don't see how that quote supports your assertion "No, it does not". On the contrary, it says "Vowels are written... by means of diacritics", which would seem to suggest the answer to Tamfang's question is "Yes, it does". Angr (talk) 17:33, 25 May 2014 (UTC)
 * Yes, a vowel mark in an abugida counts as a diacritic, and I understand why I interpreted the text incorrectly.
 * —Wavelength (talk) 17:46, 25 May 2014 (UTC)


 * Did Wavelength subtly scold me for somehow violating the indenting convention? I used one colon to respond to the OP; ought I to have used some other number? —Tamfang (talk) 07:50, 26 May 2014 (UTC)
 * This is a record of my revision of 17:46, 25 May 2014 (UTC).
 * —Wavelength (talk) 15:04, 27 May 2014 (UTC)


 * In that case—why not include abjads? Abjads that include vocalization can have multiple (usually optional) diacritics on every character of fairly commonplace words. הסרפד  (call me Hasirpad) 22:42, 25 May 2014 (UTC)

Interesting discussion. I wonder if any of it disproves the claim by the Guinness Book of World Records that the archaic French spelling of Héréhérétué, an atoll in French Polynesia, is the word with the most accent marks in any language (the claim is mentioned in our article on "Hereheretue"). — SMUconlaw (talk) 01:37, 26 May 2014 (UTC)


 * It surely isn't. In languages employing agglutination and vowel harmony such as Turkish, it's not hard to find examples such as gözlükçülüğü "his work of an optician" or üçüncülüğü "his state of being third" (each having 7 accented letters). Some languages with recently devised orthographies may use diacritics to represent tones - check out the Wikipedias in Yoruba, Sango and Navajo to get an idea about the extent to which these languages make use of modified letters. --Theurgist (talk) 09:10, 26 May 2014 (UTC)
 * Perhaps what the Guinness Book of World Records meant was acute accents and not accent marks (i.e., diacritics) in general? — SMUconlaw (talk) 09:15, 26 May 2014 (UTC)
 * Here too you can find examples in agglutinative languages (though these aren't words you'll find in most dictionaries): "újjáválaszthatóságáról" (Hungarian for "about his or her re-electability") has seven. As for Héréhérétué, French has another, more common word which is normally (i.e. not archaically) spelled with five "é"'s: hétérogénéité. ---Sluzzelin  talk  10:18, 26 May 2014 (UTC)
 * Perhaps you should update the "Hereheretue" article. — SMUconlaw (talk) 10:55, 26 May 2014 (UTC)
 * Done. Kahastok talk 11:22, 26 May 2014 (UTC)


 * If I'd been smart enough to think of combining cases and possessive forms before Sluzzelin's comment about Hungarian reminded me of that, I'd have come up with üçüncülüğümüzü "our state of being third" (definite accusative), where the accented letters are 9, or even with the hypothetical predicative formation üçüncülüğümüzmüşsünüz "you (pl.) were our state of being third", which - unless I've got the declension wrong (I'm not too good at Turkish) - contains as many as 12 diacriticized letters. --Theurgist (talk) 13:54, 26 May 2014 (UTC)