User:Jasy jatere/Diacritics

Fleshing out the differences
(this section is not a part of the above proposal, but a separate deliberation on the understanding about the level on which use of English in naming conventions applies)

As I understand it, there is disagreement about the level on which UE applies. At least the following possible levels of application can be distinguished, starting from broad and going down to very narrow interpretations.


 * 1. UE applies to English as a linguistic system of form-meaning correspondences. Case in point: The Hague to be prefered over Den Haag.
 * 2. UE applies to the script English is written in, i.e the Roman script. Case in point: Thessaloniki to be prefered over Θεσσαλονίκη. If not further narrowed down, Meißen to be prefered over Meissen
 * 3a. UE applies to the subset of Roman script comprising all letters known to English speakers plus diacritics on them. Cases in point: René Descartes to be prefered over Rene Descartes, but Meissen to be prefered over Meißen, because e is known to speakers of English, but ß is not.
 * 3b. UE applies to all letters that can be input on a standard QWERTY keyboard in any variant of US_xxxx. Basically the same as one above, diacritics are no problem but ß is.
 * 4a. UE applies to the subset of Roman script known as the English alphabet (a-zA-Z). No diacritics. Case in point: Rene Descartes to be prefered over René Descartes, Charlotte Bronte to be prefered over Charlotte Brontë, Cerenkov radiation to be prefered over Čerenkov radiation, menage a trois to be prefered over ménage à trois.
 * 4b. UE applies to all letters that can be input on a standard QWERTY keyboard in the "basic" variant of US. Same as one above.

I think that everybody agrees on 1. and 2., the question is whether wikipedia should accept to narrow the interpretation down to 3. or 4. Jasy jatere (talk) 12:52, 23 February 2008 (UTC)
 * I certainly do not agree with (2) Meißen to be preferred over Meissen --Philip Baird Shearer (talk) 14:38, 23 February 2008 (UTC)
 * You are right, I could have been more precise. I meant everybody agreed on Thessaloniki to be prefered over Θεσσαλονίκη, The treatment of Meissen depends on the position on 3 and 4. I edited 2 so that it you should now be able to agree with it, I think 80.61.183.71 (talk) 16:13, 23 February 2008 (UTC)
 * I prefer Thessalonica myself. This entire framework omits the position we presently state, and actually follow: do whatever English does: Meissen, but Groß Gerau. Septentrionalis PMAnderson 17:39, 23 February 2008 (UTC)
 * this is not a framework, the list is just intended to collect the arguments made on this talk page over and over again ad infinitum. If we have some structured table where the pros and cons are listed, it will be easier to refer to
 * UE applies to all letters that can be input on a standard QWERTY keyboard in any variant of US_xxxx. Basically the same as one above, diacritics are no problem but ß is.
 * ß can be typed on US_intl by pressing AltGr and s, and what else can be typed on US_intl is dependent on operating system and browser, so that point is completely flawed. - MTC (talk) 08:00, 24 February 2008 (UTC)

I think we may be losing the raison d'etre for the standard described as common English usage, which is usability. There is therefore a balance to consider: common English usage for well-known places, native usage for less well-known places or ones with no wide usage of a transliterated name (or where diacritics are just stripped off out of habit, not for any other reason). I've mentioned elsewhere, there are many places in central Europe where only the current native spelling really makes sense to strip away Latin to Cyrillic to Latin double transliterations, etc. and where there are multiple English "variants" even in use today for a single place or geographic feature. At the end of the day, we want the reader to be able to (a) read article and (b) go to a book store, buy a map, and look up the places they read about. For much of Eastern/Central Europe, that involves native spelling. Without considering usability, the arguments on both sides are being conducted in a vacuum. My Windows keyboard is set up for English, Latvian, Lithuanian, Estonian, Romanian, Polish, German, and Russian (a bit more of a challenge to use). For the most part,atin alphabet "makes it impossible to vvrite all vvords vvith the cvrrent orthography". If in your interpretation the Latin alphabet in the current wording does not represent the Ancient Latin, nor does it represent "any straightforwatin alphabet "makes it impossible to vvrite all vvords vvith the cvrrent orthography". If in your interpretation the Latin alphabet in the current wording does not represent the Ancient Latin, nor does it represent "any straightforw right-alt (GRE) plus the letter gets the right version as the large majority of letters only have one diacritic in each language, e.g., ļ in Latvian, ł in Polish, so writing place names natively is not an issue anymore. Just a thought. —PētersV (talk) 02:26, 27 February 2008 (UTC)

arguments in favor of 2., not beyond

 * "correct" spelling
 * diacritics serve to disambiguate topics e.g. rose/rosé, Dauphine/Dauphiné, Bø/Bo/Bő/Bō (from Knepflerle) Jasy jatere (talk) 16:56, 25 February 2008 (UTC)
 * there is no well-defined transliteration scheme to get rid of diacritics. Should Jürgen become Jurgen or Juergen? (from knepflerle)Jasy jatere (talk) 17:32, 25 February 2008 (UTC)

arguments in favor of 3., not beyond

 * people get confused by ß, but not by à and é
 * some transliteration systems make use of letters that are in Roman script, but not within the range [a-zA-Z], for instance International Alphabet of Sanskrit Transliteration, which has ū Ū ṛ  Ṛ	ṝ  Ṝ	ḷ  Ḷ ḹ  Ḹ, among others. There is no Anglicization of Sanskrit. Depriving lemmas of their diacritics would go against common scientific practice in the field since 1912. Jasy jatere (talk) 16:39, 23 February 2008 (UTC)

arguments in favor of 4.

 * basic layout on QWERTY computers. Users find it difficult to change layouts
 * people get confused by à and é
 * people cannot pronounce à and é
 * English only uses those letters
 * This is the alphabet taught in school in English-speaking countries
 * wikipedia users cannot be required to learn other languages
 * some people's browsers display boxes when encountering rare diacritics, like a w with a dieresis ("), i.e. ẅ.

Please include additional arguments above in the appropriate sections, but please discuss the validity of arguments below to keep the list of arguments neat. Jasy jatere (talk) 12:52, 23 February 2008 (UTC)

argument "correct spelling"
To me, this argument is paramount. An encyclopedia is meant to educate people about the facts, and not to dumb the orthography down. — Nightstallion 13:49, 23 February 2008 (UTC)
 * To me, this argument is fallacious. The correct English spelling of Meissen is Meissen, as the correct English spelling of Nuremberg is Nuremberg, and the correct English spelling of Florence is Florence. Meißen, Nürnberg, Nuernberg, Firenze are linguistic facts, probably worth mentioning in the relevant geographic articles, but not their titles. This is one of the things interwiki links are good for; one of the problems with hypercorrection is the damage to Wikipedia as a whole. (Where is a German Wikipedian to find out about English preferences for Nuremberg, Meissen, and Florence if not here? His own wikipedia is unlikely to tell him; any more the Italian Wikipedia mentions Kalifornien.) 17:48, 23 February 2008 (UTC)
 * The argument "correct spelling" applies only where there is no English exonym, i.e. an English name for the town (or other entity). For Firenze and Nürnberg it is very clear that the correct way of rendering them in English is Florence and Nuremberg respectively. The question is what wp should do with towns and persons that do not have a clear-cut English name, like Düren. Should the diacritic remain or should it go? This is where the argument "correct spelling" kicks in. It argues that the most "correct" way to spell that town's name is "Düren" and that "Duren" would be "wrong"Jasy jatere (talk) 18:30, 23 February 2008 (UTC)
 * My point exactly. I'm 100% sure about Meissen vs Meißen, but I could be wrong there. — Nightstallion 12:39, 24 February 2008 (UTC)


 * The claim about exonyms is almost a tautology; if the correct English spelling is not the local one, the English word must be an exonym ("almost" covers Paris, where the exonym and the endonym are spelled alike). Therefore this is substantially true, and entirely trivial. Septentrionalis PMAnderson 23:11, 24 February 2008 (UTC)
 * Let's distinguish the cases 1) where foreign language and English are completely different (Firenze/Florence), 2) where they differ in the use of a diacritic, but the one without diacritic is the well established English name (Zurich/Zürich) and 3) where English has no name (Düren). The argument "correct spelling" will only apply to 3). How one can establish that English has no name for an entity will most likely be circular, in the sense that one checks texts and only finds the version with diacritics. On the other hand, the absence of an English exonym should be the null hypothesis for very local topics, say a small river in Turkey like Üçköprü.
 * All this amounts very much to the same as "Use what a majority of English sources use", I certainly agree. Use of a differently spelled exonym in English sources trumps local spelling. Still, technical limitations, notational convenience, or ignorance causing a different spelling do not create an instant exonym. Only if reputable sources with editorial skill use a different spelling will the name be a exonym (like Zurich, for instance) Jasy jatere (talk) 17:24, 25 February 2008 (UTC)
 * The claim here is that English cannot adopt Zürich, but only Zurich, as the English name. I do not believe this; English has adopted Göttingen.


 * The practical difference from what we do now would be to always adopt the non-local spelling where there is one; not only Zurich, but Lyons and Leghorn. I am tempted to endorse this, if only to watch the nationalist screaming when it is enforced; but even I think it more trouble than it is worth. Septentrionalis PMAnderson 21:37, 25 February 2008 (UTC)
 * I actually do think that your suggestion to use the non-local spelling if there is one is a good one, provided that this spelling is not the result of technical limitations, notational convenience, or ignorance. And I do think that technical limitations, notational convenience and ignorance are common even among widely read sources if their focus happens to be sth else than the origin of the person under discussion.Jasy jatere (talk) 16:09, 26 February 2008 (UTC)
 * I guarantee that if that phrasing is adopted, the cry will go up that Zurich is the result of "technical limitations... or ignorance".
 * Well, that can be countered by http://www.flughafen-zuerich.ch/ZRH/default.asp?ID_site=1&sp=de&hp=1, which has the ü in German but not in English. The airport people are neither ignorant, lazy or technically limited, as their German page proves
 * I'm not quite sure what notational convenience means here, although it sounds like several perfectly good reasons to adapt a foreign word. Septentrionalis PMAnderson 05:32, 28 February 2008 (UTC)
 * I use "ss" as a notational convenience when typing German when I don't care about the ß. Some people never use capitals out of notational convenience. Very similar to laziness, I think Jasy jatere (talk) 14:31, 29 February 2008 (UTC)

argument "people get confused by ß"

 * This is correct as a claim of fact; readers do. Every one (IIRC) of the Meissen discussions has included someone inveighing against Meiben.


 * This should not be surprising: the eszett combines two forms, the long s and the ornate z, both of which are archaic in English; indeed, confusing long s with f is a standard joke. Readers of English with no German (and, by WP:NC, our titles should be chosen for them) will not recognize as a combination of two Latin letters, and if by some chance they do, they will read it as sz - which was correct in Goethe's time, but apparently not now. Septentrionalis PMAnderson 23:08, 24 February 2008 (UTC)

argument "well established transliteration systems use diacritics"
Finnish is transliterated using "diacritics", or in this case the Finnish letters ä, ö and å. All English-language materials produced by the Finnish government or its agencies will use ä, ö and å in personal names and placenames. Any proposal to omit these in Wikipedia is contrary to practice by all informed translators of Finnish. Elrith (talk) 01:36, 1 March 2008 (UTC)

argument "people get confused by á, è, î etc"

 * á is not very different from a. Most people will make the link between á and the base letter.Jasy jatere (talk) 16:32, 23 February 2008 (UTC)
 * Argument against: Is it really clear to the hypothetical uninformed reader that ı is a variant of i while ḷ is a variant of l? Is it clear to him that ð is a variant of d and not of o? Is it clear that ǫ is a variant of o and not of p? Is it clear what letters į, ų and y are variants of? Are these letters really any less confusing than ß or ə? Haukur (talk) 22:46, 23 February 2008 (UTC)
 * maybe one needs to distinguish between '`"~^ (non-touching and rather common) and ogoneks (touching and unfamiliar). I think that ḷ is a variant of l and not of some other letter is the null hypothesis, same for ı being a variant of i. I would not consider ð a (straightforward) variant of d since the shape of d is not present in it.Jasy jatere (talk) 17:30, 25 February 2008 (UTC)
 * Yes, there are clearly some tricky issues here. For my part I think ı looks a bit like a short l and that ḷ looks a bit like an upside-down i. My main point is that I don't think there is a clear dividing line between a) variants of a-z and b) other letters present in Latin alphabets. I'm fine with using them all, ß included. But of course we should still use familiar English forms where those exist, e.g. Gauss rather than Gauß and Thor rather than Þórr. Haukur (talk) 23:32, 25 February 2008 (UTC) Corrected. I had the Gauss/Gauß example in the wrong order - hope that doesn't make non-sense of NS's agreement. Haukur (talk) 09:43, 26 February 2008 (UTC)
 * I agree, yes. — Nightstallion 07:39, 26 February 2008 (UTC)
 * What's the point. Pro-diacritics editors have had control of English Wikipedia for awhile & they'll never give up that control. English speaking/reading editors will have to contiue to suffer uner the oppression. GoodDay (talk) 17:15, 27 February 2008 (UTC)
 * How does reading Niinistö instead of Niinisto oppress you? In my mind, it would be more sensible to argue that a Finnish speaker having to read Niinisto would be a form of cultural oppression. It may be news to you, but English is an international language and is not the property of countries that have made it their official language. Elrith (talk) 01:38, 1 March 2008 (UTC)

argument "keyboard problems on QWERTY"

 * There's no requirement to spell English correctly to contribute - other editors tidy up, and will add diacritics. It's how they got into the articles in the first place. (from Knepflerle) —Preceding unsigned comment added by Jasy jatere (talk • contribs) 16:15, 25 February 2008 (UTC)

argument "pronunciation difficulties"

 * Szczecin is surely harder to pronounce, even without diacritics, than Kraków Jasy jatere (talk) 12:52, 23 February 2008 (UTC)

argument "English does not use other letters"

 * Britannica, Encarta, the BBC, the New York Times, Time magazine do use these letters (from Knepflerle) Jasy jatere (talk) 16:54, 25 February 2008 (UTC)

argument "wikipedians cannot be required to learn foreign languages"

 * Where is the evidence that readers have to know French to read the word rosé, pied-à-terre, Besançon ? (from knepflerle) Jasy jatere (talk) 17:00, 25 February 2008 (UTC)