Wikipedia talk:Naming conventions (standard letters with diacritics)/Archive 1

So what actually is this saying?
Noticeable by its absence is any indication of what the author actually wants to do with diacritics.

Are we intending to encourage or discourage their use?

FWIW my opinion is that articles should have titles which are correct, and alternate spellings should be catered for using REDIRECTs labelled with R from title without diacritics. HTH HAND —Phil | Talk 11:16, 30 January 2006 (UTC)


 * I concur with your proposal. &mdash; Nightstallion (?) 12:34, 30 January 2006 (UTC)


 * I think that it says that all diacritics used in modern languages should be used, while some diacritics typical only to old, inactive languages that have new form of spelling now should be avoided. It also excludes languages using other than Latin alphabet. The proposed label might be added to the policy.--SylwiaS | talk 19:35, 30 January 2006 (UTC)
 * I fully agree. For example, "Jaromir Jagr" is an incorrect spelling; when it is used, it is only for technical reasons. Wikipedia is supposed to be an encyclopedia, so it should give correct information. - Mike Rosoft 01:35, 8 February 2006 (UTC)
 * Ditto. Priority in choice of spelling for a proper name (e.g., of a ruler or other person, or of a geographic entity) should be given to that regarded as correct by the people with whom the name originates.  That is the practice in serious scholarship, and we should not patronize any parties involved by talking down to them and homogenizing to the least common denominator.  logologist|Talk 07:22, 8 February 2006 (UTC)
 * I know I'm repeating myself here, but -- 100% support from my side. &mdash; Nightst a  llion  (?) 11:53, 8 February 2006 (UTC)
 * Totally false. Jaromir Jagr, the spelling routinely used in thousands upon thousands of newspaper, television, and magazine reports about this person, is most definitely not a misspelling.  It is never a misspelling to use the English alphabet when writing in the English language.  We have every bit as much right to use our own alphabet as anybody else does.  We can certainly choose to include diacritics in his name; to do so because not to do so would be a misspelling is patently false.  Gene Nygaard 10:13, 5 February 2007 (UTC)


 * Gene, I have you read that argument from you many times. Wards kan bee speled rong youing tha English alphabet wen riting in tha Inglish langwage two. Bendono 12:29, 5 February 2007 (UTC)

From here after the first version of the guideline formulation had been written down --Francis Schonken 22:23, 8 February 2006 (UTC)

With provision 4 this seems like a reasonable proposal. I see no harm it can do. But perhaps it would be useful if the author (or anybody else interested in this) could show us few examples where this policy would actually force us to rename an article? I see little point in creating a policy that would do nothing. This has to 1) solve some current disputes 2) make us rename some articles.--Piotr Konieczny aka Prokonsul Piotrus Talk 17:21, 8 February 2006 (UTC)
 * Note that according to the proposal, anybody who wants to write an article on a subject which should include diacritics, let's say the Łobzów district of Kraków would have to do BOTH of the following.

But don't take my word for it. Let's ask the author of the proposal (or anybody else) whether any of the following pages would have to be moved if this was accepted: Stefán Ingi 19:52, 8 February 2006 (UTC)
 * 1) Find 10 reliable publications about the district written in English which only use the diacritics version when referring to the place.
 * Please bear in mind here that less than 1% of wikipedia articles provide 10 references.
 * 1) Do a google search (or something equivalent) and from that make a convincing argument that 20% of all English webpages referring to Łobzów outside of wikipedia use the diacritics. Here is a test but note that even if I tried to exclude wikipedia, still at least two of the first 10 google pages are copies of wikipedia articles.
 * Straße des 17. Juni (German)
 * Jyväskylä (Finnish)
 * Alliance française (French)
 * Tromsø (Norwegean)
 * the brothers Ó Siochfhradha (Irish)
 * Gàidhealtachd (Scottish Gaelic)
 * Antonín Dvořák (Czech)
 * Łódź (Polish)
 * Davíð Oddsson (Icelandic)

@Piotr: The only question is if it would make a positive difference for wikipedia: if it would bring down the number & length of discussions about diacritics that would already be something; if it would bring down the number of WP:RM requests (either by additionally defining some types of moves as "non-controversial", or even better, because nobody would even see the need any more to move or to WP:RM certain diacritics-related pagenames) that would be a good point in itself, wouldn't it? Even today a new Village pump topic was started with emotions going high: Wikipedia:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article (quote: "Man, I feel like the bottom man in a dogpile.", etc). If that could be tackled more easily by a guideline many people feel easy with, then that's enough for me. Even if not a page needs to be changed.

@Stefan: --Francis Schonken 22:23, 8 February 2006 (UTC)
 * "ß" and "ð" are presently outside the scope of this guideline, see Naming conventions (standard letters with diacritics). The present guideline wouldn't change a thing about Straße des 17. Juni and Davíð Oddsson, neither, for example, about Weißenhof or Weissenhof...
 * "Łobzów" wouldn't be so difficult. Appears Wladyslaw IV Vasa was born there. It appears to have a famous garden. It appears to have a lot of restaurants presently. So I could find at least 10 references in English for it having a lot of restaurants. If you'd reject these (wikipedia is not a tourist guide), even finding 10 non-wikipedia references in English for the "Wladyslaw IV birthplace" + "the famous garden" seems not too difficult. Anyhow, I had a higher Google result on "Łobzów" than on "Lobzow". Further, note that at the WikiProject Geography of Poland naming conventions are being built presently (there's a vote going on at the moment): so in the future there might come a guideline that takes over (unless that guideline says nothing about diacriticals).
 * Antonín Dvořák might have to move back to Antonin Dvorak (at first sight less than 20% of google results, but would need further checking), but then he's without diacritics at the Radio Prague website (the same website spells of course Antonín Dvořák on their pages in the Czech language,, so this is not about "technical limitations"); further, he lived a few years in America (and many Americans would of course throw the diacritics overboard when they had to write his name); and at Wikipedia his compositions are disambiguated with "(Dvorak)" and not by "(Dvořák)", except his 7th symphony (see Category:compositions by Antonin Dvorak) - so the prevailing spelling is even at Wikipedia "Antonin Dvorak" - wouldn't hurt to see all these a bit on the same line would it? Note also that there is a WikiProject Composers - if they decide to make a naming conventions guideline the naming could be settled on the Czech name too - I wouldn't have a problem with that! Even Naming policy (Czech) could be policy within a week or so...
 * Alliance française has far above 20% of google hits, and as it is an organisation that publishes a lot (and about which a lot is published) 10 reference works wouldn't be a problem, would it? So this would stay where it is.
 * Well, I did three examples now, maybe you do some?
 * Hmm, there seem even more conventions around, limiting the scope of this proposal, than I had realised. So it seems very difficult to forsee which moves this proposal will mandate. That certainly makes me worry. Another problem I have, aside from the instruction creep of asking people to dig up 10 references and combing through google results before they can write an article, is that I just don't agree that we need to have a fallback convention severely limiting the use of diacritics. Infact I don't think the use of them needs to be limited at all in articles on Irish, Norwegean, Polish, Croatian or German subjects and others. And it seems to me that this is a feeling shared by most editors who are actively working on these article and for the most part we should just let them get on with it. So finally, rather than give examples, I would simply suggest that we drop this proposal. Stefán Ingi 23:01, 8 February 2006 (UTC)
 * Concur heartily. I propose one overarching convention:  that the presumption will favor use of authentic spellings, with diacritics, in all cases.  logologist|Talk 23:18, 8 February 2006 (UTC)


 * I strongly oppose that. It is exactly the same type of arguments used by people to argue that an 80-year long American citizen, who never used diacritics in his own name at least as an adult from anything found after extensive arguing, an American educated college physics professor who devised a chess rating scale known throughout the world as the Elo rating system (though the words for "rating system" differ of course, but always "Elo" without any diacritics, in pretty much any language, at least in anything published before the name was butchered and corrrupted by Wikipedia, though sometimes capitalized as ELO by people who mistake it for an acronym),  should be at Árpád Élő.  Furthermore, while the ship's manifest when he left to come to America as a child did include diacritics in his name, it was different diacritics, in addition to two l's instead of one in his surname, in that manifest.  Nobody has ever provided any contemporary to that usage evidence whatsoever as to the spelling of his name before he came to America, other than that ship's manifest information which I provided.  Gene Nygaard 10:44, 5 February 2007 (UTC)


 * Furthermore, that 10 publications bit is ludicrous when it comes to something that has thousands upon thousands of publications in English which use only the form without diacritics. Plus, that very same corruption due to Wikipedia's past misusage can play a role in this usage in other "reliable sources" (which in Wikijargon usage is a property of the publication and not of the accuracy of the contents of that publication) as well. Gene Nygaard 10:44, 5 February 2007 (UTC)

Moving up to guideline
Since the discussion on some related NC proposals (Czech, Swedish,... - see links above) appears to be concluded, I see no further problem to move this general treatment of diacritics up to NC guideline too --Francis Schonken 08:10, 2 March 2006 (UTC)


 * Francis, I think you need to advertise this widely and run a strawpoll before you change it into a guideline. The page has had only 5 editors and that is hardly enough of a consensus to create a guidline which had an impact on lots of articles across en.wikipedia. --Philip Baird Shearer 09:02, 2 March 2006 (UTC)


 * Has been advertised widely, for several weeks (WP:UE, current surveys, Wikipedia:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article]–that discussion was active for several weeks with many more than "only five" contributors–,... and many other places) - having none of these on your watchlist would be a lame excuse, if you're interested in this topic;
 * Doesn't need a straw poll: Polls are evil (the reasons given there are especially to the point in the diacritics debate, think e.g. of the "groupthink" of all the ethnic/nationalist groups of wikipedians)
 * Please don't indulge in useless debates, if you don't have anything to communicate regarding the content of the project page. --Francis Schonken 09:47, 2 March 2006 (UTC)
 * It seems clear to me that there is absolutely no consensus for this proposal. Aside from the objections that I have already stated, I might point out that this would be about as inconsistent as can be imagined. Eg. we should use diacritics for Serbian names according to the Cyrillic convention but would have to trawl through the strict conditions before we could use them for Croatian names. This is despite the fact that these two nations use a language which is very closely releated and uses the same set of diacritics. Also, with the Czech and Swedish proposal pages, both of which Francis started, we would have two islands where diacritics can be applyed regularly (perhaps with a few exceptions) but for every nation which borders these we would have to do the same trawling for any page which wanted to use diacritics.
 * Finally, to reiterate my main point: This proposal goes against the current practice on Wikipedia. Therefore there needs to be demonstrated a lot of support for it before the shift which it dictates is carried out. Stefán Ingi 10:16, 2 March 2006 (UTC)

The five editors of the page (six if you include my reversal). Apart from yourself non has made more than one edit.
 * 11:08, 28 January 2006 Francis Schonken (start)
 * 11:17, 30 January 2006 Phil Boswell (→Scope - )
 * 15:42, 9 February 2006 CesarB (→See also - Wikipedia:Naming conventions (technical restrictions)#Browser support limitations)
 * 08:38, 10 February 2006 TShilo12 (→Other - dab Hebrew, avoid redirects on other languages, changing description of Chinese)
 * 14:33, 20 February 2006 Nikai m (→Rationale - sp)
 * 08:56, 2 March 2006 Philip Baird Shearer (Reverted)

Francis as an old hand in this contriversial area you will be well aware (I certainly am) there are strong feeling among many editors of using Google or other search engines to decide issues like these. I suspect many people would object to such suggestion. Further there are those who argue that blog pages should not carry the same weight as research papers, books and other encyclopaedia entries.

So at the moment I know that you do not have a true consensus for changing this from a proposal into a guideline. Whether you have a consensus is debatable. --Philip Baird Shearer 10:23, 2 March 2006 (UTC)


 * @Philip: your criticism is too absurd for words. It comes down to: "write a lower quality guideline, so that other wikipedians feel massively compelled to edit it". I did invite others to improve the text:
 * "[...] could you have a look at Naming conventions (standard letters with diacritics)? I mean, both w.r.t. (ab)use of the English language and content of the thing, [...] --Francis Schonken 13:56, 14 February 2006 (UTC)"
 * reply: "Looks OK to me, [...] Bill 14:21, 14 February 2006 (UTC)"
 * So, no, Bill does not appear in the list of minor changes to the guideline proposal. Which proves the absurdity of your newly invented method for assessing guideline proposals. To be remarked also that you're moving way out of consensus by even imagining that such flimsy method would be acceptable to the wikipedia community.


 * I know that you have trouble accepting google test *is* part of wikipedia consensus. It's a how-to guideline, so I need not defend that I rely on it. Of course that guideline is a lot about caution re. the application of google test. That's one of the reasons why this NC is only on "standard letters with diacritics": Google is unreliable in filtering out non-standard letters like ß, þ and ð (I commented on that at wikipedia talk:naming conventions (thorn)). The same unreliability does not exist for standard letters with diacritics, see for example naming conventions (Swedish) (but that's of course not the only testing I did to check reliability of filtering out variants of the same word with and without diacritics).
 * Further, it's not "the search engine that decides", as you erroneously try to present it. There's only a check required that the version with diacritics is not totally uncommon (20% is not really a high treshold, and takes account of the internet's bias towards diacritic-less variants). A minimum of references that use the variant with diacritics in English, is a requirement of no lesser stature.


 * @Stefan: Your criticism is inconsistent: first you ask me to prove the guideline changes something (following Piotr who didn't see the need for a guideline if it doesn't change anything), then you reproach me it *would* change some things (on which you exaggerate, but that's another point). Why should I answer to such inconsistencies?
 * It has been established long ago that the same rule for all words with diacritics wouldn't work (the famous diacritics poll). So the "standard letters with diacritics" NC distinguishes between languages, and is also an invitation to come up with NC's for languages that would be problematic (I invited Haukur not so long ago to come with such proposal for Icelandic at wikipedia talk:naming conventions (thorn), of course, if you wouldn't have seen that invitation, I invite you likewise!). I wrote/copied the Czech NC, taken all time together, including talk page, in less than half an hour. I don't think, for example, that a Croatian NC would take more than that. Maybe Croatian isn't even problematic seen its proximity to Serbian (and Serbo-Croatian that has both Latin and Cyrillic spelling? – I'm not that much of an expert in those languages)? - Anyway, if there would be a deeper problem, in that case the "standard letters with diacritics" NC would probably only offer a temporary solution, until the specific guideline is written, but the diacritics NC may help in writing such guideline (like it helped me while writing the Swedish NC - without knowing Swedish).


 * You're trying to make a reproach of me writing some of the NC's for specific languages. How absurd can you be? I wrote them (or a part of them), even collaborated to the hockey NC (as if I know anything about ice hockey). What problem could you have with that? They all settled disputes over *differences of opinion* regarding current practice, and did so as straight derivations of the diacritics NC proposal. So, please, don't make problems where there are none. --Francis Schonken 16:01, 2 March 2006 (UTC)


 * I have thought and still do think that if this were to become policy then it would change many names, or at least make a lot more effort for those defending them. I disagreed with Piotr when he said that it would not change anything. I asked you to confirm that it would change something but the examples I took weren't very good and nobody offered any other examples so it was inconclusive. I'm also saying that since I think that it changes many things, then it has to be shown to have some sort of Consensus before it can be put up as a guideline. As Philip says, whether that consensus exists is debatable, from looking around on Wikipedia I would say that it definitely doesn't, if you disagree then you should offer some evidence for that statement.
 * Also, I wasn't reproaching you for writing these specific guidelines, I was just pointing it out for the benefit of people who were to come to this discussion. Perhaps there wasn't much point in doing that, I'm not sure. I apologise that I worded it in such a way that you took it to imply that I was reproaching you. Stefán Ingi 16:41, 2 March 2006 (UTC)

Spanish accents
I would like to know if this proposal covers Spanish accents. My interest is in regards to accents in a person's name. Should accents be used in the article name if that person's original name has them? Joelito 14:38, 3 March 2006 (UTC)
 * As it stands, this proposal would cover Spanish accents so if you wanted to include them and this proposal were a policy or guideline you would have to go through the motions to justify the accents every time. But, this is just a proposal so instead you might just as well look around you, e.g. on the list of Prime Ministers of Spain and see whether the accents are used there. In the example I'm taking they are. Stefán Ingi 14:47, 3 March 2006 (UTC)
 * Well as the proposal stands it would be very hard to prove points 2 and 3 if the person is not well covered in English publications. For example Eddie Miró, a person known by all the people from Puerto Rico as a television show host has had very little English coverage. It would be very hard to provide references in English for him. Possibly 7-10 relevant English references could be found.Joelito 14:58, 3 March 2006 (UTC)
 * Yes, I think that in many cases it would mean a lot of unproductive work. That's one of the reasons why I am opposed to this proposal Stefán Ingi 17:52, 3 March 2006 (UTC)
 * I think that the burden to prove the spelling should be on the non-native name (English), not the other way around. When it comes to articles about the non-English world, much of the work is done by non-native English speakers, who are more familiar with local spellings. Such work usually gets copyedited and such by native English speakers, and if at that point they think a move to a more English-friendly name is useful, they should do the search and if there is a much more widely used English name variant, a move can be done.--Piotr Konieczny aka Prokonsul Piotrus Talk 15:44, 5 March 2006 (UTC)

Speaking of Spanish accents, there is a small group of people who regularly edit Major League Soccer articles who have decided among themselves, without any other notification or documentation, that every MLS soccer player who now has US citizenship should have no accents in their names, even if they were born in countries where their names would normally be accented. Blank Verse 07:26, 29 March 2006 (UTC)

Japanese
An independant mediation supported a past change to the Japanese MoS so that now the inclusion of macrons in titles of articles with Japanese content is acceptable. (Don't blame me!) Japanese romanization utilizes (ō), (ū), and ('). freshgavin ΓΛĿЌ  03:19, 20 April 2006 (UTC)

Technicality and an alternative proposal
As was raised in the very begining, there is problem with provision "There are at least 10 reliable publications that are fully in English". Besides the question 'what is meant by 'reliable' (and note that on WP:RS there is no specific list or easy 'how to determine' process)' there are many, many cases when there are much fewer then 10 publications. Lots or smaller towns or villages are not mentioned in 10+ English publications, the same problem is with many historical personas who might be notable in their country but are barely (or not at all) known outside. But if the proposal passes, then we will be forced to invent the undiactricized versions of many names, thus for example Okopy Świętej Trójcy would become Okopy Swietej Trojcy because this former village is apparently almost completly unknown to the English-speaking world. Instead I'd like to draw attention to a similar naming convention, which proposes a different approach and seems to attact mostly positive comments. The Naming conventions (geographic names) proposed policy supports the use of English names, but states that if there is no widely accepted English name (with 'widely' being defined later) then local name should be used. I personally believe that this policy is more realistic, and it can be expanded beyond geographic names to other 'rare' names.--Piotr Konieczny aka Prokonsul Piotrus Talk 21:51, 5 June 2006 (UTC)


 * FYI,
 * "[...] the old gates at the Ramparts of the Holy Trinity (Okopy Sw. Trojcy) [...]" can be found on this page: http://www.personal.psu.edu/users/w/x/wxk116/sjk/jazch1.html;
 * From the edit history of Ramparts of the Holy Trinity :
 * 20:10, 27 May 2006 Halibutt m (moved Stronghold of the Holy Trinity to Ramparts of the Holy Trinity: moved to a proper name)
 * [...]
 * 13:00, 15 March 2006 Calgacus m (moved Okopy Świętej Trójcy to Stronghold of the Holy Trinity)
 * [...]
 * 02:50, 15 March 2006 Molobo m (moved Stronghold of the Holy Trinity to Okopy Świętej Trójcy: commonly accepted names(Kulturkampf not Culturecombat for example))
 * 17:58, 10 March 2006 Matthead m (moved Okopy Świętej Trójcy to Stronghold of the Holy Trinity: understandibility in English)
 * [...]
 * 00:06, 1 July 2005 Witkacy m (Okopy Swietej Trojcy moved to Okopy Świętej Trójcy)
 * ...if you ask me not the best example of a stable Polish name... Up till now 20% of the total number of edits to that page have been page moves... For me this rings a bell that maybe a good guideline would be better than this move-warring... no?
 * Then you also make a link to Naming conventions (geographic names) which is at a no-consensus for "proposal F" state, afaik *longer than the diacritics proposal exists* - I wouldn't boast too much on the "near to consensus" state of Naming conventions (geographic names). Anyway, I don't even see "competition" there, its parameters for determining a choice for a name are comparable to those of the more general "diacritics" proposal - and it certainly isn't a guideline that would be less complex to put in practice.
 * For instance, when applying the recommendations of that proposal (version F) to your example, I'd need to check Britannica, Columbia, Encarta, Google Scholar and Google Books. I pick one of these 5 recommended reference sources (Google Books):
 * "Okopy Świętej Trójcy" - did not match any documents.
 * "Okopy Swietej Trojcy" - did not match any documents.
 * 2 pages on "Ramparts of the Holy Trinity"
 * "Stronghold of the Holy Trinity" - did not match any documents.
 * 1 pages on "Okopy Sw. Trojcy"
 * IMHO, this confirms the present name of the article at Ramparts of the Holy Trinity, and not any version of the Polish name. But yeah, true, if the "geographic names" proposal would be guideline I'd need to check 4 more reference sources in the same fashion. For the "standard letters with diacritics" proposal, the case would already been settled: translation appears indicated... no need to discuss about Polish versions with or without diacritics. --Francis Schonken 10:08, 6 June 2006 (UTC)
 * Your search above is actually misleading. The 3 (not 2) pages on Ramparts are actually:
 * shoud be discarded because it does not refers to the village but to the ramparts of the castle
 * This is the same case, note the lower case used in ramparts (surely if it was to be the village's name it would used an upper case?
 * The third source is the only one which capitalizes it.
 * Finally it should be noted that while translation makes sense in the literary text (like #2 or discussions about it like #3) it makes no sence when we are refering to the geographical place.--Piotr Konieczny aka Prokonsul Piotrus Talk 17:30, 9 June 2006 (UTC)
 * My feeling is this: If a particular town or structure isn't being written up in any English publications, then by some standards we shouldn't have an article on it at all, because it doesn't pass the "Notability" test. On the other hand, I do see some advantages to having articles about not-necessarily-in-the-press things, such as schools and towns and some obscure bits of history.  So I'm willing to accept the idea of having an article at Wikipedia about it, as long as the article is titled by whatever the most common English name is.  If there's no English name, then okay, I'd say list the title, but without diacritics. Personally, even though I have a passing familiarity with several languages, I find diacritic names jarring to look at in what is supposed to be an English reference work, because it looks like an article has been written solely for the use of the locals in that area, which makes it less accessible to other nationalities (including people on other continents for whom English is a second language anyway). It's unsettling to see a name that is so clearly unpronounceable to most English speakers.  So I would rather see the title use the non-diacritic version, which is how it would probably show up anyway in an English-language newspaper if they *did* end up writing an article about the town.  And in this case, the problem would be self-correcting.  If a town with an odd spelling did become famous in English-language literature, and genuinely notable, then the Wikipedia article would have a standard to follow:  If the article was showing up in English-language newspapers with diacritics, a move could be requested to the more commonly-used version of the name.  But until then, I'd say let's stick with the "no diacritics unless it can be shown that it's common usage in English" guideline. --Elonka 18:06, 9 June 2006 (UTC)

Related poll
Interested editors are invited to participate in: a poll on whether or not to use diacritics in the titles of Polish monarchs. --Elonka 18:13, 13 June 2006 (UTC)

Major rewrite
I took a stab at simplifying and condensing the proposed guideline to make it easier to read and understand. If I removed anybody's favorite section, please feel free to add it back in. :) --Elonka 06:05, 26 June 2006 (UTC)


 * I think it reads quite well now. Honestly, i think this is all common sense and I would like to see this get moved up one to become a guideline (eventually). Masterhatch 17:51, 26 June 2006 (UTC)

somewhat related
There is a somewhat related poll here Talk:Voss-strasse if anyone is interested in adding their two cents. Masterhatch 17:47, 27 June 2006 (UTC)

Scope addition
As regards the other "letters not included in this guideline", such as þ and Đ and ß, what is the feeling about adding wording such as: "Because of the limited geographic regions in which these letters are used, English-speakers in other parts of the world (especially those for whom English is a second language) often find these symbols incomprehensible and unpronounceable. As a result, this guideline recommends that their use be avoided in article titles." --Elonka 20:46, 27 June 2006 (UTC)
 * That makes sense. Personally, i don't know the difference between the sounds the diacritics make and i am a native speaker of English. Masterhatch 22:36, 27 June 2006 (UTC)

Other options
I hope that we have finally reached the agreement that linking through redirects is not a relevant issue, and that we can concentrate on English usage in Wikipedia articles. There are some important points to be made:
 * Wikipedia conventions are just that, conventions of Wikipedia. They are not natural laws, and we can choose to implement any naming convention we chose, including numbering articles by timestamp at creation or some other scheme. Use English is a convention of Wikipedia because we decided so, not because that's how it must be.
 * There is no central authority on correct usage in English.
 * Popular usage is not always correct - even if most sources refer to Tories, the party's proper name is still the Conservative Party.
 * We chose article titles which we judge as the most appropriate for Wikipedia, not those that are the most correct (United States is properly called United States of America) nor always those which are most commonly expected (China is overwhelmingly used to mean People's republic of China in the real world).
 * There are sources in English which use diacritics regularly, those which use them irregularly, those which use just certain diacritics or use them just in certain languages, and those that don't use them at all.
 * Foreign words used in English text don't automatically become English words.

So, we should approach this question with an open mind. Use English does not require us to not make exceptions for classes of special cases. It also does not require us to use or omit diacritics as neither omitting or including them is wrong in English per se. We should discuss what the advantages and disadvantages of using diacritics are. One obvious advantages I see is providing the additional information. The one real practical disadvantage mentioned so far is that they make it hard to search for the name inside the browser page.

There is the real possibility that both ways are equally correct and that it's a matter of taste, and tastes are hard to change through debate. In such cases, we should be looking for widely acceptable rather than hypercorrect solutions. IMO, we should aim to avoid the ridiculous situations when the spelling of people's names depends on ancd changes with where they currently live or work, or when the same first or last name is spelt differently in different article titles without a clear criterion.

The rewritten proposal is much better than previous attempts, but it has two major problems: (1) it's too long, and (2) it goes against the current practice, which has many supporters, and which will be hard to change, even if this is promoted into a guideline. Keep in mind that putting a guideline tag on something doesn't magically make all articles conform with it nor all editors agree with it.

Other solutions include:
 * Use no diacritics at all
 * This would have the advantage of being short, clear and easy to enforce, but as the current proposal says, it would sometimes force us to use wrong titles even for English names, which makes it unacceptable.


 * Alway use the original spelling
 * This would also be short and clear, but it would be simply wrong for monarchs and many other historic people, which makes it unacceptable.

There are other, more nuanced options, none of which should exclude per-case decisions in special circumstances, nor using English names when they are spelt entirely differently.
 * Use whatever makes more sense, or what the first editor used if no choice is substantially better
 * This is how BE/AE spelling and CE/AD era notations are handled.


 * Use diacritics if the common English spelling is the same as the original one, but without the diacritics.
 * This is more or less how place names are handled.


 * Use the original spelling unless the person has legally adopted the spelling without diacritics or regularly uses it even in their native language.
 * This would cover naturalized citizens or other people who have genuinely changed their name and language identity, while leaving most articles as they are now.

In short, I'd prefer names to be spelt by dafult like properly translated sources in English from the country of origin spell them (i.e. use the local transliteration), but any of the nuanced solutions are acceptable to me. Zocky | picture popups 01:29, 28 June 2006 (UTC)


 * What is this "natural laws" thing you keep talking about? are we discussing physics? Well, we decided to use english because this is the English language section of wikipedia. It would be kinda strange to use korean here, now wouldn't it? anyways, that is why there are multiple language sections on wikipedia.
 * Well, the Brits have their English and the Yanks have theirs. For wikipedia, we blend the two and use the most common form of English.
 * While popular usage isn't always correct, wikipedia has a policy of using the most common form of English in usage because wikipedia is for the layman and it is the most common form of English that the layman understands the best.
 * See above, we use the most common form, unless there is a disambig problem, of course.
 * That is why we go with the most common form. Simple, eh? If a word or name is most commonly written with diacritics, then wikipedia should use diacritics. If it is most commonly written without diacritics, then wikipedia should follow suit. Masterhatch 07:26, 28 June 2006 (UTC)
 * Look the only forseeable solution i can see is that for names, places, etc that diacritics are most commonly used in English, they keep the diacritics here on wikipedia article titles. For ariticles where English most commonly drops diacritics, wikipedia should reflect that. Isn't that simple? How can anyone logically argue against that? That way both sides win. People that like diacritics get them on articles where they are most commonly found in English and people who don't like diacritics don't have to have them rammed down their throat for words that they almost never see them on in daily life. Masterhatch 07:26, 28 June 2006 (UTC)


 * The above was inserted into middle of my comment which made it hard to read. All I can say is that it has been previously established on numerous occasions that reader ignorance is not a valid concern for editorial decisions, so most of Masterhatch's comment is irrelevant. We also know that common usage is one of the factors used for these decisions, not the overriding deciding factor, which makes the rest irrelevant. Zocky | picture popups 10:30, 28 June 2006 (UTC)

Zurich -- (Talk:Zürich/Archive1) --Philip Baird Shearer 07:43, 28 June 2006 (UTC)

I've reverted some of the changes to the proposal:
 * (1) categories don't have redirects: So it's Category:Compositions by George Frideric Handel (without diacritic) if we have the composer at George Frideric Handel, and Category:Compositions by Camille Saint-Saëns (with diacritic) if we have the composer at Camille Saint-Saëns. I tried to draw a bit more attention to the category aspect of being consequent (while categories don't have redirects). Also the section is important, while it draws attention that being consequent only applies to the name of a topic, not to "copying all diacritics of a language".
 * (2) this is NC not MoS : refers to "first contributor" rule (copied from MoS) that was removed by me from this NC proposal, while the "first contributor" rule can only be used if it's competition between varieties that are acceptable in English. In other words, one doesn't fight non-English nationalistic POV by inciting to start as much wrong-named articles as possible, to give way to "I was the first" claims.
 * (3) remove "national varieties" doubling (Irish not nat. var. of Eng) : don't know why the "national varieties of English" link, that was already in the intro of Naming conventions (standard letters with diacritics) was doubled in a rephrased format in one of the subsections. Note that Irish is not a "variety of English" (the rephrased intro was a bit more ambiguous on this point).
 * (4) keep all commented out proposals until further notice : why delete some, and keep some others? Some of the deleted ones were pages in Wikipedia "naming conventions" format, some of those that were kept, were merely a link to an encyclopedia article (so not guidance on how such specifics are handled in wikipedia page names) --Francis Schonken 09:41, 28 June 2006 (UTC)


 * If you realy mean this "After the choice has been made whether a name is written with or without diacritics in a page name, all other Wikipedia pages" then I find it unacceptable. It is a step way beyond WP:NC. The reason for redirects is so that Wikipedia can accommodate other names for the same subject.
 * I see the confusion I generated: changed that to content pages. I hope "content page" is clear enough as a concept, or should "except redirect pages and disambiguation pages" be added to that? --Francis Schonken 11:24, 28 June 2006 (UTC)
 * I think that when there is a dispute over the name which can not be resolved then first is a reasonable compromise. If not how does one decide as a page has to have a name? It cuts down on revert wars while an alternative consensus is reached.
 * WP:RM always leads to a resolution (even if "stalling by lack of consensus"). And it has a slight bias towards "keep where it is" (while 60% is the usual threshold for a move). And part of the rules are, as far as I know, the WP:RM should not start from a place where the page has just been moved to (recently there was still a WP:RM vote broken off for that reason). --Francis Schonken 11:24, 28 June 2006 (UTC)
 * Depends on what you mean by "Irish" lets call it "Irish English" as reduces the ambiguity.
 * Irish English (Hiberno-English, a variety of English) is not the same as Irish language (which is a Goidelic language, so a variety of Gaelic, not a variety of English). "Irish English" is not mentioned at the Irish dab page. Best to avoid confusing terminology. --Francis Schonken 11:24, 28 June 2006 (UTC)
 * Only those which effect National verities of English should be mentioned here. The rest should not because there are potentially hundreds of these and there is no reason why this general guideline should be explicitly subservient other potentially POV laden guidelines like this proposed one: Naming policy (Czech):
 * Czech names: almost all names with diacritics use it also in the title (and all of them have redirect). Adding missing diacritics is automatic behavior of Czech editors when they spot it. So for all practical purposes the policy is set de-facto (for Cz names) and you can't change it.
 * "Only those which effect National verities of English" - the title of the section is "Specific languages using the (extended) Latin alphabet". Neither Irish nor Māori language are a National variety of English. French has a more profound effect on UK English than on US English; it seems also that, for instance, Spanish has a more profound effect on US English than on UK English. But this is not the point of this section (while these US/UK style variants are treated in the MoS). If "böttger ware" turns up in Webster's, with a diacritic as in German, this is not an issue limited to "national varieties of English", but it should be part of Wikipedia's diacritic-related guidelines. FYI, German is a language using "extended Latin alphabet". --Francis Schonken 11:24, 28 June 2006 (UTC)
 * --Philip Baird Shearer 10:16, 28 June 2006 (UTC)

Catering for dumbed-down and lazy usage
A café is a café is a café. It is a French word which we, English speakers, have adopted to mean a particular type of building. The word is not an English word and it would be incorrect to treat it as such--it is a French word used by English speakers. Sure, some people spell it "cafe", some people even pronounce it "caff" - that's fine, local variation is good - language develops and evolves. One day, in a few decades time, the spelling "café" may seem quite alien - at that time, it will have been fully adopted. The word role, once spelt rôle, is an example of a French word which has been fully adopted into English and whose original spelling looks odd to most. This is the distinction between a non-English word in common usage among English speakers, and a fully adopted word. The misspelling café is one thing, but proper names are quite another - "Antonín Dvořák" is spelt one way and there is no alternative. Ultimately, we are trying to write an encyclopedia, doing something somehow authoritative. In a casual e-mail I may miss the diacritics due to laziness and in the understanding that the recipient would understand who I was referring to. We are not writing a casual e-mail, we are not texting our friends and we are not instant messaging our colleagues. As such, we should not treat language as if we were. It would also be wrong to go too far in the opposite direction and become too prescriptive about language, insisting that those who don't use diareses in words such as the verb meander (thus meänder) are somehow wrong or illiterate. Of course, we are still a dynamic work that is able to stay current and appeal to all but we have, at our disposal, a range of tools which enable us to use the correct characters for a wide range of languages more than anyone has ever had in history. Wikipedia is such that if someone doesn't understand, or recognise, a particular character they are able to look it up and educate themselves. We should make the effort to provide truth.

"Diacritics should only be used in an article's title, if it can be shown that the word is routinely used in that way, with diacritics, in common usage" is entirely flawed as a guideline. Common usage varies from continent to continent, from country to country and from culture to culture. I am sure it is common for most people who are writing about Dvořák to spell his name "Dvorak" because they aren't sure how to get that funny "ř" character to appear (this was particularly the case in the days of the typewriter). "Dvorak" would therefore be considered common usage, but this doesn't reduce from "Dvorak" being wrong through-and-through when referring to the man Dvořák. --Oldak Quill 17:17, 28 June 2006 (UTC)
 * In that case, the article about the man should definitely include the proper spelling of his name, in the body of the text. This guideline is not referring to the main article, but strictly to the titles of articles, and trying to come up with a consistent method which allows for ease of linking, reading, and finding, for the vast majority of English speakers.  If I, myself, were looking for an article on the man you mentioned, I would type "Dvorak" into the search box, not something with diacritics. --Elonka 17:36, 28 June 2006 (UTC)


 * I cannot see your reasoning. Surely the title of an article should be spelt correctly? This is particularly the case in Wikipedia as we have redirects which allow us to use correct characters in titles without inconveniencing our visitors. Redirects allow us to both maintain a high standard of spelling and lexical correctness while making the browsing experience easy for the visitor. Potentially, all people who have primarily Cyrillic names could have Cyrillic article titles, redirects would ensure that finding the correct article would be as easy as finding a non-Cyrillic counterpart (Pyotr Ilyich Tchaikovsky, or whichever transliteration we chose to use, would redirect to Пётр Ильич Чайкoвский). Of course, using non-Latin alphabets in titles goes too far for most so it is something we don't do. But diacritics are easy to use and understand, they are part of our Latin alphabet- we should not incorrectly label people and things due to sheer institutional laziness. --Oldak Quill 17:46, 28 June 2006 (UTC)


 * I have to disagree with you Quill. I am strong believer in the most common form of English be used in article titles (except of course in the event of disambig problems). Diacritics, in most cases are foreign to English and if you look around, most people, places, and things that have diacritics in their native language, lose it when mentioned in English. Take Jaromir Jagr for example, his name in Czech include diacritics. if you look around publications in English, the extreme vast majority of times the diacritics are dropped, even on his hockey sweater. The native spelling, with the diacritics, should be (and is) shown in the first line of the first paragraph of the article. The title should be the most common form found in English, whether the most common form includes diacritics or not. Your basic reasoning is that English is spelling the names wrong. Well, i am telling you, English is not spelling it wrong, it is just spelling it English. That is what, in fact, got me into this. I used to not care either way if there were diacritics in titles until someone called the English spelling wrong. That lit a fire in my arse because that is pure ignorance for someone to call an accepted English spelling as wrong. I don't go and say that "États-Unis d'Amérique" is spelt wrong and they must spell it the English way! I understand that French has its spellings and you must understand that English has its spellings. Masterhatch 18:37, 28 June 2006 (UTC)


 * I would say that as long as a name is written in a Latin-derived alphabet and there is no commonly accepted English name (such as Spain for España, Rome for Roma etc) then the name should be written with its original diacritics. A name is a name, it is either spelled correctly or incorrectly and we shouldn't start disfiguring it. The majority of non-English names, whether they are of places or people are too little known in English to have commonly accepted English spellings. Simply because the name of a Czech village or a minor figure from Paraguayan history has diacritics that are not normally used in English doesn't mean they should be removed when written in English as there are no commonly accepted English spellings for such names. Booshank 19:21, 28 June 2006 (UTC)
 * Well, maybe some of those places that aren't well enough known to English speakers aren't notable enough to have their own article. If those places can't be found in an English atlas, then why would wikipedia have an article about it? If there are no English publications in regards to that place (or person), are they really notable enough to have an article? Most small czech villages can be found in a thorough english atlas. If those English atlases have diacritics, then wikipedia should follow suite. If those atlases drop the diacritics, then, again, wikipedia should follow suite. Same with people. If there are no or only a very few English publications about a person, then is he/she really notable enough to have an article? If there are English publications about that person, have a look at them and, again, whichever form is most common (with or without diacritics), then use that form. I have no problem with the use of diacritics if that is the most common way to spell that name in Engish. I only have a problem with the use of diacritics when the most common way of writing that name in English drops those very diacritics. Masterhatch 20:15, 28 June 2006 (UTC)


 * I don't believe a single source should ever be used to push an agenda. Just because a single atlas rejects diacritics (for the sake of space, perhaps) does not mean that we should follow suit. Diacritics should only be excluded if a non-diacriticed version has become more popular in English. On a side point, whether there are any publications on something in a particular language (where there might be plenty in another language) is not a measure of notability. Czech villages should be covered as extensively as British villages in Wikipedia (the latter of which will be covered far more thoroughly in the English language), for example. Articles on small Czech villages which won't be widely known enough among English speakers to have their own spelling variants (regardless of what a particular atlas states) should always be given the correct Czech name. --Oldak Quill 20:53, 28 June 2006 (UTC)
 * Single source? you must have misunderstood my comment. I would never say that a single source is good enough. Anyways, to clarify, for notability sakes, if there is no mention of a small czech village in an English reference book, then i ask, is it really notable enough for an article? But that is side tracking and that is a debate for a different day and a different place, so I won't discuss it further. Back to my point, if most English atlases and reference books aren't using diacritics for a czech town (city, village, person, whatever), then wikipedia shouldn't either. If most English atlases and reference books are using diacritics, then wikipedia should too. Funny thing is, with all this arguing back and forth no one has told me what is wrong with that. It is fair for everyone and it follows the wikipedia naming convention policy to a tee. Masterhatch 02:28, 30 June 2006 (UTC)


 * [Was replying to Masterhatch and had an editing conflict. This is a response to Masterhatch, but I completely agree with Booshank]. Of course "États-Unis d'Amérique" isn't wrong, this example is not analagous to what I have been saying. États-Unis d'Amérique and United States of America are both correct, it is "Etats-Unis d'Amerique" which is not. If one is going to use French words then spell them correctly - E is a different letter to É. In many languages with diacritics, forgetting the diacritic can result in an entirely different word with an entirely different meaning. In Afrikaans, for example, if one forgets the diacritic on the ë in the word "hoërskool" (meaning high school) and so produce "hoerskool", one would be expressing "whore school". This is an example which demonstrates the fact that a letter with and a letter without a diacritic are different letters and to confuse them is to arrogantly thrust the English non-use of diacrtics onto loaned words. Jaromir Jagr chooses to spell his name differently in English, to transliterate his name for an English-speaking sport. That is perfectly acceptable and we use his adoptive English name in articles. But we, as an encyclopedia, cannot thrust new names onto people because it suits us, because it is easier for us to type. Proper nouns, and some adopted words, exist outside the rules of the language by which they are adopted - they are words which should still be treated with the spelling rules of the language from which they came until their usage is so common that those rules are dropped. My name is Oldak Quill and I wouldn't expect speakers of a language which doesn't use "Q" or "Qu" to change my name to 'Oldak Kwill" to give themselves an easier time - it is just incorrect. Dropping the diacritics of a French or Polish word is nothing short of transliteration and is entierly comparable to changing "Quill" to "Kwill". Transliterations can only ever give rough and fuzzy approximations of a word and should therefore be avoided as much as possible.--Oldak Quill 19:17, 28 June 2006 (UTC)


 * You are missing my point (and i am probably missing yours) that we here on wikipedia aren't about changing English, but following the majority of English publishers to arrive at "most common form of English". It is simple, if the majority of English publications drop the diacritics, then wikipedia must follow suit. If the majority of other English encyclopaedias and reputable publishers don't use diacritics for names, why should we on wikipedia? As i have said before, if the majority of reputable English publishers are using diacritics for a given name, then, by all means, include them in the article's title in wikipedia. I am not trying to eliminate diacritics from wikipedia, i am just trying to make sure that wikipeida doesn't stray too far from "the most commonly used form of English".


 * See my reply below. Reputable English encyclopaedias do use diacritics in many, many names. --Oldak Quill 20:46, 28 June 2006 (UTC)


 * Agreeing with Oldak.--Piotr Konieczny aka Prokonsul Piotrus Talk 22:02, 28 June 2006 (UTC)

Not a place for change
Really, i think it is all pretty simple: if the majority of reputable English publications include diacritics, then wikipedia should too. If the majority of reputable English publications don't use diacritics, then wikipedia shouldn't either. It's a case by case situation. What is wrong with that? Honestly, if people have problems with English dropping diacritics, don't come to wikipeida with your grievances. Wikipedia isn't here to change the English language or how English does things. Go to websters or oxford or whoever. Wikipedia is not the place for change. Masterhatch 20:15, 28 June 2006 (UTC)


 * I have no desire to make Wikipedia radically different from other works of reference in English. As I said, if it has come to be that a word without diacritics is more common than one with, then it should be the one used (such as role instead of rôle). What I do object to is purposefully going out of our way to force change by standardly rejecting diacritics when they play a very valid rôle ;) in many words. Further, no works of reference that I know of force change in peoples names to exclude diacritics. Dvořák is Dvořák is Dvořák, there is simply no alternative way to spell this name - the diacritics are not just aesthetic but functional. I am striving to do exactly what you accuse me of doing: all I see in this proposal is an enforced rejection of diacritics for the sake of dumbing-down and laziness. --Oldak Quill 20:46, 28 June 2006 (UTC)
 * I am not standardly rejecting diacritics!! that would be ignorant of me. I will repeat what i have said all along, because people for some reason keep missing it: if most English works use diacrtics, then wikipedia should. If most works don't, then wikipedia shouldn't. This is a case by case situation, not a blanket covering! Someone please tell me why that won't work? Masterhatch 02:28, 30 June 2006 (UTC)


 * Nonsense. There are zillions of reputable English publications discussing Antonin Dvorak and other Dvoraks, and things such as the Dvorak Simplified Keyboard as well as its inventor August Dvorak.  And there are probably as many  English publications using Antonin Dvořak as Antonín Dvořák, so why is the former a redlink as I write this?  Redirects are cheap enough that we could even include August Dvořák for fools who mistakenly think "Dvořák is Dvořák is Dvořák". Gene Nygaard 16:56, 18 August 2006 (UTC)

Dumbed down?

 * Café/cafe : IMHO, it's rather "dumbed down" to state that in English only the version with diacritic is correct:
 * {| class="wikitable" |

! colspan=2 | Webster's 1981 international printed edition ! colspan=2 | OED minidictionary (1994)
 * align = center | café
 * align = center | cafe
 * align = center | café au kirsch ||
 * align = center | café au lait ||
 * align = center | café brûlot ||
 * align = center |
 * align = center | cafe car
 * align = center | café chantant ||
 * align = center | café concert ||
 * align = center | café crème ||
 * align = center |
 * align = center | cafe curtain
 * align = center | café noir ||
 * align = center | café society ||
 * align = center | café concert ||
 * align = center | café crème ||
 * align = center |
 * align = center | cafe curtain
 * align = center | café noir ||
 * align = center | café society ||
 * align = center | cafe curtain
 * align = center | café noir ||
 * align = center | café society ||
 * align = center | café society ||
 * align = center | café society ||
 * align = center | café ||
 * }
 * }


 * I was using café as an example - café is, as far as I can tell, the most commonly used form and the form used by most reputable dictionaries (including the current OED). I did not deny that some people did use the word "cafe" (in fact, I stated that they did) nor that derivative words might use that spelling (such as "cafe curtain"). --Oldak Quill 20:46, 28 June 2006 (UTC)
 * Well, "dumbing down" comes from calling a less used version a misspelling (that's the word you used). even caff (Brit : café ) is in the addenda of the 1981 international printed Webster's, and so not a "misspelling". --Francis Schonken 21:31, 28 June 2006 (UTC)
 * I did not call caff a misspelling, I called cafe a misspelling. I was quite wrong about "cafe" being a misspelling though, but that isn't the discussion at hand. In trying to get my point across I erroniously emphasised something too strongly. All of my points (except calling cafe a misspelling) still stand. --Oldak Quill 22:02, 28 June 2006 (UTC)
 * Tx for taking the point. My next point is that "Antonin Dvorak" is not a misspelling in English. And that was your main point (at least your main example). I'm prepared to discuss whether the article on this composer should be at Antonin Dvorak or at Antonín Dvořák. In fact I already did, twice, as you can see in and  above on this page. As a result of these discussions, among others Category:Compositions by Antonin Dvorak was moved to Category:Compositions by Antonín Dvořák. I'm prepared to discuss the page naming for this composer again, but not on the basis of the "dumbed down" assumption that Antonin Dvorak is a "misspelling". --Francis Schonken 22:29, 28 June 2006 (UTC)
 * "ř" is an entirely different letter to "r". "Dvořák" is a different word to "Dvorak" with a different pronounciation. "ř" produces that particular "j" sound where "r" would produce, well... you know. It is not appropriate to replace "ř" with "r" just because they look similar. --Oldak Quill 23:11, 28 June 2006 (UTC)
 * "Antonin Dvorak" is not a misspelling in English. --Francis Schonken 23:17, 28 June 2006 (UTC)
 * "Antonin Dvorak" is not a misspelling in English. Gene Nygaard 12:20, 5 February 2007 (UTC)


 * Antonin Dvorak : As you might know, this composer lived and worked a few years in the USA. I always wondered how they wrote his name in the music school where he was director at that time? How was his name written on the concert programs when his music was performed during his stay in New York? Does anyone have any info on that? --Francis Schonken 20:11, 28 June 2006 (UTC)
 * That would be a good piece of info, but the spelling at the time is probably not what we want to use as a criterion. It would make us spell a bunch of historic names weirdly, if nothing else. Zocky | picture popups 13:01, 29 June 2006 (UTC)
 * Neither do I intend to do so! Unless when it would be clear that in English, the composer used the diacritic-less version of his name exclusively. In that case we'd have the same situation as for Arnold Schoenberg, who clearly changed his name to a diacritic-less variant when moving to the USA. For this composer the version of his name with diacritics could be considered a "misspelling" in English. And only in English - all other languages write Arnold Schönberg afaik. --Francis Schonken 13:52, 29 June 2006 (UTC)

Diacretics are not English, full stop
This is the English language Wikipedia, so diacretics should not be used, other than to indicate the form used in an original language. We don't use Chinese characters, and there is no more justification for using diacretics, other than that to some degree we can get away with it. Words written with diacretics are not English, and that is the end of the matter. Chicheley 20:51, 28 June 2006 (UTC)
 * This is simply not true. English does make use of diacritics natively (such as the diaresis). Further, it makes extensive use of diacritics in loan words because letters with diacritics are not the same as the similar-looking letter which doesn't have one. --Oldak Quill 20:55, 28 June 2006 (UTC)


 * I disagree, but that counts for little. Of rather more importance are the vast number of reliable sources which use diacritics where conventions dictate that they should be used. Angus McLellan (Talk) 21:43, 28 June 2006 (UTC)


 * I must agree with the title of this section: "Diacretics are not English, full stop". Indeed "diacretics" is not an English word. "Diacritics" is. And they are used in some English words, all dictionaries agree on that. Sorry for the pun. Couldn't resist :) --Francis Schonken 22:58, 28 June 2006 (UTC)

Foreign words are not English words. Full or partial stop or whatever. Zocky | picture popups 11:43, 29 June 2006 (UTC)
 * Of course loanwords are English words. Some of these have diacritics. See Naming conventions (standard letters with diacritics)--Francis Schonken 13:59, 29 June 2006 (UTC)
 * Personal names and other foreign words used in English texts are not loanwords, they're cited foreign words. Zocky | picture popups 14:04, 29 June 2006 (UTC)
 * Wasn't talking about proper nouns but about loanwords. The question was whether there are "English" diacritics. Proper nouns are afaik of no use when trying to prove whether diacritics are part of English or not. Loanwords, on the contrary, are useful in that context. And then the answer is yes, some diacritics are English. --Francis Schonken 15:07, 29 June 2006 (UTC)

Technology, not "it's not English," is why diacritics were historically stripped in English
The question was whether there are "English" diacritics. et al...
 * The notion of "diacritics" are "not English" is fundamentally rooted in pre-computer typesetting technology, whose ultimate technological achievment was the Linotype machine. Typing technology consisted of assembling a set of molds for all the letters and spaces in a line of text, then pouring hot metal into the created mold to create the line of print. It was simply not feasible (if not mechanically, then certainly not economically) to create Linotype machines which can do what we can do now from any computer keyboard, that is, type any language on the planet (almost). And so, what you had were Linotype machines set up by language, with a few "extras" tossed in which were used often enough that including them in the set of additional "special" characters was not overly burdensome.
 * The result, when printing foreign names in English, particularly in the case of Eastern Europe, were all sorts of variations from just the elimination of diacritics to a seemingly endless supply of semi-transliterations.
 * References recognized this and began moving away from non-diacriticalized versions as early as the 80's as phototypsetting technology began rolling out in force (having become commercially affordable in the late 1970's), using instead the original language form. However, it's only more recently that typing technology for the masses has caught up, from the RIGHT-ALT "GRE" language character shift standard to all the standard font faces supporting all the extant "code pages" and computer programs accepting alternate font "code pages" besides just Latin and "extended" Latin.
 * Just in the tiny little article naming corner of the world I find myself embroiled in, we find the following question (leaving issues of monarchal titles, other languages, etc., out). What, for example, defines common English usage and/or current English usage for the Polish "Władysław"? Let's do the "Wiki" thing (google et al. searches, library searches, etc.):
 * Wladyslaw (drop the diacritics),
 * Vladislav,
 * Ladislau,
 * Ladislas,
 * along with the native Władysław.
 * Here we find three variants based on some sort of transliteration, one based on stripping the diacritics, and the native "Władysław," quite frequently used in major/popular references (not just obscure academic history journals) and simply indexed for English usage as if there were no diacritics.
 * I submit that "Władysław is not English" is a red herring. Enough current English references, and going back 3 decades, simply use the native Polish syntax. So, what are the arguments in favor of the four non-Polish variants above?
 * A person "won't know how to type the ł's"—a moot point because redirects handle all the variants.
 * It looks strange—articles should mention all the historical English usage variants; I think it would be a net benefit for (many) English speakers to have a less parochial view of the Latin script; since it already appears to be generally accepted that one can use the native syntax within articles, there's no need to restrict the title.
 * Typing those ł's is damn inconvenient−shift to Polish keyboard, RIGHT-ALT plus "l", it really couldn't be easier... łłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłł (repeating key). Took less than 30 seconds to install Polish keyboard support on my PC.
 * A person typing it without the diacriticals won't find the Wiki article from a search engine—perhaps once upon a time, but as far as I can tell, search engines now pretty much ignore diacritics.
 * I can't save the file in Notepad—save as UTF-8.
 * It seems exceedingly odd to me that an encyclopedia which exists only because of computer technology, which provides a built in editor supporting all the (major) "font pages" so one can insert all manner of characters with the click of a mouse, would insist that article titles remain rooted in a technology that became obsolete thirty years ago.
 * English usage no longer means "restricted to the English language character set." Yes, "Władysław is not English"—it's Polish, but "Władysław" is accepted and increasingly preferred English usage. —Pēters J. Vecrumba 03:49, 19 November 2006 (UTC)


 * Don't be silly.
 * "It isn't generally a technology limitation. Newspapers and magazines have long been able to include diacritics when they choose to.  They do not choose to.
 * It's a "moot point because redirects handle all the variants"? Utter hogwash.  First of all, redirects don't just happen.  Just look through my contributions listing, and see all the redlinks mentioned in my edit summaries that are still red even after I have called the problems to the attention of those watching the articles.
 * Furthermore, there are much broader implications than what happens when you type something into the go box, or when you put a link into an article. The use of diacritics or not also has significant effects on searches in various search engines; and on the major search engines, there are so many different parameters that affect the results that those results are never predictable.  I can show you a great many cases in which diacritics do indeed make significant differences, on various search engines.
 * We English speakers have every damn bit as much right to establish our own identity in the characters we use in our language as the users of some other language who can't think of any better way to establish their identity than to see how cute they can get with the squiggles they put on the letters they use. I get tired of hearing from far too many editors claiming not that we should choose to use the options with diacritics here on Wikipedia, but rather that it is an "error" for us not to do so.  It is never an error to use the English alphabet when writing in English.  We can at times choose to retain some other languages characters for some purposes; that does not mean that we are in any way obligated to do so.
 * I never used to be in favor of an "English as an official language" law in the United States. After dealing for two years on Wikipedia with POV-pushers who insist on sticking diacritics in hundreds of places where they clearly do not belong, I am now urging my representatives in Congress to not only pass such a law, but not to pass some wimpy, empty mollification of constituents clamoring for such laws by passing some meaningless nonsense, but rather to make a real law with real language police with real authority.
 * It may have taken you less than 30 seconds to install the Polish keyboard on your computer, but big fucking deal. What's the point. Yes, I can do that too.  But the caps on my keys do not change.  What the hell am I supposed to do?  Be like those monkeys they talk about in math classes, given enough time they'd duplicate all the works ever written?  Do I just hit keys at random, until something shows up that resembles what I'm supposed to be looking for?  Does that keyboard include "dead keys" that don't do anything themselves, but rather change what happens with the next key you hit?  I may get it installed in 30 seconds, but it is hard to teach an old dog like me new tricks.  It would take me 30 years to learn how to use that Polish keyboard, and I likely don't have that much left.  By the way, I have for many years had both the German and the Norwegian keyboards used on my computers.  Even there, where I konw what the letters are, I have found it more trouble than it is worth to try to learn the keyboard layout.  Rather, I find it much, much simpler to remember a few numbers and use the Alt-numeric keypad method to create them as I need them (and in most cases, it is things like Alt-134 that I remember, the old DOS operating system versions which still work with Windows, rather than the Alt-0229 Windows version which I had to look up because I don't know it.  Of course, in the case of Alt-0248 and Alt-0216 which weren't available in DOS, I learned the Windows numbers.
 * Then, what in the world do I do for the 500 or so other languages used here on the English Wikipedia. How many keyboards can I install?  How much memory does each one of them take up?  How am I ever going to remember what I need to do to switch to the one I want to use, let alone remember what the layout of the keys is if I do figure that out?
 * Your example of the "native" Władysław spelling is the most common late-20th century/early 21st century spelling in the Polish language, of a name that has had dozens of variant spellings in both the Polish language and as well as in other languages throughout history, including the times when many of the people bearing that name lived, and including various other languages actually spoken by the people bearing equivalent names under whatever spelling. Gene Nygaard 03:04, 3 February 2007 (UTC)

Another thing
Another thing: We indeed don't use Chinese characters, but we do use the transliteration which the Chinese use. Zocky | picture popups 13:04, 29 June 2006 (UTC)
 * For that reason romanization systems (like pinyin) are defined as outside the scope of this proposal at Naming conventions (standard letters with diacritics) --Francis Schonken 13:59, 29 June 2006 (UTC)
 * So, the Chinese and Serbs get to use their own transliteration, but Swedes and Czechs don't? Zocky | picture popups 14:03, 29 June 2006 (UTC)
 * Chinese: see Naming conventions (Chinese), this is a romanization guideline.
 * Serbian:
 * Serbian Cyrillic script (Српска Ћирилица), see Naming conventions (Cyrillic): "Latin spelling is used" (the language has a dual writing system, no need to start a romanization in English directly from the Cyrillic spelling)
 * Serbian Latin script (Srpska Latinica), this is the "Latin spelling" of Serbian. It has diacritics. Naming conventions (standard letters with diacritics) intends to deal with these.
 * Swedish: see Naming conventions (Swedish) (proposal). As of writing this, that Swedish proposal is as much a proposal as Naming conventions (standard letters with diacritics). If you think the "Swedish" NC proposal nearer to consensus, please go improve it, and try to find consensus for it.
 * Czech: see Naming conventions (Czech) (proposal). As of writing this, that Czech proposal is as much a proposal as Naming conventions (standard letters with diacritics). If you think the "Czech" NC proposal nearer to consensus, please go improve it, and try to find consensus for it. --Francis Schonken 15:07, 29 June 2006 (UTC)

OK, forget Serbian and take Macedonian, which uses a very similar latin spelling as Serbian, but only as transliteration. Keeping a person with the same last name at Buckovski or Bučkovski (which both would spell Bučkovski themselves), depending on whether they're from Serbia or Macedonia sounds unworkable.
 * I really have no clue what you're trying to get at. How to romanize the Macedonian Cyrillic script is described at Naming conventions (Cyrillic). Indeed, there, it is described as "may be written as Serbian" (with a few specifics/variants). If you have a problem with that, please direct your concerns to Wikipedia talk:Naming conventions (Cyrillic). Naming conventions (standard letters with diacritics) is not about romanizing Cyrillic scripts. If you want the people involved in the romanization of Cyrillic script languages to read your suggestions, then this talk page is not the right place, Wikipedia talk:Naming conventions (Cyrillic) is. --Francis Schonken 16:19, 29 June 2006 (UTC)

My idea is that all languages should be treated the same - use the same spelling as used in English texts produced in the country of the language's origin. Zocky | picture popups 15:50, 29 June 2006 (UTC)
 * Don't see what that would solve. These English texts produced in the country of the language's origin don't all use the same spelling. And a side-effect would be that you'd make the current *agreement* on National varieties of English (as described at WP:MoS) explode: "the country of the language's origin" would be the UK in that case I suppose, so you'd get all the USA people against you. --Francis Schonken 16:19, 29 June 2006 (UTC)

Read that as "the country of the original language's origin", of course. In other words, spell Slovenian names as English texts produced in Slovenia do, spell Chinese names like English texts produced in China do, and spell American names as English names produced in US do.
 * UK would still be "the country of the original language's origin" when speaking about English (the original language) --Francis Schonken 16:49, 29 June 2006 (UTC)

The problem with this proposal excluding romanization is that it would e.g. force Serbian and Croatian names to drop diacritics while the same names used in Macedonia would keep them. Imagine a situation where both presidents of Serbia and Macedonia had the same first or last name, which includes a diacritic both in Serbian latin spelling and in the Macedonian romanization. A sentence saying "Sasa Cacic visited Saša Čačovski in Skopje" would look ridiculous. Zocky | picture popups 16:34, 29 June 2006 (UTC)


 * Giving an example of English texts produced in the country of the language's origin don't all use the same spelling:
 * Lech Wałęsa,
 * English web page of the Polish Ministry of Foreign Affairs
 * Lech Walesa,
 * on an English page of the Polish Radio
 * on the English pages of the Lech Walesa Institute - note that the Lech Walesa Institute was founded by Lech Walesa himself, so he'd know how to write his name in English wouldn't he?
 * Note that all the mentioned websites are Polish (.pl), and that for the Polish pages of each of these websites always the version with diacritics is used... (I mean: the differences in the English spelling don't result from the often gratuitously assumed "laziness" in this case).
 * So, no, I don't think Zocky's alternate proposal would solve much.


 * Neither for Chinese for that matter, Lao Tzu as well as Laozi (and some other variants) can be encountered in English texts produced in China. --Francis Schonken 16:47, 29 June 2006 (UTC)


 * Of course they don't all use the same spelling, but that's in no way different from English texts produced in English speaking countries, but it would still be the same rule for all languages. Zocky | picture popups 17:19, 29 June 2006 (UTC)

There are two aspects to you proposal (apart from the US/UK English thing, but that could be worked away with a diligent way of formulating the principle):


 * 1. Use local sources in English for determining spelling in English Wikipedia : This has several problems, for one that it would be less compatible with the current provisions of naming conflict. For example for Lech Walesa/Wałęsa, using the table provided by that guideline:
 * {| border=1


 * width=60% | Criterion
 * width=20% align = "center" | Lech Walesa
 * width=20% align = "center" | Lech Wałęsa
 * 1. Most commonly used name in English
 * align = "center" | 1
 * align = "center" | 0
 * 2. Current undisputed official name of entity
 * align = "center" | 0
 * align = "center" | 1
 * 3. Current self-identifying name of entity (in English!)
 * align = "center" | 1
 * align = "center" | 0
 * colspan=3 | 1 point = yes, 0 points = no. Add totals to get final scores.
 * }
 * This is a weighed result. Doesn't give precedence to a single principle. Compatible with the present "diacritics" proposal. What you propose is that a single principle gets precedence, a principle that doesn't apply likewise to all countries/languages (not all countries/languages produce readily available "reliable sources" in English covering everything that is notable about the country, for instance - for several countries the majority of reliable sources in English are produced outside the country).
 * So, as far as your "published in English in home country single principle" proposal is concerned: this might seem a good idea on first sight, but I foresee too many problems, and won't support it.
 * "self-identifying name of entity (in English)" is roughly what I'm talking about, and in this case would probably more commonly be the one with diacritics. In fact, almost for all foreign names, items 2 and 3 gives points for the name with diacritics and trumps the English common usage. I guess that's why most articles are at the names with diacritics now. Also, Use English says that for languages which use the latin alphabet, no transliteration is necessary, which I interpret as "use the original spelling". Zocky | picture popups 02:43, 30 June 2006 (UTC)
 * I've no idea what, in sum, you're trying to say:
 * Currently "self-identifying name of entity" should determine for 33% (the other two thirds being "official name" and "common name in English"), per the naming conflict guideline;
 * Then you say: no, "self-identifying name of entity" should determine for 100%, it is an appropriate formulation of the "published in English in home country single principle";
 * Then you say: no, "self-identifying name of entity" should determine for 0% while it trumps English common usage (which, furthermore, it obviously didn't in the example given above).
 * ...all in all a quite confusing comment.
 * Also, your quote of Use English is very questionable. The sentence where you quote from has quite clearly "If there is no commonly used English name". Arguably, for example, "Andre" is the common English format of the French name "André". Seems also as if you never read the guideline till the end. It has very clearly: "There is disagreement over what article title to use when a native name uses the Latin alphabet with diacritics", in Use English. It is (a part of) that dispute we're trying to solve with the present "diacritics" NC proposal. Your comments above seem so confused to me, that I still don't know what grounds you have to either support the thing getting solved, or not. --Francis Schonken 07:38, 30 June 2006 (UTC)
 * Currently "self-identifying name of entity" should determine for 33% (the other two thirds being "official name" and "common name in English"), per the naming conflict guideline;
 * Then you say: no, "self-identifying name of entity" should determine for 100%, it is an appropriate formulation of the "published in English in home country single principle";
 * Then you say: no, "self-identifying name of entity" should determine for 0% while it trumps English common usage (which, furthermore, it obviously didn't in the example given above).
 * ...all in all a quite confusing comment.
 * Also, your quote of Use English is very questionable. The sentence where you quote from has quite clearly "If there is no commonly used English name". Arguably, for example, "Andre" is the common English format of the French name "André". Seems also as if you never read the guideline till the end. It has very clearly: "There is disagreement over what article title to use when a native name uses the Latin alphabet with diacritics", in Use English. It is (a part of) that dispute we're trying to solve with the present "diacritics" NC proposal. Your comments above seem so confused to me, that I still don't know what grounds you have to either support the thing getting solved, or not. --Francis Schonken 07:38, 30 June 2006 (UTC)


 * The problem here is the idea that everything has an "English name", which is simply not true. Some things are named in other languages and English uses them as citations. With substantial usage some of these become English words, and sometimes the spelling changes (that's how "Andrew", the real English equivalent of "André" came about). But in most cases where diacritics are used there are no English words, just cited foreign ones.
 * "Self-identifying name" to my mind is simple - what the person or entity uses themself. I have never said that anything trumps common English usage automatically or common English names at all (in fact, I supported titles like Oder, Drave, Save, Styria, etc.). I just meant to comment that if the above template is applied, diacritics would win in most cases, even if versions without diacritics were really "English".

How citations are rendered, is a matter of choice, but there's no magic formula that says that dropping the funny dots makes a foreign name or word English. Zocky | picture popups 03:33, 15 July 2006 (UTC)


 * 2. treat the Latin alphabet languages and those with other native scripts with the same rules.
 * In fact I agree with you there. The "caveat" for the non-Latin alphabet languages is a practical one. Wikipedians have elaborated guidelines for Japanese, Chinese, etc... I think they did a good job. I'm not remotely experienced in these languages to doubt their assertions that on some level somewhere a more "formal" linguistical romanization system should be used, like pinyin, which results in some diacritics being used. Anyway, that's a different problem, and is, for those languages, covered by active guidelines. I don't think it would be a good idea to undermine that work. Of course, on short term for the natively "dual script" languages (how many are there: 2 or 3?) the guidance should be clear. Which for Serbian means that, unless the  "Cyrillic" naming conventions page is updated in view of the impending diacritics NC guideline, things will be as said if and when this diacritics guideline goes life (change "Latin spelling is used" to "Latin spelling is used including native diacritics" on the Cyrillic NC page, and the thing would be settled too, without Serbian names needing to be changed).
 * Whether in a later stage Japanese, Cyrillic, Chinese, etc. guidelines are to be brought in line with the "Latin alphabets" diacritics guideline is not a problem to be solved now. Maybe it never happens. If it happens, and its a language I'm remotely acquainted with (Greek might fall in that category ) I'd support a diacritic-free romanization. --Francis Schonken 19:08, 29 June 2006 (UTC)