Wikipedia talk:Naming conventions (thorn)

I'm going to post notifications about this new proposal at Village pump (policy) and some other places now. --Francis Schonken 10:11, 7 January 2006 (UTC)

Counterargument
I don't think this proposal is a good idea, here is some reasoning.

Produces idiosyncratic forms
If we transliterate thorn every time it occurs but leave other Icelandic characters alone then we will produce very idiosyncratic forms. Let's take Þorgerður Katrín Gunnarsdóttir as an example. If we transliterate the thorn but leave the other characters intact we get "Thorgerður Katrín Gunnarsdóttir". This is not a form which really sees any use. Here's a Google search.


 * 429 English pages for "Þorgerður Katrín Gunnarsdóttir" -wikipedia
 * 2 English pages for "Thorgerður Katrín Gunnarsdóttir" -wikipedia
 * doubtful result, see below --Francis Schonken 14:43, 7 January 2006 (UTC)
 * This is the search I ran: It yields 2 (or 3) hits when I run it (I'm in London). - Haukur 15:17, 7 January 2006 (UTC)
 * Which isn't a search for English pages, hence the low result. --Francis Schonken 15:29, 7 January 2006 (UTC)
 * By restricting the search for one language you won't get any more results. I get 2 results and both of them are in English. Now, if you can produce more webpages which contain this idiosyncratic form then please name them. - Haukur 19:39, 7 January 2006 (UTC)

We would end up with a very rare form.

Can run counter to the most common principle
Even if we were to decide to transliterate all Icelandic characters into ASCII, something which is done reasonably often, we would still often end up with a form which may not be the most commonly used in English.


 * 418 English pages for "Þorgerður Katrín Gunnarsdóttir" -"Thorgerdur Katrin Gunnarsdottir" -wikipedia
 * 99 English pages for "Thorgerdur Katrin Gunnarsdottir" -"Þorgerður Katrín Gunnarsdóttir" -wikipedia

Most people who have any interest at all in someone like Þorgerður Katrín Gunnarsdóttir, the Icelandic minister of education, will already know a few things about Iceland and Icelandic orthography and will be better served with including the 'Þ' in her name.

Produces ambiguity
Transliterating 'þ' into 'th' can produce ambiguity. For example both 'Þorsteinsson' and 'Thorsteinsson' exist as separate last names in Iceland.

Is too specific
We may well need a naming convention for Icelandic people - there are e.g. some issues with the use of patronyms/matronyms which might be useful to spell out. We also need more discussion on post-ASCII characters and when to use them. A few words on thorn would be at home in a convention on either topic. But making a separate naming convention on a single character seems like overkill to me.

Proposal has a misleading example
Of course the article on the Norse god of thunder should be at Thor rather than Þórr. That's because Thor is the commonly used English name. This has little bearing on the use of 'Þ' in general.

Problems can be solved
Of course some fraction of the people reading the article on Þorgerður Katrín Gunnarsdóttir will not be familiar with the letter 'þ'. For those people we should provide a transliteration within the article - maybe with the 'foreignchar' template. - Haukur 13:06, 7 January 2006 (UTC)


 * I get 490 English pages for "Thorgerður Katrín Gunnarsdóttir" -wikipedia
 * I'm wanting to assume good faith, and that you had a technical glitch or whatever, but your numbers don't seem anywhere near to reliable results.
 * Further, I get 523 English pages for "Thorgerdur Katrín Gunnarsdóttir" -wikipedia, which, with the maybe technical limitations the Google system has, would seem "most common" in a first approach.
 * Anyway, wouldn't it be better you put together a counter-proposal? I mean, criticism is easy, but maybe better make a proposal either on this talk page, or wherever you like, according to what you think would be most acceptable to the wikipedia community? "We may well need a naming convention for Icelandic people" - what stops you to propose one?
 * Also, re. "We also need more discussion on post-ASCII characters and when to use them" - I don't know whether we mean the same with "discussion"; anyway "post-ASCII characters" are already discussed several months at Naming conventions (Unicode) (draft), for instance user:Curps is involved in this talk. --Francis Schonken 14:43, 7 January 2006 (UTC)


 * There's nothing stopping me from proposing a new naming convention for Icelandic people except that I have limited time and resources to spend on Wikipedia and have to prioritize somewhat. I'm currently working more than I should like on project pages and less than I should like on articles. - Haukur 15:15, 7 January 2006 (UTC)
 * As you please, but with the same amount of energy you put in writing what you wrote on this page, you could've made an alternative guideline proposal too. Naming conventions (Unicode) (draft) is going on for months, without producing anything more than a draft. So I decided to start helping things to move on, with the limited amount of time I have available. --Francis Schonken 15:46, 7 January 2006 (UTC)


 * A search for "Thorgerdur Katrin Gunnarsdottir" will also yield pages where "Þorgerður Katrín Gunnarsdóttir" occurs which is why those have to be filtered out to get a meaningful result. - Haukur 15:19, 7 January 2006 (UTC)
 * I don't think it would be useful to go too much in detail about the finer points of google testing. Anyhow, what I know is that in diacritics discussions google testing is nearly never helpful (google does weird things with diacritics) - the same goes apparently for Þ (I knew also that for eszet it is not possible to use google testing). Anyway, in short: you also filtered out the google results where both Þorgerður Katrín Gunnarsdóttir and Thorgerdur Katrín Gunnarsdóttir occur, like this webpage: http://womenministers.government.is/ : Þorgerður Katrín Gunnarsdóttir under the photo; Thorgerdur Katrín Gunnarsdóttir two times in the English text (FYI, on a meeting held in Iceland, on an Icelandic website). Sorry, this kind of google testing (certainly when relying on "filtering out") is not possible for characters like Þ --Francis Schonken 15:46, 7 January 2006 (UTC)
 * Goggle is not a very precise tool but it's better than nothing and it's enough to reveal that on the Internet &mdash; even on English-language pages &mdash; the spelling Þorgerður Katrín Gunnarsdóttir is most common while Thorgerdur Katrin Gunnarsdottir also sees some use but Thorgerður Katrín Gunnarsdóttir is found almost nowhere. - Haukur 19:45, 7 January 2006 (UTC)


 * I agree with Haukur that this proposal is not a good idea. Stefán Ingi 17:45, 7 January 2006 (UTC)
 * As do I. Our readers are not stupid. If they don't know the thorn yet, they'll learn. I'm very much against dumbing down an encyclopedia to make it "easier". &mdash; Nightstallion (?) 20:01, 7 January 2006 (UTC)
 * Don't confuse "dumbing down" (removing information to suit the lowest common denominator) with making the encyclopedia more accessible. It's silly to have characters in a page title that a reader can't even recognise, let alone pronounce and/or make sense of. By all means contain the "correct" spelling of a name in the article, but making an article technical and incomprehensible to people who aren't experts in the field is the exact opposite of what WP tries to do in most cases. Stevage 03:49, 8 January 2006 (UTC)
 * Forgive me for saying so, but I fail to understand why having a þ in the title of an article makes it incomprehensible. A reader not familiar with it will wonder what it is, find a transliteration as well as an indication of what part of the world it's associated with in the first sentence, and then no longer be confused.  She'd have a much easier time with it than I do with an article like Heine–Borel theorem, of which I don't understand a single word after "in."  That's a technical and inaccessible article (and yes we have them and need them on Wikipedia); Þorgerður Katrín Gunnarsdóttir is just a regular article with a funny-looking letter in the title.  Please don't overstate the situation here.  Chick Bowen 03:59, 16 January 2006 (UTC)

Intelligible to those interested in the subject
I would imagine that most readers who navigate to an article using þ would (i) either be sufficiently familiar with the letter to read it without problems (indeed, to possibly be majorly annoyed by the deterioration of information incurred by an arbitrary transliteration) or (ii) be sufficiently curious to take 10 seconds of their life to learn this letter. Our readers are expected to be curious and wanting to learn things. For example, about Icelandic subjects. Arbor 18:02, 21 January 2006 (UTC)


 * I agree. &mdash; Nightstallion (?) 19:54, 22 January 2006 (UTC)

Support
I think this is a good proposal. When I first came here from the "Proposals" page, I had never heard of this character. Even after having read everything on the proposal and discussion page, I still find the character confusing. For the average English reader, this character is not recognizable or pronounceable. It's unclear how to alphabetize it, it's difficult to link to, and if someone is browsing a category that has an article title with this character, it could be as incomprehensible as if the title were in Chinese or Arabic or Hebrew. Standard characters that are in common use in English, should be the standard for any English-language Wikipedia article. Elonka 02:49, 21 January 2006 (UTC)

I also support this idea; although it probably does need a caveat for Modern Icelandic names, those which do not, or not yet,have names in English. Althing does, though. It should be remembered, however, than to most of the world, thornis as unknown, and as hard to type, as any Chinese or Malay Malayalam character. Septentrionalis 23:45, 25 January 2006 (UTC)
 * Malay language uses the Roman alphabet in the same configuration as English. Perhaps you mean Jawi? Chick Bowen 18:35, 27 January 2006 (UTC)
 * Malayalam script, used in parts of southern India, would fit with Chinese as writing that is not just, like Russian and Arabic (and probably Jawi), incomprehensible, but hopelessly, desperately incomprehensible to most English speakers. --Jerzy•t 22:41, 27 March 2006 (UTC)


 * I stongly support this. Indeed I would prefer that All article titles, without exception, be transliterated, and any character not in standard use in English not appear in article titles (except redirects), although they may well be sued in the body of an article. DES (talk) 21:10, 27 January 2006 (UTC)
 * Sorry, DES, define "in standard use in English." Would you include characters with diacritics ('ö', 'é', etc.)? Chick Bowen 22:29, 27 January 2006 (UTC)
 * Actually i would prefer that diacritics were not used in article titles, except in thsoe few cases where a word with diacritics is the mostcommonly used form in English, and maybe not even then. As a fall-back, i would prefer that only those diacritics which show up with some frequency in english-language publications be allowed. In any case I would ban all letters from non-roman alphabets, (including letters once used in old english but no longer used in modern englsih, such as Ash and Thorn) from use in article titles. DES (talk) 22:58, 28 January 2006 (UTC)
 * Out of curiosity, where would you move Encyclopædia Britannica? - Haukur 23:00, 28 January 2006 (UTC)
 * There are always borderline cases, but I would tend to favor Encyclopaedia Britannica, with of coiurse a mention of the ligature in the lead sentance. DES (talk) 23:49, 28 January 2006 (UTC)

PS: vowels with diacriticals are irrelevant to this guideline. --Francis Schonken 10:03, 28 January 2006 (UTC)


 * No they are not. You can't just treat a single letter in isolation, if you want to transliterate you have to have some sort of coherent system for handling Icelandic names. - Haukur 11:47, 28 January 2006 (UTC)


 * Vowels with diacriticals are irrelevant to this guideline. --Francis Schonken 12:29, 28 January 2006 (UTC)


 * *sigh* How do you propose the name "Þorlákshöfn" be represented? - Haukur 15:01, 28 January 2006 (UTC)


 * How would you? Þorlákshöfn? Thorlakshofn appears OK for the title of this English page on an Icelandic website - So I moved Þorlákshöfn to a more common spelling. Would that in any way be a controversial move?--Francis Schonken 15:31, 28 January 2006 (UTC)


 * Yes, it would, as you very well know. Every Icelandic municipality article we have uses diacritics as appropriate - Haukur 15:44, 28 January 2006 (UTC)


 * ...then your way of moving it back would not be less controversial (including the fact that you created five double redirects in the process). There's no guideline currently covering the move you just performed. If you don't have time to write such guideline (as you said above), and if nobody else wrote it, then it doesn't exist. Naming conventions (thorn), on the other hand, conforms to Naming conventions, which is official policy (see: Naming conventions (thorn)). If any of you have something convincing to say why Naming conventions (thorn) would not conform to official policy, go ahead. But I don't think it a good idea to let Wikipedia be ruled by imaginary guidelines, especially not by imaginary guidelines that scorn *existing* official policy. --Francis Schonken 16:20, 28 January 2006 (UTC)


 * Okay, I've now reverted your changes to the redirects as well.


 * The facts on the ground are important. What people editing the articles are actually doing is important, not "imaginary". Your interpretation of policy pages (to which you so heavily contribute) is not the alpha and omega :) - Haukur 16:50, 28 January 2006 (UTC)

Well, let's start Naming conventions (standard letters with diacritics) --Francis Schonken 10:23, 28 January 2006 (UTC)
 * Yes, let's. I am sympathetic with the objection to dealing with letters one by one. But that's not what we are doing. There are diacritics (whether vowels and consonants are similar or distinct.) The debate on Zürich IMO established some principles (that IMO extend to all the many diacritics, including Polish Ł and ł, but let's leave that for that page), leaving a few smaller groups for attention. The ligatures are pretty much of a clear group with similar properties: at least as lower case, æ and œ seem to survive in at least British English, and they also are a recent part of our linguistic heritage. With the French Œ (called in English ethel (or eðel), pronounced with the hard or voiced th of than) and Scandinavian multilingual Æ (English name ash) make a small but pretty coherant group; the only real gray case for this group is ß (German Eszett, often transliterated as "ss"), which started as a clear ligature of S and Z, but is pretty hard to recognize by now. Thorn is pretty clearly neither modern English, a diacritic variation, nor a ligature, and the same goes for Yogh. Ð (upper case Eth or Edh) looks a lot like a diacritic; i put it with thorn and yogh, bcz ð (lower-case Eth) is hard to think of as a diacritic. (There's irony here: a Vietnamese upper-case D-with-stroke character is hard to distinguish from upper-case Eth, but the corresponding lower-case Vietnamese letter does look like a lower-case d-with-stroke.) --Jerzy•t 22:41, 27 March 2006 (UTC)


 * Strong support All article titles should be entirely in the English language alphabet. We shouldn't even have to put up with attempts to write the English language Wikipedia in foreign languages. It is illegitimate interference, and I don't suppose any other language encyclopedia has to put up with it. Nathcer 15:53, 25 March 2006 (UTC)
 * It's not, per se, the naïvete of our compadre's diktat, that leaves me blitzkrieged in ennui and angst, even anomie: it's more the chauvinism of seeing one's own local patois as the only one beset with pidginization. --Jerzy•t 22:41, 27 March 2006 (UTC)
 * Support. --Jerzy•t 22:41, 27 March 2006 (UTC)
 * Strong support. This hasn't been discussed in awhile, so I'm not sure if I've missed the boat, but I totally agree with this. I find it annoying that people feel the need to excessively use foreign characters in the English Wikipedia. Obviously, the thorn should be used on the Icelandic WP, but not the English one. We have a couple special characters like æ, but that is used in regular writing (I saw a number of examples in British English while I was in London). Dbinder (talk) 11:52, 18 July 2006 (UTC)

Convention off track
It seems to me that the way the text is currently is less than ideal for several reasons. First of all, it is proposed that this should be a convention as in this article. As such it should be codifying the current practice but as has been pointed out, all articles on relevant Icelandic localities and as far as I can see, also on relevant Icelandic people, use the Þ in the relevant places. Therefore, writing in a convention that we should not be using Þ seems to be going against convention. I might also note that the English wikipedia is not unique in using the Þ. Check the interwikilinks at Þingvellir, Þjórsá and Þorgerður Katrín Gunnarsdóttir.

Another problem with this proposal is that it is far too specific. As our page on thorn notes, this letter is only used in writing Old English and Icelandic. I haven't seen any article on an Old English subject use the letter in the title so this proposal seems to only affect articles related to Iceland. As such it would be much more natural to discuss the use of diacritics and ligatures in Icelandic at the same time.

Now, even though the proposal is maybe less than ideal, the discussion has been very good. In particular, there have been raised several points which I think are very good to bear in mind: Stefán Ingi 00:14, 29 January 2006 (UTC)
 * 1) We must take care and not go against the common names principle. This applies e.g. to Althing, which is there and not at Alþingi (the modern Icelandic form) or Althingi, but quite properly both of these exist as redirects.
 * 2) Redirects are easy to make and should be used unsparingly. This is generally the case currently, e.g. Thorgerdur Katrin Gunnarsdottir, Thjorsa and Thingvellir but it certainly wouldn't do harm to emphasise this.
 * 3) We must think of the indexing. In categories which all or almost all of the entries will be related to Iceland it might be acceptable to have Þ in the index but this is debatable. In general categories I do not think this is acceptable. See e.g. where Þingvellir is under Þ. The ideal solution would be for a change of software which would sort Þ as Th or even better, give two entries, one at Þ and one at Th. It is very common in indices to have multiple entries when it is reasonable to expect that people might look under many names. This, however, requires a software change so for a more immediate solution, I feel we should reindex the entries manually. By this I mean, edit the Þingvellir entry and make it read  . I feel that the correct way of doing this would be suggesting it here, see if people like it, implement it, see if people still like it and then write it in a convention. I would be very willing to help in doing this.
 * See below for a suggestion of Phil which I think is superior to what I came up with. Stefán Ingi 11:55, 30 January 2006 (UTC)
 * 1) We need to address that readers might be unfamiliar with Þ. Therefore we should always give some version of the title using only the 26 characters and this must feature prominently in the article, i.e. in or before the first sentence. There might not be any single best way of achieving this. See Þingvellir and Þorgerður Katrín Gunnarsdóttir for examples.
 * On that categorisation aspect, are you aware that it is quite acceptable to add a category to a REDIRECT? I have done so at Thingvellir, and also added R from title without diacritics which I commend to you as A Good Thing. HTH HAND —Phil | Talk 10:29, 30 January 2006 (UTC)
 * Thank you for that info. I was sure I had read somewhere that categories for redirects didn't work but clearly I should have gone and tested it. I also think R from title without diacritics is very useful. Stefán Ingi 11:55, 30 January 2006 (UTC)

Oppose
I dislike this proposal very much. Insisting on redirects from other likely spellings is fine, but insisting on an incorrect or inconsistent spelling is absurd. This seems to be an attempt to push the loss of diacritics (etc.) from article titles by the back door, one letter at a time. Þ is no more or less special than ß, and roughly the same arguments apply to both. I can foresee a success here being used to justify similar policies for ß, š, Å, Ĳ, and all the others. Treating letters individually is absolutely not the way to go. Yes, for some people they're unfamiliar and difficult to type, but that's what redirects are for. Please abandon this proposal. --Stemonitis 12:13, 4 February 2006 (UTC)
 * Very strong oppose in most cases. I'll try to keep this very brief relative to my potential to write essays on the issue. I assume that arguments about usage in historical names have been made already, so I won't comment about that other than to say that it is my personal preference to use forms closest to native/original spellings for proper nouns (and adjectives). For contemporary names, especially Icelandic ones (are there other large groups of modern names that use thorn?), it is very difficult for me to understand why anyone would want to intentionally misspell a person's name. "Popular usage" should have no factor, since there is such a thing as the "correct" form of names: official documents (legal names), etc., or even a person's preference. Ardric47 09:50, 8 May 2006 (UTC)
 * What we're all going for here is to try to implement the policy to use English names. This will necessarily result in something different from the original if the original contains elements that cannot appear in English. The question is which elements are to be construed as ones that cannot appear in English. Is any difference from the original to be considered a misspelling? The reductio ad absurdum of that position would be the argument (which I'm sure I'm not the first to raise) that Mao Zedong should be moved to 毛泽东 and Myanmar to [[Image:Myanmar long form.png]]. No one wants to misspell anything, but we are trying to decide where to draw the line between that extreme and what's acceptable. I don't think you've really addressed that question. - Nat Krause(Talk!) 04:31, 10 May 2006 (UTC)
 * Thorn is considered to be part of the Latin alphabet as far as I know. Ardric47 04:59, 10 May 2006 (UTC)
 * I think that the term "Latin alphabet" is somewhat ambiguous: it refers to a variety of European scripts which have a lot of overlap. The Wikipedia article on Latin alphabet currently defines "Latin alphabet" as having 26 letters, although I think this is a bit of an overstatement. Certainly, thorn is a letter of the Icelandic alphabet, which is itself essentially a Latin alphabet. It has the distinction of being one of the few letters in European scripts which is not a modification of one of the original Roman letters. So, I don't think it's accurate to say that thorn is part of the Latin alphabet without qualifying which Latin alphabet is meant. - Nat Krause(Talk!) 23:40, 10 May 2006 (UTC)
 * This report considers thorn (or apparently more properly þorn) to be part of "the" Latin alphabet, as does Unicode (the characters are "Latin Capital Letter Thorn" and "Latin Small Letter Thorn"). Ardric47 01:39, 11 May 2006 (UTC)


 * Oppose. Using standard English names where such exist is common sense and already policy, and thus no guideline is needed to support it. Using non-existent transliterations where no standard English names exist is controversial and needs discussion, but this not the way to discuss it; eth, for example, needs to be treated differently in Icelandic (typically transliterated to &lt;d>, I believe) and Old English (transliterated to &lt;th>). This policy is therefore a bad idea. The treatment of thorn and similar letters should be decided on a language-by-language basis, not a letter-by-letter basis. &mdash; Haeleth Talk 10:03, 11 August 2006 (UTC)

eth?
Is there any likelihood of the treatment of this topic turning out different from that of the eth (Ð, ð)? If not, why don't we combine them and move this to Naming conventions (thorn and eth)? - Nat Krause(Talk!) 04:33, 10 May 2006 (UTC)

Consensus
From my own read of this page, it looks like there's a consensus to make this a formal guideline. Why has it been labeled as "inactive"? What else needs to happen for it to move forward and get out of "proposed" status? --Elonka 00:47, 12 September 2006 (UTC)
 * It is labelled inactive. This might be because it was discussed a bit in January, and the reception was very mixed to say the least and it was not moved up to being a policy or a guideline. Then the discussion sort of died out, maybe once a month somebody said something there were a handful of replies or less. Stefán Ingi 01:03, 12 September 2006 (UTC)