User talk:Wavelength/Sandbox 4/Alphabetization and collation

This page is for User:Noetica and User:Wavelength to discuss preparation of guidelines for alphabetization and collation on Wikipedia.

Previous discussions
Here is a permanent link to a preliminary discussion of the topic: User talk:Noetica - Wikipedia, the free encyclopedia [section 4: "Alphabetization (given names, surnames, domestic name order, thorn)"].


 * Something like that. I think navboxes should be used a lot more. They certainly can keep things orderly. Don't hesitate to come back here for more technical discussion as needed. I have a few resources to consult, and the topic interests me.
 * – ⊥¡ɐɔıʇǝo N  oetica! T– 20:26, 10 January 2009 (UTC)
 * I've refined and corrected things a little in my post above.– ⊥¡ɐɔıʇǝo N  oetica! T– 22:53, 10 January 2009 (UTC)


 * I have done the alphabetization of Esperantist. Ba Jin (listed at Esperantist) is a pseudonym, which I alphabetized at Ba.  Pope John Paul II (listed at Esperantist) is a titled name, which I alphabetized at John.  This reminds me of Cardinal, which is used as a middle name/title.  It also reminds me of Esquire, which is mentioned last in a name (or maybe I should say "mentioned after a name").
 * Some telephone directories have all Mc and Mac (and maybe M' ) names in a section between the L section and the M section. Also, Mackenzie (with a lowercase k) could be analyzed as being in the M section, rather than in the section for Mc and Mac.  Several Mac names have two forms which differ only as to the capitalization of the next letter.
 * In my previous work on Wikipedia, I have listed items in ASCII-code order, with numerals before letters. If numerals are ordered as the words they represent, then there is ambiguity with 1492, which could be read as "one thousand four hundred ninety-two" or as "fourteen (hundred) nineteen-two", and likewise with 2009.  See User:Wavelength/Articles started, sections 2 to 7.
 * Recently, when I added M.C. Mehta v. Union of India (Oleum Gas Leak Case) to List of environmental lawsuits, I left the order as I had arranged it before, but I noticed another problem: the new entry differed from another one (M. C. Mehta v. Kamal Nath) in the spacing of the initials. Perhaps one is right and one is wrong, according to a guideline somewhere on Wikipedia.
 * (All of this is giving me images of crazy quilting.)
 * -- Wavelength (talk) 07:38, 13 January 2009 (UTC)
 * Yes, I can understand your experiencing the crazy-quilting effect. I have edited the lists on the page myself. I do urge a move of Þórbergur Þórðarson to Thorbergur Thortharson; and even without that move, Thorbergur Thortharson would be much better for standard English usage, as in these lists. Such an adaptation is quite normal. We don't refer to Thor Heyerdahl as "Þór", or whatever the original form would be! I have also fixed some punctuation, capitalisation, and the like. The Esperanto word Internacio is best translated as International (SOED, "international": [B. n'] 3 (I-.) Any of various socialist organizations founded for the worldwide promotion of socialism or Communism; spec. = First International, Second International, Third International, Fourth International below. Also, a member of any of these organizations. L19.).
 * One entry was an error, due to confusion with an almost exact namesake. I removed it (see edit summaries). There are articles for several Russians with that same surname, as opposed to first given name and also surname; and while there is a disambiguation page there is not, so far, a DAB tag at the top of every affected page.
 * Language and languages were not designed for strictly rational collation such as alphabetising. We do the best we can, in an imperfect universe. I think we have it sorted out well enough this time. The larger matter of making WP guidelines to deal adequately with alphabetising is separate and more problematic.
 * – ⊥¡ɐɔıʇǝo N  oetica! T– 00:56, 14 January 2009 (UTC)


 * }

Here is a permanent link to a subsequent discussion of the topic: Wikipedia talk:Manual of Style - Wikipedia, the free encyclopedia [section 42: "Alphabetization and collation"].

Here is a permanent link to a third discussion of the topic: Wikipedia talk:Lists - Wikipedia, the free encyclopedia [section 29: "Alphabetization and collation"].

{| class="navbox collapsible collapsed" style="text-align: left; border: 0px; margin-top: 0.2em;" ! style="background-color: #88aaff;" | Third discussion on Lists talk page 

I am proposing that Wikipedia have a set of guidelines for alphabetization and collation. Here is a permanent link to a preliminary discussion of the topic: User talk:Noetica - Wikipedia, the free encyclopedia [section 4: "Alphabetization (given names, surnames, domestic name order, thorn)"].


 * Something like that. I think navboxes should be used a lot more. They certainly can keep things orderly. Don't hesitate to come back here for more technical discussion as needed. I have a few resources to consult, and the topic interests me.
 * – ⊥¡ɐɔıʇǝo N  oetica! T– 20:26, 10 January 2009 (UTC)
 * I've refined and corrected things a little in my post above.– ⊥¡ɐɔıʇǝo N  oetica! T– 22:53, 10 January 2009 (UTC)


 * I have done the alphabetization of Esperantist. Ba Jin (listed at Esperantist) is a pseudonym, which I alphabetized at Ba.  Pope John Paul II (listed at Esperantist) is a titled name, which I alphabetized at John.  This reminds me of Cardinal, which is used as a middle name/title.  It also reminds me of Esquire, which is mentioned last in a name (or maybe I should say "mentioned after a name").
 * Some telephone directories have all Mc and Mac (and maybe M' ) names in a section between the L section and the M section. Also, Mackenzie (with a lowercase k) could be analyzed as being in the M section, rather than in the section for Mc and Mac.  Several Mac names have two forms which differ only as to the capitalization of the next letter.
 * In my previous work on Wikipedia, I have listed items in ASCII-code order, with numerals before letters. If numerals are ordered as the words they represent, then there is ambiguity with 1492, which could be read as "one thousand four hundred ninety-two" or as "fourteen (hundred) nineteen-two", and likewise with 2009.  See User:Wavelength/Articles started, sections 2 to 7.
 * Recently, when I added M.C. Mehta v. Union of India (Oleum Gas Leak Case) to List of environmental lawsuits, I left the order as I had arranged it before, but I noticed another problem: the new entry differed from another one (M. C. Mehta v. Kamal Nath) in the spacing of the initials. Perhaps one is right and one is wrong, according to a guideline somewhere on Wikipedia.
 * (All of this is giving me images of crazy quilting.)
 * -- Wavelength (talk) 07:38, 13 January 2009 (UTC)
 * Yes, I can understand your experiencing the crazy-quilting effect. I have edited the lists on the page myself. I do urge a move of Þórbergur Þórðarson to Thorbergur Thortharson; and even without that move, Thorbergur Thortharson would be much better for standard English usage, as in these lists. Such an adaptation is quite normal. We don't refer to Thor Heyerdahl as "Þór", or whatever the original form would be! I have also fixed some punctuation, capitalisation, and the like. The Esperanto word Internacio is best translated as International (SOED, "international": [B. n'] 3 (I-.) Any of various socialist organizations founded for the worldwide promotion of socialism or Communism; spec. = First International, Second International, Third International, Fourth International below. Also, a member of any of these organizations. L19.).
 * One entry was an error, due to confusion with an almost exact namesake. I removed it (see edit summaries). There are articles for several Russians with that same surname, as opposed to first given name and also surname; and while there is a disambiguation page there is not, so far, a DAB tag at the top of every affected page.
 * Language and languages were not designed for strictly rational collation such as alphabetising. We do the best we can, in an imperfect universe. I think we have it sorted out well enough this time. The larger matter of making WP guidelines to deal adequately with alphabetising is separate and more problematic.
 * – ⊥¡ɐɔıʇǝo N  oetica! T– 00:56, 14 January 2009 (UTC)


 * }

Here is a permanent link to a subsequent discussion of the topic: Wikipedia talk:Manual of Style - Wikipedia, the free encyclopedia [section 42: "Alphabetization and collation"].

-- Wavelength (talk) 22:16, 31 January 2009 (UTC) [I have updated the permanent link and the archived discussion. -- Wavelength (talk) 02:58, 20 February 2009 (UTC)] -- Wavelength (talk) 06:16, 28 April 2009 (UTC) At this time, the third discussion has added nothing substantial. -- Wavelength (talk) 06:26, 28 April 2009 (UTC)
 * }

Features
D1 = first discussion; D2 = second discussion; D3 = third discussion; D4 = fourth discussion; W = User:Wavelength; N = User:Noetica; Ar = User:Army1987; Re = User:Reywas92; Pe = User:Pegship; Fe = User:FeanorStar7; Ng = User:Nowimnthing; Ta = User:Taiwan boi; Kz = User:Kmzundel; Bo = User:Bookgrrl; Ja = User:JackofOz; Kp = User:Knepflerle; Wo = User:Woodstone

Release Version

 * 0.7/0.7alpha
 * 0.7/0.7geo
 * 0.7/0.7index