Wikipedia:Categorization/Sorting names

Ordering names in a category
It is possible to change the default order in which the articles in a Category are displayed on the Category: page. For general instructions and conventions about this, see. Note that there are two techniques for defining a sort order different from the sort order that would result from the page name:
 * 1) Adding   in the article sets the category sort key for all categories without sort keys in that article, before or after it.
 * 2) Per listed category, overriding the DEFAULTSORT,

The sort key should mirror the article's title as closely as possible, while omitting disambiguating terms. Some exceptions are made, however, to force correct collation.

Please note that some named individual animal have titles included in the article name (for example, Sergeant Stubby, a dog with a formal military rank) and are therefore subject to this guideline.

Sort by surname
If the article is titled "Forename Surname", the category should be added to the article as  (or:  ) so that it will be sorted by surname (surname and family name are used interchangeably in this article). However, there are exceptions depending on customs, where a person lives and when they lived. If the country is not listed, try consulting with Names of persons : national usages for entry in catalogue in the bibliography section. It is a resource for how librarians and institutions inside their respective country sort names. However, the sort value may be inappropriate outside their country.
 * Arabic names or Islamic names historically had no family or given names, but a full chain of names. These names should be sorted as they are written out. However, after 1900, Arabic names became similar in structure to those of Western names, and these should be sorted as if they were Western names. Certain areas form exceptions: for example, in Malaysia, Islamic names follow a patronymic pattern, as do a subset in Afghanistan, Pakistan and Bangladesh.
 * Modern names with Abu, Abd, Abdel,  Abdul, ben, bin and bint are considered compound names and particles are integral to the name. Osama bin Laden is sorted . Mounir Fakhry Abdel Nour is sorted.
 * Burmese names have no surnames or patronymic system, therefore they are sorted as they are written. However, if the person's common name includes an honorific, the name should be sorted with the elements succeeding the honorific. U Thant is sorted.
 * Chinese names, Korean names, Vietnamese names and Cambodian names are generally written with the family name first: Mao Zedong is sorted.
 * Eritrean and Ethiopian (Habesha) names that use a patronymic system are sorted as they are written.
 * Icelandic names are generally patronymic and occasionally matronymic, with a person's last name derived from their father's or mother's given name. For example, Arnaldur Indriðason is the son of Indriði G. Þorsteinsson. Normally a patronymic name is sorted as it is written. However, on English Wikipedia, the DEFAULTSORT value is Western order, overridden for Icelandic categories, where the sort key is as the name is written. Arnaldur Indriðason is sorted, while the Icelandic category of photographers is done,  . For the   parameter in project templates on article talk pages use the DEFAULTSORT value (since it mainly categorises in non-Icelandic categories), e.g.,.
 * Indonesian names may be sorted by surname or in the order they are written depending on the Ethnic background of the individual. Javanese names (the most populous ethnic group in Indonesia) do not generally have surnames and may be sorted in the order they are written.
 * Japanese names for people born after 1885 follow Western order. For people born before 1885, names followed the same practice as Chinese names.
 * There are exceptions. Sumo wrestlers, geishas, kabuki actors, and practitioners of traditional crafts and arts may take professional names. These names follow the same practice as Chinese names. Sumo wrestler Toyohibiki Ryūta's sort value is.
 * Malaysian names usually use a patronymic system and are sorted as they are written. There are exceptions; most notably, Malaysian Chinese names are handled as regular Chinese names.
 * Portuguese names (Portugal only) are commonly composed of one or two given names, and two family names. In a compound family name, the first name is the mother's maiden name, with the second name being the father's surname. These names should be sorted on the last element or the father's name. Francisco da Costa Gomes is sorted.
 * Spanish names are similar to Portuguese names in that they are commonly composed of one or two given names, and two family names. However, in a compound family name, the first name is the father's name, while the second name is the mother's name. The sort value depends on how many names are in the articles title. For Gabriel García Márquez, with two family names and one given name, the sort is . For José Ignacio García Hamilton, with two family names and two given names, the sort is  . Be careful, as the article's title may include any combination of given names and family names.
 * Thai names have only contained a family name since 1915 and the name follows the western pattern of "given name, family name". However, people in Thailand are known and addressed by their given name. In categories mostly containing articles about Thai people, all names should be sorted with the given name first. For example, Thaksin Shinawatra is sorted . That the entries in a category are sorted in this way for this reason should be indicated on the category page, for which the Thai people category template can be used. Thai names in categories which only contain relatively few such names should, in these categories, be sorted without applying the "sort by given name before family" exception, which only applies to categories which dominantly contain Thai names and which are entirely sorted the Thai way. user:cewbot is now maintaining sort keys in Thai-people categories.
 * Most Muslim Turkish names before 1934 had no surname. After 1934, people adopted surnames.

Historical patronymic names
The patronymic system was once common throughout Europe and in some parts of the world. See Patronymic for the list of systems used in each country. Patronymic names should be sorted on their first name. The following is to distinguish how to sort the relevant historical people in some of the more common languages:
 * East Slavic languages (Russian and Ukrainian) with the ending -ovich, -ovych, -yevich, -yich are used to form patronymics for men. For women, the endings are -yevna, -yivna, -ovna, ivna or -ichna. For example, in Russian, a man named Ivan with a father named Nikolay would be known as Ivan Nikolayevich or 'Ivan, son of Nikolay'.
 * Irish names were formed by using Mac for "son of", Ó or Ua for "grandson of", Ní for "daughter of the grandson of", Nic for "daughter of the son of" and finally, Uí for "wife of the grandson of". The transition to fixed surnames began around 1000 and was completed after 1200. An example would be Ailill mac Dúnlainge, son of Dúnlaing mac Muiredaig.
 * Jewish names were formed by using ben or bar for "son of" and bat for "daughter of". Permanent surnames started in the Iberian Peninsula around 1000 and spread eastward over the next 700 years.
 * Scandinavian names (Danish, Swedish and Norwegian) were formed by using the ending son, søn, sen to indicate "son of", and dóttir, -dotter, datter for "daughter of". Denmark outlawed the patronymic system in 1828, Sweden in 1901 and Norway in 1923. However, the countries started to abandon the patronymic system much earlier. The nobility and academics started using surnames in the mid 1500s, the middle class around 1700, with most people having surnames in the 1800s. An example of a patronymic name would be Sverker Karlsson, the son of Karl Sverkersson. See also the section about Icelandic names above.
 * Scottish names began using fixed surnames around the 12th century, though the practice continued in some areas until the 1700s. In the Gaelic language, the word meaning son is mac. The word meaning daughter is nic. Máel Coluim mac Donnchada was the son of Donnchad mac Crínáin and is sorted.
 * Welsh names before the 1536 Act of Union were mostly patronymic, but people had begun to use fixed surnames for over 100 years. The patronymic practice continued after 1536 and is still used today. In the Welsh language, the word meaning son is ap or ab. The word meaning daughter is merch or verch (modern spelling ferch). Rhiryd ap Bleddyn was the son of Bleddyn ap Cynfyn and is sorted.

Nobility

 * Kings, queens, emperors, emirs, sultans, popes and others known by their official names should be sorted as spelled out. An ordinal number is converted to an Arabic numeral with a leading zero. Louis IX of France's sort value is . In some cases, you can leave off redundant information in a category,.
 * European princes and princesses are sorted by their given name. Prince William is sorted . Because of the prevalence of princes with the same name, Arabic or Muslim princes are sorted by their given name, but a second name (usually their father's given name preceding bin or ibn) is added. Prince Talal bin Abdul-Aziz Al Saud, whose father is King Abdul-Aziz, is sorted.
 * British peers are sorted by name of the title rather than surname, e.g. Robert Gascoyne-Cecil, 3rd Marquess of Salisbury is alphabetized under "Salisbury", not "Gascoyne-Cecil" or "Cecil":.
 * Some peers are almost invariably known by some name other than their peerage (which will not, in such cases, appear in the article title); for example, Frederick North, Lord North (who was 2nd Earl of Guilford) or Anthony Eden (who was 1st Earl of Avon). This should be followed for most categories, sorting them under North,... and Eden,...; but categories directly relating to the peerage should still sort them under it.  and , respectively.
 * Unless necessary for identification, Sir, Dame, Lord and Lady should be omitted from the sort value.

Other exceptions

 * Eliminate epithets: e.g. "Saint" in Saint Alban:.
 * Generational suffixes (e.g., "Jr." or "III"), should be placed at the end of the sort key, rather than with the surname: Robert J. Smith II sorts as, not.
 * Only hyphens, apostrophes and periods/full stops punctuation marks should be kept in sort values. All other punctuation marks should be removed. The only exception is the apostrophe should be removed for names beginning with O'. For example, Eugene O'Neill is sorted.
 * Clerical titles, academic titles, military titles and honorifics should not be used in sorting. For example, Martin Luther King Jr. is sorted  and without the titles "Doctor" or "Reverend", for his academic and clerical achievements.
 * Surnames beginning with Mac or Mc are sorted as they are spelled. Douglas MacArthur is sorted  and Malcolm McDowell is sorted  . This is also British standard (BS 3700:1988) and ISO 999:1996 standard for preparing indexes.
 * Names with particles or prefixes are a complex field and there are exceptions and inconsistencies. Examples of particles are af, al, dall, da, de, della, di, do, dos, du, el, la, o, and von. Whether or not to include the particle in sorting can be up to the individual's personal preference, traditional cultural usage or the customs of one's nationality.
 * Generally, Dutch, French, German, Italian, Portuguese, Spanish, and Swedish names do not include lowercase particles in sorting, but do include uppercase particles. For example, Otto von Bismarck is sorted, Jean de La Fontaine is sorted  , and Alberto Di Chiara is sorted.
 * American, Australian, Canadian, and English names generally sort on the prefix, regardless of capitalization. However, there are discrepancies between different sources on whether to sort on the prefix or not.
 * In Belgium, Dutch/Flemish and French/Walloon names sort differently by time period. For people in the Southern Netherlands (Belgium) before 1830, surnames are sorted on the body of the surname and not on the prefix(es). For example, Rogier van der Weyden is sorted  and Gérard de Lairesse  . In contrast, Belgian people since 1830 are sorted on the prefix. For example: Paul van Ostaijen is sorted   and Christian de Duve is sorted.
 * In South Africa and Namibia, Dutch/Afrikaans and German surnames are sorted by prefix, e.g. F. W. de Klerk is sorted.
 * In modern Arabic or Islamic names, the prefixes al and el, regardless of capitalization, are never part of a family name for indexing. For example, Osama Al-Muwallad is sorted  and Ezzat el Kamhawi is sorted.
 * Sometimes the name containing the prefix is not a family name, but a description of where the person is from. In these cases, the sort value is how the entire name is spelled. For Peire de Corbiac, "de Corbiac" is a description where Peire is from, the town of Corbiac. So, the name means 'Peire of or from Corbiac' and is sorted.
 * Sometimes a given name is combined with neither a surname nor a peerage title; it is preferable to sort on the first name in these cases. Example: for Augustine of Hippo, use  or simply.
 * Some people are known primarily by their first name only. When it is not possible to set the first name alone as the article title, as with many articles in Category:Brazilian footballers, you should sort with the first name first to make the article easier to find in the categories. For example, Leonardo Araújo is commonly known as Leonardo, and should be sorted as.