Wikipedia talk:Overcategorization/Archive 9

BLP, ethnicity, gender
Resolve arguments about differences between guidelines. Add "ethnicity, gender," to BLP, matching all other guidelines.

To avoid repeating myself ad infinitum:
 * All categorization is required to be both notable and relevant.
 * Certain quibblers have noted that ethnicity and gender are not specifically listed in WP:BLP.
 * WP:BLP is a "policy", while Categorization, Categorization of people (WP:COP), Category names, WP:EGRS, and Overcategorization (especially WP:OC) are "guidelines".
 * Certain quibblers argue that policy trumps guidelines for these special cases.
 * Thus, (non-notable or irrelevant) ethnicity and gender might be allowed for living people, but removed for the dead, undead, or incorporeal.
 * This is difficult to enforce or implement (and was certainly never the intent of the policy).

Please visit the Talk section above to certify the slight wording change. --William Allen Simpson (talk) 15:16, 7 March 2011 (UTC)

Removed links to avoid confusion
At WP:OC there were links to WP:N and WP:TRIVIA, neither of which have anything to do with categories. We can understand the concepts of "importance" (which I changed "notable" to) and "trivia" without irrelevant Wikilinks that only serve to muddy the waters. If you don't know the meaning of "important" and "trivia", read a dictionary. If they need defining, we must do that here, as relates to categories, not by deferring to other guidelines that were not written with categories in mind. Here's my edit: Fences  &amp;  Windows  01:50, 9 March 2011 (UTC)


 * The concept of notability needs to be applied consistently across Wikipedia. We should NOT be redefining notability here. This guideline does need to clarify how the notability guideline applies to categories, and a link to that guideline is relevant. -- Donald Albury 12:14, 9 March 2011 (UTC)
 * Not if the concept of "notability" as defined in the notability guideline doesn't apply to categories. Which I think it doesn't - it's about deciding whether a particular subject "is notable", not whether a particular property is a notable feature of some subject.  --Kotniski (talk) 12:54, 9 March 2011 (UTC)

I have restored the links. Please don't make significant changes without spending a week or so here first. And you cannot redefine notability or trivia here. They are the same across wikipedia. Although a subject may be notable, each categorization MUST be in regard to that notability, not some trivia. --William Allen Simpson (talk) 15:21, 9 March 2011 (UTC)
 * But if you click on the links and read the target pages, you will find (I think) that the concepts defined there simply do not apply to the context in which the words are used here. In other words, you're sending your potential reader on a wild goose chase, looking for explanation of a term on a page that we know doesn't contain any explanation of that term (as used here). --Kotniski (talk) 15:34, 9 March 2011 (UTC)


 * I think that the concept that notability is established by "significant coverage in reliable sources that are independent of the subject" is pertinent to categorization, in that an article should be placed in a category only if that membership is supported by "significant coverage in reliable sources that are independent of the subject." We don't need categories added based on rumors, deduction, or original research. -- Donald Albury 00:05, 10 March 2011 (UTC)
 * Fair point; but if we're going to extend the scope of the "notability" concept to cover facts as well as topics, we should probably say something about that explicitly in the notability guideline (and then perhaps link to that section). Or simply define here what we mean by notability in relation to facts. At the moment the notability guideline you want to link to is very clearly oriented towards deciding the notability of topics, without giving any indication that similar criteria could be applied to assertions of fact.--Kotniski (talk) 07:19, 10 March 2011 (UTC)
 * As you describe it, that's not notability, that's verifiability. Of course membership of a category should be verifiable and ideally verified in the article. That is not the same as notability, which is an article inclusion guideline. Fences  &amp;  Windows  21:48, 10 March 2011 (UTC)


 * Well, I don't agree. You made a change and two of us objected. At this point, if you want to change this guideline, you will need to find a consensus to do so. -- Donald Albury 09:57, 10 March 2011 (UTC)
 * That's a rather unhelpful thing to write. We're in the process of discussing to decide on the best thing to do. Just saying "I disagree" or "we objected" doesn't get us anywhere - we need to know why you disagree (and indeed, what you specifically disagree with).--Kotniski (talk) 10:57, 10 March 2011 (UTC)
 * What do you suggest, other than removing the links? -- Donald Albury 11:50, 10 March 2011 (UTC)
 * Well, we could write out explicitly what we mean by "notable" and "trivial" in this context. It doesn't seem to be written anywhere else.--Kotniski (talk) 11:52, 10 March 2011 (UTC)
 * If you support "notability" and "trivia" being linked, then you must explain how those guidelines apply to categories, as the pages themselves do not discuss categories (WP:TRIVIA is about trivia/miscellaneous sections in articles). Expecting each categorisation of a topic to have received significant coverage in multiple reliable sources is totally unrealistic, and strays very far from the purpose of categories, which is navigation among related articles. Must we have had several news articles written about the year of someone's birth or their age to categorise them according to their birth year? Must we have had a book written about someone coming from their home town before we can include that category? "Notability" is not the right term here and the links only confuse - they are simply a fig leaf for the fact that we've failed to properly explain what our criteria are for including a category on an article. Objecting simply on the grounds of "it's always been that way" and "you've not spent enough time here" (Willam Allen Simpson, you don't WP:OWN this guideline) is just being obstructionist. Saying "I object!" is not how we reach consensus - that involves debate. I suppose I will have to open an RfC about this if the regulars on this page are going to dig their heels in. Fences  &amp;  Windows  21:48, 10 March 2011 (UTC)

OK, I propose that the following wording for the section in question:

== Non-defining or trivial characteristic == 

:Example: Bald People, Famous redheads, ''Age of death

In general, categorize by what may be considered notable in a person's life, such as his or her career, origin and major accomplishments. Notability of a defining characterisitc in a person's life is established by significant coverage of that characteristic in reliable sources that are independent of the subject. In contrast, someone's tastes in food, their favorite holiday destination, or the number of tattoos they have may be considered trivial. Such things may be interesting information for an article, but not useful for categorization. If something could be easily left out of a biography, it is likely not a defining characteristic.

Note that this also includes grouping people by trivial circumstances of their deaths, such as categorizing people by the age at which they died or by whether they still had unreleased or unpublished work at the time of their death. Even though such categories may be interesting to some people, they aren't particularly encyclopedic.

That eliminates the link but brings in what I think is important, the definition of notability. -- Donald Albury 00:13, 11 March 2011 (UTC)
 * Looks good; but as in the discussion referred to in the previous thread, we must keep clear in our minds, and in our words, whether we are talking about the question of whether to assign an article to an existing category, or whether to allow a particular category to exist. This guideline seems to be intended to be about whether particular categories should exist, so we're not talking about a specific person - we're talking about whether the category represents a property which would generally be expected to be a notable feature of the people who have it. Once a category exists, we tend to just put articles into it as appropriate according to the sourced statements in the article.--Kotniski (talk) 07:37, 11 March 2011 (UTC)


 * I think Kotniski is onto something. The whole issue of how to determine which categoris to put someone in is not an issue here.  You may be able to find multiple articles that are independent and discuss someone being bald, but being bald is still trivial.  The issue is that it is not something like nationality, place of origin, place of education, employer, carreer, elected office held or so forth that is either a distict and clear thing about someone or a notable thing.  This page is focused on what types of categories should not be created.  Thus we disucss that Category:Roman Catholic athletes is not a useful category, however which actual athletes should also be categorized as Category:American Roman Catholics, Category:Italian Roman Catholics and so forth is an issue of categorizing people and not an issue of creating categorizations.  I do wonder if a seperate page that would discuss the dos and don't like of putting people in categoris is worth-while.  Such as "do not put people in a category when there is no evidence that they belong there" or "do not put someone in Category:American Roman Catholics if they were baptized at birth but there is no evidence they ever went to mass or publicly identified as a Roman Catholic".John Pack Lambert (talk) 22:15, 28 April 2011 (UTC)

Contradiction between categories and WP:OC#EGRS
I wanted to know if there is a reason we have Category:Sportspeople by ethnicity? It seems to be a clear contradiction between WP:OC and the existence of this category, surely all that it achieves is encouraging the creation of subcategories without anyone showing that there is a substantial article to create.

When a person appears to be able to have upto 7 (at least) ethnicities under current wikipedia usage it seems difficult to avoid issues of WP:UNDUE. Tetron76 (talk) 10:18, 5 April 2011 (UTC)


 * A person really needs to be prominent in some sort of way within their ethnicity. We once had categories of "people by religion." So some politician who was supposedly religion X would be listed that way even if he was never seen to practice religion X and violated most of its principles. I had thought we had put a stop to this. Maybe Mohammed Ali and Islam and maybe African American, but not most people. More than one seems preposterous and pretentious on the part of the biographer/editor. The ubiquitous "they" have traced Obama's antecedents to include "Irish" but that hardly defines him and would be silly to use IMO. (Good luck! :)  Student7 (talk) 15:56, 7 April 2011 (UTC)


 * The listed ethnicities are "Sami", "Romani" and "Basque" and "Catalan". The fact that Category:Sportspeople by nationality is a subcategory seems to suggest an odd ordering of the tree.  Nationality is not ethnicity.  There has been a fight to get rid of the "European-American baskeball players" and such categories but we have made little progress on it.  In the cases of "Romani People" they are a trans-national ethnicity, and so having this as a sub-set of Category:Romani people by occupation does make sense.John Pack Lambert (talk) 22:21, 28 April 2011 (UTC)
 * "We once had categories of people by religion" was said above? I thing that this is a misnomer.  We still have categories of people by religion, including Category:American Roman Catholics and hundreds of others.  We have dismantled many of the religion+occupation categories that cover non-notable intersects, but it is not clear that all our religion+writers cats are being limited to intersects where the religion effects the writing.  J. R. R. Tolkien clearly belongs in Category:British Roman Catholics because his Catholicism is known and notable and some even see it influencing his works in subtle ways, but I do not think he should go in Category:Roman Catholic writers.John Pack Lambert (talk) 22:23, 1 February 2013 (UTC)

Category:Indian films by topic
Does overcategorization apply to Category:Indian films by topic? Please advise what is the rule on this. Wiki-uk (talk) 12:47, 29 April 2011 (UTC)
 * Sofar no reactions here. I have added the phrase "Note: please avoid overcategorization and overpopulation." on top of the category. My feeling is that as long as an average maximum of two or three main thema of a movie are added, it should be fine. Please note an extreme example of 200 keywords on the IMDB page for Devdas (2002). Wiki-uk (talk) 04:57, 4 May 2011 (UTC)

Category:Priory of Sion hoax
Here a note about removing lots of folks from Category:Priory of Sion hoax.

I saw Isaac Newton (1643-1727) linked up to the Category:Priory of Sion hoax which is dealt with in Priory of Sion (hoax) (c:a 1920-1956). As far as I can conclude, the life span of Isaac Newton preclude him from either partaking or debunking the hoax, although he was a very skillful debunker of mint counterfeits in his time. I think it is wrong to link up real individuals mentioned in a category reflecting a: I removed the vast majority of guys mentioned in the hoax, but who didn't partake in the controversy around it (1953 and forth). But I don't know how this refers to Overcategorization — I just have a gut feeling that it is an overcategorization de luxe. Rursus dixit. ( m bork3 !) 10:15, 17 May 2011 (UTC)
 * less-than-notable
 * fiction story
 * because it links from a fictional universe to the real universe.


 * I guess maybe WP:OC. More? Rursus dixit. ( m bork3 !) 10:22, 17 May 2011 (UTC)

Non-defining vs. trivial clarification proposed
The first entry in this guideline is titled "non-defining or trivial characteristic". I think this is unnecessarily confusing, because if the history of CFD has come to any broad conclusions (and there are few), it has affirmed that a "defining" characteristic is not the same as saying that the characteristic is "non-trivial". "Defining" means something significantly more central to the thing than simply stating it is "non-trivial", "interesting", or "notable". We run into this confusion quite regularly at CFD, and this page may well be the source of the confusion. We haven't been able to reach a consensus on what "defining" means exactly, and I don't want to attempt yet again to craft a definition, but what I am proposing is to separate this first entry into two separate ones: "Non-defining characteristics" and "trivial characteristics", since they are not at all the same type of overcategorization. Any objections, or other thoughts on this? Good Ol’factory (talk) 23:51, 5 September 2011 (UTC)


 * With no comments in favour or against, I'll go forward with this small change. But please still raise the issue here if you wish to discuss it. Good Ol’factory (talk) 01:56, 12 September 2011 (UTC)


 * No problems with the change. Here's a definition of "defining" to get the ball rolling: A characteristic of a subject that is "defining" is one that reliable, secondary sources commonly and consistently define, in prose, the subject as having. For example: "Subject is an adjective noun ..." or "Subject, an adjective noun, ..."; here, subject to being used both commonly and consistently, each of adjective and noun may be deemed "defining".  Uniplex (talk) 07:11, 12 September 2011 (UTC)


 * I think that's a pretty good start. Maybe this could be included as an example of what would constitute a defining characteristic. The main problems in the past with definitions is trying to find a definition that is exhaustive, but maybe the key is to define it non-exhaustively using several examples. I think what you have set out is a great example that could be used. I've already added two examples of "rules of thumb"; maybe this should be included as well? Good Ol’factory (talk) 20:40, 27 September 2011 (UTC)
 * Yes, that was my thought too. Perhaps it can slot in as an "including" bullet point before the existing two "excluding" ones? Maybe the final words need a tweak: "may be deemed" -> "are likely to be"? Maybe the grammatical terms would be more accessible if replaced with real examples? Please go ahead making any such changes as you see fit. Uniplex (talk) 05:34, 28 September 2011 (UTC)
 * I agree that this one should be mentioned before the other two, since this one is positive and the other two are negative or exclusionary. For each bullet point we include, we could also include examples gleaned from CFD consensus on what is vs. what is not defining. I'll give this some more thought and try to incorporate your suggestions. Good Ol’factory (talk) 06:30, 28 September 2011 (UTC)
 * In fact, I've added some of these words and an example at WP:CAT; feel free to BRD if it's not right. Uniplex (talk) 09:58, 7 October 2011 (UTC)

Eponymous categories
A very recent change to section Overcategorization added without discussion:

I do not believe this is correct. Such a change would contribute to massive pollution of CfD, dealing with millions of these existing categories. Bad idea. Bad process.

This is a fairly common and obvious scheme to consolidate categories, and keep things manageable. Once there are 2 or 3 articles on a topic, it's much better to put them into 1 eponymous category, and maintain it as a subcategory of the relevant topics. Moreover, it violates the assumptions of the section, that there be no potential for growth. Current TV series come to mind. One season wonders might have no eponymous category, but once there are 2 seasons (main article and 2 season "list of episodes" articles, there SHOULD be a category. --William Allen Simpson (talk) 08:51, 8 October 2011 (UTC)

Proposed replacement: --William Allen Simpson (talk) 09:03, 8 October 2011 (UTC)
 * Two articles? Once there are two articles about a topic, it should be maintained in a category? That's pretty small and would result in a flood of thousands of categories that only contain "[TV SHOW]" and "[LIST OF TV SHOW EPISODES]"--how would that be helpful in navigation? —Justin (koavf)❤T☮C☺M☯ 12:42, 8 October 2011 (UTC)


 * Unless you have consensus to remove the bit about being part of a scheme, none of this is going to work. Because people will always come around and say that "Categories named after American television series" is a scheme, and that such epo categories should be allowed.  --Kbdank71 19:37, 12 October 2011 (UTC)


 * Support the present version. I have followed cfd closely for some years and the only editor who has confused 'Categories named after' with 'part of an established scheme' is User:William Allen Simpson. There has been for many years a presumption against eponymous categories, and particularly against sprawling eponymous categories with no clear inclusion criteria. As I have observed many times, Category:Categories named after American television series is a 'category of categories' and as such is quite different from (say) Category:Albums by artist, an established sub-categorisation scheme. Occuli (talk) 00:04, 25 October 2011 (UTC)
 * Keep as is We do not want eponymous categories. They are useful for topics like Abraham Lincoln, but in general it is best to avoid putting categories in categories with the same names.  Pop musicians tend to be in so many categories as it is, having every one in a category with the name of the article is a really bad idea.John Pack Lambert (talk) 22:27, 1 February 2013 (UTC)