User:Nederlandse Leeuw/Examining the phrase a large overall accepted sub-categorization scheme

The phase a large overall accepted sub-categorization scheme is part of the WP:SMALLCAT guideline, and has been since it was first developed in December 2006. But it is unclear what it means, why and how this specific wording was developed, and what are good examples. More importantly, there is widespread disagreement whether it is really works to prevent the deletion or merger of certain important/helpful categories, or that it can be employed as a pretext to oppose the deletion or merger of any category whatsoever, no matter how unimportant/unhelpful that category is. As a result, there is no consensus on how to interpret it, and whether it should stay in the WP:SMALLCAT guideline as it is, should be amended to be clear and work as intended, or be removed the WP:SMALLCAT guideline for serving no apparent useful purpose.

The present text seeks to examine all these questions.

Development
Reconstruction made by Nederlandse Leeuw (me) on 3 August 2023:

The basic text of the WP:SMALLCAT guideline was haphazardly put together in December 2006. The most relevant diffs are between 18:22, 21 December 2006 and 22:15, 22 December 2006. At the time it was still a proposed guideline. It was put together through unilateral actions of several editors (including Dugwiki, jc37, Tim!, Circeus and others) making it up as they went along, sometimes reverting each other and almost edit-warring in the process. There was virtually no talk page discussion (just Dugwiki making two comments explaining their own edits).

To be fair, that is how many early policies and guidelines on English Wikipedia were made; whatever stuck became customary law. It's only later that amendments were formally proposed and voted on, but per WP:PGCHANGE a lot can still be WP:BOLDly amended. Most disagreements in December 2006 about SMALLCAT were apparently exactly about examples of what a large overall accepted sub-categorization scheme looks like, and how many items there should be in a category to be exempt, or that this number should remain vague or unspecified.

But honestly, I have no idea what they were trying to say, and I believe they also didn't really understand each other. We just ended up with the present text of a large overall accepted sub-categorization scheme when people stopped unilaterally changing it and reverting each other, and up until this day there is disagreement about how to interpret it. It is true that we cannot ignore it just because there is disagreement about how to interpret it. It is part of a current guideline, to be adhered to for as long as it exists, even though its practical application will be problematic for as long as its meaning is unclear. I certainly support amending it to clarify what it means. Cheers, Nederlandse Leeuw (talk) 19:23, 3 August 2023 (UTC)

Reconstruction made by jc37 (one of the initial developers, who knows more about it than I, Nederlandse Leeuw, do) on 24 July 2023:


 * In November 2006, User:Radiant! created the Overcategorization page, based upon many CfD discussion results -


 * In December 2006, Radiant! added the section "No potential for growth" to Wikipedia:Overcategorization -

As can be seen, originally, "No potential for growth" was the title. The title did not use the word "small", until I added it several edits later:.


 * In August 2007, Radiant! changed the words "two or three" to "a handful" -


 * Then "a handful" changed to "a few" (by me), in the next edit -

There were many reasons to change from a set number. For one thing, it had become divisive. Things were getting nominated due to numbers alone, without actually looking to see if it was part of an overall system. (And had also begun to be set for Speedy Deletion.) As can be seen, "Songs by artist" had really become contentious over this. For example, this was the edit right after Radiant! initially added the section. Which was then re-written in the next edit here.

Another reason is semi-related - gaming the system. If you set a finite amount, then: "anything over that amount should be an automatic Keep, right?" Or so went the argument. It also was leading to category "stuffing". As it's not that difficult to find anything anywhere that could maybe fit under a category, just to prevent its deletion.

So an indeterminate amount, handled on a case-by-case basis at WP:CFD, was seen to be better.

That said, there have always been those who want a set amount, because they have the seeming idealistic hope that it would reduce discussions at CfD, or that it might dissuade category creators from making small categories. Neither of which has been proven out over the years. - jc37 14:21, 24 July 2023 (UTC)

Disagreements over interpretation
''These case studies are just examples. They are not meant to rehash old discussions, or cast the comments or actions of certain users in a bad light. They are only intended to illustrate the point that there are frequent disagreements over how to interpret the phrase a large overall accepted sub-categorization scheme, and why that makes categorisation discussions difficult, whereas a guideline should guide users to make such discussions easier.''

Disagreements over "sub-categorization scheme"
Reconstruction: There was an article about a band named "Counterfit", which apparently only produced one album ever, Managing the Details of an Undertaking, and then disbanded. The article on the band and the album were later soft-deleted on 28 October 2021: (Unsourced, fails WP:MUSIC. They released one album 20 years ago on a non-notable label, then disbanded two years later.)
 * Categories for discussion/Log/2011 April 17

As of 17 April 2011, there were apparently 3 categories, containing 5 items in total:
 * Category:Counterfit, only containing main article Counterfit about the band itself, and 2 subcategories (which were apparently created later):
 * Category:Counterfit albums (later deleted on 5 November 2021 as empty)
 * Category:Counterfit album covers (later deleted on 8 May 2021 as empty)


 * Nominator sought to delete Category:Counterfit: Per WP:OC, "articles directly related to the subject typically are already links in the eponymous article in question". That is the case here. There are only 4 albums and their covers. The albums themselves would already go in a well-established albums by artist category, thus no need for a redundant eponymous category.
 * Editor A: Keep and recategorize the contents of Category:Counterfit albums and Category:Counterfit album covers back into Category:Counterfit. This is completely useless subcategorization. The entire subject area of this band encompasses only 5 articles; there is no need for further subdivision. The albums cat isn't even used correctly: 2 of the releases in it are EPs, not albums. Are we really going to have subcats (albums & EPs) just to hold 2 articles each (the band broke up years ago & only ever released 2 albums & 2 EPs)? That makes no sense at all. (...)
 * Nominator and Editor A exchange arguments.
 * Editor B: Delete – the articles should certainly not be reorganised back to an undefined top category. Any albums should be in an albums category. (...)
 * Editor A: The top category is hardly "undefined": It says right at the top what the topic area is (articles relating to the band Counterfit). The albums are in albums categories: albums by year and by genre. However, it hardly makes sense to have a "by artist" album category in this case when the artist in question only released 2 albums and is no longer active. (...)
 * Editor B: We categorise articles by 'defining characteristics', and it is impossible to describe Managing the Details of an Undertaking without saying it is an album by Counterfit: ergo 'Counterfit album' is defining. (It doesn't bother me at all to consider EPs as albums.) We have had endless debates about 'eponymous musician categories' and no-one has previously raised any objection to albums categories. WP:SMALLCAT explicitly states "... unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in Category:Songs by artist".
 * Editor A: (...) the whole point is that this is a tiny topic area that isn't ever going to get any larger. I never said it didn't make sense to categorize albums by foo as an album by foo, I said that it doesn't make sense to create an albums by foo category if will only ever contain 1 or 2 articles. It may not bother you to consider EPs albums, but it bothers the Albums project, hence why we have separate Category:Albums by artist and Category:EPs by artist. So if we are going to be pointlessly strict about "overall accepted sub-categorization schemes", we will have Category:Counterfit albums and Category:Counterfit EPs each of which will only ever contain 2 articles. What usefulness does that have to a reader? How does it help them to find or navigate between articles of the topic area "Counterfit"?
 * Nominator: Can you point me to where it bothers the Albums project? I've found discussions where it bothers some people more than others, but no consensus or a mention in WP:ALBUMS. While you can say there is consensus for the categorization scheme of Category:EPs by artist, there is nothing that says an EP by Foo should be categorized under Category:Foo EPs, yet it does say within the project that every album should be categorized under Category:Foo albums at WikiProject Albums/Article body.
 * (several exchanges)
 * Editor A: (...) Just as your [Editor B's?] assertion that because Category:Albums by artist has a "large overall accepted sub-categorization scheme", every album article must be placed in a Category:Foo albums to fit that scheme (even if it would be the only article ever to inhabit that category). By the same token, Category:EPs by artist has a "large overall accepted sub-categorization scheme" and EP articles should follow the same convention, yes? This, of course, meaning even more categories that will only ever contain 1 or 2 articles.
 * I don't see it as tangential. The root problem here is the instruction at Category:Albums by artist that "all single-artist album articles should have subcategories here, even if it's the only album the artist has recorded." You've said that we must stick to that instruction, as it represents a "large overall accepted sub-categorization scheme" (per WP:SMALLCAT). I think it's stupid to require the creation of categories that will only ever contain a few articles. That's the central issue affecting this category, as well as Category:The Hippos and other eponymous categories I created to group very small topic areas, rather than have them spread out across multiple cats/subcats.


 * Editor C: Delete. I would say there's not enough in this case to justify an eponymous category. I support having an albums category first. An eponymous category should only follow if there are several subcategories and articles that need grouping.
 * Closer: The result of the discussion was: Delete.

Analysis
 * A point of disagreement was whether to consider EPs as albums or not, because this would determine what the sub-categorization scheme even was, and whether it applied. The only example stated by WP:SMALLCAT was subdividing songs in Category:Songs by artist; it didn't say what to do with albums (or album covers) or EPs by artists, let alone bands. Categorising EPs as albums allegedly clashed with certain standards WP:ALBUMS, although it wasn't clear whether there was consensus on those standards either.
 * Otherwise, there wasn't so much disagreement about what the phrase a large overall accepted sub-categorization scheme meant, but whether it should be "pointlessly" strictly applied. Given that the band in question had already disbanded about 10 years earlier, there was no (realistic) potential for growth (a phrase in the full title of WP:SMALLCAT, Small with no potential for growth, which Editor A indirectly invoked) for albums, EPs or any other production by that band. Editor A argued it would be better to just put all 5 items in total in 1 category to help the reader navigate.
 * Nominator, Editor B and Editor C jointly agreed on deletion per WP:OC, taking a large overall accepted sub-categorization scheme into account, over the navigational argument, the lack of potential for growth, and the (unclear?) WP:ALBUMS standards, as advanced by Editor A.
 * 10 years later, all remaining articles and categories were still deleted anyway as non-notable.

Disagreements over "large overall"

 * Categories for discussion/Log/2014 August 21
 * Nominator: [Due to recent deletion of non-notable mayor biographies], this is now a WP:SMALLCAT, more or less permanently stalled out at two articles with no realistic prospect for expansion. Delete Mayors of Kirkland Lake and upmerge entries to and . No prejudice against future recreation if we ever actually have something like five or six articles to file in it.
 * Editor A: Keep  as these categories "are part of a large overall accepted sub-categorization scheme" (WP:SMALLCAT).
 * Nominator: No, they aren't. A "large overall accepted sub-categorization scheme" would be if every municipality in Ontario had its own separate subcategory for its mayors — "some larger places do while smaller places don't" does not satisfy that criterion.
 * Editor A: Category:Mayors of places in Ontario currently has subcats for 42 other places - it looks like a "large overall accepted sub-categorization scheme" to me. However, the 2 articles currently in the category don't indicate that the people are notable because of being mayors/reeves (they are notable as MPs) so per WP:COP the mayors cat tag could be removed from those articles and hence the category deleted as empty. I've struck my keep !vote.

Analysis
 * Nominator and Editor A disagreed on whether every municipality in Ontario [having] its own separate subcategory for its mayors was required, or whether the then-existence of subcats for 42 other places (out of a total of 444 municipalities in Ontario) was sufficient, in order to demonstrate the existence of a large overall accepted sub-categorization scheme.
 * Apparently, Nominator focussed on the word "overall" (which was not satisfied, because only 43 out of an "overall" potential of 444 municipalities had a mayor subcategory), while Editor A focussed on the word "large" (because 43 was arguably a pretty "large" number of subcategories).
 * No agreement was reached. Editor A only struck their keep !vote because the two mayors of Kirkland Lake in question were only notable for being Canadian Members of Parliament (MPs). Therefore, the question whether all potential subcategories should exist, or if some under-populated subcategories should/could exist even if not all potential subcategories existed, remained unresolved.

Finding of Fact: "reasonable editors can reach differing conclusions"

 * Arbitration/Requests/Case/SmallCat dispute/Proposed decision
 * Arbitration Committee (11/11): (...) SmallCat has been part of a guideline since 2006. Originally named "No potential for growth", it was changed after editors were using it to delete categories based purely on numbers. There has been an ongoing desire, never reaching consensus, to apply a strict numerical threshold for SmallCat (jc37 evidence). Use of such numerical thresholds, even if phrased as a "rule of thumb" or similar such phrase, in CFDs is therefore not supported by the guideline. However, reasonable editors can reach differing conclusions about other elements of the guideline, including the potential for growth and whether categories.