Wikipedia:Proposed category reorganizations/Re-populating actor categories

Initial proposal
I would like people working in Category:Film actors to consider adding ALL film actors back into this category. The decision to subcategorize everyone was made before there were category TOCs. Nationality is an artificial distinction to many actors who are multi-national. I could see the utility of being able to browse through a category of all actors. You would still have the choice of browsing by nationality. -- Samuel Wantman 21:15, 21 January 2006 (UTC)

First Stage: Concerns, Analysis, Brainstorming, etc...

 * I think that's crazy. I don't see how you're supposed to browse a category with thousands of articles in it.  The table of contents helps if you know the name...which you don't need a category for in the first place.  What you're proposing defeats the purpose of subcategories and goes against precedent in other categories.  For instance, it's possible that I'd want to browse thru all American people, not just American botanists, or Californians.  But we don't keep Americans in both categories, because being in the subcategory automatically puts you in the parent category.  You can always browse the subcategories, too.  Categories as big as film actors was last week are simply unbrowsable, but when subcategorized, they can still be browsed, while people can also check out Category:Mexican film actors and so on, getting more information without any more effort.  NickelShoe 21:57, 21 January 2006 (UTC)
 * As per Categorization. It does make sense, for instance, for award-winners to be both in the subcat and the parent cat, because award-winners are a small amount of actors, and because a browser might not have any idea an actor won the particular award.  But all people have nationalities, and where the nationality is a question, then we can leave it here in Category:Film actors or in multiple subcats, such as Category:American film actors and Category:Canadian film actors.  Duplication here doesn't seem any more useful than duplication at other occupations, and such a large occupation makes duplication more cumbersome here. NickelShoe 22:06, 21 January 2006 (UTC)
 * It is not as big as you think. Most of the film actor by nationality subcategories do not have many people in it.  I suspect the combined listings would be about three times bigger than the American listings. It doesn't defeat the purpose of the subcategories, they would still exist.  The past precedent was because of an inability to easily browse through large categories.  With a table of contents, this would not be difficult to browse through.  I am not proposing getting rid of the subcategories, I just proposing to add the complete list of film actors here.  I do think there is a point when large categories would be unweildy, probably when there is more than about 600 names for each letter of the alphabet.  I don't think this category would be that big.  I see advantages, and I don't see any disadvantages.  It has never been against policy to have articles listed in categories and their grandchild subcategories.  The guidelines (which I just concluded facilitating the rewrite), say that these decisions should be made to help people browse through categories.  Since I am just proposing to add to what we already have, I don't see how it makes things worse, and it might make it better for some people who would like to  see the complete list.  For example, let's say someone is trying to remember the name of the actor and all they remember is that it started with a B (or was it a D?)  The way things are now, it would be hard to find if you don't know the nationality of the actor.  It would be easy if they were all here. -- Samuel Wantman 22:20, 21 January 2006 (UTC)
 * It seems very awkward to me to have only the people with unknown or multiple nationalities listed on this page. It should be all or none. -- Samuel Wantman 22:21, 21 January 2006 (UTC)
 * I think the line of unbrowsability is much lower than that, and in any case, this category was that big.  I've moved hundreds of people out of it in the past few days.  I know several of the subcats are pretty small--like Category:Welsh film actors.  But there's a lot of nationalities, and if you don't pull out some nationalities into subcats, this category has four hundred actors per letter (because there are plenty of people not in here before I started subcatting).  Well, I wasn't counting, but you're free to look at my contributions to get an idea.


 * I can only speak for myself, but if I can only remember what letter something possibly started with, the category is actually unuseful to me around fifty. I'm sure others can use it up to two hundred.


 * I suppose leaving unknown nationalities here is a little weird...I was thinking as a temporary solution until they were either in multiple subcats or somebody actually included their nationality in the article. NickelShoe 22:30, 21 January 2006 (UTC)
 * Since you don't like browsing through more than 50 names you would not be forced to, you could browse the subcats. But many people might. It certainly helps deal with the people of unknown nationality, and in general might make for less work.  The natural inclination is for editors to add people to this category.  Why fight it?  I can imagine that this would be a shock to you if you've just recategorized hundreds of articles.  All I can say, is that I'll help you put them back.  I would like to hear from others about this.  And, BTW, I'd like to make the same change to film directors.  Quick, can you tell me what the nationality is for Roman Polanski? -- Samuel Wantman 22:40, 21 January 2006 (UTC)
 * But...if I know his name, why don't I just type that in? (And it should go without saying that I'm going to hold off any recategorization up or down until there's some kind of consensus here.) NickelShoe 22:45, 21 January 2006 (UTC)
 * I guess I should be more explicit about my point about Roman Polanski. The point I'm making is that "by nationality" often constrains categories artificially.  There is often nothing particually notable about an actor or director being from one country or another.  They are, like in Polanski's case, often from one country, raised in another, got famous in a third and then moved somewhere else.  My point is that while some people might find the distinctions of nationality to be notable, it often is not.  Because of that, I'd prefer seeing categories populated at a higher level when possible.  This would not apply to Category:People by nationality because that is the entire point of the category.  Everywhere else, when nationality intersects with profession, I'd like to populate the smaller categories by nationality and larger categories by profession unless nationality is integral to the profession (like politicians). -- Samuel Wantman 23:05, 21 January 2006 (UTC)
 * I have also been doing some work on subcategorization. I guess my greatest concern is that a consensus be established and that the consensus then be posted at the top of the discussion page of the relevant categories.  Otherwise, I believe, what we'd tend to get, (and what we probably had before the subcategorization effort) is some individuals only in the grandparent category and some individuals only in the grandchild (by nationality) category, and some in both.  That, to me, is worse than the alternatives of either a very big grandparent category, or having to go to the grandchild categories to browse.  Whichever way the consensus goes, I would help in the effort to make things consistent.  I don't think the multi-nationals or unknown nationalities are a problem.  Multi-nationals can be listed in multiple nationality categories.  Unknown nationalities can be researched to determine their nationality.  In general, I tend to prefer very small categories :P  but I guess that with the TOC, 200 names (1 page) per letter, or less, is reasonable. -- LiniShu 23:28, 21 January 2006 (UTC)

I decided to do some analysis of this category. There are about 3300 Entries in Category:Film actors Almost all don't seem to be listed in the nationality subcategories. For the subcategories there are: At most there would be about 5000 total names in Category:Film actors which would not make the category very much more crowded than it is now. I also notice that there is already Category:Actors by nationality which makes me wonder why we would need Film actors to also be broken up by nationality. Film actors is already a subcategorization of Actors. -- Samuel Wantman 00:12, 22 January 2006 (UTC)
 * 1113 Americans
 * 3 Argentines
 * 203 Australians
 * 5 Austrians
 * 83 British which overlap with 107 English 17 Scottish and 8 Welsh
 * 59 Canadians
 * 2 Danes
 * 23 French
 * 5 Germans
 * 13 from Hong Kong
 * 7 Irish
 * 4 Israelis
 * 4 Italians
 * 71 Mexicans
 * 2 Poles
 * 5 Russians
 * 13 Singapore
 * 3 Spaniards
 * 6 Swedes
 * I support the inclusion of e.g. American film actors in the Film actors category. It's true that many people might find the category too big to browse, and so would use the subcategories instead--but if people are interested for whatever reason in film actors in general, then as big as that is, that's what would be best for them.


 * Look, my copy of the Video Hound book has thousands of movies listed in it, and sometimes I'll browse through it to see what I'll find. Other times I'll be interested in finding a specific kind of film and I'll browse through the book's categories.  Why does what works in a printed reference work not work for an online encyclopedia?


 * The point I want to stress is that the Film actors category ought to have some use, otherwise it should be a category empty except for the subcategories. Who would be interested in the set of "Film actors who are not listed in a subcategory"?  To whom is that a useful group?


 * When a consensus is reached, putting the guidelines on the top of each category is of course a great idea.


 * The question of whether you need actors by nationality and film actors by nationality is a good one. In terms of what people might be looking for, it's not clear why having separate subcategories for say French actors and French film actors would be helpful.  I've thought the same thing about musicians by state and musical groups by state--I spent an evening recently separating out the individual musicians from the musical groups into the two respective subcategories for Massachusetts, and now I'm wondering if someone interested in seeing the Massachusetts musical scene at a glance wouldn't be better served by a single category that combines both the individuals and the bands.  Sometimes there's an obvious way to break articles up that really is not very useful.

Nareek 04:42, 22 January 2006 (UTC)

The more general question is "When is a category too big?" The push to subcategorize started before the implementation of category table of contents. I don't know if this question has been really addressed since then. My take on this is that categories get subcategorized into tiny pieces often because it is not possible to do a database selection. These small categories like Category:Polish film directors have minimal browsing value compared to being able to browse through Category:Film directors. These multi-attribute categories are often too small. As a guideline, it seems reasonable that categories be populated at the same level as articles about the subject. We don't have an article about Polish film directors but we do have one about Film directors. It also seems that above about 6000 articles a category would become very unwieldy. -- Samuel Wantman 08:40, 22 January 2006 (UTC)


 * Thanks, Samuel, for taking the time to do some numbers analysis for this discussion :) And also, thanks for raising the question of "When is a category too big?"  I think it would be helpful if we also had some consensus on that.  Not a hard and fast rule, and I could imagine there being exceptions in some specialized areas of knowledge, but, I would appreciate, for example, when I look at Category:American actors and think, "wow, this category is really big", being able to go back to the consensus that has been reached, and to answer myself "ok, it's not too big, yet."  Your suggested guideline of categories being populated at the same level as articles about the subject is a good one, with the possible exception of when that category is just absolutely too big per the consensus that I hope we'll arrive at.
 * Another point I'd like opinions on - this discussion began with the question of whether articles should be in both the grandparent category Category:Film actors and the grandchild categories (Category:American film actors, for example); I would like to know whether the articles in Category:American film actors should also be listed in their other grandparent, Category:American actors. That is a category that might be over 6000 if all entries were listed there. (Sorry, I haven't counted yet how many are there right now). Actually, as I think about it, there may be many articles in Category:American actors or Category:Actors that are not yet listed in any of the subcategories of Category:Actors by medium  So, the count of about 5000 names for Category:Film actors might be higher if everyone was listed there who should be listed there. -- LiniShu 12:44, 22 January 2006 (UTC)


 * Category:American actors has 4355 entries at this time. This would not count those that are only listed in a grandchild American actors by medium subcategory (whether recategorized by myself and others, or listed directly in the grandchild subcat in the first place.) Category:American actors currently has a too large tag on it.  If you read the discussion page, it looks like there was previously a too large tag, which Samuel removed, noting so, with his reasons, on the discussion page, in August 2005.  The tag was added again, without discussion, on 14 January 2006.  I began, that same day, slowly, to depopulate the category into the American actors by medium subcategories. --LiniShu 13:21, 22 January 2006 (UTC)


 * What exactly does "unbrowsable" mean? You can browse through an unabridged dictionary or a multi-volume encyclopedia, and plenty of people do. If you mean it's not practical to look through the entire category to find someone specific whose name you've forgotten, then that might be a pretty low number (depending on your patience).  But there are plenty of other reasons you might be browsing a category like "actors."  Maybe you want to see what kind of names actors have.  Maybe you want to see how many actors are in Wikipedia.  Whatever--if there aren't any reasons people might be interested in actors in general, then why have anyone in the category?  The fact that a particular size might be cumbersome for a particular use is an argument for addding subcategories, but it's not an argument for eliminating the category.  It's not like it's taking up physical space.

Nareek 16:34, 22 January 2006 (UTC)

My $.02: First, since previous upgrades to the Wikimedia software, large categories no longer pose the technical problems that they once did. So the size of a category alone should not be the rationale for breaking into subcategories. Second, "profession by nationality" is an extremely arbitrary breakdown -- many professionals simply do not have any intrinsic association with a nationality. So while I have no objection to creating these subcategories for those professionals who do have a strong association by nationality, these subcats should not mean that the primary category is de-populated. older ≠ wiser 16:57, 22 January 2006 (UTC)

So it sounds like there is general consensus for the notion that categories should be fully populated at the "topic article level". I still don't know the criteria for where the upper limit is. Is it Category:Film actors, Category:Actors, Category:Entertainers, Category:Celebrities or Category:People. I strongly agree with populating Film actors, not certain about Actors and tend to think anything higher is too much. One criteria is to not put an article into every possible category because that clutters the article with categories. I'd propose that we only populate to the level of notariety (there's probably a better way to say this). If people are notable for being Directors, that is the highest level of populating film people categories. There are still grey areas with this, Poets or Writers? Film actors or Actors? Any ideas? What criteria do we use to make these decisions? -- Samuel Wantman 21:06, 22 January 2006 (UTC)


 * I favor categories that are not subdivided into a hundred subcategories. Unless one already knows that a professional is "Scottish" rather than simply "British", he or she might not be found by browsing at all. Categories are already sorted by alphabet. It'd be much simpler, overall, if people were identified as "Scots" and "Actors", rather than as "Scottish actors". -Will Beback 09:08, 23 January 2006 (UTC)

Points that I think we're on the way to establishing, so far:
 * There really is no such thing as a category that is too big, by any kind of absolute standard.
 * It is reasonable to keep categories farther up the heirarchy populated, to the "topic article level" and/or "to the level of notoriety" (which could maybe be defined by the opening sentence of the articles?) Our articles say things like "Clark Gable was an American film actor..." or "Sir Alec Guiness was an Oscar-winning English actor..." - note the use of the word actor rather than entertainer, celebrity, or person :)
 * The contributors to this discussion have different preferences for what level of categorization/ subcategorization they find helpful for browing, and presumably our encyclopedia users will also. Even to the same individual, different methods of browsing may be helpful on different occasions, depending on the object of the browse. We can accomodate all by using the subcategories that currently exist or others that might be requested, and also by keeping categories farther up the heirarchy populated too, according to the criteria in the bullet point immediately above, or as requested.
 * The possibility of "cluttering" articles with too many categories and subcategories should be considered, but is a secondary concern to having useful categories, possibly at multiple levels.
 * --LiniShu 12:09, 23 January 2006 (UTC)

I've asked Brion Vibber (a developer) about whether there are any technical concerns about having large categories. -- Samuel Wantman 07:13, 24 January 2006 (UTC)


 * Thank you --LiniShu 11:28, 24 January 2006 (UTC)

This is the response from Brion Vibber. "THERE IS NO ISSUE WITH THE SERVERS DUE TO LARGE CATEGORIES". So the only issue is the awkwardness of browsing large categories. -- Samuel Wantman 02:04, 25 January 2006 (UTC) There is certainly no need for any pause to the improvements to the accuracy of categorisation. Things are much better than they were a year ago, but there is still lots to do. No search system is as good as a category system which does some of the work for people. I have found and categorised hundreds and hundreds of articles which were only in the subject category or only in the national category, and the combined categories filter articles through so they are in both. It shouldn't be forgotten that for many occupations and nationalities there are quite small numbers of articles, and without the precise categories they would belong in two big categories, so the six "Fooian xers" would be lost among say 1,252 Fooians (upmerged from 200 different categories) and 2,213 Xers (upmerged from a 150 different categories. Both of these categories would be a useless seas of unfamliar names for most people. It is not a question of whether random browsing is possible, but rather a matter of helping people to target their browsing. People who know exactly what they are looking for can use the search box. By contrast the category system is a natigational tool which helps people to identify groups of related topics. This is especially relevant to people from smaller countries, as the few articles on people of each occupation of their nationality would get lost in a sea of foreigners. In any case starting the conservation with a category where nationality is not quite as relevant as in some other occupations (though it is still very relevant, especially for actors from outside the English-speaking world) has put this discussion on a somewhat false footing. CalJW 05:10, 25 January 2006 (UTC)
 * I don't think anyone is advocating that we remove any of the categories that have been added over the last year. There would still be all the "Fooian fooers" categories.  This discussion is about adding all the fooers from all countries back into a single category.  Some people might look at these large categories and find them useless, but they could still move to the smaller subcategories.  I often find the small categories useless and wish they were combined.  This option to look at the contents is small pieces or in one large category would add flexibility to the categorization scheme and would serve the needs of more users. -- Samuel Wantman 06:18, 25 January 2006 (UTC)


 * I am also very much opposed to dividing categories of people into subcategories solely based on nationality. I doubt that everyone browsing Film Directors really cares to look through the lens of the person's nationality.  Why limit the ability to look at the category in different ways?  It doesn't effect the subcategories at all, those are still listed at the top of the page and readily accessible.  Cacophony 09:10, 25 January 2006 (UTC)