User talk:Tim!/Ceci n'est pas une pipe

Reply to actor-by-series cfd comments
Hi Tim. Since you and I often disagree on the issue of actor-by-TV-series, I wanted to reply to your essay for reference to give my side of the issue.

You mentioned that "it is commonly argued at CFD that actor categories are non-defining." In fact, whether or not starring in the show is "defining" has nothing to do with my reasoning for why I am opposed in general to actor-by-TV-series categories.

The reasons I recommend deleting almost all of the actor-by-series categories are:

1) They are entirely redundant with the cast list in the main article. Let's look at it from the perspective of a Wiki reader.  If I'm a reader who wants to peruse, for example, articles about actor who starred in The 4400, the most likely scenario is that I will do a search for "The 4400" and read the main article.  There is a cast list within that article that already includes links to the entire cast of actors on that show.  Moreover, this method of finding the actors is at least as easy for the average reader, if not easier, than using the category system to find actors for the show.  Additionally, the main article gives plenty of auxilliary information, like character names and plot synopses, that I can use if, for example, I want to find the actor who played a particular part but I don't know the actor's name.

Thus Category:The 4400 cast is made almost entirely redundant by the cast list withing The 4400 itself. It is unlikely that a reader would find any added benefit from having both a category and a cast list within the article, so there is no pressing need to have both.

2) Since there is no pressing need for these categories, the next question becomes whether or not there is a downside to keeping them. In this case, there are downsides to keeping these categories around and allowing them to proliferate. For one thing, by creating categories that are redundant with their main article you are effectively doubling the number of wiki entries that need to be maintained by the editor community.  So rather than just having one article, The 4400 for example, that editors need to keep track of, you now also have a second item, Category:The 4400 cast, that editors must maintain independently of the main article.  You've almost doubled the amount of work per article when it comes to keeping cast indexes up to date.

Also, there is the issue of category dilution within an individual actor's article. By that I mean that the more categories an article appears in, the harder it becomes for a reader to find a category they are looking for related to that article. If an article has 100 categories, it becomes more difficult for someone looking for specific information related to that article to use those categories to find what they want. The problem is that a fair number of actors notably appear in the casts of multiple television series. Voice actors like Tara Strong (one of my favorites) in particular are extremely prolific and can star in literally hundreds of shows and films during their career.

So to me it's a matter of both whether or not the category is necessary, and whether or not including the category has a downside to the community. If there were no downside, I wouldn't care, and if the category gave readers a significant benefit, I would support it. But actor-by-TV-series categories are not useful to the reader compared to just using main article, and they also pose potential problems for editors and readers alike.

Therefore the quality of "defining characteristics" has nothing to do with my opposition. Rather, it is a technical matter of maintainence, redundancy of information and ease of use for readers and editors.

Just my two cents. I think restricting these categories to regular cast is definitely a step in the right direction, as it helps minimize the potential problems I mentioned. But overall I think we'd be better off simply excluding these categories altogether with very rare exception. (The main exception would be catgorizing actors form a set of related series, like Star Trek, which isn't as easily listed within a single main article).

Thanks, and hope that clears up the reasoning behind my disagreement! Dugwiki 18:38, 15 January 2007 (UTC)

"too much information" and usability
Hi Tim! In your essay you say that complaining about too many categories is equivalent to complaining about too much information. I beg to differ. Because categories are functionally different from the information in the article -- they are an aid to navigating the encyclopedia -- they are not solely (or even primarily) there for the purposes of conveying information. A tag, for instance, is used just to convey information (see Tag (metadata) and Tag cloud and Keyword (Internet search); and compare with some of the entries under Classification -- categories in wikipedia are there for navigation, like a classification system.

You probably understand that distinction, but I wanted to reiterate it because that functional difference means that inclusion of lots of categories is very different than, say, writing a longer article. Because the categories are primarily functional (not informational), basic user interface and usability type stuff applies. And it's pretty clear that there is some reasonable upper limit of utility in terms of numbers of categories; probably something less than ten. When people are looking at information arranged in a series, they see the first few, and the last few, and anything that "jumps out" for font/style/etc. Otherwise they miss a lot of stuff. So the top line, the bottom line, some of the stuff on the left & right of the categories, and a large indigestible mass in the middle.

If you look at George W. Bush, for instance; I'm using the default monobook style, and categories are at the bottom as they are for most people. Just trying to read through this list of categories is sort of interesting, but it's very hard to use it as a navigational device. I, like basically everyone else studied in usability studies, find myself reading the top line, the bottom line, and my eyes run away from the middle. I have to force myself to read through the whole list of categories if I want to fully digest them and make sure I haven't missed them. (The lack of a discernible order--alphabetic, for instance--doesn't help but wouldn't go far to fixing the problem. And only alphabetic would help at all -- any order other than alphabetic is not going to be readily understood by people not familiar with the system.)

If I were to look up GWB, and then say, hmm, I want to see him in context with other similar people -- the ordinary reader is going to look for a) time information (when he was born for instance); b) president; c) political views. We could add a thousand categories for him, all of which would be true, and all of which would be interesting to some people, but they would basically be lost.  Because someone, looking through the list to find "Air National Guard Officers" would have to read the entire list, top left to bottom right, to find it somewhere in the middle.  And they might miss it, because usability shows that even if your eyes are tracking the lines if you are not concentrating with every entry in the sequence you can miss it.

So to lump together, as "tags" at the bottom, is not helpful as a navigation aid.

Then the question is whether they are helpful as categories themselves. First of all, people won't be getting to those category pages very often if the category sections on articles are all user-unfriendly and hard to navigate. But let's assume that viewers get to the category pages through "search", inter-text links directly to categories, the occasional determined or observant person who picks out the right category in a long list, and people who have very few categories (an ever-decreasing list if we adopt the "more categories = more information" philosophy).

Then the category itself is basically a list, alphabetically (unless people forget to alphabetically sort for the category structure) of things. The Mediawiki software is fairly flawed w/r/t the categories, so a category is really just an automatically-generated list which is in all ways inferior to a manually created list (in the "Article" format):
 * (a) It can't be sorted (lists now can);
 * (b) It has no possibility for annotations (lists do);
 * (c) It has no possibility for references (lists do);
 * (d) It can scroll over to multiple pages with no navigation -- subcategories are particularly confusing here, because people who intuitively grok that the "continued" message means that pages roll over after 200 don't intuitively grok that subcategories roll over, because of the way the subcategories/pages are broken down on a category page.
 * (e) It has no way for the casual viewer to tell if things are missing, and no way for interested editors to make that apparent to casual viewers; lists can include sections that are empty or redlinked or just obviously missing, and editors can make notes that works are missing;
 * (f) Even knowledgeable or serious viewers might not be able to tell if things are missing from a category, for usability reasons; the "article" format that a list is in is designed for easier reading and it is therefore easier for knowledgeable people to spot missing information, whereas, again, the "category" format contains little or no typographical cues to help readers avoid eye exhaustion;
 * (g) Categories have no way to sort and organize any way other than alphabetical; lists, which will be organized according to whatever critiera make the most sense -- date, type of work, genre, collaborations, awards won, etc.
 * (h) If you monitor categories for inclusion / exclusion (have you tried this? It's really informative) you will see that it is much, much harder to tell when an article has been deleted or added to a category. You have to basically know the category by heart (very difficult for larger categories), avoid eye strain to see differences, and then, if you see a difference, you have to figure out what it is -- which is really hard when it is a deletion from a category.  Then you have to go to the article and the article history to see what was going on.  Again, while this is annoying if people are adding categories to articles, it is practically impossible if people are removing categories from articles.  The Category:Liberals is a recent example.  I had this category on watchlist, but I didn't know that all kinds of crazy people were being added to it inappropriately, until two very different people whose individual articles were on my watchlist both triggered an alert.  Then I went to the category & saw that it was vastly overpopulated and had been so by a number of different people.  To figure out what had been happening, I had to open up each & every article in that category, read it and/or determine whether it had been added in the big broad-brushed swoop.  OTOH, if this had been a manually-generated list, instead of the automatically-generated list of a category, it would have been in my watchlist and I could easily have seen what and when and by whom edits were made.

... Basically everybody's arguments against the actor-by-performance categories relate to this critical point about usability, tied in with an important slippery-slope argument -- that is, that when someone creates a category people want to (a) include everything relevant in it; and (b) include other similar categories. So it's very very difficult to simply try to create categories for "exceptions" or define categories as "exceptional performances", "career-defining", and so on. The tendency is for people to do all or nothing. So if you have a type of category, then people will keep on creating all of that type of category, and adding all relevant members & subcategories to that category. ... And that inevitably leads to 120 different "cast of film x" categories for John Wayne, which makes it impossible to actually use the category structure.

I'd be happy to hear your thoughts on this.

--lquilter 17:55, 3 February 2007 (UTC)


 * To quote Categorization


 * Categories help users navigate through Wikipedia via multiple taxonomies
 * Categories are for defining characteristics, and should be specific, neutral, inclusive and follow certain conventions.

Just to centre to actor by performance and to take the Tom Baker article as the example:

The Tom Baker article contains information about Doctor Who. A reader would want to be able to navigate to this article in various ways, one being through the category system from other articles about Doctor Who, starting at the highest level Category:Doctor Who although exactly where under is not particularly relevant.

It is clearly a defining characteristic of Baker's career that he was in this series, and in fact much of the article is devoted to it. It is neutral.

The so-called "clutter" is really a secondary problem for biographical articles in particular. The question becomes of the many possible categories which to add?

But to quote from categorization again "If you go to the article from the category, will it be obvious why the article was put in the category? Is the category subject prominently discussed in the article?"

An article about a regular cast member of a show will discuss that role, so it is obvious why the article was put in the category. This would not apply to a guest star, as few actors are known for one-off appearances, so may not even be mentioned in the article.

Then we come to splitting, which would be only be required in unusual circumstances, and in fact is already a practice in use. See Isaac Newton. The number of categories we could apply to Newton could probably outstrip even George W. Bush. For example there are Isaac Newton's occult studies — Newton could be described as an astrologer or aclhemist, but we find neither Category:Astrologers or Category:Alchemists on the main article about Newton. Instead is the occult studies article which falls under astrology and alchemy. Your example article John Wayne is a pretty long article, and if any of his roles were described in more detail than they already are, it would be a candidate for splitting as per Newton. Tim! 11:07, 4 February 2007 (UTC)


 * I don't disagree at all that Doctor Who is defining for Tom Baker; nor that many of the potential categories are defining for people. "Defining" is, I think, a threshold consideration for adding a category; it's necessary. But is it sufficient?  I mean, should all things that are defining be categories?  That's the problem that we face, I think.  Because, of course, even taking a narrow view of "defining" (things that pretty much everyone agrees on, and not things important to one or another subgroup), people might be defined by a very, very large set of things. Tom Baker, for instance; defined occupationally by his role on Doctor Who (my favorite Doctor Who, I'll note); also defined by a lot of basic biographical facts (dates, places, language, gender, ethnicity/race, sexuality, marital status), and so on.  I mean, even to get a good head start on the very basic defining attributes of a person -- from a biographical perspective -- can lead to a lot of categories, and thus the usability problem I outlined.


 * I'll have to think more about the splitting proposal. My initial thoughts are that would work fine for large articles, like Ronald Reagan, John Wayne.  But I'm not sure if it would help with shorter articles and less notable biographies.  The problem we run into with categories is that, inherently, they suggest to people a system that is inclusive and comprehensive -- in other words, all members of a category, and all examples of that type of category.  So it's very difficult to keep a category's members, or a type of category, limited only to some hidden set of "notable" entities. (As you've proposed elsewhere, and as I've talked about elsewhere, too.) ... That leads us straight back down to the slippery slope of (a) many categories; and/or (b) many CFD discussions.


 * I guess, finally, I just feel like categories create automatically generated, alphabetical lists, which are useful in a fairly limited set of circumstances; and "lists" (article format) are a much more flexible way of handling things -- better edit tracking, better ability to add notes, better ability to sort. The manual editing is done in either case -- adding categories, or adding items to a list -- but it's easier to police & watch in article format. Can you explain what's wrong with (article format) lists, both as cast lists for shows, and as credit lists for actors? --lquilter 03:42, 5 February 2007 (UTC)


 * As to the Doctor Who and example above for Tom Baker, the question isn't whether the role is important to Tom Baker or a defining role. It's whether adding a category called "Doctor Who actors" adds any significant search functionality for the reader.  For example, let's say I'm a reader and I want to read about actors who played The Doctor.  The odds are very high that the first thing I would try is to type "Doctor Who" in the search box.  The Doctor Who article that appears already contains a full cast list with links to actor articles of all the actors who played The Doctor, along with the dates the actor played him.  So using just this main article, I can easily navigate all the actors who played The Doctor.  There would be no incentive or reason for me to further look for a category.


 * So to me there ought to be a search utility benefit to the reader to the category. If the reader isn't likely to want to use the category instead of the main article, then creating the category is almost doubling the amount of editorial maintainence for cast list upkeep for essentially no practical reason.  Combine that with the potential for possibly having too many categories per article for some articles, and it just seems like cast list categories are generally a bad idea. Dugwiki 16:57, 5 February 2007 (UTC)