Wikipedia:Auto-categorization

Manual help needed for categorization and "See also" conversion
One of the benefits of categories is that instead of having to maintain 50 different "See also" lists in 50 different articles, there's a single central list of all the articles on a given topic, still easily accessible from each member article. Wikipedia has many clusters of articles with "See also:" lists that would benefit from conversion into one or more categories. I have started experimenting with various techniques of clustering articles, the results of which you can see below. Right now, I'm simply taking articles that have lots of "See also" links and highlighting the articles linked to and the categories those articles may be found in. My hope is that others, having a clump of related articles all in one place, will be able to easily form categories and delete the majority (if not all) of the "See also" links in the articles themselves. I leave it entirely up to you whether or not you think any of these suggestions are worth implementing. Because of the semi-automated nature of the clustering process, it's likely that at least some of the suggestions are bogus. -- Beland 06:26, 25 Dec 2004 (UTC)

Comments about how useful these listings are and how they might be improved are welcome. It's probably a good idea to delete article clusters that you've fixed from the lists (or at least mark them as complete), to avoid duplication of effort. -- Beland 07:52, 25 Dec 2004 (UTC)

Articles about numbers.
 * Auto-categorization/numbers

Articles about letters.
 * Auto-categorization/letters

Articles with the longest lists of "See also" links.
 * Auto-categorization/see-also-1

Articles that need manual categorization (by topic)
The following links are to lists of articles that need categorization. Similarly situated categories are listed on: Category:Wikipedia categories in need of attention. See Categorization for guidelines and syntax.

Wikipedia namespace
These are uncategorized articles in the Wikipedia: namespace. I have already partially sorted them. - Beland 06:09, 25 Dec 2004 (UTC)

/Wikipedia namespace

Containing USA state names
Update from 7 Jan 2005 database dump posted at Auto-categorization/update1. -- Beland 02:40, 28 Jan 2005 (UTC)

Residents albums
I created a category for albums by The Residents. Every album they've released already has an article (and they've released a lot of albums). I've categorized about half of them, and couldn't be arsed to finish the job since I don't actually care much about them anyway. Most of the album articles also need some cleanup (e.g. actually linking to The Residents, etc.) Bearcat 10:39, 30 Jan 2005 (UTC)
 * I catagorized all of the albums and did minor cleanup work. If anyone else wants to work on it, have at it. --Woohookitty 00:45, 1 Feb 2005 (UTC)

The albums are a mess. Some of the information is wrong. On the main page it says one release date and on the album page it says a different date in a lot of cases. --Mattwj2002 03:38, 20 Mar 2005 (UTC)
 * I did some cleanup work on a few of these today. Chyel 17:27, Apr 11, 2005 (UTC)

Unsorted

 * Special:Uncategorizedpages displays an alphabetical list of pages that are not categorized. It is cached (which means it may be slightly out of date) and is not editable.  Discuss on Wikipedia talk:Special:Uncategorizedpages.

Completed work
See Auto-categorization/done.

Pending auto-categorization runs
These projects have been implemented or are being implemented in code, and are here for public inspection and hopefully eventual approval.

I have begun the automated portion of Wikipedia namespace categorization. The first batch is currently posted on User:Pearle/on-deck. By bot charter, three days will be allowed for comments and concerns, which you can make on this page's talk page. -- Beland 03:07, 24 Apr 2005 (UTC)


 * A second batch of Wikipedia namespace categorization is posted. By bot charter, three days will be allowed for comments and concerns, which you can make on this page's talk page. -- Beland 01:49, 5 May 2005 (UTC)

Proposed auto-categorization runs
These projects have not yet been implemented or have not yet been precisely defined.

Category:Towns in Japan
Could be easily auto-sorted and subcategorized by prefecture. -- Beland 08:18, 20 Nov 2004 (UTC)
 * This (and, similarly, Category:Villages in Japan) has been done (manually). -- Rick Block 13:38, 18 Apr 2005 (UTC)

Main article
In cases where there is a category with the same title, we could add the corresponding category to the article, e.g. we would add [ [Category:Wikipedia|*]] to Wikipedia for existing articles/categories. With a bit more checking " [ [Category: s|*]]" could be added to " ". In a similar way the page "List of " should include [ [Category:|*List]]. -- User:Docu


 * In these cases, it would be nice to have a sentence at the top of the category page linking to the main article. I don't know if it would be best to put the main article in the category, or merely link to it in the see also section.  Sometimes one is clearly better than the other.  Frequently, either is OK. --ssd 05:00, 7 Oct 2004 (UTC)


 * I think there is a template for doing that now. -- Beland 02:03, 8 Oct 2004 (UTC)

The idea is to place the article in the category itself (under " * ") rather than in the category description (where it should go as well). It has already been done for all countries and dependencies.

Even if the category is linked under "see also" in the article, I think the article should always also be part of the category. - User:Docu

Problem with this
Sometimes the article with the matching name is a DAB page, not an article (see Romance and Category:Romance. How do we fix this erroneous auto-categorizations?  I tried to edit the page and the only thing there is a template. Bookgrrl 02:38, 18 July 2006 (UTC)

Category:Communes of the Calvados département
All the articles found in the list in Communes of the Calvados département should belong to Category:Communes of the Calvados département. However, (a) some already belong to this category (I tried doing this manually, but eventually gave up. Some things are better left to bots), and (b) some may contain Category:Towns of Normandy or Category:Cities, towns and villages of France, which have to be removed, as they contain (either directly or through the other) Category:Communes. -- Itai 01:59, 12 Oct 2004 (UTC)

Category:Plants
I support a Category:Plants by classifications subcategory starting a tree of ordoes and families, since both are partly present as subcategories. Taxoboxes can be easily used to perform the sorting. Circeus 01:38, Feb 24, 2005 (UTC)