Wikipedia talk:Historical archive/Complete list of encyclopedia topics

Is this a list of both articles we have AND stubs we have? (on a procedural note, we have trouble getting people to update the Biographical listings page - see my effort 9/16.) --MichaelTinkler


 * The lists include the topics we already have (as of today, without subpages), and also some topocs from selected lists I found on the web. The idea was to list all topics we should cover. With about 16.000 words, it is far from complete, but it's a start. See especially the letter 'Q', I found many words there, but it's still a small list that we could easily cover. Let's proclaim a "Q-week" ;)

What are the various misspellings and OldStyleLinksThatWe'reNowTryingToAvoid for? :-) --User:Koyaanis Qatsi


 * With these number of topics, I can't check them all manually :( If you have an idea of how to handle that, I'm open for suggestions...

I wish I did. So should we eliminate redundancies and redirects?

These pages seem (unsurprisingly) to get rather long. Perhaps it would be a good idea to split each page into existing subjects and subjects that should exist? --Pinkunicorn

This is a good idea, Magnus--I don't know why we didn't do it before. :-) We'll have to link to it from requested articles. I would agree that 16,000 words will not come close to exhausting the number of Wikipedia articles we'll write. The value in this list, though, is that it's a list of topics that other encyclopedists (or is it only we, ourselves??) have seen fit to write articles on--so, we'll have passed a significant milestone when we have written articles on all of these topics.

(Hey Magnus, if you haven't done this yet, it would be fantastic to list all the encyclopedia topics that www.dmoz.org lists in their encyclopedia directory -- the list is free. That way we can make sure we're covering all the topics that Britannica, Encarta, and Columbia Concise is covering.)

Of course, if we make it a habit of writing endless "stubs" just to say that we've got articles on the subjects, I think we'll in the end dilute the usefulness of the project...that bothers me. My theory is that at some point, Wikipedia will have achieved such breadth that it will have to grow in depth if it is to grow at all, and at that point, the project will be a lot more interesting to bona fide experts. For instance, if a classicist now observes that we're surrounding an article on the Temple of Apollo at Delphi with zillions of lightweight stubs, he won't want to participate, because he'll think the project is lightweight. I think that's probably the reaction of most academics to Wikipedia right now. But once we start really exploring subjects in-depth, the classicist's interest will be piqued. --User:LMS


 * speaking as an academic, there's no estimating (over or under) their intellectual snobbery, but the REAL problem with academics and wikipedia is the culture of the signature, which is why nupedia is your only hope for that. And you've seen what peer review does to initiative; I'd hesitate to suggest myself for the Charlemagne article there, because "people I know" would read it and associate it with me. Here at wikipedia I feel much less constrained, if only because I know that it's collaborative. On the other hand, I'm at least 3 years away from the real tenure crunch, so my level of paranoia is lower. --MichaelTinkler


 * I added the encyclopedia.com and encarta "Q" topics from dmoz (I have to download and convert each page manually). Britannica doesn't have a list on dmoz.org (or I couldn't find it). Problem is, right now, I have several word lists as test files (a topic per line) here, and a program that sorts them, deletes duplicates (more or less), and converts them to wiki format. But, I get 36 pages (A-Z,0-9) which I'd have to upload each time there is a major change. So, I think the best thing to do would be to collect as many sources as possible BEFORE uploading the entire thing again. I will go through the dmoz lists tomorrow. I'll also try to use the dmoz archive structure list (23MByte!). Any word lists or URLs with word lists would be very appreciated. Please put the URLs here, if you have a word list, just append it to New topics. At what point should I stop? 50.000 words, as the preliminary benchmark? I think the Q list is already quite complete, it might be fun to cover that one completely ("The Qte encyclopedia?":) -- Magnus Manske

Hey Magnus, as usual, I am amazed at your creativity. Thanks. Another thing to consider is to put links to other encyclopedias' article URLs after the Wikified name, in between single brackets. So, for example, one bit would look like this:


 * ... - Jesus Christ - ...

--User:LMS


 * Nice idea, but wouldn't it be better to put that link at the bottom of the article itself instead? In the topic list it might clutter the page. Also, I'd have to look up and edit every single entry manually :( Or am I missing the point here? --Magnus Manske


 * Well, I thought the main point of this list was to provide people with ideas about what we should have articles on. So at the same time that someone looks at a topic on your list, it would be nice to see a link to an article about that topic. Instant easy research! --User:LMS

I am quite happy to announce that my local word database has passed the 120.000 words! Not counting crap, there should be about 100.000 left. I also used the New topics list, and uploaded the Q page again as an example. I'll upload all letters once we decide on a "partly finished version". --Magnus Manske

Magnus, why are there duplicates (e.g. Quantum mechanics) on the /Q page? Missing a "sort -u" somewhere? --User:Robbe

Somewhere, before the list is officially finished, you should add a comment to the effect that very many of these proposed article names do not follow Wikipedia's naming conventions. Many use plurals and upper case when we'd use singular and lower case. --User:LMS


 * I added a warning, and I'm working on ways to reduce plurals semi-automatically. I'll probably start uploading the lists I have later today. These should keep us busy for a while ;)
 * one of the great things about the 'complete list' is that it will help identify those illicit plurals and refactor into appropriate entry-title! Thanks again, Magnus. --MichaelTinkler

The "basic vocabulary" page includes zillions of words that are not names of any plausible encyclopedia article. We shouldn't be encouraging people to write articles on basic vocabulary; we should be encouraging them to write articles on basic encyclopedia topics, which is totally different. --User:LMS

I don't believe basic encyclopedia topics will work. Who could decide what 10% of history is most significant? I still prefer the Q-week idea - just grind through the topics and people who know more will pick them up. Although I'm doing K-months and a lot of tat is slipping through.User:TwoOneTwo

Huh? Why won't it work? Of course it will work, if we work on it. It doesn't have to be exact. It just has to be useful. --User:LMS

You're probably right about the "basic vocabulary", but do you really think people will go through more than 100.000 topics searching for ones from their special area? Click on A, find a topic and write, yes, but ...

I got the "basic biology topics" by collecing them on biology pages. Might be easier for the other topics, too. Well, at least you got rid of a subpage;) --Magnus Manske

I tried going through the topic list one by one looking for sport topics, but it's just *impossible* with the list in its present form. It might be marginally easier if you formatted the words into a single-column or two-column list. It's a real needle-in-a-haystack exercise, though, and I doubt whether it's a particularly good use of my time when I already know of important articles yet to be written. --User:Robert Merkel

As far as I can see, the main functions of these enormous link lists are:


 * 1) To suggest new topics worthy of an article
 * 2) To ensure that every article in Wikipedia has a link to it, so that none will become "lost"

However, now that we've got the new Wikipedia software running, we've got special:WantedPages to fill the first role and special:AllPages to fill the second. Furthermore, the existance of these pages significantly reduces the ability of special:LonelyPages to find articles without meaningful links to them; I don't really see the links from these pages being particularly useful to someone browsing through the encyclopedia. So I'm thinking it might be a good idea to "do something about them," to allow orphans to be truly noted as orphans.

A couple of ideas come to mind, in increasing order of difficulty. We could delete these articles entirely, we could remove the links to existing articles, or we could ask the wiki software maintainers if it would be possible to exclude these particlar articles from the links database that the Orphan-finder uses. Anyone else have any ideas? Bryan Derksen, Wednesday, April 3, 2002

I think the function of this page was #1 above, & not #2. We didn't at that time have the orphan-finding capability. Also, a link to an article on these pages will be largely worthless, IMO. You make good points. I don't consider the pages themselves of much use, but in the meanwhile I'll run them through Notepad, Ctrl-H deleting [ and ]. User:Koyaanis Qatsi, Wednesday, April 3, 2002

Actually I stopped with A. I realized that having them linked did have a benefit, which would be: showing which of the topics are already created, and which might need creation still. Comments? User:Koyaanis Qatsi

idea to have theses pages ignored by the software. I'll look into it when I have time. Magnus Manske, Wednesday, April 3, 2002
 * 1) 1 was indeed my intention when I created these pages, with the beneficial side-effect of "which we have" as you said above. It might be a good

I'll restore the pages. The other option--editing them manually to eliminate links we already have--would clear up the "orphans" issue but would take a long time & too much (IMO) dedication to expect of people. Cheers, User:Koyaanis Qatsi

Perhaps we could impose upon someone with acces to the guts of Wikipedia to come up with some sort of automatic script to use on these pages and these pages alone, to remove links to existing articles but not to non-existing ones? I suppose I should bring this up on the mailing list now, to see what the practicalities of these sorts of things may be. Bryan Derksen, Wednesday, April 3, 2002

Conducting a little experiment, I manually removed all links to existing articles from Complete list of encyclopedia topics (obsolete)/A. Before I did this, there were 730 orphan articles listed. I removed 353 links from the original 2241 links on that page, most of them to existing articles (a few were to bad article names, such as those containing &, that produced nonfunctional links). After I did it, there were 737. A very naiive estimate would seem to indicate that about 200 orphans are being "hidden" by these lists of links. Bryan Derksen, Tuesday, April 16, 2002

Next, a bigger experiment: I did the page Complete list of encyclopedia topics (obsolete)/A2 too. Of the approximately 4152 links on that page, I removed 758; most of these were links to existing articles, though I also removed ones with bad ASCII in them along the way. Before I did this there were 737 orphans; afterward there were 765. There were 28 orphans hiding in this page, for a total of 35 for all of the As. This suggests that my estimate of 200 was low, and that there's closer to 900 orphans hidden in these link pages. Bryan Derksen, Friday, April 19, 2002


 * The problem with this page may simply be that its name doesn't say what it means. It's taken me three months to realize it, and that didn't happen until I chanced upon the above discussion. The intuitive sense that I get from the title is that it's a list of all the articles in Wikipedia; that has limited usefulness when you can use the search function. I'm inclined to believe that a lot of new people will also wrongly take the title at face value, and not as an inspiration to write about a topic that is dear to their hearts. Without that hook they many not stick around very long. The motives described above make sense, but I don't think the point is getting across. The second issue that this discussion made me examine is the concept of basic vocabulary. My first impression here was that this seems to run contrary to the "Wikipedia-is-not-a-dictionary" rule. I even went to that page and tried a few links. "Talk" gave me a stub about the need for someone to write about the phenomenon of talk, "up" spoke about the z-axis in mathematics, and "rent" spoke of a Broadway musical. It makes me wonder. User:Eclecticology


 * Yeah, this list would probably be better titled something like "Ideas for encylopedia articles." As for its purpose and value, I'm not sure; I know that I've never used it myself, but during my work going through all those links I did come across a lot of nonexistant articles I was tempted to click on and add a little stub for. So it might well be of use to some people. Bryan Derksen


 * My imagination tells me that adding these little stubs could result in making matters worse. Stub articles, unless one makes the effort to read them, leave the false impression that the article has already been written. As long as it shows up as needing to be done, it's more obvious to whoever is looking for something to do. The Stub articles link is more irritating than anything else. It begins with the zero length articles twenty at a time (not even 50 at a time like recent changes) with no opportunity to jump ahead. Many of the zero length articles may simply be waiting for deletion, or may otherwise be empty for good reason. It can be frustrating to go through a number of pages of lists before  finding something to sink one's teeth into. Finding work among orphans? is far more rewarding. User:Eclecticology
 * Yeah, those 0-length articles have been bugging me too. I asked Jimbo for sysop status so I could cut a swath of destruction through them, deleting all those which had nothing worth saving in their histories (which I anticipate would be over 90% of them), but although he said he had no problem with that I have not yet felt any new powers suffuse my being. I will send him another email to see if it's slipped his mind, and hopefully get to work on cleaning. I also agree on the little stubs, BTW; when I write a "little stub" I try to make it at least a full-blooded paragraph long, with plenty of obvious hooks for more knowledgeable people to hang additional material on. Bryan Derksen
 * In case you didn't notice, I got my shiny new sysop powers and have used them to clear away the majority of those 0-length articles from stub articles; you start getting to interesting stubs on the third page in, now. The remaining 0-length articles all have talk: pages that contain enough stuff to make me reluctant to delete them outright at this time. Bryan Derksen

Complete list of encyclopedia topics (obsolete)/B: of 4211 links, removed 463. 741 orphans to begin with, 759 after. 18 new orphans. Bryan Derksen, Saturday, April 20, 2002

I am pleased (very pleased) to announce that I have found a way to nearly-automatically selectively remove links to existing articles from these pages. I'm going to do the rest of them without further running commentary here. Bryan Derksen, Friday, May 31, 2002


 * There, finished. The orphan count went from 452 to 782 when I did that last batch, so I uncovered roughly 330 orphans (a handful were added or subtracted by other activities during that time). Adding to that the 53 I uncovered by hand previously, there were about 383 orphans hiding in these pages. I feel vindicated. :) Bryan Derksen, Saturday, June 1, 2002

Now that I've gone to all the trouble of invalidating the name of these articles by removing all the links to existing articles and leaving only the non-existing ones behind, I guess it's time that we (read: I, since I'm the troublemaker, after all) should move these to an article with a more appropriate name. What would be a good one? List of suggested topics comes to mind off the top of my head, or perhaps List of topic ideas since we may not want to suggest that some of these be turned into articles. Anyone have other ideas or preferences? Bryan Derksen, Sunday, June 2, 2002


 * What is the use of having these pages in the first place? These lists seem to contain thousands of names which are incorrect per established wikipedia naming conventions. Why should we encourage the misnaming of articles that will have to be renamed later? Who uses these pages? Sorry, I'm being the trouble maker now. --User:maveric149
 * I've never used them myself, and doubt I ever will, so don't expect a defence to come from my corner. Still, having just spent a day "cleaning" them, I would feel a certain irrational knee-jerk upset to see them simply deleted. You could have saved me a bunch of work by suggesting that earlier. :) Anyway, I'll go along with whatever consensus is reached; all I wanted to do was to get those hidden orphans out into the open, and I've accomplished that. Bryan Derksen
 * Sorry, haven't been near a computer for a couple days. If we decide to keep these pages and somehow fix all the incorrectly named edit links, then I suggest a title of List of possible encyclopedia topics or better yet List of potential encyclopedia topics --User:maveric149
 * I sympathize with Bryan's dilemma about losing his children, but he must look at it as a positive sign that he has raised his children well. Still all he has succeeded in doing is showing us how useless these pages really are. Most of the entries on these pages are themselves orphans. After doing a random check I found that out of any ten entries eight will be orphans. In the absence of the original context that led to their being added to the list. I would remove all these orphan entries. If they're important enough they'll certainly reappear. Perhaps lists of wanted articles could be generated from time to time in a way similar to the most wanted. That, however, could be very long. -- User:Eclecticology, Monday, June 10, 2002


 * Fortunately, it's been a while since I did that work and so the memory has faded. I won't even jerk my knee now if someone were to delete all this. :) Bryan Derksen
 * I find that these pages are of little use, for reasons already mentioned above; many incorrect titles, and creator of many hidden orphans (and also "falsely" contributing to other pages such as most wanted). It is a good idea to have some suggestion for articles, but maybe there's a more structured way. I think there's someting like that around divided into categories, but I can't get to the name right now. If we're not going to delete them, we might put them in a different namespace, solving the false linking issues. Just my .02, User:jheijmans

So, are we all agreed that these pages should be simply deleted, then? Since these pages are rather big and impressive-looking, the result of much work by various folks, it's probably best to spread the word far and wide before we do that. But if nobody has any objections here, I'll add them to the votes-for-deletion list and see if I can rouse objections from elsewhere. Bryan Derksen, Tuesday, June 18, 2002


 * I agree that these pages should go, but you are right -- these pages do represent a lot of work. We should spread the word. Perhaps list this on the deletion queue. --User:maveric149, Tuesday, June 18, 2002

Keep this page!
Please do not delete this page, under any circumstances - it, along with associated subpages and this talk page (and, more critically, their histories), contain important documentation of the early days of Wikipedia - documentation which will (someday) be very important to historians. Trust me (as someone who worked on something else important where we threw away the records "because they are obviously useless junk" - which is now driving historians of technology to despair) on this one! Noel (talk) 13:38, 10 Nov 2004 (UTC)