Wikipedia talk:Encarta Encyclopedia topics

Please follow *our* naming conventions
Argh! Things like Absolute Value or Absolute Value, The are red links because they do not follow our basic naming conventions. I fixed a few of these, but a great many more also need to be fixed. Oh, and we do have an article about Absolute value. It would probably be best to combine this list with the Britannica and other encyclopedia lists that are not subject based. It is a duplication of effort and may also not be legal. A combined list would at least not be so oblivious. --mav 19:28, 27 July 2005 (UTC)
 * In that case, I think that redirects should be made for some of the articles so that if a new encyclopedic list is created the red list will be much shorter to deal with others naming conventions. I personally try to create several redirects for that very reason.Leonsimms 21:36, 27 July 2005 (UTC)
 * I've dropped a note to that effect on the project page. We should do the same for reverse-name presentations (e.g. Anthony Comstock or Comstock, Anthony; Auguste Comte or Comte, Auguste). Cheers! -- BD2412 talk 23:01, July 30, 2005 (UTC)

Renumbered sections
When the lists have gone through an initial passthrough of removing the blue links, I think that the pages and sections should be renumbered to reflect the number of remaining red links at the time rather than the original 30,000+ list. This will also remove some of the unecessary section headers for only a few articles. Leonsimms 21:36, 27 July 2005 (UTC)
 * Yes, the less clutter the better. Bluemoose 21:52, 27 July 2005 (UTC)

Some curious things
When confirming links on page 15, I came across a few things that I thought I'd mention here: Not really big points, just curious. Matt 01:39, 28 July 2005 (UTC)
 * search 'Redl' on encarta and you not only get Alfred Redl (which I was confirming) but also Colonel Redl (a movie) which isn't on this list (or under 'C') so it seems this list is incomplete (I've noticed this several times).
 * search 'Corin redgrave' on encarta and it gives you an article titled as such, but it's actually an article on Micheal Redgrave his father (corin is mentioned in the text). Their version of a redirect I guess. We have articles on both iirc.
 * search 'redruth' on encarta only returns an article titled 'Trevithick, Richard'. We have an article on Redruth, UK (as well as Trevithick), but I wonder how redruth got into this list when encarta doesn't seem to have such an article.

inapproriate "or" links
Where we have for example; Should we simply delete the second link? as it is not the wikipedia way of formatting.
 * 1) Georges d' Amboise or Amboise, Georges d'

Bluemoose 08:40, 28 July 2005 (UTC)


 * My inclination is to create a redirect for the second link only if it is a something that someone might plausibly either type into the search box, or use as a link within an article. Bluap 09:50, 28 July 2005 (UTC)

and where we have for example;

should we not delete the first link? then redirect/create an article for the second.
 * 1) American or American (river)

Bluemoose 08:40, 28 July 2005 (UTC)
 * Above link to "American" has been disambiguated as of this post. -- BD2412 talk|undefined 23:00, 3 October 2005 (UTC)
 * I'm not sure we need to create redirects for all of those. When you have something like:


 * 1) Bloomington or Bloomington (Illinois)
 * 2) Bloomington or Bloomington (Indiana)
 * 3) Bloomington or Bloomington (Indiana, United States)
 * 4) Bloomington or Bloomington (Minnesota)
 * 5) Bloomington or Bloomington (Minnesota, United States)
 * we've got articles on all of those (and many other Bloomingtons), but our naming conventions are different. My inclination would be to leave them in place on the first "blue sweep", as it allows us to see what the Encarta article is about, but remove them completely once someone has checked that the articles exist. OpenToppedBus - My Talk 08:54, July 28, 2005 (UTC)


 * If Bloomington (Illinois) is listed on the Bloomington disambiguation page, do we really need to create a redirect to Bloomington, Illinois (which is the correct Wikipedia naming convention)? Bluap 09:50, 28 July 2005 (UTC)


 * Exactly - I don't think we do need to. But we do need to check that it is listed on the disambiguation page before removing it from these lists. OpenToppedBus - My Talk 10:33, July 28, 2005 (UTC)


 * I think so too -- no need to do the redirects from tose un-Wikipedia-like links. However, in some borderline cases I did create the redirect, e.g. Anaconda (city) versus Anaconda, or the American versus American (river) mentioned above. -- Marcika 22:40, 28 July 2005 (UTC)


 * Redirects are fun and cheap - people are taught to search for different things in different ways, so for something like Bloomington (Illinois), I think it makes sense. -- BD2412 talk 23:04, July 30, 2005 (UTC)

Useless red links
There are certain very odd red links created here which are essentially useless even as redirects: I plan to eliminate these wholesale, unless there are specific objections. -- BD2412 talk 01:17, August 3, 2005 (UTC)
 * Butte (city, Montana)
 * Butte (city)
 * C Broad or Broad, C or Broad, C (harlie) or C Broad (harlie)
 * Joseph R., Jr. Biden
 * E Biggs or Biggs, E or Biggs, E (dward George) or E Biggs (dward George)
 * no objection from this quarter Courtland 02:52, August 3, 2005 (UTC)
 * Go fot it! Lou I 11:58, 3 August 2005 (UTC)
 * Be sure that we have corresonding articles though. er on the side of caution. Martin (Bluemoose) 13:16, 3 August 2005 (UTC)
 * I'm trying to "fix" them first - adding periods to unpunctuated initials, taking "city" out of the place names, and so forth, to see if article names pop up. -- BD2412 talk 13:53, August 3, 2005 (UTC)


 * I think that some of those strange links were caused by whatever process was used to list the Encarta articles here. The articles exist on Encarta, but the list here is inaccurate for some reason. I think that something choked on parentheses, and, sometimes, commas. For example:
 * C Broad or Broad, C or Broad, C (harlie) or C Broad (harlie) don't exist on Encarta; they should be:
 * C(harlie) D(unbar) Broad or Broad, C(harlie) D(unbar) (corresponding to C.D. Broad)
 * i.e., Encarta uses parentheses in names to indicate parts of the name that are usually not written in full. For example, Encarta writes Louis B. Mayer as Louis B(urt) Mayer (or Mayer, Louis B(urt)). Somehow, when these links were transferred, spaces were added between the initials and parentheses and names were reordered incorrectly, and sometimes dropped.
 * Some more examples:
 * Joseph R., Jr. Biden
 * Listed on Encarta as Biden, Joseph R., Jr. ,
 * meaning Joseph R. Biden, Jr.
 * E Biggs or Biggs, E or Biggs, E (dward George) or E Biggs (dward George)
 * Listed on Encarta as E(dward George) Power Biggs or Biggs, E(dward George) Power ,
 * meaning E. Power Biggs
 * &mdash; Mateo SA | talk 22:57, August 17, 2005 (UTC)


 * We should not redirect the silly ones; If we have an article for it, and sensible redirects, then it should be deleted from the list, even ones that we don't have articles for should have the silly alternatives deleted, it is up to editors discretion what is sensible. thanks -- Martin - The non-blue non-moose 23:04, 17 August 2005 (UTC)

Other language Wikis
I've found a few articles that are redlinks here, but which have articles in other-language wikipedias. Keep an eye out for those, we may need no more than a translation to fix them! -- BD2412 talk 23:40, August 5, 2005 (UTC)
 * Even if you can't speak the language you can sometimes steal their pictures (by putting them into wikicommons!), which is always very satisfying. Martin - The non-blue non-moose

Culling blue links properly
I become very nervous whenever someone starts trimming out blue links wholesale because I can't be sure that they have checked each of the links to see if it actually corresponds to an Encarta article. Probably 95% of the time it usually does, but there are always those articles (Pearl River or Opie) have absolutely nothing to do with each other or nothing exists. Because the coverage in Wikipedia has getting so good, it's really easy to assume that someone has written the "right" article or at least created a disambiguation page that points to the "right" article. This is NOT true. I'm aware of this in particular because I did that same thing when I first became part of this project. I cleared out most of page 2 without following a single blue link. We cannot lose sight of the larger goal of total coverage in the short term goal of clearing out the blue links and getting the % complete higher. Reflex Reaction 21:05, 9 August 2005 (UTC)
 * I'm the new person that you think might not be getting it right, but it was me that made the note about Opie, and Pearl River (Mississippi-Louisiana) has existed since September 2003. 82.35.34.11 05:14, 10 August 2005 (UTC)
 * It sounds like then I have nothing to worry about from you and your contributions. Welcome to the project and my sincere thanks for your contributions. Hopefully this will serve as a warning to other new not so experienced contributors.  Reflex Reaction 13:22, 10 August 2005 (UTC)

Start point
Following the squabble over page 4, yes the starting point was wrong, but they have been wrong across the board. The actual number of missing articles is probably less than 4,000, not 6,300, but it's more consistent to continue on the same basis as before. 82.35.34.11 20:53, 10 August 2005 (UTC)
 * thanks for finishing page 4Martin (Bluemoose) 21:07, 10 August 2005 (UTC)

Updated starting estimate
I have updated the estimate of the initial number of missing articles to 4,000. The multiplier comes to less then 4,000 and there are some fresh blue links in the cleared pages and certainly more potential redirects - I've hardly looked at cases involving diacretics. Of course some articles have been created since this was started. We will never have a meaningful "precise" figure, and I suggest that 4,000 will be a suitable start point to calculate the percentage complete once the blue links and redirects on the missing five pages are done. This may have to be changed if the results for the other five pages are dramtically different, but that is unlikely.82.35.34.11 18:39, 14 August 2005 (UTC)


 * I don't quite follow how you get ~4000 rather than ~6500. For consistency with the EB2004 figures, the idea was to count how many pages were missing _before_ we created the redirects - ie the total number of red links on the original list. Bluap 19:06, 14 August 2005 (UTC)
 * By multiplying the current number of missing articles on the 13 pages by 36,433/26,000 - then there is an upwards adjustment to reflect the fact that some have been created since this page was started and a downwards adjustment to reflect the fact that there are still more redirects to be done. Many of the redirects are just garbled names. If the Britannica figures were calculated before the creation of simple redirects they are patently misleading. The percentage completion given for Britannica purports to be the percentage of the missing articles which have been created but it is not. This is misrepresentation, and it should surely not be repeated just because it has been done before. Accuracy should be a higher priority. Also, once the percentage count is underway, progress will be faster with a lower count as 1 is a larger percentage of 4,000 than of 6,300. 82.35.34.11 19:25, 14 August 2005 (UTC)
 * Well if it was done in the same way they would all be comparable, now they are not, but I guess you know better than everyone else. It is not misrepresentation, this is a wikipedia project, not a court of law (i.e. it doesnt actually matter), and it is impossible to say how many redirects there are, it is easy to say how many red links there are. Martin - The non-blue non-moose 21:53, 14 August 2005 (UTC)


 * I stand by my comments. The figures for the different encyclopedias aren't comparable because the proportion of mistaken red links is likely to vary greatly according to the degree of difference between each encyclopedia's formatting policies and Wikipedia's formatting policies. The project is about missing articles, not red links. Also, I wasn't the first person to reduce the estimate, someone else cut it from 6,300 to 4,800. 82.35.34.11 17:29, 15 August 2005 (UTC)


 * As the person who had done two earlier estimates (6,300 and 4,800), the most recent change was a then current (and very conservative) estimate of remaining articles, not the initial articles. With the redirects and character weirdness perhaps the number of needed articles is closer to 4,000, but I don't think extrapolation from a single page or even several is the answer.  We are still working on the "pruned" percentage; when we have an adequately pruned list, redirects and character weirdness included, that list can be used as the initial estimate of remaining work.  It will be less than 4,000 and closer in spirit to the original convention.Reflex Reaction 21:25, 15 August 2005 (UTC)
 * How did you arrive at the 3,700 figure? I don't see how it can be right, given that I haven't gone through the unpruned pages looking for redirects with the same throughness as the pruned ones (in fact I've hardly touched them) and so far as I know no-one else is doing so. The redirects that could be created on those five pages are in addition to the figure of around 4,000 rather than something that can be taken away from it. Also, the current version of the project page doesn't add up. I know they are estimates, but it looks odd. 82.35.34.11 19:25, 16 August 2005 (UTC)
 * Also, I don't understand what you mean by "Conservative estimate" in this case. It can mean "low" or "cautious", but in this case it is cautious to estimate high to avoid underestimating how much there is to be done. 82.35.34.11 19:39, 16 August 2005 (UTC)
 * I guess I should have explained more when I changed the number because there is an error which I just fixed. Your work showed that ~200 topics were left per page after a thorough cleaning (redirects created).  Before the redirects were done each page was about ~325-350.  Using the numbers available for completely culled pages I (wrongly) estimated that 200 would remain after the redirects were made.  I should have used the 325 or 350 number not the 200 number because I was estimating the amount of remaining work to be done not an estimate of new articles to be created.  I have changed the estimate of remaining work to 4,500 (2,791 + 5*(350)) which is a better estimate of remaining work for the last five pages. My mistake.  Reflex Reaction 21:50, 16 August 2005 (UTC)

Pruned percentages - Just to be clear, I calculated the pruned percentages this way
 * Single pages: (current count - 275)/2000   ~350 ~275 articles per page need to be created or need a redirect, the rest need to be pruned.  So for page 5, (410-275)/2000 = 93% cleared, 135 more need to be trimmed
 * Total count: There are 2,259 articles that need to be cleared based on the current count. 1-total to be culled/starting amount (1-2,259/36,433) = 93.8%
 * Okay, but the figures aren't very accurate. I would have simply counted the percentage of the 2000 deleted, and jumped to 100% when a page was fully prunned.But it can be dropped when the pruning is finished. 82.35.34.11 11:56, 28 August 2005 (UTC)

Adding search
I assume that no one would object to me adding search capabilities to the lists but just to be safe, I only added it to the first five pages. If there are no complaints by tomorrow, I will add the rest of the alphabet. If anyone is interested in the spreadsheet that I used to create the lists (using work of others before me) please leave me a message on my user talk. Reflex Reaction 16:36, 24 August 2005 (UTC)
 * Good work, it makes checking our articles match Wikipedias faster. 82.35.34.11 12:01, 28 August 2005 (UTC)

Questions...
I am highly interested in this project and have been working on it here and there...I am still uncertain however about redirects. Should a redirect be created whenever a last name/first name appears on this list? For example, we have an article on George Washington, but if Washington, George is on the Encarta list, should I make that into a redirect?

I am eager to celebrate the fact that WP owns Encarta. Rock on, everyone. Paul 12:13, 26 August 2005 (UTC)


 * I would tend to make a redirect, unless the reversed order is just patent nonsense, and nothing that anyone would type into the search box. Bluap 14:28, 26 August 2005 (UTC)
 * I've done plenty of these redirects, but they're pretty useless for general purposes as they never produce any links. There isn't really much reason to do them except they'll stop red links coming up on future projects of this type. Sorting out middle names (and adding them to our articles where they are missing, which is quite often) is more important I think. 82.35.34.11 11:59, 28 August 2005 (UTC)

Could you clarify what you mean about middle names? Thanks Paul 15:09, 28 August 2005 (UTC) P.S. I envision a WP party, similar to a LAN party, where several wikipedians, with single-minded devotion, tear through this entire list. Bawls and beer will flow. Anyone who's down, say so. Paul 15:09, 28 August 2005 (UTC)

WTF???
WHY was the project page deleted?????? Paul 15:11, 28 August 2005 (UTC)


 * I'm not entirely sure, but I found this page about the topic. It's some sort of copyright issue. Mateo SA | talk 15:48, August 28, 2005 (UTC)
 * Very odd. I can't see a copyright violation from merely listing articles that are included in a reference - and even if there were, the lists have now been so thoroughly pruned that they no longer resemble the list of topics covered by Encarta. -- BD2412 talk 15:52, 28 August 2005 (UTC)