Wikipedia talk:Press releases/March 2005

How to distribute this release
I have a fat directory of US newspapers here, and we can plunder the press log from september's release. Also needed: mailing lists, newsgroups, cool websites. +sj +

SOMEONE VANDALIZED THE PAGE, FIX IMMEDIATELY
 * It was fixed 2.5 hours before the above line was added. I'd be happy to delete the above line and this reply. David Brooks 22:22, 18 Mar 2005 (UTC)

Comments on content
Is it too late to restructure this press release? A standard press release puts the "news" in the first sentence. This has a long expository, with terms that general readers would not understand, like subject-area portals. Press releases should not confuse or confound. Fuzheado | Talk 06:53, 15 Mar 2005 (UTC)
 * "putting the 'confound' back into 'press release'"...  No, not too late at all.  We still have a few days to hammer it out.  It will mainly be released in English, in any case.  +sj  +  11:48, 15 Mar 2005 (UTC)

Press releases, like a new stories themselves, are often shaped like an inverted pyramid. In any case, my question on this press release is, so what? I'm not disagreeing that Wikipedia's cool, but what is the import of reaching 500,000 articles other than the fact that it is a large round number? For example, is this a thousand times larger than any other Encyclopedia? I'm not sure I have an answer other than it is a big number, but if others do, they should highlight it! --Reagle 23:31, 15 Mar 2005 (UTC)


 * I think the 500,000 number is enough on its own and the short length of the press relase is the right size. We could put up some numbers of Encarta or Britannica as comparison, but I would caution against it.. Fuzheado | Talk 00:33, 16 Mar 2005 (UTC)

Determining article #500,000
Do the developers have some way of determining which article is #500,000? I tried loading the "Statistics" and "New Pages" times at the same time, and found that there were 500,193 articles, and that the 194th most recently-created article was Battle of Bean's Station. Hopefully, there's some more scientific way of doing this?

Anyway, it seems to be copyvio, so I've removed it for now, and I'm contacting the user who posted it.

Thanks, -- Creidieki 22:47, 17 Mar 2005 (UTC)


 * It turns out that it was from a US government website, and public domain. I had done a Google search, and had found another site which erroneously claimed copyright without attribution.  Well, I'm glad of that.  (Also, I accidentally edited Battle of Collierville instead.). -- Creidieki 23:17, 17 Mar 2005 (UTC)


 * It's hard to say for sure, but thanks to an active bunch of count-watchers on IRC, the deletion log, and alterego's and JRM's snapshots, it's a fair guess that the 500k'th article was Involuntary settlements in the Soviet Union. Congrats to Mikkalai :)  +sj  +  08:25, 18 Mar 2005 (UTC)

5,000 editors in the past month?
5,000 editors in the past month? Where is that from? 119 07:24, 18 Mar 2005 (UTC)
 * Projection from the December stats --> probably 7,000 of our "over 10 edits this month" editors. Since both the projection and the definition of "active" are vague, I picked a more modest number and left it at "over".  +sj  +  08:04, 18 Mar 2005 (UTC)

Recent changes
I see some significant changes to the text over the past couple of hours, all by 119... they add a lot of bulk to the body of the press release, and don't seem to clarify its thrust very much. I'm undoing many of those changes for now; in particular since others who looked at the release and felt good about it did so after seeing the previous, fairly stable version. Perhaps we can talk about how much of this information to include in the body of future press releases, ahead of time. +sj +  10:21, 18 Mar 2005 (UTC)
 * Quite frankly, I agree with SJ - 119 did a lot of rewriting, and I don't think it was an improvement. &rarr;Raul654 10:22, Mar 18, 2005 (UTC)
 * Bulk is not a good thing. We went through this with the million-article press release too; everybody wants to get some particular thing included that they feel is important, but in reality, to be effective you need to cut 90% of the stuff you'd like to cram in there. The information in a press release should be representative, not exhaustive, and you can only afford to hit a handful of key points. The revised version as it is now is pretty good. --Michael Snow 17:22, 18 Mar 2005 (UTC)
 * Then trim it... The introductory paragraph is now more cheerleading-like and usable specifics on Wikipedia's popularity (#4 Alexa "Reference", 50m hits) have been replaced with the wonderfully obscure "millions of visitors." "English Article"? Did Wikipedia publish its 500000th article on English, or its 500,000th article in its English edition? That some website has added songs, 1GB of images, and sub-portals hardly seems remarkable--will even a blogger cover that, much less a major regional newspaper? Further, the previous growth of Wikipedia is certainly important contextual information, especially for two sentences. 119 19:20, 18 Mar 2005 (UTC)

Publicizing the article in-house
Would it be appropriate to post notices on Peer review, Village Pump, and/or the mailing lists saying that Involuntary settlements in the Soviet Union is about to become very well-known, and it might be helpful for us if it were more complete? I'm not sure quite where that sort of post would be appropriate, or whether some people might object to "inflating" the article. Mainstream news sources might not understand the difference between the software's definition of "article" and the Encyclopaedia Britannica's. -- Creidieki 12:28, 18 Mar 2005 (UTC)
 * Sounds good to me, really. Some people might object, but the result does seem to have more pros than cons.-- Kizor 16:06, 18 Mar 2005 (UTC) On second thought, it seems likely that the article will get plenty of attention due to the press release alone, especially since the main page links to the release. -- Kizor 16:45, 18 Mar 2005 (UTC)

Nyuk nyuk nyuk
The press release seems to have a definitive pro-Wikipedia bias to me. We should work to make it more NPOV. -- Kizor 16:06, 18 Mar 2005 (UTC)
 * Why does something not in the main article space have to conform to the NPOV policy? &mdash; Matt Crypto 16:32, 18 Mar 2005 (UTC)
 * Frankly, because it was fun to say. 'Nyuk nyuk nyuk' is a reference to the Three Stooges comedy group.-- Kizor 16:45, 18 Mar 2005 (UTC)

How to determine the 500,000th article?
Many will be wondering this, as it is not strictly possibly simply by looking at the database. The only way to tell is by looking at the data I have made available on my website. The way to tell is to browse to this directory. They are sorted in order of the time that the pages were served from the wikipedia server. User:JRM/Sandbox contained NUMBEROFARTICLES, and we used that page because it was very quick loading. There are two versions, one that contained ?action=purge at the end of the URL. It is one of the latter pages that we are primarily interested in. Because my clock is synched with the atomic clock at NIST just up the road from my house, we can say that the 500,000th article occured at precisely 20:54:46 UTC, and this is the page that evidences that (notice the 500,000). You will notice at the end of the file name three numbers. These are the HOUR-MINUTE-SECOND that the page was served to me. At that very same second I was also served New_Pages, and that page is here. As you can see, Milon's Secret Castle was the newest article, however, Cyrius had a look at the source code and found that a new article does not change the counter unless it has both a link and a comma. The first version of Milon's Secret Castle did not meet this requirement. Therefore, the next article down the list is Involuntary settlements in the Soviet Union. As you can see here, this article did meet that status on creation. More information is available on my website. Please download and archive the directory. --Alterego 17:43, Mar 18, 2005 (UTC)
 * I hate to run into the chance to be stripped of the glory, but could you explain this "link-comma" issue in more detail: why, when. etc., it was decided? How does it work? (I am sure there were plenty of other commaless new articles down the road. How they were discarded (if), and why this unlicky Secret Castle had to be discarded by eyeball search.) Mikkalai 18:17, 18 Mar 2005 (UTC)
 * I am guessing the selection for which articles are counted is mainly for historic reasons; check out the Marts 7 2001 announcement. Thue | talk 19:20, 18 Mar 2005 (UTC)
 * THIS IS WRONG! The requirement is not that it has both a link and a comma. MediaWiki contains a switch that allows the Wiki's administrator to choose between "has a link" and "has a comma" as a requirement. I asked the developers, and it's set to use "has a link". -- Cyrius|&#9998; 19:40, 18 Mar 2005 (UTC)
 * Just to be clear to everyone: I didn't make the decision to have any article as The One lol :) I was simply trying to explain why the decision to choose the soviet article was made. As I said initially on my website, it's hard to make that choice. I'm still not clear on the reasoning used and was under the impression we were going to call it a tie. I apologize for misremembering that it could be a comma OR a link.  --Alterego 22:19, Mar 18, 2005 (UTC)
 * It's not a comma or a link. It's just a link. There's a switch in the software to shift it to commas. -- Cyrius|&#9998; 23:01, 18 Mar 2005 (UTC)

There's independent reasoning to support Involuntary settlements in the Soviet Union, without the effect of this mistake. See Wikipedia talk:Half-million pool. --Michael Snow 21:05, 18 Mar 2005 (UTC)

Great, now Angela's misquoting me on her blog. I am not saying that Milon's Secret Castle is the 500,000th article. I am saying that Alterego's reasoning for eliminating it was wrong. Michael Snow has explained his independent reasoning for selecting the Involuntary settlements article (Wikipedia talk:Half-million pool), and I don't have a problem with it. My previous objections were based on my belief that the only basis for selecting the 500kth article was the flawed assumption.

To summarize: We don't know for sure which article was the 500,000th. Milon's Secret Castle could have been it, and Involuntary settlements in the Soviet Union was likely to be it. -- Cyrius|&#9998; 23:01, 18 Mar 2005 (UTC)
 * 1) Articles were being created at a furious rate, making it impossible to easily determine which article crossed the milestone.
 * 2) Roughly six articles were considered possibilities, Milon's Secret Castle among them.
 * 3) MediaWiki does not count all "articles" in its article count, only those with links.
 * 4) Alterego wrongly interpreted something I said as meaning that MediaWiki requires a link and a comma to count an article.
 * 5) Alterego used that incorrect interpretation to eliminate Milon's Secret Castle and went on to declare Involuntary settlements in the Soviet Union the most likely 500kth article.
 * 6) *That's not true. I went to bed last night thinking it was a tie and reasoned that my excursion was just for fun. I woke up to a press release based on my data which said it was the soviet article. Wanting to support the foundations decision I attempted to reproduce the reasoning used. I made it very clear on my blog that I didn't think we could make a solid choice before I went to bed! --Alterego 00:58, Mar 19, 2005 (UTC)
 * 7) **Revise that to "Alterego ventures a guess that it's the settlements article, and a press release is written taking it as solid fact." I got the tone wrong on my statement. -- Cyrius|&#9998; 04:00, 19 Mar 2005 (UTC)
 * 8) Michael Snow used an independent method based solely on clock times to independently estimate that Involuntary settlements in the Soviet Union was the 500kth article.
 * 9) I start objecting to Alterego's flawed process without knowing that Michael Snow has come up with the same answer by a different mthod.
 * 10) None of this really matters at all, because according to developer JeLuF, the counter is wrong.


 * Sorry for the misquote. I thought when Brian discounted Milon's Secret Castle being the one, and you replied "this is wrong" that you were saying he was wrong to discount it, and therefore, it was that one... Sorry. :) Angela. 00:14, Mar 19, 2005 (UTC)
 * I think you've got it now. I was just saying it could not be eliminated, not that it was the right answer. Personally, I don't like either of them. Involuntary settlements... has placeholder section headings, and Milon's Secret Castle was an awful game. I could have gotten behind Bionic Commando. -- Cyrius|&#9998; 00:31, 19 Mar 2005 (UTC)
 * Please see latest comments here. I propose that at this point it is impossible to tell. Michael Snow's reasoning was incorrect and at this point I think server timestamps are incorrect (in addition to the counter being off). Let's just leave it as the soviet article? I like the sound of that so we can move on. --Alterego 01:07, Mar 19, 2005 (UTC)
 * I liked the idea of a tie, personally. Too late now, I guess. -- Cyrius|&#9998; 04:00, 19 Mar 2005 (UTC)

The calculations for the "13 stories tall" factoid
For the curious and cynical, here's the details of how I came up with this factoid:
 * At the average of 2,500 characters per article, this is 1.25 gigabytes of raw text, which if printed would form a stack about 13 stories tall.

First, shows that the average characters per article was last 2,434 in December 2004 and exhibiting a gradual upward trend. 2,500 seems about right for now. 2,500 times 500,000 is 1.25 billion characters. According to, single-spaced manuscripts have about "2500-3500 characters per page". This means 1.25 billion/3000 = about 417,000 pages.

According to, 20 pound paper is the standard copy paper weight "most frequently used by governments". According to, 20 bond paper has a thickness of about 0.0038 inches. Multiplying 417,000 pages by 0.0038, that's 1585 inches, or about 132 feet. Taking roughly 10 feet to be the height of a story, we have 13 stories. Deco 18:48, 18 Mar 2005 (UTC)


 * Haven't you forgotten to take into account that we write on both sides of the pages? Thue | talk 18:53, 18 Mar 2005 (UTC)


 * Of course &mdash; but we don't print on both sides of the page typically, unless we're printing a book (and you'd have trouble binding a book that big). In any case it's either 13 stories or 6.5 stories. Which do you think is most appropriate? Deco 19:04, 18 Mar 2005 (UTC)
 * I went ahead and put 6.5, specifying that the printing is double-sided. I hope this looks alright. Deco 19:10, 18 Mar 2005 (UTC)


 * I think the double-sided figure is best; the reason the number is interesting is that it is comparable with other encyclopedias, which are double-sided. Thue | talk 19:14, 18 Mar 2005 (UTC)
 * Other encyclopedias are usually printed on thinner paper than normal printed books. I'd like to see an estimated character count for some typical big conventional encyclopedia.  I have a few-years-old cd-rom edition of the Encyclopaedia Britannica that I never use because wikipedia is better, but maybe I can count the characters somehow.
 * Britannica has around 55 million words. At 5 characters per word, this puts it at 250 million characters, or roughly 1/5 our size. A stack of Britannica is about 6 feet tall. Multiply by 5 and you get 30 feet, or 2.5 stories. &rarr;Raul654 08:16, Mar 19, 2005 (UTC)


 * I just realised, I probably grossly underestimated this, because the "average characters per article" figure includes stubs, and the official article count does not. I'll have to get my hands on some better data. Also, it excludes images, and the 3000 characters per page probably excludes spaces, making the underestimate worse. Deco 20:02, 18 Mar 2005 (UTC)
 * Looking at the database size stats and projecting, this looks about right after all. Hmm. Deco 20:09, 18 Mar 2005 (UTC)


 * How compressible is paper? Would the sheets on the bottom of the pile be squeezed to less than 0.0038"? David Brooks 21:23, 18 Mar 2005 (UTC)


 * The sheets towards the top of the stack would have a little bit of air between them, offsetting the compression effect towards the bottom of the stack. &mdash; DV 12:51, 20 Mar 2005 (UTC)

I also did this calculation recently, and compared with the LOC (here) --Alterego 22:44, Mar 18, 2005 (UTC)


 * But how many books in the LOC are on the same topic, oh, say, "Our Wonderful Universe"? Just how many printed descriptions of Jupiter should you count? So, it's not gazillions; it's jillions at most. David Brooks 23:43, 18 Mar 2005 (UTC)

Lucky it wasn't an article about Star Wars Extended Universe
That would be very embarassing.
 * Forget the Extended Universe, how about an article or sexual slang?

Are you sure it was luck? ;)

Number of languages with Wikipedia editions
The press release says there are articles "in over 120 languages." Yet, according to, there are 159 languages with at least one article other than the main page. Is there a reason the lower figure was used in the release? JamesMLane 03:34, 20 Mar 2005 (UTC)
 * If it were "over 130", then it would include -pedias with 6 articles. And "over 122" would be strange. The decision, while being arbitrary (although formally true), has a merit of avoiding an accusation in overbragging. Mikkalai 04:17, 20 Mar 2005 (UTC)
 * I've switched it to 150 because it's (a) more accurate and (b) a nice, round number. &rarr;Raul654 06:09, Mar 20, 2005 (UTC)

Wikipedia Publishes 500,000th English Article
"if printed double-sided would form a stack about 66 feet or over 6 stories tall". Printed on what?--Jerryseinfeld 18:45, 20 Mar 2005 (UTC)

See the calculations for the factoid above. Incidentally, the press release should give the result of those calculations in metres as well as feet. We are an international project and should promote this news to international media. 66 feet is slightly over 20 metres. Jonathunder 06:05, 2005 Mar 21 (UTC)

spelling: tall storeys
Might be an idea to change "over 6 stories tall" to "over 6 storeys tall".

(Yes, I know "stories" is possible as a plural of storey, but it looks wrong and is not shown as such in either Wikipedia or Wiktionary. Important for credibility not to even seem to be guilty of a basic spelling mistake.)

194.203.153.1 11:33, 21 Mar 2005 (UTC) Harry


 * In US english, at least, one floor is a story, and multiple floors are stories. Looks right to me the way it is.Bollar 13:43, Mar 21, 2005 (UTC)


 * Since both spellings seem to be acceptable in US English, and since Wikipedia/Wiktionary gives only "storey", what would be the harm in changing it so it seems OK to speakers of both Englishes? 194.203.153.1 16:50, 21 Mar 2005 (UTC) Harry


 * Story is much more common than storey, and as far as I know, storey is chiefly used in British english. If you Google the four words, story/stories is used about 100 times more than storey/storeys.  Then there's the style issue - the article shouldn't mix variants of the language.  Perhaps someone should write a story entry!  Bollar 17:11, Mar 21, 2005 (UTC)


 * Well yes, naturally there will be many more hits for story/ies than storey/s, since the word meaning "tales, history" is going to be a fair bit more common than the one meaning "floor, level". If there's a reliable way of Googling for only one sense of a word I'd love to hear about it, it would be very useful. (Just in case you didn't know, "storey" only refers to a floor, not a tale.)


 * You seem to be assuming that the rest of the article is in American English but I can see no reason to think that.


 * 194.203.153.1 19:08, 21 Mar 2005 (UTC) Harry

Interwiki
Is there any sysops to adding the interwiki link for this article? I got the link for Chinese Wiki in below: zh-cn:Wikipedia:& zh-tw:Wikipedia:& thanks :) Shinjiman 16:26, 4 Apr 2005 (UTC)