Wikipedia talk:Modelling Wikipedia's growth/Archive 2

Update needed
We need to update the graphs, as well as remove the obsolete ones. --Christopher 12:19, Mar 1, 2005 (UTC)

Simpler model
The equation Y = .49X² + 13.59X + 175.6, which is simpler than the current one, fits surprisingly well with the data. --Bart133 (t) 16:31, 17 Mar 2005 (UTC)

Are the artifacts still bad?
The line about artifacts in the data making it impossible to see wether the trend is exponential, linear or whatever (which probably was appropriate just after the addition of all the towns), isn't really appropriate any more, is it? --62.79.161.178 15:37, 24 Jun 2005 (UTC)

My new model
Should I write a section on my own growth model? Image:Wikigrowthjul05.jpg --Ctrl buildtalk 23:36, 13 July 2005 (UTC)


 * Absolutely. I'd be interested to see how your model works. As an aside how close were your predictions for this month (August 2005)? --Lisiate 01:59, 19 August 2005 (UTC)

Unclear what variable represents
on the formula used for the Dec 2003 model, what is d? --207.200.116.195 05:57, 8 August 2005 (UTC)

What happened arround Oct. 2002?
What happened on october 2002 to cause such a big bump? --Bawolff 00:13, 2 October 2005 (UTC)


 * In October 2002 Ram-Man used the bot Rambot to add a very large number of articles about U.S. towns; these articles were automatically generated from U.S. census data. As you can see, at the time it made a big difference to the total number of English Wikipedia articles. --CheekyMonkey 11:44, 2 October 2005 (UTC)

Added plot of log(pages) vs. time to demonstrate exponential growth
I added a plot of the log(English language pages) vs. time and it look VERY linear demonstrating exponential growth.

To me this shows that the more pages are out there, the more people read them, the more readers are "converted" to editors, and these people then create new pages. I would like to see if people are starting to do statistical analysis of pages, page types etc. to create knowledge bases like Cyc.

I put it into a spreadsheet and was going to try to get the TREND function to work but no luck so far. Let me know if someone else is an Excel expert.

--Dan 21:48, 11 April 2006 (UTC)

Automatic modelling
I have made a gnuplot/ruby script which, given a file containing article creation dates, plots the size of wikipedia and fits a few functions to it (exponential, logistic, power series), and generates png-images of this. Results are here and here. I can't generate the text files used as a basis myself, but it should be possible to use something like this to automatically keep this page up to date. The scripts themselves are linked from my user page. Amaurea 14:53, 23 April 2006 (UTC)

Google Trends
Look at the following graph:

http://www.google.com/trends?q=%22Wikipedia%22&ctab=0&date=all&geo=all

Far too good information not to use... But where and how?


 * It's even more interesting when you compare it to some competitors:


 * http://www.google.com/trends?q=wikipedia%2C+encyclopedia%2C+encarta%2C+britannica&ctab=0&geo=all&date=all - mennonot 22:00, 19 May 2006 (UTC)
 * See Wikipedia talk:Awareness statistics. --Zoz (t) 23:38, 28 July 2006 (UTC)

The December 2003 model predictions vs. actual data
Is this section really relevant anymore? I think we've established sufficiently that the 2003 predictions were way low, and continuing on a monthly basis to show how much more mighty we are than we thought we'd be doesn't seem to be productive. it's absolutely stunning historical data, but at this point it's about 50% out of sync. Maybe a new prediction is in order? -- nae'blis 20:02, 31 August 2006 (UTC)


 * I removed this section. I agree with you. It does not have any sense to show that the statistics once calculated in 2003 do not hold anymore. Diego Torquemada 12:59, 8 November 2006 (UTC)