Wikipedia:Productivity of Wikipedia Authors

By its very nature, Wikipedia lends itself preeminently for international comparisons. For example, currently Wikipedia (DE) has some 478,000 articles, Wikipedia (EN) comprises Special:Statistics articles. Does that mean, English-speaking Wikipedians write more than their German-speaking colleagues? And how does that compare to Wikipedia in other languages?

The results might give some clues on the intellectual climate in various countries.

Productivity of Wikipedia authors
A meaningful measure would be the Productivity of Authors, that is the number of articles in a language compared to the number of potential authors. To be a potential author, at least two requirements must be fulfilled:


 * Command of the language. This must be at least sufficient to write an article, perhaps with the help of other Wikipedians. For practical purposes this a too vague a measure. Therefore, in the following table only native speakers of each language have been taken into account because these numbers are available. However, this determines only the lower limit of potential authors - the real number will be higher and therefore their real productivity will be lower than shown below.


 * Access to the Internet On this there accurate statistics by geographic region. However, speaking a language can comprise several geographic regions - for example, English is a language spoken nearly all over the world. On the other hand, there are geographical regions where several languages are spoken - for example, Switzerland is a country where 4 languages are spoken).

Thus, in the following table there is a column "countries" which lists the main countries where native speakers of a language reside and over which the Internet penetration was averaged.

Productivity of Wikipedia authors by languages
potential                 Productivity Native Speakers    Internet-  Wikipedia-  Wikipedia      of Wikipedia 		      Age -  Penetration  Authors     Articles        Authors			 as of Aug 2005 Language Wiki (mio) (countries)       (%)    (million)    ('000)    (articles/'000 pot. aut.)	  Begin	     (month) - ---  --- --  --    -	 --   English    en   322   USA,GB,CDN      66.6	221.1	     684		3.0		Jan 2001	56 German	  de   100    B,D,A,CH	      56.6	 55.5	     274		4.8		May 2001	52 Japanese  ja   125	J	      60.9	 76.1	     134		1.8		Sep 2001	48 French	  fr	 79   F,B,CDN,CH      49.9	 35.9	     148		3.9		Aug 2001	49 Polish	  pl	 43	PL	      27.8	 12.8	      80		6.1		Sep 2001	48 Dutch	  nl	 21	B, NL	      66.2	 16.6	      85		5.0		Aug 2001	49 Italian   it	 61	I	      48.8	 34.2	      55		1.6		Jan 2002	44 Swedish   sv	  8.8	S	      73.6	  6.5	      96	       14.8		May 2001	52 Spanish   es   332   E,MEX,Latinam. 14.1	 45.1	     60		1.3		May 2001	52 Portuguese pt  170	P,BR	      13.5	 23.0	      61		2.6		Jun 2001	51 Chinese   zh  1080	RC	       7.9	 85.3	      37		0.4		Oct 2002	34 Hebrew	  he     5.1	IL	      44.8	  2.3	      24	       10.5		Jul 2003	23 Norwegian no	  4.6	N	      68.2	  3.1	      32	       10.2		Jan 2002	44 Finnish   fi	  5.2	FIN	      62.3	  3.2	      29		9.0		Sep 2002	35 Russian   ru   170	RUS	      15.5	 26.4	      29		1.1		Nov 2002	33 Danish	  da	  5.3	DK	      68.7	  3.6	      29		8.7		Feb 2002	43 Bulgarian bg	  9.0	BG	      21.4	  1.9	      17	        8.8		Dec 2003	41 Slovene   sl	  2.0	SL	      47.5	  0.9	      17	       17.9		Jul 2003	23 Hungarian hu	 14	H	      30.2	  4.2	      15		3.5		Jul 2003	23 Czech	  cs	 12	CS	      34.5	  4.1	      14		3.4		Nov 2002	33 Turkish   tr	 51	TR	       9.9	  5.0	       4.6		0.9		Dec 2002	32 -       -              ---                                                 666.8       1924.6 (ca. 96%     2.88 avg.                                                             of all Wikipedia)

Column Native Speakers comprises two sub-columns:


 * (mio) - Total number of native speakers (see List of languages by total speakers).
 * (countries) - countries taken into account when averaging the internet-penetration.

Column Internet-Penetration is the weighted average over the countries of native speakers of a language (see Interstat's Internet Usage Statistics)

Column potential Wikipedia-Authors is the product of number of native speakers and the average Internet-Penetration in their countries. This is the approximate number of people who might be capable of writing a Wikipedia article.

When you compare the number of potential Wikipedia-Authors and the number of Wikipedia articles (see the main page of each language). you get the productivity of wikipedia authors in articles per 1000 potential authors.

Findings

 * 1) Surprising is the great disparity among language communities: The most productive one, Slovenia, produced more than twenty times more Wikipedia articles per '000 potential authors than the last one, China. That's even more remarkable since Slovene is one of the youngest Wikipedias whereas Chinese is 9 months older.
 * 2) All Scandinavian countries rank very well (from Sweden to Norway and Finland and even Denmark is still far above average). That cannot be explained by their age alone - a number of older Wikipedias show less productivity.
 * 3) An outstanding productivity is achieved in Israel, though Hebrew is one of the youngest Wikipedias in this list.
 * 4) Remarkably well does Poland, while Polish is almost the same age as English, German and French.
 * 5) The languages most prominent in Wikipedia (English, German, French) are about average. Well, for arithmetic reasons that's not really astonishing. Thus, however, nearly 50% of all potential Wikipedians have about the same productivity (while, as said before, there are great differences among the others).
 * 6) Widespread languages (e.g. Chinese, Russian, Turkish) have rather poor productivity in writing Wikipedia articles, though all of them are nearly 3 years old.

More comparisons/graphs


This graphs shows a comparison of a factor, which suppose to represent the number of articles per person for Wikipedias with more than 10,000 articles (as on 27th May 2006). The values "Articles per person" were calculated as: the number of articles in a given language divided by a number of people, for which that language is native. All numbers are based on the data given in English and Polish Wikipedias. If an accurate value was not available an average between the lowest and the greatest number was taken. The graph for language "Ido" was cut, because it amounted to 5.34, whilst the next best (Icelandic) is only 0.038.

Ratio of active Wikipedians (with more than 10 editions per month) to the number of native speakers for Wikipedias with more than 10 000 articles. The values are given in promils (1/1000).