User:Aymatth2/Expandstats

This page gives statistics from a comparison of 27 articles flagged for expansion to 27 articles not flagged. It compares the degree to which the two sets of articles were expanded in 2010. Results are inconclusive due to the small sample size and difficulty in ensuring like-for-like articles in the samples. In summary: These results should be treated with great caution, for reasons given below. A larger and more carefully designed study may show quite different results.
 * For the flagged articles, average expansion was 12.17%
 * For the unflagged articles, average expansion was 17.45%

Methodology
To build the list of flagged articles, the editor
 * Checked "what links here" on template:expand
 * Made a list holding results 51-150
 * Selected entries that
 * Had the expand template at the top of the article (not just after a section header)
 * Had been flagged for expansion before 2010
 * Arbitrarily removed "similar" articles (e.g. picked just one of Telecommunications in Jersey, Telecommunications in Jordan, Telecommunications in Kazakhstan, Telecommunications in Kuwait)
 * This left 27 articles, for which the edit history showed the size at 1 January 2010 and the size at 22 December 2010

To build the list of unflagged articles, the editor used Special:Random to find a random sample of 27 articles. Articles were filtered to remove Again, the editor recorded the size at 1 January 2010 and the size at 22 December 2010
 * Articles less than 2,500 characters long, since all but one of the flagged articles was larger than this
 * Articles created in 2010

Comments
Both samples were very small, so any conclusions have to be treated with extreme caution. A mechanized approach with samples of 1,000 or more articles would give much more credible results. There were noticeable differences in the characteristics of the two samples. Specifically:
 * Tagged articles were generally much larger than untagged articles. This was not caused by tagging. The size of the tagged articles on the date they were tagged was generally greater than the size of randomly selected articles
 * Tagged articles had much more edit activity than the random articles. In many cases there were more than 50 edits in 2010. None of the random articles had more than 50 edits in 2010

This suggests that tagging for expansion is more likely with articles where there is a higher level of editor interest than the average article. A more thorough study of the effects of tagging should adjust for this by comparing samples of tagged and untagged articles that have similar size and edit activity levels.