Wikipedia talk:Wikipedia Signpost/2015-05-20/In focus


 * How many times does this need to be said? Forget about article counts.  What matters is article quality.  If you can demonstrate the will to train editors and teach them better research and writing skills, you can plan a future of sorts for Wikipedia.  Of course, I'm alone crying in the wilderness on this.  I'm sure all the money is going to figuring out how to automate article writing and replace editors with AI.  Hail Silicon Valley.  How's that planned 45% unemployment going to work out for y'all?  See you on the barricades... Viriditas (talk) 04:12, 21 May 2015 (UTC)
 * And how many times does this need to be said? Not everyone has to work on Wikimedia sites in the same way. I am always seeing people telling other people how they should be spending their time to improve the sites. Bottom line is this: you work on what you find important, and other people will work on what they find important, and together (hopefully) we can improve things. That's the wiki way. That being said: yes, it would be nice if more people could work on fact-checking and adding/improving references. That's hard work, though, so only a minority of editors will want to put in the required effort (especially since [almost] none of us are getting paid to edit — a fact that makes your "money" and "unemployment" remarks seem quite irrelevant). - dcljr (talk) 17:57, 21 May 2015 (UTC)
 * Your article covers the technical reasons for why article counts on Wikimedia wikis are erroneous. You did a great job covering this topic.  But it isn't clear to me why this is an important issue.  Why do we need to measure the number of articles?  The unstated assumption is that article count milestones indicate a measure of success.  But I am far more concerned with the number of articles reviewed for accuracy, reliability, and prose quality, which is, IMO, a far greater metric for success. Erik Zachte touches briefly on  "presenting article counts in a meaningful way" that indicates the importance of editors vetting for article quality reflected in one measure of the count.  For me, this is the most important aspect of your article, but it's mostly ignored.  Regardless of how we define an article, editor retention rates may impact article count milestones at a greater scale. As editor retention rates go down, article counts might decrease.  With automation increasing, experts are currently predicting that in the U.S. alone within the next two decades, "47% of the workforce have a high probability of being displaced by technology and another 19% had a medium probability of displacement".  In the longer term, jobs that require "critical thinking, innovative thinking and high emotional and social intelligence capabilities" will be safer from automation, but probably not for long.  With such a large pool of unemployed people to draw from, Wikipedia is a unique position to welcome these people into its fold and continue to increase article counts and more importantly, article quality.  Far from being "irrelevant", this is the most important occurrence in modern history. Viriditas (talk) 21:09, 21 May 2015 (UTC)
 * Article count (and article length, reference count, article assessment and many others) is a way of measuring a number of things. "Success" is too vague a term.  Primarily these metrics are only useful for comparing projects against other projects (ideally of the same family) or against themselves at other points in time.  These types of figures, for example, partially motivated the WMF's attempt to increase participation from less developed countries.  (Arguably too great a leap was made from the figures to action without developing and testing an explanatory model, but that does not invalidate the usefulness of the figures.)  All the best: Rich Farmbrough, 22:55, 21 May 2015 (UTC).


 * That is such a nice way way of putting it. I think you are exactly right. Thanks for doing this work. --Ori Livneh (talk) 01:03, 24 May 2015 (UTC)
 * I too don't care too much about the number of articles. The number is growing, and so is the quality of the stock of articles with editors adding some info here or there, or just tidying up typos (I make more than my share), improving grammar and style, etc. What we don't measure, probably because we can't, is delight. Good articles are all very well, and a great help to high school and college students faced with an assignment, but what I really hope happens from time-to-time is someone finding something on a topic that matters to them, even if it is not notable to the world at large. Finding out a little about a vessel one of one's ancestors sailed on, or an obscure provincial town they came from, and the like, is where Wikipedia really performs a unique service. Acad Ronin (talk) 19:32, 21 May 2015 (UTC)
 * My viewpoint on this issue is simple: if the MediaWiki software is going to report article counts (and let's face it, that feature is not going away), the counts should be as correct as possible. To the extent that they are not correct, people should understand why, and some thought should be given as to whether the situation can be improved. That's it. What people use the counts for, how they get interpreted, how people feel about reaching milestones, etc., are all less important to me than the accuracy (or lack thereof) in the counts themselves. - dcljr (talk) 23:38, 23 May 2015 (UTC)

Inconsistent information
In the second sentence, I read "a decrease of 281,624 articles in the English Wikisource (a 27% drop)", then later in the page, I read "in the English Wikisource, which increased by 281,199 (a 27% drop)". [//en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-04-08/News_and_notes#Milestone_counts Signpost 2015-04-06] says "a decrease of 281,624 articles in the English Wikisource (a 27% drop)". What's the right figures ? Cantons-de-l&#39;Est (talk) 23:46, 21 May 2015 (UTC)
 * Sharp eyes! The first and last statements you have quoted are based on stats collected by a Perl script that I (try to) run manually each day. The "decrease of 281,624 articles" occurred between 02:56:35 UTC on 2015-03-29 and 01:47:31 UTC on 2015-03-30. The second statement is based on stats collected by EmausBot and posted to the various "/Tables" pages at Meta; see Article counts revisited/2015-03-29 changes to all recounted wikis for more details. In the case of the English Wikisource, the second quote should say "which de creased by 281,199" (I accidentally switched increase and decrease when I typed it up); this change occurred between 12:00 UTC on 2015-03-27 and 12:00 UTC on 2015-03-31. The differences in numbers are purely due to the different times at which the data were collected. This wouldn't have been an issue if I had saved a copy of the data my Perl script collected on 2015-03-29 and 2015-03-30, but I didn't think to do that before I ran the script too many times to still have access to the older information. (Oops.) - dcljr (talk) 22:53, 23 May 2015 (UTC)

Database hicups
You missed db hickups as a possible reason for article count mismatch. Updating the article count is not done in the same transaction as making the edit, so every now and then, something weird will probably happen and an article will not get counted. Over time, this probably makes a difference. Bawolff (talk) 03:40, 22 May 2015 (UTC)

Stateless, historically and globally consistent measurements
People interested in the issue of how to measure Wikimedia projects growth consistently may want to take a look at our work on metrics standardization and some of the principles we proposed to identify robust metrics. As to the issue of quality vs quantity, check out EpochFail's work on m:Research:Measuring_value-added and m:Research:WikiCredit --DarTar (talk) 19:46, 22 May 2015 (UTC)

Disambiguation pages
What it is not taken into account to exclude from the article count, like the redirects, are the thousands of disambiguation pages that by definition they are also not articles. Since they are already treated differently by MediaWiki, it would be easy to do. -- geraki T L 07:47, 23 May 2015 (UTC)
 * Hmm. That definition seems to be what the English Wikipedia considers an "article" specifically for the purposes of its own guidelines and policies. You say it's "easy" to do, but I'm not sure how quickly pages can be checked in this way; it might be too slow to implement on the English WP, for example. (But IANAD, so I don't know.) - dcljr (talk) 00:07, 24 May 2015 (UTC)

Other projects
Any reasons why Commons and Wikispecies aren't recounted? I would imagine that Commons is tricky because it's more about files and media than pages. But Wikispecies seems straightforward. I do seem a potential recounting issue in Wikispecies. A lot of taxon authority pages only contain their nationality, area of classification (e.g. botany) and their birth date and death date (or year). These authority pages will have mainspace articles pointing into them but if they don't have a comma, category, image, or interwiki link, these authority pages will be excluded from the count. OhanaUnitedTalk page 00:32, 26 May 2015 (UTC)