User:R. fiend/A Million Articles

It is all but certain that, before long, Wikipedia will have a million articles.

This is not exactly a good thing.

Let me explain. Wikipedia's size is an asset to an extent. Britannica has a bit over 10% of this total, but, of course, size isn't everything. When it comes right down to it, Britannica is, and likely always will be, more reliable, and for an encyclopedia, that's a very important attribute. The "Wiki-" is nice and all, but the "-pedia" is what's really important. And having a million articles, in this sense, is something of a liability.

First of all, let me clarify. Wikipedia is close to having a million entries, it is far from having a million articles. the most specific information I have on this comes from a study I recently did. It's imperfect, but, as far as I know, it's the best we have, and it has been largely substantiated by other smaller studies. It can be found here. According to this, only about 70% can qualify as articles. The others are either substubs (too short; a sentence is not an article), disambiguation pages (navigational devices, not articles), lists or charts (not necessarily bad, but not meeting the strict definition of an article either), or pages with serious problems (needing major cleanup, having serious credibility problems, or ones that have fallen through the cracks and should have been deleted). The numbers are somewhat out of date, as Wikipedia has grown considerably in just a few months, but I have no reason to believe the percentages have changed dramatically for the better (and circumstantial evidence that they have changed for the worse). The estimated 12,352 deletables is a serious problem right there. I'm sure some could be completely overhaued and turned into decent little articles, but until they are, they are a liability. Wikipedia is just too big to be monitored by the good editors we have. It's unfortunate, but true. Vandalism goes unspotted for months at times, new entries are added by the truckload, and there just isn't the time to verify all of them. Wikipedia's popularity has increased greatly (which is good) but that has lead to thousands of people adding anything that pops into their heads, and leaving us to clean up after them (bad). Who has the time (or the ability, google is lazy and imperfect tool) to check every new entry? If something is highly questionable, we have to go through a somewhat complicated and rather overloaded deletion process. An admin can spend hous just taking care of speedies.

The Seigenthaler incident highlights some of these problems. It's not surprising that his entry went unnoticed for so many months, it was lost in a sea of some 700,000 articles. It took him looking for himself to notice the problem. It was an isolated incident, but it is almost a statistical certainty that there are other such articles floating about, with similar incidents waiting to happen. When it does, it may be worse for Wikipedia. Somehow the word managed to get out to some that the problem was "fixed", when the only consequent change was minimal; it would not have prevented the initial incident, even if the perpertrator had not bothered to register (which he could have done in seconds anyway). It happening again when many people believed they were told the necessary precautions had been taken will be a very difficult hurdle to clear. One million articles is far too large of a playground for vandalism and unintentional mistakes to hide in.

One really bad article (and I don't even mean front-page-headline/potential-lawsuit/libelous bad) can do more to harm the project than a hundred good articles can do to repair it. True, there was an article in Nature magazine comparing Wikipedia to Britannica on their science articles, and Wikipedia didn't do much worse. But those were confined to science articles, and there still were some substantial errors (and many smaller ones). The size of Wikipedia is rapidly rising, but the average quality seems to be dropping. It takes time to write a good article (and I'm not taking featured article or anything, just decent) and we have thousands of people who just aren't putting an effort into it at all. From my recent observations, we are becoming plagued with substubs. There was a time when you could do a 10 random article search, and find one pretty bad article; the last one I did, I found only one that I considered quite good. In the time it takes several editors to argue about whether a featured article candidate should have another image, or if the opening section is long enough, dozens of new additions have been made, nearly all of which need real work or removal.

So what can be done about this? Very little, unfortunately. If, as a group, Wikipedians were to become more concerned with quality and less with quantity, we might have a good start. The problem is that it's easy make a bad edit, making good ones takes effort. But with effort we can make progress. We can realize that as Wikipedia grows, so does its misinformation. We can start questioning the idea that keeping fluff does no harm, and realize that it is a small part of the greater problem of an completely unmanageable Wikipedia, being overtaken by entropy and about to be crushed under its own weight. We can stop having five separate articles which could all be covered in one; it's 4 less articles to have to watch. We can try to have honest discussions about encyclopedic standards, instead of this attitude that anything that can be verified gets its own article. And we can start being more serious about verification. "Verifiable" should not mean "someone could potentially research this to see whether it's true or not", it should mean "we've researched this and it is true".

Quality control is nearly impossible to maintain for one million entries, and while I'm sure the "one millionth article milestone" will be a big event, I won't be celebrating, which is sort of sad, really. I suppose there will be a press release in which the "millionth article" is identified, circulated, and celebrated, and I bet it will be a featured article-level entry on an unquestionably encyclopedic topic (like The early history of Benin). I suppose it's an open secret that in such cases the article is hand selected, being representative of the millionth article, even though with constant additions and deletions we can't really tell exactly when one million was reached (and if you could it's much more like to be something like Sid Pluthroo: "an awesome guy from east staley high school. joined the swim team in 04."). So be it. That's the way it's done, disingenuous as it seems.