User talk:Smallbones/Quality4by4

Explanation for lower starting score for Geography articles?
I'm curious if you've thought much about why Geography articles have lower starting quality than the other categories. I remember reading about rambot. Any idea what categories have the most bot-created stubs? -- Evoapps (talk) 21:15, 11 April 2016 (UTC)
 * In my incomplete & biassed experience creating some geography articles (specifically those related to Ethiopia), the problem lies in not having enough material. It's very tempting to write a series of articles about the settlements (or rivers, or mountains, or roads, etc.) of a given country or administrative subunit, & it's not that hard to compile a list of them. And it's only a little harder to find some basic statistics about those settlements, such as location, administrative configuration, population; even Ethiopia, which is not known for Internet savvy, has their census results online. After that point, it gets harder to find material on these geographic objects, sometimes exponentially harder. Yes, many cities & towns have well-documented histories, but for many more (as in 3-10 times more) either that history is drawn from sources Wikipedians would consider unreliable, or it doesn't exist at all. There were times that I considered finding enough historical material to create a single sentence about a town in Ethiopia was a major accomplishment. (And one that I doubt many were aware of -- which was why I eventually drifted away from contributing to that subject.) And then a lot of settlements have next to no social or economic information about them. In short, geographical articles are amazingly easy to start, but too often amazingly difficult to develop beyond a Stub class. -- llywrch (talk) 23:26, 19 April 2016 (UTC)


 * This is fascinating, thank you for your insight on the issue. As it relates to measuring article quality, when you read an incomplete article are you able to get a sense of how much of that incompleteness is due to the unavailability of editors versus difficulty in research? An alternative is that it depends on domain expertise, i.e., having written geography articles. -- Evoapps (talk) 01:42, 20 April 2016 (UTC)
 * IMHO there are two ways to determine whether an article is incomplete. First & most reliable is to be familiar with the subject -- not necessarily an expert, but familiar enough that one knows the standard references, what are important topics, & current thinking about it -- & evaluate the article that way. An example would be articles on Classical Greece & Rome, a subject I enjoy reading about. One of the misconceptions about this topic is that there have been few new thoughts about many of its subjects: after all, the fact that Julius Caesar was assassinated 15 March 44 BC hasn't changed in centuries, & we can be assured that it won't change in the near or distant future. However, the way we look at these subjects has changed: a modern Classicist is probably more likely to look at the social & economic effects of the assassination of Julius Caesar than to try to investigate the political decisions that led to his death -- which have probably been exhaustively studied. Things do change in the study of the history of ancient Greece & Rome, it's just that because there are fewer historians in this field than in others -- say modern American history -- the change is notably slower. The point of my example is that if an article fails to cite sources more recent than, say 1900, then it is clearly incomplete. (And I have found articles that cite 18th & 19th century sources, apparently because a copy was available online. Sad to say.) And if one is at all familiar with ancient history, one can also identify other topics that are overlooked: for example, I just left notes on two articles asking about the etymology of those placenames -- a topic that, knowing a little about not only Classical studies but placename research in general, I know has content available in the secondary literature. It may be a little difficult for someone who is not familiar with this field to research, but the information is out there. The second way to determine if an article is incomplete is to ask, "Does this article provide the information that a reader would reasonably expect to be there?" (This is my admittedly subjective method of determining the difference between a stub article & a start class article.) Consider geographical articles: a reader of Wikipedia who looks for an article on a settlement would reasonably expect to be told where that settlement is. That reader would also expect to find some other information -- the population, details of the history of the settlement, details of the people, what can be found there, etc. -- to varying degrees. For example, one would expect to find a mention of Tammany Hall in the Wikipedia article on New York City, but a 5,000 word discussion of local politics in an article about an African village would be surprising & excessive. But even an article about an African village would be improved if it contained something about what is there -- e.g., does it have a church? A mosque? A mission? A traditional shrine? And to repeat my earlier comment, it can be difficult from several thousand miles away to state what is there. FWIW, I remember reading an article in an academic journal lamenting that there is no public list of all religious missions in any African country, let alone all of Africa. Wikipedia can only report what has been published, & unfortunately only experience can determine what has been published on different topics. -- llywrch (talk) 17:16, 20 April 2016 (UTC)

So sorry that I missed this conversation earlier. I think the explanation below is consistent with Llywrch, but adds a bit.

Notice that there are 3.8 times more GEO, Eastern Hemisphere than GEO, Western Hemisphere articles. The sample used here reflects that (see the data table at the bottom of the user page). GEO,E articles are much worse than GEO, W on average. This is probably because of bots being used to start articles, but also because there are more regular editors living in the US and Canada than almost anywhere in the East. UK GEO articles are more like GEO, W articles. South American GEO articles are more like GEO, E articles. The anglophone countries simply have more editors interested in them and more info in their own language, plus, as pointed out above, better internet access. I've thought of testing the "more editors interested" hypothesis by comparing community population vs. article size in the US, but people would probably think it is too obvious - bigger towns get bigger articles. So I couldn't really get at "towns with more interested English language editors get bigger articles."

Maybe the takeaway here is that we should ask Poles to translate their articles on Polish towns to English, Russians to translate their articles on Russian places to English, etc. Smallbones( smalltalk ) 01:24, 22 April 2016 (UTC)
 * Interesting image. One conclusion I draw from it is that there are a lot of geographical & biographical articles because they are easy to start -- but I suspect the share of biographical articles that are stubs are equivalent to geographical because they are just as hard to develop into better articles. (For example, there are hundreds of ancient Egyptian pharaohs, but many are only names to us. And then there are sports figures who just meet the bar for notability, but for whom there is just not any material about them.) Hopefully someday Wikipedia will reach the point where we can start rationalizing these stubs & decide to merge, delete, or accept them for never being more than 4 sentences in size. As for importing articles from other Wikipedias, I would hope that would work, but I'm honestly surprised how often Wikipedias in languages where there are better sources than in English -- e.g., French & German for ancient Egypt & ancient Greece & Rome -- are translations of English Wikipedia articles. I guess for some people it's easier to translate from English than it is to research in their native tongue. :-/ llywrch (talk) 22:14, 22 April 2016 (UTC)