User talk:GPHemsley/Archive 21

Language-population update project
Hi. The 18th edition of Ethnologue just came out, and if we divide up our language articles among us, it won't take long to update them. I would appreciate it if you could help out, even if it's just a few articles (5,000 articles is a lot for just me), but I won't be insulted if you delete this request.

A largely complete list of articles to be updated is at Category:Language articles citing Ethnologue 17. The priority articles are in Category:Language articles with old Ethnologue 17 speaker data. These are the 10% that have population figures at least 25 years old.

Probably 90% of the time, Ethnologue has not changed their figures between the 17th and 18th editions, so all we need to do is change "e17" to "e18" in the reference (ref) field of the language info box. That will change the citation for the artcle to the current edition. Please put the data in the proper fields, or the info box will flag it as needing editorial review. The other relevant fields are "speakers" (the number of native speakers in all countries), "date" (the date of the reference or census that Ethnologue uses, not the date of Ethnologue!), and sometimes "speakers2". Our convention has been to enter e.g. "1990 census" when a census is used, as other data can be much older than the publication date. Sometimes a citation elsewhere in the article depends on the e17 entry, in which case you will need to change "name=e17" to "name=e18" in the reference tag (assuming the 18th edition still supports the cited claim).

Remember, we want the *total* number of native speakers, which is often not the first figure given by Ethnologue. Sometimes the data is too incompatible to add together (e.g. a figure from the 1950s for one country, and a figure from 2006 for another), in which case it should be presented that way. That's one use for the "speakers2" field. If you're not sure, just ask, or skip that article.

Data should not be displayed with more than two, or at most three, significant figures. Sometimes it should be rounded off to just one significant figure, e.g. when some of the component data used by Ethnologue has been approximated with one figure (200,000, 3 million, etc.) and the other data has greater precision. For example, a figure of 200,000 for one country and 4,230 for another is really just 200,000 in total, as the 4,230 is within the margin of rounding off in the 200,000. If you want to retain the spurious precision of the number in Ethnologue, you might want to use the sigfig template. (First parameter in this template is for the data, second is for the number of figures to round it off to.)

Dates will often need to be a range of all the country data in the Ethnologue article. When entering the date range, I often ignore dates from countries that have only a few percent of the population, as often 10% or so of the population isn't even separately listed by Ethnologue and so is undated anyway.

If Ethnologue does not provide a date for the bulk of the population, just enter "no date" in the date field. But if the population figure is undated, and hasn't changed between the 17th & 18th editions of Ethnologue, please leave the ref field set to "e17", and maybe add a comment to keep it so that other editors don't change it. In cases like this, the edition of Ethnologue that the data first appeared in may be our only indication of how old it is. We still cite the 14th edition in a couple dozen articles, so our readers can see that the data is getting old.

The articles in the categories linked above are over 90% of the job. There are probably also articles that do not currently cite Ethnologue, but which we might want to update with the 18th edition. I'll need to generate another category to capture those, probably after most of the Ethnologue 17 citations are taken care of.

Jump in at the WP:LANG talk page if you have any comments or concerns. Thanks for any help you can give!

— kwami (talk) 02:42, 4 March 2015 (UTC)