Wikipedia:WikiProject Languages/Open tasks

General

 * Answer requests for comments

Updates
Population data has been mostly updated from Ethnologue 16 to 17. However, an unknown number of articles which did not have the ref field set to "e16" slipped through the cracks. For instance, [//en.wikipedia.org/w/index.php?title=Cumanagota_language&oldid=575515965 Cumanagoto] did not have a ref'd population figure because E16 had mistakenly listed it as extinct. Articles which are not ref'd to Ethnologue could be checked in case E17 has a more recent figure.

User:PotatoBot helps keep ISO redirects in sync with changing WP articles and ISO standards. The results of the latest run are displayed at ISO 639 log and ISO 639 language articles missing.

Names at Spurious_languages with asterisks have not been addressed.

Articles to improve: Category:Language articles with unknown population not citing Ethnologue 18

Articles citing previous editions of Ethnologue can be found in the following categories:
 * Category:Language articles citing Ethnologue 8 (empty as of Dec 2022)
 * Category:Language articles citing Ethnologue 9 (empty as of Dec 2022)
 * Category:Language articles citing Ethnologue 10 (empty as of Dec 2022)
 * Category:Language articles citing Ethnologue 11 (empty as of Dec 2022)
 * Category:Language articles citing Ethnologue 12 (empty as of Dec 2022)
 * Category:Language articles citing Ethnologue 13 (acy and twc as of Dec 2022)
 * Category:Language articles citing Ethnologue 14 (nul, yud as of Dec 2022)
 * Category:Language articles citing Ethnologue 15
 * Category:Language articles citing Ethnologue 16
 * Category:Language articles citing Ethnologue 17
 * Category:Language articles citing Ethnologue 18
 * Category:Language articles citing Ethnologue 19
 * Category:Language articles citing Ethnologue 20
 * Category:Language articles citing Ethnologue 21
 * Category:Language articles citing Ethnologue 22
 * Category:Language articles citing Ethnologue 23
 * Category:Language articles citing Ethnologue 24

Articles citing undated versions can be found in:
 * Template:Ethnologue
 * Template:Ethnolink
 * Category:Language articles citing Ethnologue undated

Most should be updated to a reference to the latest Ethnologue edition or to another reliable source. However, references to old editions may continue to be appropriate, for example, with undated citations, or where an old edition shows the date or range of estimates of the source, and that info has been lost from recent editions, or where a new source in the latest edition of Ethnologue just cites an old edition of Ethnologue, so we should cite the old edition ourselves.

Some articles do not use templates such as e25:
 * "Ethnologue: Languages of the World"
 * "ethnologue.com"
 * http://ethnologue.com
 * http://www.ethnologue.com

Short descriptions
All articles should have a short description. As of December 2022, about 1,000 articles about languages do not have one: -hastemplate:"Short description" hastemplate:"Infobox language"

Articles to be created

 * Keftiu language (Egyptian records of what may be Minoan; WP-de has an article)

Red links should either be redirected or have their own articles.


 * 1 Eastern Peripheral Nahuatl (Chiapas Nahuatl)
 * 2 Zapotec in List of endangered languages in Mexico (locations, index [checked all small locales of Zap. of S. Mtns against Ethn; no matches. checked large locales of N. Zap. of Valleys; no matches.) [add link to Mixtec of the Puebla-Oaxaca border if INALI name directed.)
 * 6 INALI names in Nahuatl dialects (need to also link plain-text English names in other classifications)
 * 2 Zenati languages
 * 1 Indo-Pacific languages (Toro; dbl check Isurava & Tagota. Multitree identities are unreliable)
 * Shümom language (in Bamum people and related articles)
 * 5 Template:Language Endangerment status

99.9% of ISO language names have articles, though not always one-to-one (e.g. Fulani, Zhuang, and Mazatec); the 0.01% which do not are spurious, dubious, or insufficiently attested to justify their own article, and are redirected to an article stating that.

The lists below are of self-links in our articles, language names from various sources which do not have articles or redirects, and suspicious cases to keep track of.
 * Lists for evaluation


 * INALI
 * 48 at INALI names for Mexican languages (27 Mixtec & 6 Nahuatl to be reviewed; 12 Zapotec & 3 others attempted). Even blue links may be wrong, due to confusion of similar town names or misidentification at Ethnologue.


 * AIATSIS
 * 7 potential languages w data. The AIATSIS db is periodically updated, with new languages confirmed.


 * Ethnologue 11
 * Holima ["near Dobu" – misreading of Molima?], Waelulu ["existence unconfirmed"; taken from V&V]


 * Voegelin (1977): 36 red-linked names; list doesn't bother with reds links for what Loukotka says is unattested.
 * Blue links have not been checked. Many are presumably inadvertent homonyms rather than the language intended by V&V.


 * Ruhlen (1987)
 * S.Am.: 12 (see key) extremely obscure names of mostly unattested languages, not even listed in Campbell & Grondona 2012, and for only a few does Loukotka say anything other than 'unknown'. Those not found in Loukotka might be copy errors.
 * There are also at least half a dozen names in Ruhlen which take you to what is apparently the wrong article. One is a typo, 3 are unidentified, and 2 have perhaps just been reclassified.


 * Campbell & Grondona
 * 0 in List of unclassified languages of North America & List of unclassified languages of South America: There will never be anything to write on most of these, so they've been converted to self-links.


 * Linguist List local-use ISO


 * Glottolog: 25 at Talk:Glottolog 93 more at WikiProject Languages/Glottolog languages without ISO codes -- both for Glottolog 2.2
 * WikiProject Languages/Glottolog 2.2 language names (see Talk for red links)
 * WikiProject Languages/Glottolog 2.3 language names (see Talk for red links)
 * WikiProject Languages/Glottolog 3.3 language names


 * Identity suspect: Nshi, Sotatipo, Lui, Pasto (wrong ISO?), Kanamarí and Karipuná (contradicted by E17), Gulei (marked "?" in list), Sonde, Ngoni, Pretoria-Tsonga (marked "§" in list) & Mangala


 * Circular links of ISO names with summary data: Loloish, Qiangic (3 listed + old name Pingfang, which I can't ID), unclassified Asian (Bhatola: presumably a Gond dialect, Warduji: presumably a Persian dialect), Hindi (Ghera: Pakistani enclave of unidentified Indian language), conlang codes (Kotava, Romanova: old articles were deleted as not-notable)


 * No 1-to-1 correspondence to ISO: Tracking only; no need to fix.
 * Gbaya language (Central African Republic), Gbaya language (Sudan), Syriac language


 * ISO languages without info box:Typically because there are problems in defining the language. Tracking only; no need to fix.
 * Minor languages covered in family article: Loloish (4)
 * Language uncertain: Mina, Majhwar
 * Rd. to script or history article: Epi-Olmec (undeciphered), Ancient Zapotec, Middle Korean
 * Rd. to spurious-language article: Parsi-Dari, Parsi, Tapeba


 * Newly discovered or unattested languages without ISO codes
 * Lubu (unattested and extinct)
 * Cuyama (unattested and extinct)

Requests for expansion
Images for articles in Category:Wikipedia requested photographs of languages.


 * Standard Moroccan Tamazight
 * Chilean Sign Language
 * Chadian Sign Language
 * Brazilian Sign Language
 * Indonesian sign languages

Requests for attention

 * We may need to distinguish Fernando Poo Creole English from Pichinglis, per talk page
 * we may want to split Tsotsitaal (Camtho/Shalambombo is Zulu-based, not Afrikaans)
 * review Haiǁom people (article history inadvertently deleted and never restored)
 * need to work out Língua Geral vs. Tupi language; holding off on info boxes at Lingua Geral and Língua Geral Paulista until then
 * Old and Middle Greenlandic language may be fictions, see Talk:Middle_Greenlandic_language
 * Verify if Carpathian Rusyn language should be a separate article from Rusyn language.

(no article Ashéninka people; Keres functions as the lang article but reads as a family article)

Tagged categories

 * Category:Ill-formatted IPAc-en transclusions (catches obvious screw-ups to the IPAc-en template)
 * Category:Articles needing IPA cleanup
 * Category:Language articles needing attention
 * Category:Languages articles needing expert attention

Category:Articles lacking sources
Only language varieties are included here. Subjects such as 'French language in Jordan' and 'Westernized Chinese language', though in bad shape, are not listed because they would not be representative of the many unreferenced articles that are not about specific varieties.
 * 2004–2014: (only articles with 'language', 'dialect', 'creole', or 'pidgin' in name are included; distilled from an insane number of articles)
 * English: Jewish English languages
 * Germanic: Central Franconian dialects, Eastphalian dialect, Hamburgisch dialect, Norwegian dialects, Orsamål dialect, Ripuarian language, Sognamål dialect
 * Romance: Chipilo Venetian dialect, Comasco-Lecchese dialects, Fornes dialects, Pavese dialect, Sabino dialect, Sutsilvan dialects (Romansh)
 * Slavic: Debar dialect, Reka dialect, Strumica dialect
 * Maltese: Qormi dialect, Żejtun dialect
 * Chinese: Luoyang dialect, Mango dialect, Qihai dialect, Weihai dialect, Ningbo dialect, Ganyu dialect, Fu'an dialect, Xuzhou dialect
 * other: Kfar Kama Adyghe dialect (Adyghe), Enuani dialect (Igbo), Thanjavur Marathi dialect, South Korean standard language


 * 2015: (thru Jun 23) Harbin dialect, Qingdao dialect, Southern Rural dialects, Dutch-based creole languages, Shilluk language, Old Montagnais language

Category:Orphaned articles
(same search terms as missing sources)
 * Ordek-Burnu language (moved to 'stele')

Open ISO issues
The following ISO requests for new languages from previous years were still open in 2016 Jan. The articles should be updated if they are accepted. (See the current list, reviewed to 2021-02.)

2020-039 	tki 	Iraqi Turkman language 2020-009 	nww 	Ndwewe language 2019-007 	rrm 	Moriori language 2011-041	vsn 	Vedic Sanskrit 2009-081	elr 	Katharevousa Greek 2009-060	ecg 	Ecclesiastical Greek 2006-084	gkm    Medieval Greek

Articles proposed for deletion
including WP:AFD, WP:PROD and other processes

Articles to watch
The following are language articles which come under repeated POV attack, often for ethnic or nationalistic reasons. Feel free to add ones you've noticed, and to remove languages which have not been a problem for some time. That way, if one of us drops out from editing, the articles we've been watching hopefully won't go to pot.
 * Population inflation: Arabic (2015.10), Assamese, Azeri, Balochi, Bengali (2015.10), Bulgarian (2015.10), Cantonese (we have no estimate), Cherokee (2015.10), Egyptian Arabic, French, German, Greek, Gujarati, Hebrew, Hungarian, Italian, Korean, Kurdish, Nepali, Oromo, Portuguese, Angolan Portuguese (60% is an exaggeration, per ELL2), Punjabi (2015.10), Sindhi, Tajik, Tamil, Tati, Turkish, Ukrainian, Yue, many Indic languages and dialects being pushed as separate languages. Many of these will be caught by checking the top 100 at List of languages by number of native speakers or List of languages by total number of speakers.
 * (Note: Ethnologue 17 and the Swedish Nationalencyklopedin use Indian census data, which is not a RS because it does not have a consistent definition of Hindi. For example, part of the Awadhi population is listed under Awadhi, but most is counted as Hindi. This problem is acknowledged in the presentation of the census results, but has gotten lost in 2ary sources.)


 * Serbo-Croatian & Croatian (subject to ARBMAC)
 * Saraiki dialect, Punjabi dialects, and "Panjistani" (requires text searches to purge repeated additions of contradictory claims of "Panjistani" to multiple articles)
 * Southern Luri language. It may be worthwhile splitting the Luri article, but so far the attempts to do so have been incompetent and motivated by OR redefinition of the language. The present description of the two varieties in the Luri article is so intertwined that splitting them would create something close to a content fork. — kwami (talk) 02:32, 4 September 2015 (UTC)
 * Assyrian Neo-Aramaic and Chaldean Neo-Aramaic, along with the ethnic articles. A seemingly chronic ethnic dispute.
 * Luganda and Baganda: deletion of ISO name
 * Misleading maps: Many national languages have had maps with half the world filled in because of emigration, with no apparent standard for what counts as a speaking population. Most of these will be caught by checking the top 100 at List of languages by number of native speakers.