Wikipedia talk:Language recognition chart

English
How exactly does "DFGLNQRUVWYZ ... and no other" = English? Brooklyn Nellie (Nricardo) 02:31, Mar 17, 2004 (UTC)
 * The author meant that English uses the Latin alphabet, with no further letters or diacriticals. I'll change that to be the entire Latin (uppercase) alphabet.Lisa Paul 07:16, 27 Jun 2004 (UTC)
 * It was my intention to inlude only characteristic letters (I excluded letters of Latin alphabet which look same as letters of Cyrillic or Greek alphabet). Nikola 04:30, 12 Jul 2004 (UTC)

A resource
This page attempts something similar to the wikipedia chart and may have info not already on wikipedia, if someone wants to compare. also this discussion in a livejournal community lists corrections and additions to the aforementioned web page

Devanagari and other Indic scripts?
I don't know a darned thing about Asian alphabets, but maybe somebody could put in a sample of Hindi, Nepali, etc. Also nice would be a sample of Thai. - Lisa Paul 07:52, 27 Jun 2004 (UTC)


 * You may be interested in omniglot -- SS 20:05, 22 Aug 2004 (UTC)

Klingon? Are you serious?
Come on people. Does Wikipedia have to start to look like a dumping ground for losers? (yeah, I know I'm one, but still, I try to hide it from time to time) Nelson Ricardo 01:30, Aug 23, 2004 (UTC)

Klingon is being taken more seriously today than you may realise. It wouldn't surprise me if, in a few years, it became more widely spoken than Esperanto. AdmN 01:52, 23 Aug 2004 (UTC)

The purpose of this page was originally to enable users to see in which language a new article is written in. As someone might write an article in Klingon... Nikola 08:16, 23 Aug 2004 (UTC)

Well, at the top it says "document" rather than "Wikipedia article"... and there are certainly documents written in Klingon. I wouldn't expect anyone to write a en.wikipedia article in Klingon, but then I wouldn't expect anyone to write an en.wikipedia article in French either. Perhaps this should be moved out of the Wikipedia namespace and mingled with the general articles. It's certainly fascinating, and it may be generally useful. And no, I did not add Klingon as a joke. -- SS 16:08, 23 Aug 2004 (UTC)

Namespace
I moved this page to the main namespace. I understand the initial motivation, but I don't see why hide this useful information from the general public. -- Taku 05:17, Dec 5, 2004 (UTC)
 * This looks to me like original research that is not appropriate for the Main namespace. Russ Blau (talk) July 6, 2005 17:03 (UTC)
 * I second the utility of this page, although it needs work. --babbage 13:08, 16 January 2007 (UTC)

Armenian/Georgian Alphabets
All I see is a bunch of question marks for the Armenian/Georgian alphabets. MonsterOfTheLake 17:15, 28 Dec 2004 (UTC)
 * Your computer doesn't have support for these fonts. Mikkalai 20:24, 28 Dec 2004 (UTC)

Languages using Arabic or Arabic-derived script
How does one tell apart Arabic, Farsi, and Urdu?


 * My totally non-expert observations indicate that Persian (Farsi) writing is more 'broken up' than Arabic if that makes any sense. &mdash; Trilobite (Talk) 06:34, 8 Mar 2005 (UTC)
 * See Persian alphabet. Basically, Farsi, Kurdish (when not written in the Latin alphabet), Pashto, and other languages use an "extended" Arabic alphabet, including letters for sounds which are present in their language, but not in Arabic. Urdu, well, just look at it, because I can't describe that nearly as well. -Fsotrain09 01:04, 28 December 2006 (UTC)

Although both use the Arabic script, Arabic is a much flatter language, in that all letters sit upon the line, eg. اللغة العربية(the Arabic language).


 * That's one style of Arabic writing; it's not universal. —Tamfang (talk) 18:39, 20 October 2012 (UTC)

Greek
I think the article spends too much time describing different ways of writing Greek, and not enough on the grammar and vocubulary. If the purpose here is to quickly figure out which language something is written in, so a translator can be contacted, there is no need to get into the details of monotonic vs. polytonic. Like Hebrew, Greek is instantly recognizable; no other alphabet is sufficiently close for confusion. (Most Greek that shows up here is in the Greek alphabet, not in "greeklish".)

However it would be worth regognising the differences between Ancient and Modern Greek. While an educated native speaker of Greek can usually understand the ancient language; an ancient text would be better translated by a classicist. On the other hand a modern text would be better translated by a native speaker who is more aware of current cultural references.

Segv11 (talk/contribs) 22:53, 14 January 2006 (UTC)

Missing Greek characters
I think some of the upper-case Greek letters are missing from the list. Could someone fix that? I'm not good at generating non-Latin characters on my keyboard? Truthanado 02:40 23 March 2007

Done. Also added a low case "o" that was missing. Should we perhaps also add the accented characters (ά etc.) as we are after characters and not letters?

Arabic
The easiest way to tell a text in arabic script isn't arabic is the extra letters that are found in the persian languages. There is no P in arabic, also the Jhe, but it is a rare character in Persian farsi. It's hard to use the words as a reference, excluding the obvious like the al article in arabic, and the pronouns in each language.

The four letters:

پ pe ژ jhe چ che گ gaf

are missing from arabic and found in farsi. Pe, Che, and Gaf are very common.

Arabic uses the "al" article (The) very VERY often and is the easiest way to confirm something is an arabic text.

it can appear in different forms though

ال الا

This is the easiest method of recognizing words that are arabic beginning with those letters (On the right side of a word).

examples

Also Personal pronouns in arabic:

انا انت هو هی نحت انتم انتن هم هن

And common pronouns in farsi:

من تو شما انها

common verb endings in farsi:

کرد شد است ام

This should be enough to give you an insight into recognizing farsi and arabic texts, the challenging one to identify is urdu, someone should figure out a method of recognizing that, because it's so different from all other languages written with the arabic script. About half of the word pool from any language written with the arabic script is arabic EXCEPT urdu, which is probably 70% hindi.

Afrikaans
I'd say that this article is wrong about Afrikaans using no other letters. For example the Afrikaans word for 'morning' is môre, the word for 'bird' is voël, 'world' is wêreld and 'bridges' are brûe.

encyclopedic?
While the page is certainly useful, it's highly unencyclopedic, I don't know if WikiPedia is a good place for it. If so, it should at least start with a paragraph a discussion... also note that recongnising languages by squiggles is not always useful online, as many people write still in squiggleless ASCII due to their local technical limitations or wish to communicate with others who might be encodingly-challanged.

This page was never intended to be encyclopedic, but was intended almost as a help page to accompany Pages needing translation into English to assist people to identify what language an article was written in. It was originally created in the Wikipedia namespace and moved to the main encyclopedia as per the discussion above, and therefore I'm going to remove the unencyclopedic template. --Sepa 17:46, 18 April 2006 (UTC)

can only be Vietnamese?
Although the letters with a dot below mainly occur Vietnamese, ẹ and ọ are also used in Yorùbá language, ị and ụ are used in Igbo language, ạ is used in Rotuman language. - &#9993; Hello World! 03:27, 28 October 2007 (UTC)

Suggested French and Spanish additions
French: W only used in loan words [can only think of whisky and proper nouns]. Spanish: ll common at the starts of words.

SimonTrew (talk) 15:17, 4 March 2009 (UTC)

Hawaii
I added a link to Hawaiian language under basic latin alphabet. I now see it is referenced, but not linked, a little farther down that it "may" have overscores in texts. I am happy to revert, undo, whatever, though I hope it wuld be referenced in one place or the other, but not sure which should go (or I'd have just undone it myself). Advice please. SimonTrew (talk) 02:46, 15 April 2009 (UTC)

Namespace, 2nd
The page lies in the WP-Namespace currently, but according to the discussion a few items above, it should actually be in the main namespace (which I'd also prefer). For some reason, it was moved back here with the strange reason that "it causes some problems as an article". Well it's not an article after all, it's a table or a list anyway. After all, Language recognition chart still links here, which is quite a no-no after all. I suggest moving it back to main as in the discussion above. --PaterMcFly (talk) 10:22, 24 June 2009 (UTC)

Farsi?
I do see Persian, but no mention iirc of Farsi. What about changing to "Persian (Farsi)", or "Farsi (Persian)"? Nikevich (talk) 05:48, 11 October 2010 (UTC)
 * it's called Persian in English. Choyoołʼįįhí:Seb az86556 > haneʼ 05:52, 11 October 2010 (UTC)
 * Surely, but I also see "Farsi" in news stories and the like. "Persian" seems to be more traditional. However, I'm only suggesting, not passionate at all. (^_^) Regards, Nikevich (talk) 06:36, 11 October 2010 (UTC)

Maltese?
Should Maltese be included, or does it not have enough speakers? Nikevich (talk) 05:48, 11 October 2010 (UTC)

alphabet in article with no associated language
There's a bullet in the Characters section with A, Ą, Ã, B, C, D, E, É, Ë, F, G, H, I, J, K, L, Ł, M, N, Ń, O, Ò, Ó, Ô, P, R, S, T, U, Ù, W, Y, Z, Ż that has no language mentioned. If no one knows which language it is supposed to indicate, I'll take it out.--Wikimedes (talk) 19:58, 22 July 2013 (UTC)


 * I have identified it as the Kashubian alphabet.
 * —Wavelength (talk) 20:54, 22 July 2013 (UTC)
 * Very good. Thanks.--Wikimedes (talk) 05:06, 23 July 2013 (UTC)

Le
Is there a reason why a section is titled " French (le français)" and not " French (français)"? We don't give the article for any other language name. Apokrif (talk) 19:17, 18 November 2015 (UTC)

Syriac
I have added the Syriac Alphabet Bulahyatain (talk) 23:21, 22 January 2019 (UTC)bulahyatain

Added Maltese
I added a Maltese section Bulahyatain (talk) 23:38, 22 January 2019 (UTC)bulahyatain

Language recognition chart
Hello sir. I want to add my language to Wikipedia:Language recognition chart and i don't know how. Will u do it for me ... The language is Tamazight or (berber) ... Plzz 😟 Massinissa014 (talk) 12:28, 29 September 2021 (UTC)

Wikipedia:Language recognition chart
Hello sir. I want to add my language toWikipedia:Language recognition chart and i don't know how... Can u do it for me plllz. The language is Tamazight (ⵜⴰⵎⴰⵣⵉⵖⵜ) its called berber too ... Plz add it men :) and thank you Massinissa014 (talk) 12:32, 29 September 2021 (UTC)
 * I'll look into taking care of it, at least to present the alphabet, probably in the next few days. Largoplazo (talk) 14:26, 29 September 2021 (UTC)
 * Sorry for the delay, but I just added it to the end of the Characters section. Please check to see whether I got it right. Largoplazo (talk) 23:16, 29 April 2022 (UTC)

Add more langauges that use cyrillic
Here is a list with more languages that use cyrillic and how to distinguish them, maybe someone could help me add them: https://www.quora.com/How-do-I-tell-the-difference-between-languages-that-use-cyrillic-script — Preceding unsigned comment added by Vloxxity (talk • contribs) 21:48, 15 January 2023 (UTC)

Removal of extraneous information in French section
It was pointed out to me that the French section contains a fair bit of phonological and historical orthographic information, which isn't really useful for a page whose scope is to "help [...] determine the language in which a text is written."

I've WP:BOLD gone ahead and removed this information, to refocus this section on describing more "naïvely" how to identify French. While the removed information is in the history, for convenience, please find below the points that I've significantly modified:


 * Common digrams for vowels, that either were historically diphthongs or long (au, ai, ei, ou, or final -ez), or are nasalized (an, en, in, on and more rarely un, where the n is muted to an m before b, p or m) possibly surrounded by mute letters for longer polygrams (e.g. eau, ein, ain, but oin is a common diphthong).
 * Common digrams as well for some consonants (ch, rarely sh, gu-, gu-) or semi-consonants (-ill-)
 * Final consonants of words are generally mute (notably s), except to form vocalic digrams.

Laogeodritt [ Talk 00:04, 23 January 2023 (UTC)

French à

 * Accented letters: [...] à only in the word à and at end of words.

What French word (other than à) has final à? —Tamfang (talk) 04:01, 9 February 2023 (UTC)
 * Là, çà (as in çà et là), delà, deçà, déjà, holà, voilà. Largoplazo (talk) 04:16, 9 February 2023 (UTC)
 * D'oh, I thought of là just as I clicked the notice of a change to this page. —Tamfang (talk) 04:37, 9 February 2023 (UTC)

Chakma and Burmese
I'd like to suggest that the Chakma and Burmese scripts be added to this chart. They are similar scripts that are currently missing from this chart. I don't know if there are any guidelines to determine if languages should be put in this list, but if so, they should be put under "Brahmic family of scripts" in "Characters". I also don't know in what order to put the characters in. Alpha514 (talk) 15:26, 25 July 2023 (UTC)