Talk:List of languages by number of native speakers in India/Archive 2

Bhojpuri
Why is it not on here? --Maurice45 (talk) 19:01, 16 May 2009 (UTC) I think the census data lumps it together with Hindi. In much the same way that Rajasthani or Khari-boli are considered dialects of Hindi, so too is Bhojpuri. It's a political thing to some extent. —Preceding unsigned comment added by 128.135.88.143 (talk) 06:17, 1 June 2009 (UTC)

Inclusion of English and Persian...
...while English is mentioned in running text, it's not in the actual tables. Same with Persian. While I suppose this made sense when the title was "List of Indian languages by number of native speakers", the current title has no such disclaimer ("List of Indian languages by number of native speakers in India?!"), so they should be added. http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.htm Seems to indicate 11,688 for Persian and 226,449 for English, so any objection to adding that directly in? SnowFire (talk) 21:32, 11 December 2009 (UTC)

Don't merge with Languages of South Asia
I don't believe that this article should be merged with the list of South Asian languages for the simple reason there is such a vast list of languages in the sub-continent, and apart from the biggest one's, there is no crossover across countries. User:Georgiebest7 18:20, 5 December 2009
 * Yeah, it does seem rather strange. India needs its own article. Oppose merge. Copana2002 (talk) 18:53, 5 December 2009 (UTC)

I have removed the merge suggestion in the article, for i see no willingness to merge in the talk page. Arjun 024  07:37, 20 March 2010 (UTC)

note 1
I think note 1 links to a website that has been taken over by an advertising company, and thus there is no pdf anymore. —Preceding unsigned comment added by 121.45.211.39 (talk) 07:30, 17 May 2010 (UTC)

I just completed the chart, please DO NOT CHANGE ANY OF THE NUMBER.
This is the source and those are the official numbers. Please DO NOT CHANGE ANY OF THE NUMBERs even if you cite a reliable source. The reason is because when time changes, number of speakers changes with it; if two language give two numbers at different times, you can't compare them. Therefore, all languages should be from the same time for better comparison. Tarikur 21:12, 12 July 2007 (UTC)
 * your data is that of the 1991 census. That's fine, but you should declare that, not just give naked external links. we should leave the 1991 data stand in a separate column, but obviously, we should be looking towards adding the 2001 data as well. dab (𒁳) 07:35, 14 July 2007 (UTC)


 * Two points:
 * I don't think the 2001 data is freely available yet, at least on the web (I had searched a couple of weeks back)
 * The data in the table is not the number of speakers of "Indian languages", but rather the number of Indians speaking the listed language, i.e. the adjective Indian should apply to the speakers and not the languages. I don't know whether we should change the section title, article title or data source to take care of this discrepancy. Any suggestions ?
 * Abecedare 07:57, 14 July 2007 (UTC)

they haven't got round to publishing a lousy list of languages, six years after the census. Now that's pathetic :( I know, the Encarta numbers are flawed for languages spoken outside India as well. That's why we should sort by the 1991 census numbers. dab (𒁳) 09:18, 14 July 2007 (UTC)


 * looks like the chart has been tampered with: the encarta (2007) estimate of bengali speakers is "207 million" ?? Please correct it.


 * Some of the numbers are obviously wrong, and need fixing. The "official document" cited above no longer exists.  For example, the most common language, Hindi, was listed as having only 1% native speakers.  We also need to list here why the numbers don't add up to the numbers in the population, and do so with enough significant digits to justify our percentages here.  (For example the page says that the totals add up to "about 127%" (3 significant digits) but numbers are quoted to 4.  Not verifiable data this way.  --Eliasen (talk) 22:33, 14 August 2010 (UTC)

Please keep the total number of native speakers based on the sources provided. DON'T change the numbers.
These are the sources: and. The numbers that those sources gives us are official numbers of native speakers. PLEASE DO NOT CHANGE ANY OF THE NUMBERS. Tarikur 06:26, 12 July 2007 (UTC)

What if the numbers are totally wrong and internally inconsistent, as they are today? --Eliasen (talk) 11:30, 18 August 2010 (UTC)

shouldn't urdu/hindi be combined?
They are mutually intelligible, and thus are considered by linguists to be one language. For political reasons, they are often treated separately, but it's misleading to have them treated separately on a "list of languages by native speakers". jackbrown (talk) 20:14, 25 August 2010 (UTC)

Another 27 to 43% of national population can understand or speak the language(Hindi).
Just removed this sentence "Another 27 to 43% of national population can understand or speak the language(Hindi)", how could this be true? so the member means to say that around 70 - 86% Indians understand Hindi?? Can some one accept that? Can someone please help put official figures?? i tried googling but in vain.

Doctor muthu's muthu   wanna talk ? 04:31, 28 August 2011 (UTC)

Tamil or Telugu
Though Tamil has a larger number of speakers "globally", Telugu speaking populace is the second largest linguistic group "within" India.
 * Thats because a large percentage of the Tamil speaking population resides outside India as compared to telugu, eg Sri Lanka, Malaysia, SIngapore SOuth Africa, Mauritius etc.--Deepak D'Souza (talk • contribs) 21:34, 31 May 2007 (UTC)

Wikipedia is not showing correct data as far as I analysed. Telugu is the second most spoken langauge in India. And also 5 disctricts of Karnataka speak telugu at their homes though they speak kannada outside as they became part of Karnataka where Kannada is official langauge. And also It has been observed that outskirts(most of the villages which are now part of Bangalore) of Bangalore speak telugu. But in Bangalore wikipedia, no where it has been mentioned. — Preceding unsigned comment added by 125.16.142.226 (talk) 11:07, 13 September 2012 (UTC)

Comparical?
I have reverted the revisions made on 24 March 2013 by 117.192.144.212 (talk), which relied on the answers to questions put to Comparical. The answer pages from Comparical do not show the source of their information, and I could not find out who runs their site. Additionally, the changed figures were put in the column headed ' Encarta 2007 estimate', where they do not belong. Perhaps they could be put in a new column, but first we should find out who runs Comparical and where they got their information. Apuldram (talk) 12:09, 24 March 2013 (UTC)

List of languages by number of native speakers in states and districts
where can I get data of languages number of native speakers in states and districts?--Kaiyr (talk) 07:08, 19 July 2014 (UTC)

Bangal Language?
Where is the head count for Bangal language? If Assamese can be considered as a separate language because of its dialect (Assamese alphabets are almost same of Bengali alphabets) then why can not Bangal. After 1947 people from Bangladesh (east Pakistan) has moved in different parts of West Bengal and their language is Bangal (though the alphabets are the same) — Preceding unsigned comment added by 115.242.115.30 (talk) 20:09, 22 June 2013 (UTC)
 * Bengali (also called Bangla) is second on the list. Richard-of-Earth (talk) 10:44, 4 February 2015 (UTC)
 * Language status has nothing to do with a language's alphabet (or even whether it has any at all). --JorisvS (talk) 11:08, 4 February 2015 (UTC)

WOW
I never even THOUGHT India spoke so many languages! My homework just got a HOLE lot easier. — Preceding unsigned comment added by ArtemisHunt (talk • contribs) 15:43, 11 February 2015 (UTC)

Other censuses
Where I can get result of all other censuses languages by number of native speakers in India?--Kaiyr (talk) 12:10, 3 February 2015 (UTC)
 * Here :

--Loup Solitaire 81 (talk) 10:43, 4 February 2015 (UTC)
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement5.aspx
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement6.aspx
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement7.aspx
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement8.aspx
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.aspx
 * http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement4.aspx
 * Links arent opened.--Kaiyr (talk) 09:40, 13 February 2015 (UTC)
 * Grrr. the website is not responding. Here are archives from 2007:

Richard-of-Earth (talk) 08:37, 14 February 2015 (UTC)

Numbers, numbers...
The numbers of native speakers are taken from the Official census tables. They way they're structured, the numbers for some of the bigger languages (like Hindi) subsume the numbers for smaller ones (like Rajasthani or Awadhi). The tables in the articles here give the big number and don't have entries for the "smaller" languages that are subsumed under them (even though some have millions of speakers). Should we change that? Uanfala (talk) 13:46, 2 January 2016 (UTC)
 * I think it was like that once, but it got out of hand. On the 2001 census table you linked, there are a couple dozen languages grouped in with Hindi with more than a million speakers. Out of pride or some political agenda, people would want their pet language broken out from Hindi. What criteria would we use to justify doing so and in what way? Rajasthani language for instance per the article here on Wikipedia should include Marwari, Malvi, and Nimadi in it's count. Would we do that here? And what citation would we use to justify it. Richard-of-Earth (talk) 10:31, 3 January 2016 (UTC)
 * But in this article we only follow the census results. I was just wondering whether we should use the higher- or the lower-level entities. All the languages that are lumped in with Hindi are quite distinct and I don't think counting them separately is anything but basic common sense. As for Rajasthani, I assume the census takes it in its narrow sense. Uanfala (talk) 14:26, 3 January 2016 (UTC)
 * If it were done, it should be a new section and list only the ones with more than a million speakers. As far as I can tell only Hindi has ones with more than a million speakers. Your welcome to add such a section and I would see it as useful. I would be happy to maintain it, that is revert attempts to change the the list from its cited form. I am just saying there is controversy about what is and is not called a language as opposed to a dialect or something else. The 2001 census called them "mother tongues", so we should call them that. Richard-of-Earth (talk) 20:04, 3 January 2016 (UTC)
 * Done. But this is just a bare table, ideally each entry should come with a note explaining what is and what is not included in this number. I don't know if the relevant information is available somewhere in the census papers. I'll just point out one issue: Sadri language and Nagpuria have separate entries in the table, but the links point to the same article. Similarly for Lambadi and Banjari. Oh well, we can't have a nice 1-to-1 mapping between census entities and articles. Uanfala (talk) 01:12, 6 January 2016 (UTC)
 * Nice. I will keep an eye on it. Richard-of-Earth (talk) 09:27, 6 January 2016 (UTC)

"Mother tongue" or "First language"
has replaced all occurrences of "mother tongue" with "first language". But, as pointed out at some point in the previous discussion on this talk page, it makes sense to keep the original terminology of the census, which this article is almost by definition entirely based on, and the census uses "mother tongue". Generally, I'm agnostic on the use of one or the other, but I think we should have a reason to alter the terminology of the source. The fact that tongue also happens to mean the anatomical organ isn't a reason. I imagine a possible direction might be to look in the different connotations of the two terms, but I'm not sensitive enough to this so I don't really know if "mother tongue" sounds in any way dismissive or devaluing. Any thoughts anyone? Uanfala (talk) 20:26, 4 April 2016 (UTC)
 * First of all, tongue is the organ in English. For speech, "language" is the word. This is not about some term being dismissive or devaluing (which is not the case). That there are people who do not differentiate the words is not enough reason, especially in an encyclopedia. I've seen source parrotting many times, and it usually only serves to make the text less clear. The point is to make the text say what it needs to say, not to use different terms because that may be a little easier for us writers: If we want to clearly distinguish that we're talking about census-listed languages, we should just say that, "census-listed languages". I know of no good reason to use "mother tongue". --JorisvS (talk) 20:39, 4 April 2016 (UTC)
 * The source, the census, uses "mother tongue". That is a good reason for the article to use it. It is arrogant for Wikipedia to 'correct' a source. Although "tongue" is a common synonym for "language", the combination "mother tongue" conveys more than "first language". Also, it doesn't follow that someone's mother tongue is their first language (my brother's first language was not his mother tongue). That said, I think there is a case for "first language" to be used in the article, with an explanation of why it is being used there for "mother tongue". Apuldram (talk) 21:55, 4 April 2016 (UTC)
 * I reverted JorisvS and added a definition from the census itself. At http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/gen_note.html it says:
 * 3.1 Mother tongue is the language spoken in childhood by the person’s mother to the person. If the mother died in infancy, the language mainly spoken in the person’s home in childhood will be the mother tongue. In the case of infants and deaf mutes, the language usually spoken by the mother should be recorded. In case of doubt, the language mainly spoken in the household may be recorded.
 * From this "mother tongue" could be simply stated as "first language". We should reach a consensus and go with that. I have a mild favoritism for "mother tongue" with the definition I added, quoted from the general notes. Richard-of-Earth (talk) 06:44, 5 April 2016 (UTC)
 * @Apuldram. It's not about 'correcting' a source, but about establishing a consistent usage over Wikipedia. For example, by astronomers, the Kuiper belt is often defined as both including and excluding the scattered disk. If we were to follow the sources' specific meaning every time, our texts would become semantically disjumbled and hard to follow. To prevent that, the Kuiper belt has been defined on Wikipedia to exclude the scattered disk, just noting that the usage is different for different sources (and sometimes even within sources!). Back to the issue here, exact English for "Mother tongue is the language spoken in childhood by the person’s mother to the person." is "mother's language". --JorisvS (talk) 17:50, 5 April 2016 (UTC)

number of official languages?
Is 23 a misprint? Should it be 22? --Richardson mcphillips (talk) 14:32, 14 November 2016 (UTC)


 * See Languages with official status in India. There are 22 languages on that schedule and those are considered constitutionally recognized official languages . English is also a constitutionally recognized official language, but is not mentioned on the schedule, so that make 23. Richard-of-Earth (talk) 22:00, 14 November 2016 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified 6 external links on List of languages by number of native speakers in India. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20061223112810/http://www.censusindia.net/results/slum1_m_plus.html to http://www.censusindia.gov.in/
 * Corrected formatting/usage for http://encarta.msn.com/media_701500404/languages_spoken_by_more_than_10_million_people.html
 * Added archive https://web.archive.org/web/20080201193939/http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.htm to http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.htm
 * Added archive https://web.archive.org/web/20081118143215/http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement6.htm to http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement6.htm
 * Added archive https://web.archive.org/web/20050109084200/http://www.ethnologue.com/show_country.asp?name=India to http://www.ethnologue.com/show_country.asp?name=India
 * Added archive https://web.archive.org/web/20041213203632/http://www.ciil.org/ to http://www.ciil.org/

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 00:01, 22 May 2017 (UTC)

Dravidian speakers
The total number of dravidian language native speakers based on the 2001 data given below (Kannada+Tamil+Telugu+Tulu+Gondi+Kurukh+Malayalam) =20.61 — Preceding unsigned comment added by 115.64.14.71 (talk) 08:52, 14 October 2017‎ (UTC)
 * I don't know where the original percentages were taken from, so maybe we should find a source. However, we can't just add up the numbers for the major languages listed in the table as there are Dravidian languages that are each with less than 100,000 speakers (and hence haven't made it into our list) but that put together would represent a substantial proportion of the total. – Uanfala 10:17, 14 October 2017 (UTC)
 * Pinging, who's recently added these numbers to Dravidian languages, where they're sourced but I'm still not able to see where exactly in the source they are. – Uanfala 10:46, 14 October 2017 (UTC)
 * On the 2001 Census Data on Language page, Statement 9 lists 17 Dravidian languages. The total of the numbers given for those languages in Statement 1 is 214,172,874.  The total population is given as 1,028,610,328.  Kanguole 11:19, 14 October 2017 (UTC)
 * Thanks, but see my comment immediately above. I'm going to have to remove this from the article. – Uanfala 11:21, 14 October 2017 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified one external link on List of languages by number of native speakers in India. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20080324032158/http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement4.htm to http://www.censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement4.htm

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 17:57, 14 December 2017 (UTC)

Removal of nonsensical phrase
After the mentions of how many Indians claimed to be bilingual and trilingual, the phrase "so that the total percentage of "native languages" is at about 127%." appears. What does that mean? I removed it since I really can't come up with any sensical meaning. Anyone disagree? Sitim.far (talk) 18:37, 2 February 2018 (UTC)
 * Quite right. Thanks. Batternut (talk) 16:33, 3 February 2018 (UTC)
 * It simply means that if you add the total percentage of speakers for each language (the last column of the first table in the article), you'll get more than 100% because multilinguals are counted for each of their native languages. – Uanfala (talk) 16:53, 3 February 2018 (UTC)