Talk:Computational linguistics

Wiki Education Foundation-supported course assignment
This article is or was the subject of a Wiki Education Foundation-supported course assignment. Further details are available on the course page. Student editor(s): Joejacob95. Peer reviewers: Joejacob95.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 19:24, 17 January 2022 (UTC)

Wiki Education Foundation-supported course assignment
This article was the subject of a Wiki Education Foundation-supported course assignment, between 26 August 2019 and 11 December 2019. Further details are available on the course page. Student editor(s): Simransall.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 19:24, 17 January 2022 (UTC)

Wiki Education Foundation-supported course assignment
This article is or was the subject of a Wiki Education Foundation-supported course assignment. Further details are available on the course page. Student editor(s): Sibyllwang.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 18:14, 16 January 2022 (UTC)

Untitled
If you folks don't mind, I'm going to make some changes to the Origins section of this article some time in the next week. I'm going to add some material about the work of David G. Hays. He led the machine translation effort at RAND back in the 50s and 60s, was one of the authors of the ALPAC report, wrote the first textbook in computational linguistics, was instrumental in founding the Association for Computational Linguistics, and the International Committee on Computational Linguistics, was the first editor of the ACL's journal and so forth. I'll be adding an article about Hays himself, then I'll link it to this article, and, as I said, make a some changes to the history section.

AI is is generally considered to have originated in 1956 with a conference at Dartmouth, where the term "artificial intelligence" was coined. MT was under-way before that time. When MT morphed into CL during the mid-60s it maintained a somewhat separate identity from AI and, so far as I know, does to this day. AI, CL, and NLP are closely related, obviously, and some researchers may consider themselves to be two or three of those, or only one. Bill 21:22, 16 March 2006 (UTC)

Merge with NLP?
Judging from the discussions, there doesn't seem to emerge a consensus on whether or not this article should be merged with Natural Language Processing. I think they should be merged since So I vote for a merge. Kallerdis (talk) 20:06, 29 February 2008 (UTC)
 * Although it might be historically justified to speak about a division, this is meaningless as the fields stands today. Whether you call it NLP or CL is today arbitrary and mostly a question of what aspect you stress.
 * Current articles in the Computational Linguistics journal, and at the Coling and ACL conferences, are judged on whether they are useful rather than on whether they give any insight on how humans process language.
 * Most of the material in this article belongs in the NLP article anyway.
 * We can't seem to come up with a definition of CL, and where to draw the line between the two "fields".

Dear All,

I am painfully aware of the many names used for this area (CL, NLP, Language Engineering, Human Language Technology, Language Technology, ...). Many of these terms denote nearly identical areas, but not all. My experience is that, in many countries, one uses two separate concepts, one narrow and one broad:


 * 1) The narrow one being defined (as also cited below by others) as a study of language using the methods and inspiration from computer science.  This definition puts much weight on the automatic parsing, especially using linguistically motivated grammar models.
 * 2) The broad concept includes the narrow one as its core, but also includes much material from the areas where the core methods can be applied.  It is also common that speech technology is included as a part of the broad area.

I would not be alone in proposing that, in the Wikipedia, we would establish two concepts (=articles) one for the narrow one and use the title computational linguistics for it, and another with title language technology (or human language technology). Both could have an account of the possible synonyms or alternative terms. The distinction is, anyway present in many languages other than English (Finnish 'tietokonelingvistiikka' vs. 'kieliteknologia', Swedish, Danish and Norwegian roughly 'datalingvistik' vs. 'språkteknologi').

Together with some colleagues, we would like to use Wikipedia as a platform for creating consistent multilingual terminologies for CL/LT. This would be feasible if there are consistent concepts which can be linked between languages.

-- Kimmo Koskenniemi, prof. of computational linguistic (but teaching language technology) --

HI....

I'm a student at a computational linguistics department (CoLi at Saarbruecken), and I have been a (visiting) student at a natural language processing department (HCRC at Edinburgh). I think each might have been close to being the biggest in the world in their respective "fields". But honestly, I can't tell the difference!!!

I think the key is: how are the terms commonly used? Once, (while I was stationed at Edinburgh but making a weekend trip to Cambridge), a Cambridge prof came up to me and said, "So, you're from Edinburgh, I bet you do NLP." Yes, I replied, not wanting to get into the details about really being from CoLi at Saarbruecken. "Heh," he says, "NLP is totally unfounded. At Cambridge, we do computational linguistics." Now (for those of us who are not irony impared ;) I think this is pretty strong evidence that the terms are used as synonyms within the field. And if they are used as synonyms within the field, then what exactly is this page describing?

PS... with the ACL quote... I think it might be decades old. If there is an "engineering/science" distinction between NLP and CL, then the ACL quote has got to be ironic because most of the research in their journal and at their conferences is clearly engineering.

-- pobody

I don't disagree that Natural Language Processing is useful and important. I'm just not convinced that Computational Linguistics is the same thing as Natural Language Processing.

According to the The Association for Computational Linguistics web page defining Computational Linguistics, ( http://www.aclweb.org/archive/what.html ) the definition is broader than appears on this page: ''computational linguistics is the scientific study of language from a computational perspective. Computational linguists are interested in providing computational models of various kinds of linguistic phenomena.''

i'm changing the article heavily and splitting large portions off into [Natural Language Processing] to reflect the fact that these two fields are related, but not at all the same thing. --jkominek

Well, but I do not think that NLP and CL do not overlap, I think they have a lot in common. At my university we do mostly NLP in computational linguistics. In the first year, there are the foundations (theoretical linguistics, computer science, logic, relations, trees, grammars), and the second year class is called "Natural language processing systems". According to the professor, computational linguistics IS about designing natural language processing systems. But I do not have anything written on my desk here (will deliver this later, ok?). I must also admit that he was CS prof before, so he may be biased ;-). Of course it is ok to have a separate entry, as CL and NLP are not synonyms.

I recommend to delete the sentence "... which is in the domain of computer science", as this implies NLP does not belong to CL. Instead, I would write "An important task of CL _is_ NLP" or sth. similar.

I also do not agree that CL is (only) a subfield of linguistics, as it is an interdisciplinary field, somewhere in between theoretical CS/maths, applied computer science, AI, and linguistics.

As I am new here, and still want to think about, I do not make changes immediatly (I also do not have the time atm to build proper, good-looking sentences, as I am at work).

Best regards,

-- zeno

Computational linguistics is the original term for the field of language processing that developed following the collapse of MT with the ALPAC report in the 1960s. Originally, it was thought that computers could perform translation of language quite easily, whereas in fact its exactly the opposite. I would characterize the difference between computational linguistics and natural language processing as the difference between computer science and software development, i.e., between the theoretical and the practical. Some computational linguists are not really interested in natural language processing and prefer to work exclusively on theoretical problems, whereas others are actual practitioners of the field--same as in computer science.

Also the assumption that computational linguists are linguists is wrong. Computational linguists came into the field from a variety of disciplines including mathematics, computer science and psychology. Computational linguistics predates AI. It is a separate and parallel field of those concerned with having computers process language in any way. It's major subfields include: speech generation, speech recognition, parsing theory, text generation and of course its original impetus, mechanical (or machine) translation.

Whereas today I expect you can easily study computational linguistics in a linguistics department, in the 1970s and early 1980s this was rare.

Dr. Robert A. Amsler, Sr. Computational Linguist, SEA/DOE

Computational linguistics would seem to involve more statistical (pattern recognition, markov, etc.) and NLP more determining parts of speech and trying to get meaning. Wouldn't it? User:KellyCoinGuy


 * No. both are used/covered by both "fields". --zeno 00:37, 21 Dec 2004 (UTC)
 * According to the Handbook of Computational Linguistics (2007), "Computational Linguistics" is used when a substantial amount of linguistic knowledge is incorporated into the computational model whereas "natural language processing" usually refers to a mainly stochastic language modeling. --Tomonori 03:31, 1 August 2007 (UTC)

I had it explained to me something like as follows:


 * Computational linguistics is where we make programs that should work, but don't and we don't know why
 * Natural language processing is where we make programs that shouldn't work, but do and we don't know why

- Francis Tyers · 14:36, 28 February 2009 (UTC)


 * Heh, that rings true to me. I studied cog sci at Yale in the late 1980s including some AI. Both terms are familiar to me and somewhat interchangeable. There were some folks using computers to analyze speech not as I/O for software, but in order to understand language as a psycholinguistic and physiological phenomenon; for these folks I would have used the term computational linguistics and not natural language processing. But that is only a useful distinction, to the extent that we want to draw a line between "computational linguistics" and "non-computational linguistics"—and I would argue that today computers are so omnipresent that the distinction is mostly historical.
 * I draw a parallel to 3D graphics. Prior to 1993, the term mostly referred to ugly green simulations created in DoD-funded university CS departments, and academic papers on more photorealistic stuff that was supposed to be decades away. Then almost overnight, the term got a new meaning: first-person shooter games like Marathon and Doom. SIGGRAPH never had a problem with disambiguation; it was all just "graphics".
 * A few years later, natural language leaped similarly from academic to practical, with interactive voice response (IVR) and machine translation. But we never settled on a generic word like graphics to cover it all. So I'm leaning toward a merge into "natural language processing" at the minor cost of some historical nuance. Whatever we choose, can somebody (who's more current on this stuff than I am) please find a supporting citation that asserts the breadth of the term?
 * Mrevan (talk) 11:07, 13 April 2009 (UTC)

It seems to me that the state of things is that the boundary between NLP and CL is unclear. I think the goal of any related Wikipedia articles should be to represent the state of things as accurately as possible, NOT to solve the clarity problem. Thus, both articles should clearly :) state that various opinions about these fields. Indquimal (talk) 22:06, 18 May 2009 (UTC)

As has been stated, CL and NLP have different origins and different goals - one to understand human language, one to produce more effective computer interfaces and deal with the problems natural languages serve up. At the moment, certain paradigms are dominant in NLP and carry over to CL, swamping techniques that come out of Linguistics, and this reflects the number of people coming in to the combined area from the AI and Lg sides. But most NLP does little to advance the aims of CL, and there are relatively few development that both advance CL and deliver improvements in NLP performaance - but these are ones we should pay particular attention to. — Preceding unsigned comment added by Dmwpowers (talk • contribs) 03:49, 25 August 2012 (UTC)

To my mind these are absolutely different--but I don't disagree that the boundaries are blurry and that a dept. doing NLP, say, would likely be comfortable for someone who identifies as a CL. Keep separate and work on the defns. JJL (talk) 02:36, 26 August 2012 (UTC)

Computational Historical Linguistics
In the last decade computational methods have been applied to historical linguistics. This article does not cover this aspect. Really a separate article is required with a change to the title of this one to separate it. Adresia (talk) 18:52, 29 November 2007 (UTC)
 * Sounds good. Do you have some sources? —Preceding unsigned comment added by Dranorter (talk • contribs) 03:48, 8 June 2009 (UTC)
 * This case points, imo, to the coming state of affairs: Linguistic research of various types will increasingly be presumed to use computing among its array of tools. We ought to be planning for, / gradually inching our way toward, the time when "Computational linguistics" will have become a term of mainly historical interest, and all branches of linguistics that are amenable to formal analysis of any kind will automatically be presumed to have a computational presence.--IfYouDoIfYouDon&#39;t (talk) 05:39, 19 September 2016 (UTC)

Stub template
Is there any stub template for use in articles related to computational linguistics? I haven't found any, but I think it would be viable, if it really didn't exist. -- Sandius 20:09, 7 February 2006 (UTC)

"Free online introductory book"?
As of this time, the external link Free online introductory book on Computational Linguistics has not responded at all for several hours. I have no way of knowing if this is temporary, but the link should be removed if this continues. 75.15.115.31 (talk) 08:21, 15 December 2008 (UTC)
 * Found it in the Internet Archive. --Thüringer ☼ (talk) 12:46, 15 December 2008 (UTC)
 * Seems to be working now. Dranorter (talk) 02:24, 8 June 2009 (UTC)

Humour
About the field. Sorry couldn't resist :) Note that the decision whether to merge being discussed above does kinda of support this... prat (talk) 00:56, 5 August 2010 (UTC)

Delete "Counter-views"
I would vote for deleting the "counter-views" section, marked out as of questionable quality. These don't represent well-known views in the field or fields close to it, and appears to take issue with "computational theories of language" rather than the field(s) of CL. I am more an observer than scholar of CL (though I have taught courses on it), & I don't think it's accurate to see it as promulgating a "theory" that needs to be balanced with "counter-views." Most CL just says that language can be productively analyzed using some computational tools or others, and justifies this assertion with actual research results. The "counter-view" would therefore be that no, computers aren't useful in linguistic analysis, a claim for which I know neither factual nor scholarly support. CL doesn't entail any statement about the nature of language per se (although some CL practitioners certainly do have such views), and this section obscures that. Wichitalineman (talk) 16:33, 2 February 2013 (UTC)

Dear Wichitalineman, The above discourse is an instance of intolerance, that leads to annihilation of dissenters' voice(s)and blocking of dialogue constitutive universals [Habermas]. I have few questions: (a)If computers or something x is a "tool" (as you said) for analyzing language or y-object, may we propose a full-fledged epistemological disciplinary technology, viz CL, depending on a "tool"? (b)Regarding your statement,"The "counter-view" would therefore be that no, computers aren't useful in linguistic analysis, a claim for which I know neither factual nor scholarly support", I must say that I do not know such extremist position that claims "computers aren't useful in linguistic analysis", but I know "factual"     and      "scholarly support" for anti-AI (both strong and weak)that is related to CL's epistemological status in relation to cognitivism. I have got this support from Roger Penrose's Godelian argument. — Preceding unsigned comment added by Debaprasad01 (talk • contribs) 20:39, 11 May 2015 (UTC)

Mathematical linguistics redirect
I don't think "Mathematical linguistics" should redirect to this page. If anything, "Math. Ling." is even more ill-defined, or more over-broad in use, than is the "Computational" term. The involvement of computer science in this putative subfield, under many of its apparent definitions, is at best secondary. (The mere suggestion that Mathematical ling. should redirect to Computational is reminiscent of the notion, still prevalent just a few years ago, that computing or computer programming was somehow "about" mathematics, or that a computer programmer was by definition a mathematician.)

Whoever created the redirect, I'll give you a short while to make restitution, no questions asked. Then the shaming shall begin. :~> (And no, I don't think I need to come up with a better place for "Mathematical linguistics" to (re)direct to. Redirecting here is inappropriate, in & of itself.) --IfYouDoIfYouDon&#39;t (talk) 05:09, 19 September 2016 (UTC)

New to Wikipedia editing
Hi all, I’m new to editing, and I’m also apart of a class that is taking a look at language in the digital age. I hope you all don’t mind if I attempt to improve my knowledge on the field by occasionally attempting to provide useful and pertinent information. Thank you Beardown25 (talk) 06:55, 10 September 2019 (UTC)

Approaches
At the moment, the Approaches section is very much essay-like, and less encyclopedia-style, and its prominent position in the article severely limits the usability of this page. (I guess this actually *was* an original essay, as it had been inserted as a single block.) I suggest to cut that down to 3-5 sentences per subsection, or, alternatively, rewrite from scratch, possibly taking an authoritative text book as a basis (not to copy from, but to follow its classifications). Chiarcos (talk) 08:39, 20 August 2020 (UTC)

On the Sindhi Language section
As of writing (12:10 pm EST, Mon Nov 9 2020), there is a paragraph in this article which reads:

In recent days, the structural data of languages is available for several languages of the world other than English language. Computational linguistics work is in progress on Sindhi language because the structure, grammar and domain of Sindhi language is different from the other languages of the World. The computational linguistics models for English language are not suitable for Sindhi language. Viewing this, computational linguistics work on Sindhi language [27][28][29] has been started properly by developing methods, algorithms, linguistics tools (https://sindhinlp.com/), machine learning models and deep learning models since 2016 [30][31][32][33][34][35] to focus and solve the linguistics problems of Sindhi language. This work could lead to further important discoveries regarding the underlying structure of Sindhi, and could have any number of effects on the understanding of Sindhi as a language.

I hope that this paragraph clearly presents itself as being nonfactual. I would be happy to offer sources to this if need be, but to state that the language spoken in one particular region of South Asia has "structure, grammar, and domain... different from the other languages of the World" is quite obviously unverifiable, and to me smells of the same sort of linguistic supremacy propaganda we see time and time again in pseudo-academia. Indeed, if we are to believe every article which states that some given language is somehow the most 'suitable' for computational analysis – Sanskrit, Hindustani, Sindhi, Macedonian – then the speakers of the rest of the (W)orld's languages ought to put some thought into why their languages, despite being equally qualified as a method of communication to every other language in the (W)orld, are so computer unfriendly, and/or why the great computational-linguistic minds of these locales have not made the incomparable advances in syntax parsing that we would expect with such an innate advantage.

173.35.233.148 (talk) 17:19, 9 November 2020 (UTC)

A Commons file used on this page or its Wikidata item has been nominated for deletion
The following Wikimedia Commons file used on this page or its Wikidata item has been nominated for deletion: Participate in the deletion discussion at the nomination page. —Community Tech bot (talk) 18:24, 26 February 2023 (UTC)
 * Alan Turing Aged 16.jpg