Talk:Natural language understanding

Cleanup
rm:

Steps of NLU:
 * General Natural language processing
 * Named Entity identification
 * part of speech tagging
 * Parsing
 * Semantic slot extraction
 * Dialog act identification

as if there was a canonical NLU methodology. rm

Approaches
-
 * Rule-based
 * Learning based

it's equally vacuous. When someone gives the article attention the first will be mentioned and as for the second ... (rhetorical ellipsis). —Preceding unsigned comment added by 74.78.162.229 (talk) 20:29, 10 July 2008 (UTC)


 * This comment from 7 years ago seems to have some interesting criticisms, but not one of them is explicit. "Vacuous" is about as meaningless a criticism as is possible. It would be helpful, when an expert in the field visits here, if they could make some actual improvement in the article. One paper on the Web mentions "greedy decoding with the averaged perceptron" modified by Brown clustering and case normalization, without any explanation of what these terms mean. If there is good current research in natural language understanding, it would be helpful if an expert expanded the article to explain the details. One excellent application, not currently listed in the article, is in language translation. David Spector (talk) 15:51, 30 March 2015 (UTC)

Speech Recognition
FTR, this really has relatively little to do with the subject of this article. Understanding proceeds indifferently from spoken or written speech, except insofar as non-verbal communications are concerned and of course these are outside the normal scope of either NLP or SR.

74.78.162.229 (talk) 19:57, 10 July 2008 (UTC)


 * Disagree. Speech is full of ambiguity, as is text/writing. The full understanding or interpretation of the semantics of speech (even down to the identification of phonemes) requires very much the same kinds of analysis as does the understanding of written language. David Spector (talk) 15:59, 30 March 2015 (UTC)

Article rewrite
This article can be kindly described as "hopeless". It has a few irrelevant short paragraphs and needs a 99.9999% rewrite. If no one objects, I will rewrite from scratch. There is nothing here that can be used. And there is NO point in merging with Natural language processing since this is "a field onto itself" and the merger will be the blind leading the blind, for the other article is no gem either. History2007 (talk) 20:40, 18 February 2010 (UTC)


 * Good work, the comments above are all by me, glad to see someone's done something with this. 72.228.177.92 (talk) 17:48, 20 August 2010 (UTC)


 * Thank you. History2007 (talk) 18:27, 20 August 2010 (UTC)


 * But also, yes. It is a really thorough article now! :) Pixor (talk) 17:07, 17 June 2012 (UTC)


 * It doesn't seem thorough to me at all. It doesn't say anything specific about how to analyze natural language text for its semantics. I can't write even the most primitive computer program to do this analysis based on this article. I also don't see any real explanation of any academic topics in the field. David Spector (talk) 16:03, 30 March 2015 (UTC)


 * I hope that this article will be more thorough, so that many interested students are properly guided. I don't think that it would be a good starting point for newcomers to NLU. Yijisoo (talk) 13:40, 18 September 2016 (UTC)

Dubious:
The second paragraph of the opening section makes the (albeit well-argued) unsupported claim that understanding is more complex than generation. While this might be true, it isn't cited or referenced.

I'm inclined to be believe that this isn't true, though. In a recent computational linguistics course I took, my professor repeatedly mentioned that good generation is much more difficult, because in understanding all of the information is there and only needs to be picked apart, whereas in generation, the computer has to make many "decisions" based on little else but context.

Anyway, I would consider removing this section until a source is found? I'm not sure if it adds a lot to the article, anyway. Thoughts? Pixor (talk) 17:06, 17 June 2012 (UTC)


 * Disagree strongly. It is almost obvious that generation of natural language text can be very easy. I've programmed this myself (for a medical history generator), and I have no background in formal NLP. It is equally obvious that determining the meaning of actual natural language text ranges from complex to impossible, depending on how much context information is needed. But, as to removing any section of the article, you haven't built a case for such an extreme action. Sources are certainly needed throughout the article, but deletions aren't a solution for lack of sources, they are just an avoidance. If a section has been removed, someone with the time to research this change should restore it. David Spector (talk) 16:13, 30 March 2015 (UTC)
 * Though I have an almost identical experience with NLP, this is technically OR, so it isn't really a valid argument against needing to source the material; however, it is obviously true enough that removing it while searching for a source seems unnecessary. OTOH, the fact that this reply is being made a hair over 2 years after your post (yours being a reply to a comment made about 2.75 years after the OP) was made just under 3 years after) suggests nobody cares enough to look; I'm not sure I do... Hppavilion1 (talk) 23:01, 13 April 2018 (UTC)

While the argumentation given is true, whether NLG is more difficult than NLU depends on the representation from which language is generated, and how much variation you want to put in the generation. I would not consider illing up slots in a template as proper NLG.

If you know how to get from a natural language to an abstract language representation, then you only have to reverse the process in order to do the output. So output is not more complex than input (this is case 1). On the other hand, if you know how to output, it means either your output scheme is at least as good as your input scheme (we're then back to case 1), or you know one way to output and that doesn't give you any guarantee regarding the input schemes. Still not convinced? When you input, you may face rules you don't know about, which makes the whole thing very difficult (often impossible, so you have to reject the phrases or take chances). When you output, you use a well-defined representation of the language, you know the semantics, you have a well-defined set of rules, and the only difficult part is to make sure you don't output something that has a multiple meanings. Then, one might argue that we could be taking about a crazy natural language in which basically everything is ambiguous. In that case, we can just fall back to the worse case scenario: case 1. Sparkles (talk) 23:00, 24 January 2018 (UTC)

Bold removal
The process of disassembling and parsing input is more complex than the reverse process of assembling output in natural language generation because of the occurrence of unknown and unexpected features in the input and the need to determine the appropriate syntactic and semantic schemes to apply to it, factors which are pre-determined when outputting language.

Quite apart from whether this is dubious, the case is argued on a strange foundation, and surely some proper citations are demanded for a contention such as this.

For me, excision until properly reworked is a slam dunk.

Hiding in the foliage is Postel's law: that we tend to be generous in dealing with messy inputs, while stricter in generating outputs. But in a natural language setting, why should this be a default assumption? Why should the natural language system be an inherent complexity filter? Much of the messiness of natural language input actually conveys sophisticated social nuance, so the output is only simpler than the input if we're happy for the output to lack nuance.

Because of our historical relationship with our machines, we tend to accept this bargain automatically. But this social criteria should not then be baked into a sweeping statement that input is harder than output. It's almost certain that human speech generation contains a feedback loop back to the input system (which is in turn informed by the mirror neurons, and our cognitive model of the speech recipient(s)) in order to assess the inherent trade-offs between economy of utterance, tonal sideband, and risky ambiguity. Any good writer knows that careful output is way harder than parsing input.

In the interpreter community, it's taken for granted that interpreters input their foreign language(s), and output their birth language(s), because you so need to know precisely what you just meant at all possible levels.


 * Lýdia Machová at the Polyglot Gathering 2015

I'm pretty sure this was the talk where I heard an interpreter (this one seems very good at her trade) talking about making an exception to this rule: she actually does translate into her foreign language(s), out of a pure numbers game for the less common EU languages. But in the big languages (English, Spanish, German, French, Italian) this would never be allowed, because you can always find enough native-speaking translators with proficiency in any number of other EU languages.

Note also that she talks about (in this video, or one of her others on YouTube) how realtime translators have a very brief duty cycle. It's something like twenty minutes out of every hour (so you need 3× coverage for each language pair), because the task is so cognitively demanding that their brains explode. (She says that one ear is constantly devoted to listening back to what you just spoke, while the other ear is listening to the input and trying not to miss a single important word.)

Basically, we apply a very high standard to human input/output processes in a political context such as big EU gatherings, by which point no-one pretends that output is easier than input.

And then we implicitly give our machines a free pass, because we've all been weening on the social media business model that barely good enough is fan-effing-tastic (so long as it's free of direct monetary costs).

Input easy/output hard is completely crazy talk, and OR, and uncited, and a disservice to the larger article context.

There are definitely sober claims to be made about what will move the dial forward in the short term (and input will probably present more challenges in the short run), but that needs to be what we write here, and not this naive, sententious argumentation. &mdash; MaxEnt 18:02, 14 April 2018 (UTC)


 * Me again, even more opinionated than the first time, but this is fundamental to the article at hand, and worth belabouring.
 * It used to be commonly observed that Russians understood capitalism better than we do, because we swim in it, and they observe from without with a dissecting scalpel. But of course, the Russians understood certain things about capitalism as a formal model, but had terrible gut instincts about how that formalism played out in a complicated, messy world (which might actually be the larger understanding here). Surely there are parallels to be drawn with the sixty-year history of over-promise and under-deliver of the AI community, until their recent Chicago Cubs redemption under the banner of Deep Learning.


 * Kai-Fu Lee: We Are Here To Create (edge.org) — 26 March 2018


 * Lee is actually more of a hard-liner than I am one the scope of our recent breakthroughs. I would argue that much of the history of AI is littered with false assumptions about what would prove easy, and what would prove hard. Asymmetrical complexity would make my own list of happy-yet-ultimately-unjustifiable enabling assumptions (good for writing the grant, lousy for delivering a working system). There were many reasons that logic dominated the early days (in particular, statistical systems simply require more memory than any researcher could conceive of in a practical system). In the modern statistical approach (LSTM neural networks), input and output are largely regarded as symmetrical. Translation systems no work by building two language model, lashing these together, running one model forward (input) and the other model backward (output). No-one intended DeepDream, it's an automatic consequence of this style of computing that recognition and generation are two sides of the same coin.
 * I want to point out, also, the effect of formal education in this. After twenty years of formal education, hardly anyone passes through without internalizing the false perception that multiple-choice assessment is a natural language activity. Yet it isn't: it's highly artificial, where short sentences (in query form) are logically associated with even clearer answer strings (in math, however, this works as advertised). Much of our success in formal education depends upon acquiring a proficiency in this highly artificial assessment environment. (True assessment is a high-dimensional problem, one that proves intractable to solve, so we cheat with these rudimentary approximations.) So the mental model of the artificial multiple-choice environment is really beaten into us over a long period of time. And then when we interact with our speech appliances, it seems natural to fall back onto language proficiency in this artificial language subdomain, where we pretend that questions and answers exist out there, nicely matched up in this intrinsic form. It's precisely because of our false sense of this as a 'natural' performance domain that IBM went hard-into the Watson (computer) project, which probably impressed people more than it should have, because the scholastic question domain is associated with advanced educational attainment, whereas it really should be associated with shallow thinking. (The questions on Jeopardy remain multiple choice in essence, only the student now has to supply his or her or its own choices, a trick not much different from the abacus wizards who learn to imagine the beads mentally and stop moving their fingers manually; the choices having always been somewhat of a prop in the first place.)
 * There's also a meme in science that asking the right question is harder than determining the right answer (this is only true at the highest echelon of inquiry—as wrong answers are career-limiting-moves on most lower rungs of the ladder). People with the strongest understanding of an issue almost always are the same people who ask the best questions. So I guess we're imaging that our speech appliance gets to understand our multiple-choice-like constrained input, without ever inquiring back: what did you really mean here? (Another feature of the examination hall beaten into our psyches.) Because if our machines were outputting prescient questions (e.g. wait a minute—just what are you really trying to achieve here?), we sure wouldn't be falling over ourselves to call that the "easy" side of the problem.
 * I've spent thirty years of my adult life nosing up to these philosophical issues relentlessly, and that paragraph I removed, from where I now sit, could best be described as wandering into an inhabited garbage compactor wearing a blindfold, reflective sunglasses, and a welding mask. The very impulse to make this kind of assessment (x is easy, y is hard) now has sixty-year accrued history of disabuse. Everything one can validly assert in this connection stems from specific assumptions native to the performance domain (almost always an implicitly artificial performance domain). It's time to stop pretending that we're not already swimming in these artificial performance domains, because we are, and a little bit of Russian outside-the-box perspective is absolutely required here, if we are to survive our encounter with the voracious garbage-compactor inhabitant of false confidence.
 * My apology for this wall-of-text brain dump, but I know from long experience that some of these "nuances" are slow to sink in for people who have never sat down and puzzled this problem all the way through, within the full historical context. Kai-Fu Lee is a very smart guy and quite at the center of all this. I highly recommend his piece on Edge (above) as a starting point. &mdash; MaxEnt 19:38, 14 April 2018 (UTC)

Assessment comment
Substituted at 00:57, 30 April 2016 (UTC)

Dubious Searle reference
Why is Searle's POV mentioned here specifically in respect to Watson? The placement of the citation gives the impression that he believes that *that* set of algorithms running on Watson failed to understand, rather than his more general epistemological view that no matter what algorithms were implemented it *couldn't* understand, which he posited decades before Watson. I don't believe any NLU researcher, or the Watson team would claim they are in any way addressing the Chinese room argument in their work. The word 'understanding' simply implies that they are working at the semantic level, rather than surface syntax or morphology, for example. The problem they are addressing is technical- not philosophical, and I don't think Searle's position is relevant except to demonstrate that claims a machine understands as people do is very shaky ground. It may be more useful to mention Searle in a section clarifying what 'understanding' means in this context. — Preceding unsigned comment added by 217.42.112.171 (talk) 12:27, 1 August 2016 (UTC)


 * The commenter above is bringing up an excellent point. While the WSJ article may be difficult to access, here is a topical line from the article: "Watson revealed a huge increase in computational power and an ingenious program. I congratulate IBM on both of these innovations, but they do not show that Watson has superior intelligence, or that it's thinking, or anything of the sort. Computational operations, as standardly defined, could never constitute thinking or understanding for reasons that I showed over 30 years ago with a simple argument."

- John Searle
 * It might be better and more neutral pov to say something more along the lines of: "However, despite it's apparent success using machine learning, even Watson is still not demonstrative of true Natural Language Understanding as understood by experts such as John Searle". (just a rough draft proposal) Cuevasclemente (talk) 19:40, 28 August 2017 (UTC)

M2M communications, vision recognition, etc.
This paragraph is somewhat confusing, since it introduces many phrases without explaining their relevance to natural language processing. Can you provide any reliable sources for this paragraph? Jarble (talk) 01:56, 4 January 2019 (UTC) Hello, I'm new to editing the wiki and I don't have my full barrings, I was hoping to use the Vognition Wiki Page to justify the text, but I have not finished it yet. I'd like to point out that I'm not describing NLP. I'm describing a NLU. I describe what a NLU is this way because it comes from the issued patent I invented, https://patents.google.com/patent/US9342500B2/en Please advise. Mvliguori 11:54, 4 January 2019 (UTC)

Flag for tone
NLU is the post-processing of text, after the use of NLP algorithms (identifying parts-of-speech, etc.), that utilizes context from recognition devices (automatic speech recognition [ASR], vision recognition, last conversation, misrecognized words from ASR, personalized profiles, microphone proximity etc.), in all of its forms, to discern meaning of fragmented and run-on sentences to execute an intent from typically voice commands. NLU has an ontology around the particular product vertical that is used to figure out the probability of some intent. An NLU has a defined list of known intents that derives the message payload from designated contextual information recognition sources. The NLU will provide back multiple message outputs to separate services (software) or resources (hardware) from a single derived intent (response to voice command initiator with visual sentence (shown or spoken) and transformed voice command message too different output messages to be consumed for M2M communications and actions).

I've marked two aspects of this paragraph that fall short concision and clarity without trying to cover the whole enchilada (((((there are (six) parentheticals, including a nested parenthetical in this short passage))))).

Article omits deep learning

 * The state of the art in language understanding today employs deep learning in neural network architectures. At present, the article does not acknowledge this development, and hence presents an obsolete and misleading picture of the field. 2A00:23C7:5989:8E00:CD00:C260:5DAE:992C (talk) 13:13, 29 April 2020 (UTC)

Why a hyphen?
In my experience the vast majority of authoritative mentions of NLU do not use a hyphen. I don't know why this page has one. It also has a hyphen in NLP. If there are no concerns, I will take these out. Jmill1806 (talk) 11:34, 1 March 2024 (UTC)
 * I am updating this to record that I have moved the page to natural language understanding and subsequently taken the hyphen out of its use in the article. I also replaced it with the acronym NLU in most places regardless because that is included in the lede and matches common usage. I will take a look at other pages like natural-language programming, but the change is less obviously necessary with the less commonly used terms. Jmill1806 (talk) 08:50, 27 June 2024 (UTC)