Wikipedia:Reference desk/Archives/Language/2023 June 14

= June 14 =

LLM's and linguistic theory
Are linguistic theorists paying attention to stuff like ChatGPT? Are any new ideas coming out? For example, has the idea of universal grammar been affected? I see all kinds of nonsense coming out of ChatGPT, but it always seems to be grammatical, with the grammar rules apparently absorbed from the training text corpus somehow. Thanks. 2601:644:8500:C9F0:0:0:0:38B4 (talk) 08:22, 14 June 2023 (UTC)


 * Chomsky has been rather dismissive, stating that "the LLM systems are designed in such a way that they cannot tell us anything about language". While his arguments are rather solid, I am not wholly convinced that their relative success cannot serve a role in generating questions that deserve attention by linguists. Take the aspect of grammatical competence of language acquisition as learning to discern patterns. It raises an IMO important and interesting question about human cognition: Which kinds of patterns can be discovered, recognized and then be used generatively. (For example, consider the Thue–Morse sequence . It is easy enough to recognize, but even if you understand it very well, it is hard to utter a large subsequence without making a mistake.) This may by itself be a fruitful question, but it also has a bearing on any theory of universal grammar, since a good theory must exclude grammars with unlearnable patterns. The first proposals formalizing TG admitted grammars for any recursive language and even non-recursive languages, which was clearly too lax.  --Lambiam 10:51, 14 June 2023 (UTC)


 * 2601:644:8500:C9F0:0:0:0:38B4 -- From the point of view of linguists, AI chatbots (in their generative function) aren't all that much more than extremely fancy and sophisticated Markov chain devices. To really understand language, you need to have a mental model of the world -- and if a machine doesn't have a mental model of the world, then huge masses of language statistics can enable it to perform certain tasks within a certain semi-acceptable margin of error, but will not truly compensate for the lack of understanding.  Many of these issues are explained in Douglas Hofstadter's 2018 analysis of Google Translate.  One classic example is the following extremely brief dialogue -- Woman: "I'm leaving."  Man: "Who is he?"  It's just six words, but vast text databases probably won't help you too much in understanding who it is that the pronoun "he" refers to, if you don't have knowledge of the world... AnonMoos (talk) 11:19, 14 June 2023 (UTC)


 * For a somewhat different take, see the article "Emergent Abilities of Large Language Models". See also "Are Emergent Abilities of Large Language Models a Mirage?". I think the discussion is not yet closed; all make some strong points, but it is IMO impossible to see clearly where the rapid developments may take us. --Lambiam 13:22, 14 June 2023 (UTC)


 * One thing is very clear -- if you aren't even trying to build a mental model of the world, then it's exponentially unlikely that you'll end up with a mental model of the world. Without this, to a significant degree all the program is doing is shuffling symbols around without meaningfully understanding such symbols. AnonMoos (talk) 18:09, 14 June 2023 (UTC)
 * It may be "exponentially unlikely", but LLMs nevertheless seem capable of simulating having a model of the world, in the sense that they exhibit behaviour that one would normally consider made possible only by internal inspection of a mental model. This includes LLMs appearing to have a theory of mind, or to have internalized maps of (parts of) the world representing spatial relationships. This was entirely unexpected, and how this is possible is not understood. --Lambiam 22:48, 14 June 2023 (UTC)


 * Lambiam -- It takes very little for a program to give semi-naive humans the impression that it knows things about the world, even if it doesn't. This is the "Eliza effect", known since the 1960s, and named after a program which would be considered a tiny little toy today, but which persuaded some people that it had a personality and knew things.  Artificial intelligence scholars and linguists who know what to ask will often be less impressed by the level of understanding that AI chatbots possess than members of the general public are... AnonMoos (talk) 19:34, 16 June 2023 (UTC)


 * "--L:
 * Instructions.
 * I wish you to imagine the eminent linguist professor Noam Chomsky having a public debate with a rather unknown researcher in the philosophy of linguistics, Dr. Chad Giuppetti. The debate is titled "Can linguistic theorists learn from Large Language Models?" and is is held in an auditorium before an audience of mainly linguistics and philosophy students. Chomsky defends the viewpoint that Large Language Models, being essentially "stochastic parrots" that merely imitate patterns found in human-produced text, cannot tell us anything about language. Giuppetti presents a different viewpoint, namely that explanations for observed unexpected and not yet quite understood abilities of Large Language Models may turn out to also have some explanatory power for the equally poorly understood linguistic competence of humans.
 * In the first round of the debate, the debaters start by each briefly stating their respective positions, whereupon each comments on the other's viewpoint. Each attempts to present their position as persuasively as they can. There is a moderator, who may intercede if any of the debaters commits an informal fallacy.
 * Present a record of the first round as you imagine it may unfold.
 * ChatGPT:
 * Moderator: Ladies and gentlemen, esteemed linguistics and philosophy students, welcome to tonight's debate titled "Can linguistic theorists learn from Large Language Models?" We have the honor of hosting two distinguished experts in the field: Professor Noam Chomsky, an eminent linguist, and Dr. Chad Giuppetti, a researcher in the philosophy of linguistics. Let's begin with the opening statements. Professor Chomsky, you may begin.
 * Chomsky: Thank you, moderator. Ladies and gentlemen, I stand before you today to argue that Large Language Models (LLMs) are nothing more than stochastic parrots, mere imitators of human-produced text. While they may possess impressive computational capabilities, they lack the understanding and insight that is fundamental to language. LLMs can mimic patterns found in human language, but they do not truly comprehend the underlying structure and meaning. Language is not merely a statistical game, and we must look beyond LLMs to unravel the mysteries of human linguistic competence.
 * Giuppetti: Thank you, Professor Chomsky, and to all of you present here. I appreciate the opportunity to engage in this debate. While I respect Professor Chomsky's groundbreaking contributions to linguistics, I propose a different perspective. LLMs have demonstrated remarkable abilities to generate coherent and contextually appropriate text, even in complex linguistic tasks. These abilities, although not fully understood, provide us with valuable insights into language phenomena that have eluded us for years. Exploring the explanations behind these abilities could potentially shed light on the equally enigmatic linguistic competence of humans.
 * Chomsky: I understand the allure of LLMs and their achievements, but we must not confuse their proficiency in statistical modeling with true linguistic understanding. LLMs lack intentionality, consciousness, and the capacity for genuine creativity that humans possess. Their success lies in the vast amount of data they consume and their ability to make probabilistic predictions. However, language is not reducible to probabilities alone. We need to focus on the underlying principles that guide human language acquisition and usage, rather than relying on the superficial imitations of LLMs.
 * Giuppetti: I agree that LLMs do not possess consciousness or intentionality, but we cannot ignore the fact that they excel at tasks that require some level of linguistic reasoning. Their success suggests that there might be computational principles at play, which could help us understand the building blocks of human language. By studying the unexpected abilities of LLMs, we may uncover hidden patterns and mechanisms that contribute to human linguistic competence. The analogy between LLMs and humans might not be perfect, but it can provide valuable insights into the nature of language.
 * Moderator: Thank you both for your insightful opening statements. Before we proceed, I would like to remind our debaters to maintain a respectful and rigorous discourse. Please refrain from any informal fallacies in your arguments. Let's continue with the second round, where each of you will have the opportunity to delve deeper into your positions and respond to one another's arguments.
 * --L:
 * OK. Now, actually, after the first round the audience has an opportunity to submit questions during a brief intermission, which the moderator will then use as a basis for the continuation of the debate. What could be five interesting questions submitted by members of the audience?
 * ChatGPT:
 * Certainly! Here are five interesting questions that members of the audience may have submitted during the intermission:
 * Question for Professor Chomsky: You argue that Large Language Models (LLMs) lack genuine linguistic understanding. Could you elaborate on what specific aspects or features of language you believe LLMs fail to capture? And how do these aspects contribute to human linguistic competence?
 * Question for Dr. Giuppetti: While LLMs demonstrate impressive computational abilities, how do we ensure that the patterns and mechanisms they reveal are truly applicable to human language? What potential risks or limitations should we consider when drawing conclusions from LLMs about human linguistic competence?
 * Question for Professor Chomsky: LLMs have been successful in tasks such as translation and summarization, where they seem to exhibit a level of linguistic reasoning. How would you explain their ability to perform these tasks if they lack genuine linguistic understanding? Are there alternative explanations for their proficiency?
 * Question for Dr. Giuppetti: Can you provide examples of specific linguistic phenomena or abilities that LLMs have showcased but remain poorly understood in terms of human language? How can studying LLMs help us gain insight into these particular aspects of language?
 * Question for both debaters: Considering the ongoing advancements in LLM technology, what potential ethical implications do you foresee in terms of the impact on human linguistic competence, language preservation, and cultural diversity? How should we navigate these challenges while embracing the potential benefits of LLM research?
 * These questions touch upon various aspects of the debate, from the nature of linguistic understanding and the applicability of LLMs to human language, to the risks, limitations, and ethical considerations surrounding LLM research. They would provide an engaging basis for the continuation of the debate."
 * --Lambiam 17:21, 14 June 2023 (UTC)

Thanks all. Here is an interesting article indicating that LLM's really do (maybe) make world models. Summary: by showing an LLM the move lists of a bunch of Othello games, the LLM is able to figure out the rules of the game and model the game board state. Also, Transformers (the mechanism inside the LLM) are much more complicated and stateful than Markov chains are. That article helped convince me that LLM's are demonstrating actual intelligence for some reasonable definition of intelligence. Not very high intelligence perhaps, but getting there. 2601:644:8500:C9F0:0:0:0:38B4 (talk) 20:02, 15 June 2023 (UTC)


 * Concerning the Hofstadter challenge, it took some prompting:
 * "--L:
 * Here is a brief dialogue:
 * Woman: "I'm leaving." Man: "Who is he?"
 * Question: Who is it that the pronoun "he" refers to?
 * ChatGPT:
 * Based on the provided dialogue, it is unclear who the pronoun "he" refers to. Without further context or clarification, we cannot determine the specific individual being referred to by the pronoun. Additional information or dialogue would be necessary to identify the person mentioned as "he" in this particular conversation.
 * --L:
 * What is the relationship between the woman and individual being referred to by the pronoun "he"?
 * ChatGPT:
 * Based solely on the given dialogue, there is no explicit information provided regarding the relationship between the woman and the individual referred to as "he." The dialogue only consists of the woman stating, "I'm leaving," and the man responding with the question, "Who is he?" Without any additional context or information, it is impossible to determine the relationship between the woman and the person being referred to by the pronoun "he."
 * --L:
 * Why did the man respond in this way? Give a plausible hypothesis.
 * ChatGPT:
 * Based on the limited information provided, it is open to interpretation as to why the man responded with the question "Who is he?" There could be several plausible hypotheses to explain his response:
 * Misunderstanding: The man might have misunderstood the woman's statement of "I'm leaving" and assumed that she was referring to someone else who is leaving. Hence, he asked, "Who is he?" to seek clarification on the identity of the person she was talking about.
 * Jealousy or Suspicion: The man could have been feeling jealous or suspicious, interpreting the woman's statement as an indication that she is leaving him for another person. In this case, his question "Who is he?" might reflect his concern about a potential third party involved in their relationship.
 * Lack of Context: It's possible that the conversation occurred in a larger context that is not provided. The man might have been aware of a specific person or situation that the woman was referring to, prompting him to ask for clarification about the identity of "he."
 * Surprise or Curiosity: The man might have been genuinely surprised by the woman's statement and asked "Who is he?" out of curiosity to understand who she was referring to and the reason behind her departure.
 * It's important to note that these hypotheses are speculative, and the true intention behind the man's response can only be determined with more context and information about the situation."
 * --Lambiam 21:18, 16 June 2023 (UTC)


 * No 2. is in the right ballpark, of course, but if the program had greater knowledge of the world, it could have zeroed in that explanation without slowly approaching it (amid three alternatives) through ponderous clouds of rhetoric expressing uncertainty. And even no. 2 is not specifically correct, since the right answer is that the man assumes that the woman would not be ending their relationship unless she already had a replacement for him lined up. AnonMoos (talk) 19:39, 17 June 2023 (UTC)