Wikipedia:Reference desk/Archives/Science/2023 January 15

= January 15 =

COVID RNA in the brain
The journal Nature reported that during autopsies, SARS-CoV-2 RNA had been detected in multiple anatomic sites, including the brain, up to 230 days after symptom onset in one case. My question is whether it is common for other non-SARS-CoV-2 viruses to be detected in the brain, and how does the 230 day post-infection detection window, as claimed here, compare to what we know about other viruses? Uhooep (talk) 01:51, 15 January 2023 (UTC)
 * Ebolavirus was found in patients' testes and eyes a year after they recovered, and in the case of the testes, actually infected a patient's sexual partner. Note that the eyes and testes are also protected by barriers like the blood-brain immune barrier. It seems likely that all viruses can hide from the immune system behind these barriers.  Abductive  (reasoning) 05:31, 15 January 2023 (UTC)
 * HIV and herpes simplex have also been shown to be able "hide" in the brain – that is, remain present without causing any clinical symptoms. The generalization to "all viruses" is quite a leap. --Lambiam 12:06, 15 January 2023 (UTC)
 * So, every virus checked so far... In other words, "likely", all viruses. Abductive  (reasoning) 20:40, 15 January 2023 (UTC)

More detailed technical explanation of ChatGPT
The technical explanations of ChatGPT that I can find are quite vague, basically saying that it is a language model trained on a huge amount of input and now generating text conforming to the statistical properties of that input. I do not believe that mimicking statistical properties is sufficient to explain its ability to write code according to given specs, or to explain the concept of blockchain in language an eight-year old can understand. I asked for the following, "Write a sonnet in the style of Shakespeare about a wilting lily." This was what came back.

Note the following. The form and style are indeed that of a Shakespeare-style sonnet. The content is true to what was requested. While obeying (almost – "bright" does not quite rhyme with "pride") the restrictions imposed by the rhyme, and mostly following an acceptable iambic pentametre, the language used maintains an appropriate style. The result has an undeniable poetic elegiac quality. To produce this, more machinery has to be involved than conforming to the statistics of a sample, however large. So my question is, where can I find a more detailed technical explanation? --Lambiam 15:38, 15 January 2023 (UTC)
 * We have an encyclopedic article with references here: ChatGPT. Philvoids (talk) 02:15, 16 January 2023 (UTC)
 * You chose the Shakespearean sonnet form on which ChatGPT has been trained, as examples show, and for which there is a distinct Text corpus of 154. Expect less authentic-seeming poetry if you challenge ChatGPT to emulate Emily Dickinson, Edward Lear, Ovidius Naso, William Wordsworth, Alfred, Lord Tennyson or Rudyard Kipling. Philvoids (talk) 14:05, 16 January 2023 (UTC)
 * The question you ask is about the exact inner workings of ChatGPT. However, the surrounding language has a general tone of disbelief - for instance I do not believe that mimicking statistical properties is sufficient to explain its ability to write code according to given specs (...) To produce this, more machinery has to be involved than conforming to the statistics of a sample, however large.
 * While it is possible that the popular treatment (in the mainstream press, in blogs, etc.) of ChatGPT incorrectly describes what the ChatGPT team says, or that what the ChatGPT team says differs significantly from what ChatGPT actually does, I believe the simplest explanation is that ChatGPT is indeed a big statistical model. It includes lots of smart machine learning tricks, but I do believe it is a big matrix of weights without any hard-coding. (Well, except maybe for the top layer of censorship - ChatGPT will refuse to write pornographic or fascist material when asked too directly. But AFAICT there is no "Shakespeare generator" module.)
 * In the case where you believe such a feat to be impossible on theoretical grounds (rather than technical objections of whether today’s technology can reach metric X), I refer you to Alder's razor. The semi-famous 2004 essay is a bit long-winded in my view, but it aged fairly well. (Summarized: if leading experts on a topic say X is impossible based on very sound theoretical explanations, but a random gal produces a working prototype that does X, X is possible. Therefore (the author’s extreme argument goes), one should not listen to theoretical arguments that X is impossible. The only way to know if X is possible is to try to do it; even if you fail, you will learn more in the attempt than by idle speculation.) Tigraan Click here for my talk page ("private" contact) 15:12, 16 January 2023 (UTC)
 * Sometimes it is possible to show an impossibility on theoretical grounds; compare the pumping lemma for regular languages, which can be used to show that the context-free languages are a proper superset of the regular languages, a simple example of a CF language that is not regular being &thinsp;($n$&thinsp;)$n$ – any number of opening brackets matched by an equal number of closing brackets. In particular, the recursive embedding of phrases seen in many natural languages including English cannot be properly attained by a generative Markov chain model, barring anacolutha. In English as in most other languages, a minute twist to a sentence can dramatically change its whole meaning. To be able to respond with the level of adequacy displayed, the model needs to perform a structural analysis of the input, and produce output that is also structurally coherent, with semantic elements being coordinated throughout. Is this apparent structural ability an emergent aspect of the statistical process, given enough context? When I read "GPT-3 takes the prompt, converts the input into a list of tokens, processes the prompt, and converts the predicted tokens back to the words we see in the response", and "Large language models generate text token by token", I can't help but think of an overblown Markov chain model. Why aren't ChatGPT's responses more like those produced by the human responders on the Wikipedia Reference desk? Ask ChatGPT a question about a standard Aristotelian syllogism, but with random words filling the argument slots:

All plutocrats are flexible. Theophilus is a plutocrat. Therefore, ...
 * Its response: "Therefore, Theophilus is flexible." There must be something going on under the hood that associates the pattern of a syllogism with the term "syllogism". (Note also that ChatGPT recognizes that the ellipsis is an implied question, and that in this case it is appropriate to respond, not just with "Theophilus is flexible", but with the full last line after "filling in the dots".) --Lambiam 16:46, 16 January 2023 (UTC)
 * If it had really understood Shakespears sonnets it would not just go for something like that but probably gone on about his love and how they were getting old like a lily but he could still still see the beauty. But it certainly looks promising. I guess the answer is we can get by without thinking for most things. NadVolum (talk) 17:32, 16 January 2023 (UTC)
 * But the poem clearly states that even in her wilted state, she holds / A certain charm, a beauty in her woe, and that the wilted lily is only A symbol of the beauty that we see—i.e., the beauty exists whether or not the lily itself is physically attractive. So the narrator indeed can still see the beauty despite the lily wilting. Shells-shells (talk) 18:52, 16 January 2023 (UTC)
 * It is (I think) a truism that readers or listeners of fiction, poetry and song lyrics not infrequently find meanings within texts that the writers had not (at least consciously) intended (though much art is already intentionally ambiguous), but which when pointed out seem valid: sometimes different readers will find different meanings in the same text. It is therefore likely that readers of an AI-generated poem will find meanings in it that were not "intended" (though all current AIs surely lack intentionality). {The poster formerly known as 87.81.230.195} 51.194.245.235 (talk) 05:51, 17 January 2023 (UTC)
 * I didn't mean to argue that the poem was written with 'intent', or to evaluate it on stylistic grounds; I merely wished to defend it against that specific criticism. Whether or not the inhuman poet really understood Shakespeare, it seems pretty clear to me that the poem indeed does [go] on about his love and how they were getting old like a lily but he could still still see the beauty. I don't think too much interpretation is needed for one to arrive at that conclusion.Actually (and now I am evaluating its style) I think the poem is rather nicely arranged, with the tonal contrast between the first two and the last two stanzas. I also like the way the phrase symbol of pure beauty is reincorporated at the end with a different meaning. ChatGPT could certainly have done worse, and I must acknowledge it for at least making an effort to produce a poem instead of just copping out as Turing's conversationalist did. Shells-shells (talk) 07:52, 17 January 2023 (UTC)

This link brings into your browser a page of Shakespeare's sonnets. The interested reader may try their browser Find on page function to find whether Shakespeare actually used any phrases in the OP's example. Philvoids (talk) 23:12, 16 January 2023 (UTC)
 * Here is ChatGPT's poem again, with those words underlined that also occur in one of Shakespeare's 154 sonnets:


 * A few remarks. I have not underlined the "stop words" of the list found as the third item at . Although the form wilt occurs in the sonnets, it is as the second-person singular form of the verb will (I will, thou wilt&thinsp;), so I have also not underlined this. The words pride and bright occur as rhyming words, but not as a rhyming pair – Shakespeare's rhymes are proper ones. The words woe and show also occur as rhyming words, but again, not as a pair. None of ChatGPT's rhyming pairs is found in the sonnets, and I did not spot obviously similar phrases or tropes. Inversions of the word order as seen in "Gone is the lily's radiance and grace" are rare; one occurrence is in Sonnet CV ("Kind is my love to-day"). --Lambiam 20:08, 17 January 2023 (UTC)
 * Not sure the utility of comparing only to sonnets by Shakespeare. I'm guessing there must be tens of thousands, maybe even hundreds of thousands things called Shakespeare-style sonnet or something similar in the ChatGPT training data, along with stuff like [//eduzaurus.com/free-essay-samples/adrienne-rich-and-william-shakespeare-comparative-analysis/] (I don't mean that in particular but stuff like that and other analyses etc). Of course assuming only this data goes into ChatGPT writing Shakespeare-style sonnets is also I suspect highly flawed. Nil Einne (talk) 06:52, 19 January 2023 (UTC)
 * Sure, but ChatGPT didn't come up with a prose essay about the sonnets of Shakespeare, it came up with a sonnet in Shakespeare's style. What else should its output be compared to except the sonnets that Shakespeare wrote? Shells-shells (talk) 07:07, 19 January 2023 (UTC)
 * Many a poem on the Internet not written by Shakespeare is called a "Shakespeare-style sonnet". The model does not "know" that in these cases the label "Shakespeare-style sonnet" refers to a specific nearby piece of text; the association of the label with some text having the pattern of three ABAB quatrains plus a couplet, all in iambic pentameters, is supposedly purely statistical. The tonal contrast between the first two and the last two stanzas you mentioned above is at least as striking. It is purely semantic in nature, but perfectly appropriate. How did that emerge from a statistical analysis of sequences of tokens? --Lambiam 08:45, 19 January 2023 (UTC)
 * "Following the release of ChatGPT, we have only a short blog post describing how it works. There has been no hint of an accompanying scientific publication, or that the code will be open-sourced." It is an instance of cultural evolution though. Modocc (talk) 14:43, 19 January 2023 (UTC)
 * Thanks. So the reason why I couldn't find a more detailed technical explanation is apparently that the details are a close-guarded business secret. The app is a potential cash cow, so I can understand why they may not want to make it easy to replicate the success. I think OpenAI should change its misleading name, though. --Lambiam 19:50, 19 January 2023 (UTC)
 * Well, the GPT-2 model is openly available: see https://huggingface.co/gpt2-xl. The GPT-3 model, which underpins ChatGPT, is licensed to Microsoft. Shells-shells (talk) 22:25, 19 January 2023 (UTC)