Wikipedia:Reference desk/Archives/Computing/2024 February 18

= February 18 =

Do LLM reach B2 or C1 levels (in writing) in at least some languages?
Like the title. In my usage of LLM I am quite impressed by the mastery of the language. Yes I know, it is all a (huge) statistical model behind it but still, impressive compared to pre Dec 2022 attempts.

I made a cursory search and no notable search engine could help me find studies that assess which level some LLM reached in writing. Say the last version of GPT3.5 or Bard in 2023 or even the latest GPT 3, that was less powerful but still a massive model.

From experience I would expect at least B2 or C1 (if not directly C2), but of course that is different from an official assessment. Pier4r (talk) 09:35, 18 February 2024 (UTC)


 * I haven't the foggiest what B2 or C1 level of language means. If you found a way of expressing it which wasn't jargon you might get an answer. NadVolum (talk) 13:30, 18 February 2024 (UTC)
 * Those are CEFR levels. It is the level of language fluency. When you learn a language you can take exams for these levels. If memory serves, C2 is completely fluent, C1 is close to fluency, and with B1 and B2 people can have basic conversations, but ones that are far beyond "what is your name? my favourite colour is blue". —Panamitsu (talk) 13:37, 18 February 2024 (UTC)


 * How would you measure such a thing? Undoubtedly there are tests, which some people might value, but those are made for humans. An AI may perform reasonably well on some aspects of language use and at the same time very poorly on others. In other words, such a test is useless. BTW, I never experimented with LLMs, so no experience. PiusImpavidus (talk) 19:59, 18 February 2024 (UTC)
 * Various models are routinely given exams or tests. --Lambiam 22:08, 18 February 2024 (UTC)
 * In the CEFR levels there are tests (for humans) in reading, writing, listening and speaking. I was wondering if there were studies applying writing/reading tests to LLMs.
 * Further about the "tests, which some people might value". Those levels are required for entire naturalization programs or access to schools or universities. Pier4r (talk) 22:19, 21 February 2024 (UTC)
 * I didn't claim otherwise. But your comment on naturalisation programmes makes me think of the following. Some years ago, an anti-immigration party in my country had an idea to reduce immigration. To do it in some fair way, they implemented a language test. Immigrants had to follow a language course in their old country and only after passing the test, they qualified for immigration. It turned out that some people passed the test after learning the language for just a few weeks (when they were still barely understandable), whilst some native speakers failed; the test was barely more accurate than a coin toss. But that didn't matter; the test was effective in reaching its goal, namely reducing the number of people who qualified for immigration. PiusImpavidus (talk) 12:12, 22 February 2024 (UTC)
 * A similar test, the Deutsch-Test für Zuwanderer, was implemented in Germany as a condition for getting a residence permit, although the European Commission thinks it violates EU law. The test can be taken in Germany, though. --Lambiam 08:02, 24 February 2024 (UTC)

For information: CEFR is the Common European Framework of Reference for Languages, a framework promulgated by the European Union to describe the achievements of learners of foreign languages. Martin of Sheffield (talk) 09:55, 22 February 2024 (UTC)

Excess space in image metadata
When I edit metadata of my photos to be uploaded, for some reason there's always an excess space at the beginning of "Public domain" (before "P") in the Copyright holder field (as in this photo's metadata), even though I repeatedly hit backspace to remove it. No excess space appears in other editable fields. Any idea how to get rid of it? Brandmeistertalk  15:43, 18 February 2024 (UTC)


 * This is something specific to Wikimedia Commons. I did not succeed in figuring out how the make-up of the table is generated; it must be something deep inside the system. Your best bet may be posting this question at the Commons:Village pump. --Lambiam 21:59, 18 February 2024 (UTC)