Wikipedia:Conlangs/Notability, verifiability, merit, completeness


 * Top level: Conlangs

Notability vs. Merit
I want to note that we are trying to determine notability here, not merit. A conlang may be a beautiful piece of work, but if it's just sitting ignored on a single webpage and spoken only by its creator, it's not notable. Likewise, a language like Klingon might not be a fine example of our art, but it's unquestionably notable due to its connection with Star Trek. A conlang may have such merit that it has attracted attention and become notable, but the second step there is very necessary. DenisMoskowitz 15:33, 2005 July 29 (UTC)


 * OK, it's good to be reminded about that. If we go with a two-tier system (any two minor criteria or any one major criterion), as seems to be the consensus so far, we need to make sure that the minor criteria are divided up or distinguished in such a way that a conlang can't get in just on two minor criteria that prove completeness or interestingness, without at least one minor criterion that proves fame, influence, etc.


 * I would also back up and revise my various proposals about minimum corpus size by saying these are minimum amounts of publicly available text in the language -- online, or in a book (should we rule out vanity press publications for this purpose?), or maybe even just available for inspection in a research library somewhere.


 * Some possible criteria that prove fame and influence:


 * A Google search excluding Wikipedia and mirrors finds at least N independent mentions of the language (by persons other than the creator(s)).
 * A Google search excluding Wikipedia and mirrors finds at least N independent extended discussions of the language (by persons other than the creator(s)).
 * Not that I object, but I think this will require quite some digging! You know how this kind of things work: first the deletion police issues a VfD with a claim about non-notability, and the next thing is that the keepers will have to come with the evidence. --IJzeren Jan
 * At least N professionally published books or magazine articles mention the language. (minor)
 * At least N professionally published books or magazine articles discuss the language at some length. (major; if N = 1, this includes Ithkuil)
 * Agreed, but in that case we might have to be more specific about "some length". --IJzeren Jan
 * Yes, for all these metacriteria we would need to make them more specific (set N = a particular number, define "at some length") to make them real proposals to be voted on. What do you think would be a good threshold for this one?  More than 1000 words? --Jim Henry | Talk 20:54, 31 July 2005 (UTC)
 * At least N people besides the creator(s) can read/write/speak the language.
 * It's already been shown that requiring speakers is unfairly biased against artlangs and towards auxlangs. I doubt most artlangers can speak their artlangs. ThomasWinwood 17:49, July 31, 2005 (UTC)
 * That's true, Thomas, but remember: these are inclusive criteria. In other words, if this condition is met, it adds to the notability of the conlang; but if it's not, that doesn't prove its non-notability. Of course, you can't measure artlangs by their number of speakers, but like it or not: art- and auxlangs are all dining at the same table! --IJzeren Jan 19:59, 31 July 2005 (UTC)
 * This is a recognition that the most famous artlangs do have some speakers -- though far fewer than the modestly successful auxlangs -- and I envision N being set a lot lower for artlangs than Almafeta's 50. --Jim Henry | Talk 20:54, 31 July 2005 (UTC)
 * The language is used in a work of fiction which is deemed notable, or whose author is deemed notable.
 * I think the following quotation from Importance should be included here too: "[a subject is notable when] there is clear proof that a reasonable number of people (eg. more than 500 people worldwide) are or were concurrently interested in the subject. --IJzeren Jan
 * That's unfairly high for artlangs. ThomasWinwood 17:49, July 31, 2005 (UTC)
 * True. But again, this is a criterion that, once met, contributes to notability. It doesn't work the other way round. --IJzeren Jan 19:59, 31 July 2005 (UTC)
 * Set N = 1 in all the above criteria (except maybe the first) to establish "verifiability" that may fall short of "fame and influence".


 * Some criteria that only prove completeness and expressivity:
 * thorough phonology and grammar (covering all the topics in the "Lingua Questionnaire")
 * lexicon of at least N words
 * This is vastly better than the one below it. ThomasWinwood 17:49, July 31, 2005 (UTC)
 * lexicon of at least N words with real definitions, not just one-word English glosses
 * "Real definitions"? All English words have "real definitions", so why not use the English word to define rather than repeating the OED? This is absurd. ThomasWinwood 17:49, July 31, 2005 (UTC)
 * I agree, this is nonsense. Should I now start counting my number of dictionary items where the translation consists of only one word? --IJzeren Jan 19:59, 31 July 2005 (UTC)
 * This is a complex enough issue that I'll reply in a new section below. --Jim Henry | Talk 20:54, 31 July 2005 (UTC)
 * existence of corpus of at least N words (if it seems no one has read it besides the author, this shows only completeness/expressivity, perhaps not enough for notability by itself)
 * at least one person besides the creator can read/write/speak the language
 * Again, unfairly biased towards auxlangs. ThomasWinwood 17:49, July 31, 2005 (UTC)
 * Like I said, while that is true, it is also not an aspect to be fully neglected. Even in the case of an artlang I think it's safe to say that a high number of speakers (what for? ask them!) contributes to its notability. --IJzeren Jan 19:59, 31 July 2005 (UTC)
 * My idea was that having even two speakers proves the language is in some sense complete and expressive; adding more speakers after that increases the evidence of fame, but doesn't add much evidence of completeness compared to the vast difference between a conlang with no speakers, or just one, and one with two. --Jim Henry | Talk 20:54, 31 July 2005 (UTC)
 * I'm not sure the "worked on by N people for M years" criteria establish either fame or completeness. --Jim Henry | Talk 22:07, 30 July 2005 (UTC)
 * Me neither. --IJzeren Jan

Lexicon properties that are evidence of completeness
At some point someone proposed a lexicon of at least 2,500 words as a minor criterion, and someone else objected that it was easy to use Langmaker or another program to transform an existing language, or automatically generate conlang words to go with an existing word list. My tentative proposal above,


 * lexicon of at least N words with real definitions, not just one-word English glosses

was perhaps badly phrased, but it was partly intended to guard against conlangs with such automatically generated vocabularies getting into Wikipedia primarily on the strength of their lexicon size. It was also intended to screen out naive conlangs that are merely relexes of or codes for English (or another natural language), whether the vocabulary was automatically generated or not. This has been discussed on the CONLANG mailing list and undoubtedly other venues more than once before; the idea I based this proposal on is, that a conlang whose creator has seriously and independently thought out its semantics will have a fair number of words that cannot be glossed with a single English word. Some will be vaguer than any English word, some will be more specific, some will be both in different respects. That is not to say that a language whose lexicon consists mostly of one-word glosses is necessarily a relex; in the language center of the creator's brain it might be very different, and the public definitions in the lexicon might not represent the real wideness or narrowness of the words' meanings. But the lexicon (and grammar, sample texts with glosses, etc.) is what we have to go on in judging the language's completeness.

This is not to say that a relex (or apparent relex) conlang cannot possibly be notable for other reasons than its lexicon (interesting phonology or grammar, actual speakers, use in professional fiction etc.). I don't want to assert, either, that every word in a "real" conlang's lexicon must be an extended definition (or an extended list of approximate English equivalents); any two languages will probably have some points of semantic contact, and thus some words that can be validly defined with one or two words in the other. So N in the proposal above would be set a lot lower than N in the other proposal for a minimum lexicon size. Maybe this could be done statistically, e.g.:


 * language has at least 2,500 words in its lexicon;
 * a random sample of 50 entries from the lexicon shows at least 10 words with English (or other language) definitions that are more than one word.

I'm not really satisfied with this, either. A good artlang might have carefully handmade words in a few important semantic fields, and an automatically generated portion as well (assigning autogenerated words to a list of animals and plants, maybe) which could bias the statistical sampling (like the auto-generated city and county articles in Wikipedia). But a mere minimum lexicon size, with no criteria for its properties, seems too open to abuse. Maybe it would be simpler just to say "lexicon is not entirely autogenerated"; but this would rule out Classical Yiklamu, which might be notable precisely because of its creator's controversial decision to assign an independent root to every entry in WordNet, resulting in a language with no derivational morphology. --Jim Henry | Talk 20:54, 31 July 2005 (UTC)


 * All points you raise are perfectly valid, Jim, but I honestly don't think this is a solution. First of all, because in many cases a large number of one word translations will say more about the quality of the lexicon than about the completeness of the language. Secondly, because I think we shouldn't exaggerate with our criteria: if we make them too complicated, nobody will use them anymore. And finally: you can't fetch everything with profiles, criteria and conditions. There should always be space for the "you'll know it when you see it" criterion, and therefore let's not exclude the possibility of including or excluding a conlang just by using our own wet finger.
 * BTW, I'm not too familiar with the Langmaker program (I once downloaded it but never used it), but I thought it generates only a list of words that fit certain phonological and orthographic parameters, without assigning any meanings? So, if the language in question is a relex of English, that's due to the conlanger's naiveness, not to the software.
 * No, I think 2,500 words without further specification should work fine for us. Look here and you'll see the amount of languages given is manageable. The list also shows us that there are other problems involved, too. Look for example at no less than twelve languages by "Dtsdesign", all around the same size. Or Nunihongo, with 100,000 words that apparently have been generated by applying a sound change program. Or Eda, which claims 46,949 words but on the site I haven't found a lexicon or even a corpus of text samples. Or New English, which consists of the entire English language with the addition of a number of neologisms. You can of course try to filter out such languages by refining the criteria (although some of them might be notable for other reasons), but as long as any "minor criterion" requires another minor criterion, we might better stick to our "simple" limit of 2,500.
 * --IJzeren Jan 05:20, 1 August 2005 (UTC)


 * You're right that making the conlang notability policy too complicated will hinder it from being applied easily (if indeed it ever gets approved in the general vote). But I still think a minimum lexicon size should be a subcriterion of a general "completeness" criterion.  That is, a language should not get into Wikipedia just for being more complete than average (having a larger than average lexicon and one other criterion of similar import, e.g. existence of a fairly thorough reference grammar).  Besides the evidence of completeness there should also be some evidence that people besides the language's creator are interested in it: discussion in books, magazine articles, other people's websites, mailing list archives, etc.


 * The 12 languages by dtsdesign seem to all be described as "fictional naming languages", i.e. sources for character and place names in an RPG; they may not have any grammar except maybe compounding rules. And they all seem to be based on generating words with langmaker and matching them with the same list of English words.  --66.150.59.2 19:11, 1 August 2005 (UTC) (really Jim Henry | Talk; I got logged out when saving the above comment.)

Verifiability and Original Research
I thank DenisMoskowitz for noting that artistic merit is not the touchstone for inclusion. While I believe that notability *should* be a policy, and support its use, it is good to remember that the actual policies are verifiability and No original research. Note that verifiability this does not mean merely that the existence of the thing can be verified, but that all significant claims about it can be verified from disinterested sources. No original research would strongly suggest that articles not be written by the creator or a close colleague.

Specifically with respect to conlangs, this means that it should have been discussed sufficiently widely that there is a body of knowledge that extends beyonds its creator and his/her close colleagues. If I may make an analogy, when superstring theory was something that Ed Witten was kicking around at physics department teas, the theory was not sufficiently verifiable for inclusion, even though it was being discussed by some of the pre-eminent minds in the field. It was only after it got "out and about" a bit that it became sufficiently verifiable.

Because of my idiosyncratic interest (arising out of their use in RPG's) I am not that familiar with the field, and would prefer not to speak on most specific issues. On general grounds, I feel that the guidelines proposed approximate a threshold at which it becomes reasonable to say that an article on the language contains verifiable information adn is not original research. A language that has been learned by people other than its creator will, perforce, be critiqued and discussed more widely than one that has not, irrespective of artistic merit.

208.20.251.27 20:13, 30 July 2005 (UTC) The above edit made by me -- got logged out somehow. Robert A West 20:16, 30 July 2005 (UTC)
 * So, to sum up, active discussion about a language (as opposed to one that has gone without mention other than a summary on Langmaker) should be one of our criteria? Almafeta 09:27, 1 August 2005 (UTC)
 * That would seem consistent with Wikipedia's treatment of other subjects and the blanket prohibition on original research. Robert A West 15:58, 1 August 2005 (UTC)
 * I should add that, for most academic theories and constructs, "discussion" means "discussion in a peer-reviewed journal". Here I must confess my ignorance.  Are there peer-reviewed journals for constructed languages?  If not, is there a forum that serves the same purpose of acting as a filter, and a consensus of which would qualify as NPOV? Robert A West 17:07, 1 August 2005 (UTC)


 * There used to be the Journal of Planned Languages and the Journal of Unnecessary Languages. I don't know if there are any current venues of the same kind.  But the archives of those journals are still online and relevant for older languages that were discussed in them. --Jim Henry | Talk 20:17, 1 August 2005 (UTC)


 * Given the interest in the subject I see here, I am astonished to learn that there is not more peer-reviewed academic work going on. In its absence, how do you, personally, filter out the inevitable dross and concentrate on the serious research?  Pardon my ignorance, but I am genuinely interested to know.  Robert A West 08:05, 2 August 2005 (UTC)


 * True, there's not much. Langmaker.com started out as an online newsletter, but ceased to be one long ago. There in the Netherlands we briefly had a Nederlands Genootschap voor Linguafictie, which also published a yearbook or something, but initiatives like that never last long. A year or two ago, another conlang journal was started, but never made it to the first issue. Why? Probably because the market is too small. Let's face it: conlanging is not an average hobby or artform. Before the coming of the Internet, it was virtually impossible to find enough people interested to make such a journal profitable (or at least covering the expenses); as far as there was anything, it was almost exclusively auxlang-oriented. And after the coming of Internet, we basically have access to our possible clients, but since everything is now readily available online there is little need for such a newsletter. No, honestly, I'm quite surprised that despite all this a few newsletters existed at all, and that apart from that a few books have been published about the subject.
 * I've done quite some research in the field myself. In the beginning, I gathered lots of info, until I had over a thousand conlangs listed in a personal database, along with information about their creators and about the languages, and samples. I visited hundreds of websites; participated in the Conlang list and in several other mailing lists; frequented conlang directories and the like. At some point you'll start recognising patterns; you'll see certain names and languages mentioned again and again; you'll notice that some people are treated with a special kind of respect, and that their opinion about other conlangs also seems to count heavier. You'll start distinguishing between raw sketches and highly elaborated, detailed and complete languages, and everything that's in between. You'll recognise originality and see the difference between good and bad work. You'll notice soon enough that there are a few true masterpieces around. Etcetera etcetera etcetera.
 * Apart from reading a lot, I also wrote a lot, including the entire article about constructed languages in the Dutch wikipedia. As far as I didn't have certain bits of info at hand, I had to find them and sort them out myself.
 * Does that answer your question? --IJzeren Jan 08:49, 2 August 2005 (UTC)

What makes a work of art notable?
How do we decide what makes a particular painting worthy of an entry, rather than it just being another example of a painting? Why would one opera be notable, rather than simply a one line mention under Opera? Surely the reasons are unique to the individual work. Surely, indeed, the fact that they are unique to the individual work is what makes them points of notability.

The same applies to a minority artform, such as conlanging. What will make an individual conlang worthy of an entry will be highly individual. However, it should be clearly stated at the beginning of the article. Someone considering writing a conlang article should ask themselves "Why would the casual reader be interested in this?" If you can think of a good answer, start the article with it. If not, don't write it. If you find an article about a conlang and want to know whether it's worthy of inclusion, ask yourself, "As a casual reader who wants to know a bit about conlanging, would I follow the link in this article?" If so, keep. If not, follow the link, and see if the page it leads to arouses interest. If so, improve the wikipedia article. If not, and you consider yourself in a position to make an informed artistic judgement, you may consider a VfD. If you're not in a position to make an informed artistic judgement, leave well alone.

Pete Bleackley


 * Good analogy. If you go back and read the basic policies, you will find that artistic merit appears nowhere as a criterion for a work of art.  Instead, we consider criteria appropriate to whether the topic is encyclopedic.
 * Is there a corpus of discussion in reliable sources?
 * Is that discussion sufficiently broad to write an NPOV article about the topic?
 * Can a meaningful article be written using at least some secondary sources in order to avoid straying into original research? Note that an article about an artistic endeavor written entirely from primary sources constitutes the creation of a secondary source, and is almost certainly original research.
 * I quote from the OR policy: The fact that we exclude something does not necessarily mean that material is bad – Wikipedia is simply not the proper venue for it. We would have to turn away even Pulitzer-level journalism and Nobel-level science if its authors tried to publish it first on Wikipedia. Robert A West 17:25, 1 August 2005 (UTC)


 * With all due respect, I don't get the impression this is what Pete was talking about. The way I understood him, his point was: if you start writing an article about a conlang, then better be explicit about the reason why you think the language is notable. And I fully agree with that. We certainly don't want hundreds of minor conlangs here. I do not however agree with his implication that all we need is "good artistic judgement"; that's more or less the current situation are trying to solve.
 * Furthermore, I don't really see the point of your remarks about Original Research. Who says that an article about a conlang must be original research? Is there ánything in the article about Wenedyk (just to pick an example I'm familiar with) that can be qualified as such? If you set up your article as something like: "Language X is a constructed language created by ... in year .... It plays a role in book/movie/conworld .... It is characterised by ... and a remarkable feature of it is .... The online lexicon consists of ... words", then I don't think you can say there is any original research involved. I suppose the number of words in the lexicon can hardly be counted as O.R., even if the number isn't mentioned anywhere explicitly, since anyone can easily count it. Of course, entire grammars are out of line (although several other Wikipediæ contain plenty of them). --IJzeren Jan 18:12, 1 August 2005 (UTC)


 * Not quite the point I was trying to make about artistic judgement - I meant that informed artistic judgement is a minimum requirement for being able to comment on the notability of a conlang. Pete Bleackley


 * It sounds like Robert A West was saying that an article whose only source is the conlang creator's website would be original research because it is written entirely from a primary source. I think this might be true.  That is why I proposed the two subsets of criteria, one of which is focused on evidence of fame, influence, etc.: mentions in books and articles and on the web, etc.  If the only information about the conlang that's available is on the author's website, then it doesn't matter how big its lexicon or thorough its grammar -- or maybe even how many thousands of words have been written in the language -- there's no evidence that anyone besides the author is interested in it. --Jim Henry | Talk 19:26, 1 August 2005 (UTC)


 * Well, not entirely. Most conlangs have a) their own website; b) an entry on Langmaker.com; c) been discussed in one or several mailing lists; d) a place in one or more other conlang directories. If the language has been used in a translation relay we have e), too. There's relatively little you need to do for that. But what value does that add? All there is to know can be traced back to one place, the website. Let's pick an example: you write an article about a certain book. Is it "Original Research" if you write the synopsis yourself, on the basis of the book itself, instead of copying it from some website and(/or) repeating it in your own words? --IJzeren Jan 19:38, 1 August 2005 (UTC)


 * Good analogy. If J. K. Rowling were to write an article on Harry Potter it would be POV or OR or both.  If I wrote a synopsis, and nothing more, it would be a pretty trivial article, and probably OR.  If I added my opinion of the book, it would be POV and OR.  If I describe various published opinions on the quality of the work, its relationship to other works of literature, its influence and so on, and source everything, then I have produced a great article.  And aren't we here to write great articles?  Robert A West 19:54, 1 August 2005 (UTC)

Summary
I'll try to summarize the above discussion.
 * No amount of evidence for artistic merit or completeness is enough by itself to justify inclusion in Wikipedia. (Jim Henry, DenisMoskowitz, Robert A West, )
 * But evidence of being complete and expressive should be a minor criterion, weighted along with some evidence that the language is famous, influential, has speakers, etc. (Jim Henry)
 * Minimum lexicon size is a sufficient criterion without special strictures on its properties (IJzeren Jan, and, with reservations, Jim Henry)
 * Lexicon size should not be a minor criterion by itself, but a subcriterion of a minor criterion re: completeness (Jim Henry)
 * Any conlang article should note what makes the language unique or particularly interesting (Pete Bleackley)
 * Verifiability requires using sources other than the conlang creator's own site. Writing an article purely based on the information on the conlang creator's site would amount to original research. (Robert A West, Almafeta, and Jim Henry)
 * A modest, NPOV article based solely on the creator's own site doesn't constitute significant original research. (IJzeren Jan)

Is that a fair summary? --Jim Henry | Talk 20:17, 1 August 2005 (UTC)


 * I would add that if a well-sourced, NPOV, non-OR article can be written, that is proof that the subject is suitable for Wikipedia. Maybe there are exceptions, but they should be obvious if and when they occur.  Robert A West 08:15, 2 August 2005 (UTC)


 * So, if I understand you correctly, you consider that independent verifiability (based on how much has been written in/about it by people besides its creator) is necessary and sufficient irrespective of notable properties of the conlang itself (interestingness, completeness, artistic merit, size of corpus)?  That makes a certain amount of sense, given how hard it is to objectively define the latter criteria.  Please add this to Conlangs/Criteria. --Jim Henry | Talk 11:54, 2 August 2005 (UTC)


 * Will do. For VfD criteria, that is a correct statement of Wikipedia policy as I understand it.  Editorial judgment about whether to create the article in the first place will, naturally, involve a decision about whether the editor considers the conlang worth a mention, or whether an article about a group of conlangs would actually be more informative to potential readers.  Robert A West 16:47, 2 August 2005 (UTC)

Why should any conlangs be notable
It has occurred to me that the people who've been proposing all these VfD's have been taking the position that all conlangs are non-notable. Therefore the most important goal of whatever policy we decide on should be to establish that individual conlangs can be considered notable for inclusion - unless we have this firmly established, nobody will take any notice of any criteria we propose for notability. I'm boing to put up some ideas at Conlangs/Why conlangs should be covered, which anyone can improve as they see fit. PeteBleackley 08:44, 9 September 2005 (UTC)