Talk:Controlled vocabulary

Untitled
If the below is the definition of controlled vocabulary:


 * The terms are chosen and organized by trained professionals (including librarians and information scientists) who possess expertise in the subject area.

Are the automatically generated lists at Amazon.com CAPs (Capitalized phrases) and SIPs (Statistically Improbable Phrases) controlled vocabularies?

--Jahsonic 23:47, 19 November 2005 (UTC)


 * Interesting questions. I'd say that the Amazon lists are examples of folksonomy; SIPS are something else entirely (quite innovative, I think). But neither of these are consciously organized or bureaucratically implemented, or beset by the inherent conservatism and inflexibility of top-down classification schemes.  Bryan 00:14, 20 November 2005 (UTC)

Controlled vocabulary = taxonomy?
The first sentence of this article implies that a controlled vocabulary is synonymous to a taxonomy. But is that true? Can someone verify this? - 212.187.26.113 10:39, 30 January 2006 (UTC)


 * A taxonomy is a list of permitted words, but it suggests broader and narrower subjects for each term. A controlled vocabulary is also a list of permitted words, but it lists the synonyms and other closely related terms that are not permitted. The person who wrote that is probably thinking of the Library of Congress Subject Headings, which includes both features, but you could have a list that is just one or the other. GUllman 22:54, 30 January 2006 (UTC)

missing definition in the first place
The whole first paragraph of the article doesn't make clear what a controlled vocabulary actually is: It mentions for what it is used ("to tag units of information"), who have chosen/organized them ("trained professionals"), where they are for ("can accurately describe..."), where they are published ("controlled vocabulary...are often published in..."), and of what they are part of (CVs "form part of a larger universe of nomenclatural approaches...").

The half a sentence "A controlled vocabulary is a carefully selected list of words and phrases," is probably not a matching definition of CVs. One (important) property of CVs is, that they are non-ambiguously assigned to terms and vice versa, so that neither homonyms nor synonyms are contained.

Also, because of the twice mentioned possible usages of CVs ("to tag units of information", "can accurately describe...") I vote for a clean-up of the introductional paragraph. --80.135.172.224 14:57, 25 February 2006 (UTC) (gneer, not logged in; password lost, and WP doesn't send it..):

Some random thoughts

 * A comparison should be made between controlled vocabulary and natural language. Free text search is natural language yes but it has more to do with indexing exhaustivity (you index almost everything as opposed to a few terms) rather than being the polar opposite of controlled vocabulory.


 * If we are talking about indexing schemes (which may not be the case here) there are probably three kinds of indexing schemes, controlled language (index terms taken from predefined terms), natural language (index terms taken from text only), free indexing (index terms can be taken either from text or anywhere else).


 * Strengths and weaknesses of controlled vocabulary vs natural language should probably include stuff like control of synonyms, polysemes, using scope notes to control homographs (all strengths), lack of specificity and exhausitivity , slow updating , high input costs, difficulty of use by normal users and a few others weaknesses.


 * There should be mention of both Thesauri and subject heading as 2 major examples of controlled vocabulary. Technically there's a slight difference between Thesauri and Subject headings schemes. Subject headings like LCSH evolved from the library environment while Thesauri was developed more for indexing of documents. As a result there are a few (minor?) differences like some LCSH terms are still displayed in indirect order, subject heading terms tend to cover multi concepts with phrase heading(pre-co-ordinate indexing) as opposed to typically one word terms (descriptors) in thesauri.

It used to be that the equivalence, associated, broader, narrower term was only found in Thesauri while Subject headings at best had equivalence terms. I believe LCSH only added BT, NT, RT fairly (last few decades) recently? Still nowadays the difference has narrowed quite a bit. Thesauri also tend to be more specialized for a narrow subject field, describing documents while LCSH is more for describing library catalogs (books) which are wider in description.

Aarontay 10:45, 30 January 2007 (UTC)
 * The taxonomy thing is covered on Corporate taxonomy but currently it is a mess. There doesn't appear to be a really strong consensus on the definition and I'm currently reading a paper that tries to entangle the differences between thesauri, classification systems and taxonomies. In brief taxonomies are based on both thesauri for labeling semantics and classification systems (either facets or hierarchies) for structure. The main point here is that taxonomies are generally organization specific, more user focused as opposed to being based on literary warrant like most thesauri/classification scheme. They focuses on not just books,catalogs or "connecting people to documents but also people to people". While taxonomies can play roles such as filtering search results, conveying context of search  it's primary role is that of supporting browsing. There is a line about how taxonomies differ is not in foundations but in deployment. But I suppose that view comes from the recently fashionable knowledge management/organization field?

Partially Controlled Vocabulary
I was surprised to find that this topic does not discuss the concept of a 'partially controlled vocabulary'. That is, a vocabulary consisting of a predefined core of terms, but allowing for extensions using alternative namespaces. This concept is already used in many software systems. Here's an example reference in a MARC discussion paper - see the section on Gender: http://www.loc.gov/marc/marbi/2012/2012-dp05.html.

TonyP (talk) 22:26, 16 June 2012 (UTC)

External links modified
Hello fellow Wikipedians,

I have just added archive links to 2 one external links on Controlled vocabulary. Please take a moment to review my edit. If necessary, add after the link to keep me from modifying it. Alternatively, you can add to keep me off the page altogether. I made the following changes:
 * Added archive https://web.archive.org/20101204132228/http://www.imresources.fit.qut.edu.au:80/vocab/ to http://www.imresources.fit.qut.edu.au/vocab/
 * Added archive https://web.archive.org/20090314094707/http://www.fao.org/aims/kos_list_type.htm to http://www.fao.org/aims/kos_list_type.htm

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Cheers.—cyberbot II  Talk to my owner :Online 07:27, 13 February 2016 (UTC)

Are radio "procedure words" and "standard phraseology" controlled vocabulary?
I'm doing a deep dive on documenting two-way radio voice procedures, and am aware of several very controlled and restricted vocabolaries, including Procedure words, the ICAO (aviation) phraseologies, and the IMO standard phraseologies Standard Marine Communications Phrases (and its predecessor, Seaspeak).

Should coverage of these things be included in this article, or should I create a new article Restricted vocabulary to cover these separately? PetesGuide (talk) (K6WEB) 16:33, 18 December 2017 (UTC)

External links, again
Why delete those references? If ""Universal Date Element Framework" is seen to be relevant, those links are too. DEddy (talk) 00:42, 17 July 2018 (UTC)


 * The external links section was accumulating WP:LINKSPAM (and this is not a new problem, as the previous comment above from 2013 demonstrates). It is not Wikipedia's purpose to include a lengthy or comprehensive list of external links related to each topic (WP:LINKFARM). Attention to this article is best spent on improving its content by adding inline citations to reliable published sources, as the cleanup tag at the top of the article notes. The comment about Universal Data Element Framework is irrelevant, as that is an internal wikilink, not an external link. Biogeographist (talk) 10:27, 17 July 2018 (UTC)

Sentence needs re-write
Last sentence of 1st para of In library and information science: “In short, controlled vocabularies reduce ambiguity inherent in normal human languages where the same concept can be given different names and ensure consistency.”

The structure of the sentence is ambiguous: is ensuring consistency something that happens when the same concept is given different names, or something that controlled vocabularies do?

I recommend something like “In short, controlled vocabularies help to ensure consistency and reduce the ambiguity inherent in normal human languages where the same concept can be given different names.” or “In short, controlled vocabularies help to ensure consistency, and to reduce the ambiguity inherent in normal human languages where the same concept can be given different names.” instead. Sbauman (talk) 20:50, 2 May 2019 (UTC)