Talk:Shannon (unit)

Rowlett on the shannon
According to Rowlett's Dictionary of Units, the shannon "was originally called the bit [2], because when the message is a bit string and all strings are equally likely, then the information content turns out to equal the number of bits." I think this justifies adding a statement explaining that under the specified conditions, the information content is equal to the number of bits. Do others agree? Dondervogel 2 (talk) 20:47, 18 May 2015 (UTC)


 * Yup, I do. Especially with the "under the specified conditions", or equivalently, with the implication that the equivalence does not hold in general.  Until we get more references, that should do.  —Quondum 21:43, 18 May 2015 (UTC)
 * It is also interesting to note that it clearly distinguishes the two uses of "bit", such that it is clear that one is a unit of an amount of data, and one is a unit of information, i.e. that data and information are not equivalent quantities. —Quondum 21:47, 18 May 2015 (UTC)
 * Yes, I reckon that would make a useful addition to the bit article. For here though, Rowlett's full definition of the Shannon reads "a unit of information content used in information and communications theory. The definition is based on the idea that less-likely messages are more informative than more-likely ones (for example, if a volcano rarely erupts, then a message that it is erupting is more informative than a message it is not erupting). If a message has probability p of being received, then its information content is -log2 p shannons. For example, if the message consists of 10 letters, and all strings of 10 letters are equally likely, then the probability of a particular message is 1/2610 and the information content of the message is 10(log2 26) = 47.004 shannons. This unit was originally called the bit [2], because when the message is a bit string and all strings are equally likely, then the information content turns out to equal the number of bits. One shannon equals log10 2 = 0.301 hartley or loge 2 = 0.693 147 nat. The unit is named for the American mathematician Claude Shannon (1916-2001), the founder of information theory." Dondervogel 2 (talk) 22:25, 18 May 2015 (UTC)
 * How about adding this sentence, citing Rowlett: "If a message is made of a sequence of bits, with all possible bit strings equally likely, the message's information content expressed in shannons is equal to the number of bits in the sequence." Dondervogel 2 (talk) 22:32, 18 May 2015 (UTC)
 * That sentence is good, but I think we need more. We probably need to say that the shannon is commonly referred to as the bit, though this in this meaning it is not equivalent to the bit as a unit of data.  This common usage and the distinction might be worth a section. —Quondum 23:09, 18 May 2015 (UTC)
 * A sentence that indicates that a "shannon" is more commonly called a "bit" (despite that the latter has another meaning as well) is what I hope to see. Leegrc (talk) 17:46, 19 May 2015 (UTC)
 * What I take from Rowlett's definition is that the shannon used to be called a bit, but that the word "bit" now has a new meaning (as a unit of computer storage) that is not synonymous with "shannon". Dondervogel 2 (talk) 18:39, 19 May 2015 (UTC)
 * Rowlett's also says "bit (b) [2]: a unit of information content, now known also as the shannon. In information and communications theory, if a message has probability p of being received, then its information content is -log2 p shannons. This unit is often called the bit, because if the message is a bit string and all strings are equally likely to be received, then the information content is equal the number of bits." Note the use of "also".  Also, there is still the unavoidable fact that the usage of "bit" as a unit of information significantly dominates the use of "shannon".  Leegrc (talk) 19:17, 19 May 2015 (UTC)
 * I still see no evidence to support the claim that the Shannon is known (today) as a bit. Dondervogel 2 (talk) 20:17, 19 May 2015 (UTC)
 * We seem to be in one of those transition periods, where a new convention has been motivated and is being formally adopted. The benefits after the transition are clear, but in the interim there is a lot of resistance to the change, and a lot of use of the old term.  The use of the unit bit can be seen in this meaning in almost every text on information theory.  Finding texts that use the shannon seems to be a challenge.
 * Claude Shannon introduced the name "bit" and defined it as a unit of data, and then expressed information entropy in the same units. I've just been scanning his 1948 paper, and he defines it as a unit of data (though he uses the word "information" to mean what I'm calling data); he then proceeds to (in only two cases that I noticed) incidentally use the unit in relation to entropy.  He does not define it as a unit of entropy, as far as I can tell, and he makes absolutely no case for it as a unit of entropy.  He is quite sloppy about the use of the words "information" and "bit".  I found this, a whole chapter on the topic of quantifying and naming the unit (without the shannon being mentioned).  It is interesting to note that despite a serious attempt, all it succeeds in showing is that regarding a unit of data and a unit of entropy as the same is decidedly difficult to justify clearly, to the extent that one would conclude: "Why not just define a unit of information (and of information entropy) and be done with it?".
 * Anyhow, I can see a simple mention of the original/alternative/dominant name (bit) in the lead, plus a section on the history of the unit. But this article should also mention why it has been chosen to replace the bit as a unit, even though the transition is still at a very early point. —Quondum 20:33, 19 May 2015 (UTC)
 * If the papers mostly use "bit", it cannot be difficult to find one to support the claim. Can we add a couple of examples to the reference list? Dondervogel 2 (talk) 21:10, 19 May 2015 (UTC)

Do we have much evidence that a "transition" from bit to shannon is underway? One possible citation for this transition is the document known as IEC_80000-13, but the source is behind a pay wall, which is rather unusual for adopted standards in my experience. Leegrc (talk) 14:00, 20 May 2015 (UTC)


 * The first reference below (Lewis) is evidence that it occurs in print, FWIW. I should not claim that it is "underway"; that sort of implies the path is fairly deterministic or at least that it has some momentum, which I cannot comment on.  Unusual?  That seems to depend on the organization.  Many standards that I've wanted have been behind very high paywalls.  Weird ("let's severely limit access to our standards, that way they're sure to be adopted"), but nevertheless.  —Quondum 16:22, 20 May 2015 (UTC)

List of references
This section is intended for accumulation of references that might be used for the article. Anyone can add to it. —Quondum 21:54, 19 May 2015 (UTC)


 * Geoff Lewis (2013), Communications Technology Handbook, ch. 16, p.150 − binit+bit+shannon, nat, decit+hartley
 * Emmanuel Desurvire (2009), Classical and Quantum Information Theory, §6.1, p. 84 – bit/symbol, nat/symbol
 * Francis T.S. Yu (2000), Entropy and Information Optics, – bit, nat, hartley
 * Aiden A. Bruen, Mario A. Forcinito (2011), Cryptography, Information Theory, and Error-Correction, §10.2, p. 179, − "Shannon bit" (!)
 * Tammy Rush (2015), Entropy 181 Success Secrets – 181 Most Asked Questions On Entropy – What You Need To Know, – shannon, nat, hartley (this may have been sourced from WP, maybe don't use it).
 * Sang-Hun Lee and Randolph Blake (1999), Visual Form Created Solely from Temporal Structure, Science 14 May 1999: 284 (5417), 1165-1168. [DOI:10.1126/science.284.5417.1165] "the information (expressed in binary units: bits) ..."

Removal of mention as unit of entropy
Just because one standard does not list entropy as one of quantities that is measured using the shannon as unit does not necessary justify of its mention. The Shannon entropy of a random variable is simply the expected value of the information for the event, and thus inherently has the same unit; IEC 80000-13 is thus sufficient (from the perspective of a standard) to imply the removed statement, given the standard system of quantities. A citation-needed template would, however, be appropriate: a suitable reference should be given for the reader to be able to dig deeper on this. —Quondum 12:38, 22 May 2021 (UTC)
 * Point taken. I reinstated and then rearranged the opening sentence to avoid giving the impression the unit of entropy was defined (or even mentioned) by IEC 80000-13. Dondervogel 2 (talk) 14:05, 22 May 2021 (UTC)
 * I see what your objection was now. I agree that such an impression should be avoided.  —Quondum 17:46, 22 May 2021 (UTC)

Putting words in the source's mouth?
The lead currently says "One shannon is defined by IEC 80000-13 as the information content of a single binary symbol (two possible states, say "0" and "1") when the a priori probability of either state is $1⁄2$." I am a bit dubious that this is how it is really defined in the standard. It would more likely be defined as the negative logarithm of the a-priori probability of a given event occurring, with the base of the logarithm determining the unit. I do not have the reference to hand, though. —Quondum 03:55, 20 November 2022 (UTC)


 * The verbatim definition of shannon (from IEC 80000-13:2008) is "value of the quantity when the argument is equal to 2". It then goes on to state 1 Sh ≈ 0,693 nat ≈ 0.301 Hart. The corresponding definition of hartley is "value of the quantity when the argument is equal to 10". Dondervogel 2 (talk) 09:37, 20 November 2022 (UTC)
 * Yes, I realise that's cryptic, so I dug a little further. This definition of shannon is given in the context of the definition of information content, I(x), which is given in the form of the equation 'I(x) = lb(1/p(x)) Sh = lg(1/p(x)) Hart = ln(1/p(x)) nat, where p(x) is the probability of event x'. This definition of shannon seems circular to me, but at least it's self-consistent. Dondervogel 2 (talk) 10:05, 20 November 2022 (UTC)
 * If "'I(x) = lb(1/p(x)) Sh = lg(1/p(x)) Hart = ln(1/p(x)) nat" is treated as a definition of all three units, and "value of the quantity when the argument is equal to 2" as explanation/clarification, there would be nothing circular. Would this be a fair way to describe these quotes in the context of IEC 30000-3:2008?  If so, I can reword it.
 * A problem with the current wording in the article is that the description fits a definition of entropy more closely than of information, which does not match the IEC definition you have just quoted. It also assumes too much context (specifically, a well-understood definition of "information content"), which is something we should not do in a lead.
 * As an aside, as you've quoted it, there is nothing to suggest that information is not an independent metrological dimension, which I like: as with the debate about the dimension of angle, this should not be pre-judged. —Quondum 16:04, 20 November 2022 (UTC)
 * No, that is not my interpretation of the IEC definition. The formula you quote is the definition of information content, not of its unit(s), while the definition of shannon is 'value of the quantity when the argument is equal to 2'.
 * A less cryptic version of the same definition might be something like 'information content associated with an event when the probability of the event occurring is 1/2'.
 * Dondervogel 2 (talk) 20:45, 20 November 2022 (UTC)
 * I've edited the lead to be "more correct", but from what you say here, I probably do not describe what the IEC defines correctly. We should probably tweak that definition more.  —Quondum 21:26, 20 November 2022 (UTC)
 * Tweaked according to your "less cryptic version". See what you think.  —Quondum 21:39, 20 November 2022 (UTC)
 * It looks good to me, but it bothers me that it relies so heavily on one individual's interpretation (mine). On the other hand, how else is one supposed to interpret "the quantity" and "the argument"? I just wish William Taft had had a say in finalising IEC 80000-13. Dondervogel 2 (talk) 21:52, 20 November 2022 (UTC)
 * Don't fret. Strive for Taft's principle in writing and read between the lines on reading.  I emphasize more coherent or useful interpretations.  It is all a calculated gamble in the interest of greater clarity.
 * I see that IEC 80000-13:2008 was reconfirmed. It will be interesting to see whether IEC 80000-15 addresses any of this.  —Quondum 00:55, 21 November 2022 (UTC)

Summary of verbatim definitions
Here's a summary of what we have so far. I will add to it when I find the time. Dondervogel 2 (talk) 21:42, 20 November 2022 (UTC)
 * information content (I(x)): I(x) = lb(1/p(x)) Sh = lg(1/p(x)) Hart = ln(1/p(x)) nat, where p(x) is the probability of event x
 * joint information content (I(x,y)): I(x,y) = lb(1/p(x,y)) Sh = lg(1/p(x,y)) Hart = ln(1/p(x,y)) nat, where p(x,y) is the joint probability of events x and y
 * shannon (Sh): value of the quantity when the argument is equal to 2
 * hartley (Hart): value of the quantity when the argument is equal to 10
 * natural unit of information (nat): value of the quantity when the argument is equal to e