Talk:Minimum message length

This is complete nonsense -- blanking. User:194.51.2.34


 * Care to correct the two references, then, since you claim to understand it better? One is in philosophy of science.  The other is on a talk page. User:142.177.94.99

I was removing only your usage, which is idiosyncratic -- the material is well-covered at Occam's razor and the AIT article. 194.51.2.34 13:30, 22 Aug 2003 (UTC)

Should it be redirected to one of them, then? -- Oliver P. 13:35, 22 Aug 2003 (UTC)


 * Yes. This sat obviously for a long time without being filled, so I just backfilled it when I noticed.  I have no opinion about the material here, and maybe oversimplified it since I didn't care.  By all means redirect it.  Just don't leave links sitting open or blank pages.  User:142.177.94.99

Well, actually, leaving links sitting open is what the Wikipedia is all about! It's a wiki, you see. It's how we attract new content on new subjects. :) Now, where should it be redirected to...? -- Oliver P. 14:09, 22 Aug 2003 (UTC)


 * To Occam's Razor, if it is indeed a restatement of that principle. The issue raised with MML is also an issue with OC - all else is never equal, and, one can't ignore the foundations of measurement which gave rise to the data, bits, signal. User:142.177.94.99


 * I'm very late to that discussion. But a key point about MML is that it can compare when things are not equal. --crt 11:34, 28 Sep 2004 (UTC)

I notice the link to the Needham & Dowe article links to Ockham's Razor, but does not link to the wiki page Occam's Razor. Perhaps it should do that instead, with a possible extra link at OR? And is that article the best in-print counterargument to anti-parsimony claims? --crt 11:34, 28 Sep 2004 (UTC)

Posterior?
"Therefore the posterior is always a probability, not a probability density."

What is a posterior? In this case, I know it isn't the common sense of a buttock. --Damian Yerrick 03:10, 8 Dec 2004 (UTC)

Merge MDL and MML?
I was under the impression that MDL and MML are significantly different (at least to people within the field), with the same order of significance as say the difference between best-first search and A*. I shall ask people who know these things to comment. njh 06:36, 22 December 2005 (UTC)

C.S. Wallace describes the difference as follows (Statistical and Inductive Inference by Minimum Message Length, p 408): "MDL differs from MML in two important respects. First, Rissanen wishes to avoid use of any "prior information". His view is very much non-Bayesian. Second, given data x, MDL aims to infer the model class Theta_k [i.e. model complexity, number of model parameters] which best explains the data, but does not aim to infer the parameter vector theta_k [i.e. actual values of parameters] except (in some versions) as a step involved in inferring the class." --pfh, 26 Dec 2005


 * Yes, Rissanen was very blunt on separating MDL from Bayesianism (Personal communication, 1991 :-) I'll see if I can find my hardcopies of his papers. User:Ejrrjs says What? 21:43, 4 January 2006 (UTC)

Merge MDL and MML?
That's my understanding too, "MDL and MML are significantly different". Non-B v. B., class v. model, not infer v. infer parameters. --LA, 12 January 2006.

External links modified (February 2018)
Hello fellow Wikipedians,

I have just modified one external link on Minimum message length. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20081216122608/http://bjps.oxfordjournals.org/cgi/content/abstract/axm033v1 to http://bjps.oxfordjournals.org/cgi/content/abstract/axm033v1

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 02:24, 1 February 2018 (UTC)

On definitions
I do not agree with the statement:

Shannon's A Mathematical Theory of Communication (1948) states that in an optimal code, the message length (in binary) of an event $$E$$, $$\operatorname{length}(E)$$, where $$E$$ has probability $$P(E)$$, is given by $$\operatorname{length}(E) = -\log_2(P(E))$$.

When a code is designed such that $$\operatorname{length}(E) = -\log_2(P(E))$$, we get what is called "Shannon Code", which is generally not optimal. Also, I did not find this statement in Shannon's paper. Corrado Mencar (talk) 10:29, 13 December 2023 (UTC)