Talk:Markov chain Monte Carlo

Too complicated
In the introduction, second paragraph says:"A good chain will have rapid mixing—the stationary distribution is reached quickly starting from an arbitrary position—described further under Markov chain mixing time." It should be wrong since rapid mixing is the ability of the algorithm to go through the whole stationary distribution support in a reasonable time, and it has nothing to do with a rapid convergence towards the stationary distribution. —Preceding unsigned comment added by 93.148.151.6 (talk) 13:11, 12 July 2010 (UTC)

the first paragraph needs to be dumbed down for the naive reader. what is this exactly? :) Kingturtle 07:27 30 May 2003 (UTC)


 * I'll work on it. Note that I didn't actually write it -- it was moved from Monte Carlo method. -- Tim Starling 07:36 30 May 2003 (UTC)


 * This still needs to be simplified so that someone unversed in statistics, or whatever this is, can understand it. Patadragon 20:47, 27 November 2006 (UTC)

It is incredibly complicated still. Does no have any idea what this article is actually about? It is seven years since the first user pointed it out for Pete's sake! 62.198.103.144 (talk) 03:59, 1 July 2010 (UTC)


 * Some things just are complicated. End of Story. -- Derek Ross | Talk'' 21:40, 4 April 2011 (UTC)

I agree - this introduction is gratuitously complicated. It takes a real expert and a bit of work to make it clear and simple. — Preceding unsigned comment added by 98.234.101.123 (talk) 18:05, 3 January 2013 (UTC)


 * Not end of story, sorry. For starters, try a History or Motivation section. An Examples section, maybe. If it was so incredibly complicated, why can so many people use it? ᛭ LokiClock (talk) 08:06, 3 June 2011 (UTC)

"rejections sampling" is no random walk algorithm
The list of random walk algorithms at the bottom of the page lists "rejection sampling" as the first entry. Is this really correct? As far as I know "rejection sampling" is just a technique to sample from a given distribution and is typically not connected to random walk methods. -- Jochen

I think what is described here as Rejection sampling is in fact correctly described as Importance sampling, and it is a key component of MCMC. -- Blaise F Egan

But does this make it a "random walk algorithm"? Where is the random walk? Ok, I see that it was modified to be a little bit close to random walks. Then the most suspicious one would be "slice sampling"? Is this somehow a "random walk algorithm", too? --Jochen 16:51, 27 Feb 2005 (UTC)

Slice sampling is a Markov Chain Monte Carlo method: it defines a Markov chain which leaves a desired distribution invariant; successive steps are correlated but assymptotically come from the correct distribution. Rejection sampling draws independent, exact samples. There is no dependence on the previous state, so it isn't really a "walk". Adaptive rejection sampling (ARS) has this property also, although it may be used within an MCMC method. For example ARS is an important part of the BUGS Gibbs sampling package, but this would be better put on the Gibbs sampling page. 128.40.213.241 13:43, 23 August 2005 (UTC)

Rejection sampling, Importance sampling, and MCMC algorithms are all different things. They are all Monte Carlo sampling techniques, nevertheless. See:

C. Andrieu, N. De Freitas, A. Doucet, M. I. Jordan. "An Introduction to MCMC for Machine Learning". Machine Learning, 50, 5-43, 2003. --129.82.47.113 20:33, 3 November 2005 (UTC)

Hybrid Monte Carlo (Would be better called `Hamiltonian Monte Carlo')
"Would be better called `Hamiltonian Monte Carlo'"? How is that encyclopedic? —Preceding unsigned comment added by 90.193.94.242 (talk) 01:38, 13 December 2007 (UTC)

Introduction lost in too many links
It remains a bit unclear to a non-expert on bayesian statistics like myself what MCMC algorithms are used for in the numerical practice. I.e., on what kind of problems (examples!) are they used and what is the exact benefit from it (getting a probability distribution, optimizing parameters etc.)? The heavy usage of expert terms like equilibrium distribution etc. right in the introduction may provide a large amount of information, but makes very hard, if not impossible to a non-expert to get the point. The need to klick oneself through several linked articles kills any useful introduction since one has first to read and understand all these other articles, keep them in mind synchronously and finally combine all that stuff kept in mind with the text in this article. Except for supermind candidates I won't expect many peoble, even with scientific expertise, to manage this, unless they already know what a Markov Chain/MCMC algorithm is ment to be...--SiriusB (talk) 13:44, 13 May 2009 (UTC)

How to use MCMC for multidimensional integration?
The paragraph starting with "The most common application of these algorithms is numerically calculating multi-dimensional integrals." is not clear. The use of MCMC for integration is not trivial. Could somebody provide examples and references here for this? —Preceding unsigned comment added by 130.126.118.14 (talk) 03:53, 12 October 2010 (UTC)

Desired distribution
"The state of the chain after a large number of steps is then used as a sample of the desired distribution." Is the value matched to a point in the distribution with that same value? What does matching the desired distribution do? ᛭ LokiClock (talk) 06:00, 22 May 2011 (UTC)


 * What I'm saying is, if you're creating the distribution out of these samples, then the distribution depends on the state of the chain, not your desires. That is, unless the actual result is the mapping from the states of the chain to a distribution. ᛭ LokiClock (talk) 03:55, 6 November 2011 (UTC)

"A Markov chain is constructed in such a way as to have the integrand as its equilibrium distribution."
The Markov chain, so constructed, is at the heart of the sampling process. Merely stating the goal of the construction is relatively uninformative. At least link to an article on how Markov chains are constructed so as to meet the goal. Jim Bowery (talk) 17:36, 22 December 2017 (UTC)

"Typically, Markov chain Monte Carlo sampling can only approximate the target distribution, as ..."
Folks, I gotta comment on this. I'm sincerely glad that every word of the current revision of the accompanying article has been contributed, even tho IMO it cannot be turned into a decent article here: it's in the public domain now, (as are all the previous revisions) and it can be improved in accuracy and clarity (via wikis or not, and whether any improved versions exist in public domain or not). I assume it's not accurate as it stands, but it's probably useful as a starting point for those qualified to improve it and interested in doing so.
 * "Typically, Markov chain Monte Carlo sampling can only approximate the target distribution, as there is always some residual effect of the starting position. More sophisticated Markov chain Monte Carlo-based algorithms such as coupling from the past can produce exact samples, at the cost of additional computation and an unbounded (though finite in expectation) running time". Jerzy•t 00:58, 23 October 2018 (UTC)

random walk monte carlo
it is possible to know where the expression "random walk monte carlo", used to describe metropolis-hastings algorithm, comes from? I'm asking because it doesn't exist a clear aknowledged definition of what a random walk is (contrary to markov chains, for which a simple definition is provided) and describing the metropolis algorithm as a "random walk monte carlo" can be a little confusing: one generally thinks about random walks as being martingales, while the process created by a metropolis algorithm is not. if we go further to consider the process of a metropolis-hastings algorithm (which generalize metropolis one), I find quite impossible to think it as a random walk at all. I'm planning to rewrite the whole part of the article that describes those algorithms deleting the reference to these so called "random walks monte carlo". Ppong.it (talk) 13:50, 26 December 2018 (UTC)

Add a section on the parallel stretch move algorithm
As I understand it, the stretch move algorithm by Christen (2007), Goodman & Weare (2010), and the parallel stretch move algorithm implemented in emcee provide a variety of practical efficiency and convenience benefits for affine-invariant ensemble sampling, and would be worth covering in the article. THey are discussed in https://iopscience.iop.org/article/10.1086/670067 among other places. ★NealMcB★ (talk) 21:07, 5 April 2022 (UTC)