Talk:Posterior probability

Untitled
Can anybody comment on the difference between posterior probability, likelihood function and conditional probability? It seems to me the difference relies on which variable is treated as random while the other treated as fixed. From the book "Applied multivariate statistical analysis" by Richard A. Johnson & Dean W. Wichern, there seems to be no major difference between posterior probability and likelihood function. On page 639, it obviously implies that observation is fixed while parameter is random for a posterior probability, on page 178, it explicitly defines that "the expression considered as a function of mu and sigma for the fixed set of observations is called likelihood". Thus it seems to me these two say the same thing but bear different names. The wiki page for likelihood function, http://en.wikipedia.org/wiki/Likelihood_function, is a bit confusing for those (like me) who are not familiar with the notations; and the wiki page for posterior probability, http://en.wikipedia.org/wiki/Posterior_probability, explicitly says posterior = prior * likelihood function and likelihood function = p(x|theta). I guess the writer actually means that the likelihood function equals in magnitude, but not defined as p(x|theta).

I am not an expert in this field, thus dare not to make modifications. Can anybody who really knows these concepts update the context to address my concerns and give explicit relationship between these quantities so that laymen can easily clean their minds? Thanks! —Preceding unsigned comment added by True bsmile (talk • contribs) 07:04, 17 June 2010 (UTC)

I hate to complain without trying to fix it, but I'm not in a position to right now. The definition on this page needs help. An expression of proportionality is not a definition: a definition needs to have an equals sign in it. It seems like the full definition could be easily derived from Bayes's rule, but I leave this to the experts. -Nonstandard (talk) 21:52, 5 August 2011 (UTC)

About the example
How was the value in the first example - P(A|B) = 1/3 - derived?

I tried to recreate it using Bayes' Theorem: P(A|B) = P(B|A)*P(A) / P(B)

I took:

P(B|A) = (1/2)*(1/2) = (1/4) P(A) = (1/2) P(B) = (1/2)*(1/2) + (1/2)*1 And I got

P(A|B) = (1/4)*(1/2) / (1/4 + 1/2) = (1/8) / (3/4) = 1/6

Did I make a mistake anywhere?

193.40.37.71 (talk) 09:28, 27 February 2009 (UTC) Siim

yes, you made a mistake. P(B|A) = 1/2. If A happens, you flip a fair coin so in that case you get B with 1/2. —Preceding unsigned comment added by 68.198.48.12 (talk) 15:01, 21 March 2009 (UTC)

The mistake is in the wording of the example. Your error is in specifying the coin in which we have knowledge is a heads. Let me explain this visually:

1.HH 2.HT 3.TH 4.TT

These are the possible results of a double coin flip, each with an equal probability of 1/4. Knowledge that the second coin flip resulted in heads eliminates the possibility of #2 and #4occuring. The remaining outcomes HH, and TH both have an equal probability of occuring, thus the probability of the first coin flip being heads is still 1/2. The prior knowledge that landing two heads in a row is less likely than landing a head and tail (in either order) made you overlook that the latter probability is infact an amalgamation of two independant events. A correct wording would be: A friend flips two coins and tells you that one of them is heads. What is the probability that the other is also heads? Answer: none. This is an impossible situation because if you're here reading this, you have no friends. Haha, kidding, but what the hell am I doing here when I have an exam to study for? —Preceding unsigned comment added by 99.244.50.68 (talk) 10:34, 30 April 2009 (UTC)

Posterior probability example
The example should probably include the word "posterior probability" in it somewhere, so that the completely uninitiated (the audience of this page) don't have to make tenuous inferences about how the example applies. 24.218.111.172 (talk) 13:59, 18 May 2013 (UTC)

Mathematical expert required
This page needs the input of an expert. It states that the posterior probability distribution can be calculated with Bayes' theorem. From the book "Bayesian Nonparametrics" by Ghosal it can be understood that it is possible to have a well-defined a posterior (for e.g. the Dirichlet process) while Bayes' theorem does not apply. The theorem applies namely that the posterior is absolutely continuous with respect to the prior distribution. In the words of Ghosal: "The rough argument is that when the prior excludes a region, the posterior, obtained by multiplying the prior with the likelihood and normalizing, ought to exclude that region." Very informally, you can't create non-zero probabilities out of prior zero probabilities by multiplication.

An example without infinite dimensional parameters can be found in Theory of Statistics by Schervish. In example 1.36 Bayes' theorem does not apply. Quoted here verbatim from the book.

"Example 1.36. As an example in which Bayes' theorem does not apply, consider the case in which the conditional distribution of $$X$$ given $$\Theta = \theta$$ is discrete with $$P_\theta(\{\theta-1\})=P_\theta(\{\theta+1\})=1/2$$. Suppose that $$\Theta$$ has a density $$f_\Theta$$ with respect to Lebesgue measure. The $$P_\theta$$ distributions are not all absolutely continuous with respect to a single $$\sigma$$-finite measure. It is still possible to verify that the posterior distribution of $$\Theta$$ given $$X = x$$ is the discrete distribution with


 * $$Pr(\Theta=x-1|X=x) = \frac{f_\Theta(x-1)}{f_\Theta(x-1) + f_\Theta(x+1)}$$,

and $$Pr(\Theta=x+1|X=x) = 1 - Pr(\Theta=x-1|X=x)$$. Note that the posterior is not absolutely continuous with respect to the prior."

Thus, (1) that Bayes' theorem can be applied is actually incorrect and (2) it is possible to have a well-defined posterior distribution (see example) even if Bayes' theorem does not apply. Anne van Rossum (talk) 12:21, 27 September 2020 (UTC)