Talk:Maximum a posteriori estimation

MAP is equivalent to MDL
The maximum a posteriori estimation method is equivalent to the minimum description length principle, perhaps the article could mention that. 134.184.26.154 (talk) 10:32, 10 October 2017 (UTC)

a priori
The page says: "The method of maximum a priori estimation then estimates θ as the mode of the posterior..."

Should it read, "The method of maximum a POSTERIORI estimation then estimates θ as the mode of the posterior..."?


 * Absolutely, that was a lapse brought about by the fact that maximum a posteriori incorporates a prior. Thanks for pointing that out. By the way: When you feel that a change is needed to an article, it would be best to go ahead and make it yourself. Wikipedia is a wiki, so anyone is able to edit an article by following the  link. You do not even need to log in, although there are some reasons why you might like to.


 * Wikipedia convention is to be bold. You do not need to be afraid of making mistakes.  If you are not sure how editing works, have a look at How to edit a page, or try out the Sandbox to test your editing skills.  New contributors are always welcome. Cheers, --MarkSweep 02:03, 21 July 2005 (UTC)

A number of external sources use the term "maximum a priori", e.g.. Is this a synonym, a mistake, or a different concept? Cesiumfrog (talk) 23:12, 18 August 2016 (UTC)

ML vs conditional prob
ML(A | B) = conditional Prob ( B | A)

the math works out a little later then the Prob ( B | A ) is replaced by its Bayesian equivalent. - NAC (I forgot to sign in)

Example
shouldn't it read:


 * $$\pi(\mu) L(\mu) = \frac{1}{\sqrt{2 \pi} \sigma_m} \exp\left(-\frac{1}{2} \left(\frac{\mu}{\sigma_m}\right)^2\right) \prod_{j=1}^n \frac{1}{\sqrt{2 \pi} \sigma_v} \exp\left(-\frac{1}{2} \left(\frac{x_j - \mu}{\sigma_v}\right)^2\right),$$

Mantepse (talk) 10:07, 7 January 2008 (UTC)

Difference between Bayesian methods and MAP estimate
I don't understand this, and if I understand it, it's a bad example:


 * As an example of the difference between Bayesian methods and using an MAP estimate, consider the case where we need to classify inputs $$x$$ as either positive or negative (for example, loans as risky or safe). Suppose there are just three possible hypotheses about the correct method of classification $$h_1$$, $$h_2$$ and $$h_3$$ with posteriors 0.4, 0.3 and 0.3 respectively. Suppose given a new instance, $$x$$, $$h_1$$ classifies it as positive, whereas the other two classify it as negative. Using the MAP estimate for the correct classifier $$h_1$$, we classify $$x$$ as positive, whereas the Bayesian would average over all hypotheses and classify $$x$$ as negative.

What do that posteriors mean? If it's no coincidence that they add up to 1, are they the probabilities of the respective methods being "the correct method"? In that case, why are they called "posteriors"? And also, how could one know these probabilities? In this case I would look for a more practically likely example.

If it's a coincidence that their sum is 1, are they the probabilities of the prediction of the respective method being correct? This data is very likely available, but why are they smaller than 0,5? Is this data enough for a Bayesian analysis? This is an interesting problem, but I don't see a connection with MAP estimation.

Marcosaedro (talk) 21:39, 13 February 2011 (UTC)