Talk:Prior probability

Jaynes statement
The statement: "For example, Edwin T. Jaynes has published an argument [a reference here would be useful] based on Lie groups that if one is so uncertain about the value of the aforementioned proportion p that one knows only that at least one voter will vote for Kerry and at least one will not, then the conditional probability distribution of p given one's state of ignorance is the uniform distribution on the interval [0, 1]." seems highly improbable, unless Jaynes posthumously thought that the elctorate of the United States was infinite. --Henrygb 21:51, 19 Feb 2005 (UTC)

I think I know what's being referred to here. Jaynes wrote a paper, "Prior Probabilities," [IEEE Transactions of Systems Science and Cybernetics, SSC-4, Sept. 1968, 227-241], which I have reprinted in E. T. Jaynes: Papers on Probability, Statistics and Statistical Physics, Dordrecht, Holland: Reidel Publishing Company (1983), pp. 116-130. On p. 128 of my copy (corresponding to p. 239 of the IEEE paper, I presume) Jaynes, after deriving from a group-theoretic argument the prior $$\theta^{-1}(1-\theta)^{-1}$$, remarks: "The prior (60) thus accounts for the kind of inductive inferences noted in the case of the chemical, which we all make intuitively. However, once we have seen at least one success and one failure, then we know that the experiment is a true binary one, in the sense of physical possibility, and from that point on all posterior distributions (69) remain normalized, permitting definite inferences about $$\theta$$."

The reference to "the chemical" in this excerpt refers to Jaynes' example on the previous page, where he discusses a chemical dissolving or not dissolving in water, with the inference that it will do so reliably if it does so once; only when both cases are observed does one strongly think that the parameter might be on (0,1).

I infer from the passage that Jaynes would say that if we have one success and one failure, then for all other observations (excluding these two), the prior would be flat (after applying Bayes' theorem to these two observations using the prior displayed above).

Parenthetically, the Jeffreys prior, which many feel to be the right one in this case, is $$\theta^{-1/2}(1-\theta)^{-1/2}$$--Billjefferys 18:50, 4 Apr 2005 (UTC)

I have expanded the section on uninformative priors and the references.--Bill Jefferys 18:53, 26 Apr 2005 (UTC)

I have replaced the example about voting with the actual example that Jaynes gave. The two are genuinely quite different: in Jaynes' example the variable over which the prior is defined is in itself an epistemic probability, whereas in the voting example it is a frequency of actual events. Jaynes was pretty hot on the distinction, even going as far as to call confusion between the two a logical fallacy. It must have seemed to that section's original author that one could freely translate between the two, but Jaynes gives plenty of good arguments as to why you can't, and in this case confusing the two made a nonsense of his argument. There are also some unattributed arguments in that paragraph, regarding criticism of Jaynes' prior, which I have flagged. I would be genuinely interested if anyone can track them down.

In the voting case the obvious appropriate prior is not the Jaynes/Haldane prior, nor the Jeffrey's prior nor even the uniform prior, but simply to assume that each voter has a 50% probability of voting for each candidate, independently of other voters. This results in a prior proportional to $$p^N(1-p)^N$$, where $$N$$ is the population of the United States. This prior is heavily centred around $$p=0.5$$.

Nathaniel Virgo (talk) 10:24, 29 June 2010 (UTC)

Exponents reversed?
In the article, the Jeffrey's prior for a binomial proportion is given as $$p^{1/2}(1-p)^{1/2}$$. However, a number of other sources on the internet give a Beta(0.5,0.5) distribution as the prior. But this corresponds to $$p^{-1/2}(1-p)^{-1/2}$$. Similarly, my reading leads me to believe that the Jaynes' prior would be a Beta(2,2) distribution, corresponding to $$p^{1}(1-p)^{1}$$, rather than the negative exponent.

Is it standard to give a prior in inverted form without the constants, or is there some convention I am unaware of? If so, perhaps it would be good to include it in the page. As a novice, I am puzzled by the introduction of, for example $$p^{-1}(1-p)^{-1}$$, for which the integral over (0,1) doesn't even exist, as a prior, as well.--TheKro 12:50, 13 October 2006 (UTC)


 * You're right, the exponents should be $$-1/2$$ in the case of the Jeffreys prior (note, no apostrophe, it's not Jeffrey's). The Jaynes prior is correct; it is an improper prior.


 * I have corrected the article. Bill Jefferys 14:41, 13 October 2006 (UTC)


 * $$p^{-1}(1-p)^{-1}$$ was not original to Jaynes, but that was the form he used. In general the normalizing constant is not vital, as explained in the first part of the improper priors section, since it is implicit for proper priors and meaningless for improper ones.--Henrygb 09:13, 14 October 2006 (UTC)

a priori
Hello all, I found this article after spending quite some time working on the article a priori (statistics). I'm thinking that article should probably be integrated into this one, what do people think? The general article on a priori and a priori (math modeling) are both in need of work, and I thought I would engage some editors from this article into the work. Really, the math modelling article should be integrated as well. jugander (t) 22:02, 14 October 2006 (UTC)


 * Yes I agree, it would make a more complete view on subject. Alfaisanomega (talk) 10:28, 7 December 2010 (UTC)

Haldane prior
"The Haldane prior has been criticized on the grounds that it yields an improper posterior distribution that puts 100% of the probability content at either p = 0 or at p = 1 if a finite sample of voters all favor the same candidate, even though mathematically the posterior probability is simply not defined and thus we cannot even speak of a probability content."

Shouldn't it be "puts 100% of the probability content NEAR either p = 0 or p = 1" because you get a continuous distribution and {0,1} has measure zero? —Preceding unsigned comment added by JumpDiscont (talk • contribs) 17:17, 12 October 2009 (UTC)


 * Such a posterior puts infinite measure on any set of the form (0, &epsilon;) no matter how small &epsilon; is, and finite measure on any set bounded away from 0. Neither "near" nor "at" really captures this. Michael Hardy (talk) 18:29, 12 October 2009 (UTC)


 * Thanks to whoever is contributing to this article, it has helped me understand a lot. I'm still confused on this point about the Haldane prior.  You can't integrate it from zero (because as you said it would be infinite) - so how can it make sense?  At least in the other distributions, I can integrate from zero to, say, .7, and I get the probability that p < .7.  What would be the probability that p < .7 in this case?  I think the paragraph above is trying to address it, but it's still baffling me. Maxsklar (talk) 18:38, 17 July 2010 (UTC)

Acronym
I've found the APP acronym for a priori probability in some works, but I can't find a reference/source for this; sometimes the AAP is used instead. For example, looking for "a priori probability + app" in Google shows differences in usage. What do you think? Alfaisanomega (talk) 10:28, 7 December 2010 (UTC)

I work with Bayesian statistics all the time, but I don't recall ever coming across the acronym "APP." Given that priori and posteriori both start with "p," it seems like it would be a fairly confusing acronym. — Q uantling (talk &#124; contribs) 13:31, 7 December 2010 (UTC)

Diffuse Prior
The term 'diffuse prior' links to this page but does not appear in the article. I think this is the same idea as an uninformative prior, however because I'm not 100% sure I do not want to edit the article. Would someone who knows more like to confirm or deny this? — Preceding unsigned comment added by Mgwalker (talk • contribs) 01:05, 18 October 2011 (UTC)

What is an "uncertain quantity"?
Seems like a weasel word, replace by "unknown quantity"? Is it opposed to a "certain quantity"? The word uncertainty is used a lot in Bayesian statistics, but it's not always illuminating. Biker333 (talk) 11:14, 10 March 2013 (UTC)

Improper priors
This section states that the beta prior with parameters (1,1) - which is uniform - is improper. This is wrong. The distribution has finite support and is finite everywhere, so can be integrated. It's a anyway a beta distribution, which can always be normalized. — Preceding unsigned comment added by 92.74.64.12 (talk) 06:12, 7 July 2016 (UTC)
 * Removed reference to beta(1,1) distribution. – Jt512 (talk) 20:12, 12 March 2017 (UTC)

The paragraph starts with: "If Bayes' theorem is written as ..."

It seems to me that the following formula holds only if $$\sum_j P(A_j) = 1$$ and the events $$A_1, \ldots, A_j$$ are mutually exclusive. Because only then we have $$P(B) = \sum_j P(B \& A) $$. Am I correct? If yes, then I think that this clarification should be added to the paragraph. — Preceding unsigned comment added by 194.126.102.10 (talk) 09:22, 7 March 2014 (UTC)

Preceeding comment was by me. Because no-one commented, I added the change to document. See also https://en.wikipedia.org/wiki/Bayes%27_theorem#Extended_form. — Preceding unsigned comment added by RMasta (talk • contribs) 07:47, 10 March 2014 (UTC)

Problems with first paragraph
The first paragraph said:


 * p is the probability distribution that would express one's uncertainty about p before some evidence is taken into account.

It's circular to say that p expresses ones uncertainty about p. In fact p expresses ones beliefs (which are typically uncertain) about some situation. So, I'll change this to


 * p is the probability distribution that would express one's beliefs about a situation before some evidence is taken into account.

Feel free to improve it, but let's not say p expresses our uncertainty about p.

John Baez (talk) 18:10, 25 March 2015 (UTC)

Improper posterior
The Improper prior section states: "[...] However, the posterior distribution need not be a proper distribution if the prior is improper. This is clear from the case where event B is independent of all of the Aj." I do not see how this makes the posterior improper since the constant term which multiplies the priors is still both in the numerator and denominator of the posterior. Can this be further elaborated on?

Link Mistake? Unconditional Probability linked to Marginal Probability
in the sentence

"Similarly, the prior probability of a random event or an uncertain proposition is the unconditional probability that is assigned before any relevant evidence is taken into account."

unconditional probability goes to marginal probabilities.

My math is not strong as why I am reading this page to understand. so if they are the same please excuse me — Preceding unsigned comment added by Therealmarcuschiu (talk • contribs) 18:42, 20 April 2019 (UTC)

Abominable notation
It's depressing to realize that the benighted notation used throughout this article is widespread in the world. Maybe I'll work on this article further. Michael Hardy (talk) 15:18, 16 June 2019 (UTC)

Weakly informative prior
Could a more international example be found to explain weakly informative prior? I have no feeling at all for Fahrenheit. There must be some units where the US hasn't made up there own standard? 82.39.162.179 (talk) 11:04, 29 July 2020 (UTC)

Short description
The Short description was Probability distribution that would express one's uncertainty before some evidence is taken into account which is 104 characters long and so the article was added to. I have a "reasonable" understanding of stats, but found the article rather difficult to summarise in just a few words. In the end, I shortened the Short description to Distribution of an uncertain quantity, which is 37 characters and so short enough. Is this new Short description "true" enough? — GhostInTheMachine talk to me 10:31, 24 February 2022 (UTC)
 * I would write instead Probability distribution used in Bayesian inference. It's 51 characters, so short enough. We shouldn't try to give a definition, just context. Tercer (talk) 12:50, 24 February 2022 (UTC)

Proposed merge of Strong prior into Prior probability
Would it make more sense to merge in Strong prior and have an expanded article on prior probability altogether? Silvestertaylor (talk) 22:38, 14 July 2023 (UTC)
 * Indeed; a rather obvious case. ✅ Klbrain (talk) 11:51, 27 December 2023 (UTC)