Wikipedia:Reference desk/Archives/Mathematics/2009 October 25

= October 25 =

Hierarchical probability model
I had a homework question and I already did it and I think I got it right since it sort of tells you what the answer is going to be. But, I'm not sure I understand one step in there which is the crucial one.

I have a three-stage model with distributions $$Y | N \sim bin(N, p) \qquad N | \Lambda \sim Poisson(\Lambda) \qquad \Lambda \sim gamma(\alpha, \beta)$$ and I am supposed to find the marginal distribution of Y. Well, my first thought was to find the marginal of N using the conditional on N and the marginal of $$\Lambda$$. However, that led to a very messy sum and I wasn't sure what to do with it. So, instead I tried finding the conditional of $$Y | \Lambda$$. What I did seemed to work out but I'm not exactly sure why. I did $$f(y | \lambda) = \sum f(y | n) f(n | \lambda)$$.

So, my question is pretty simple. Is that equation true? Assuming it is, the rest of the problem seemed to work out. Thanks for any help. StatisticsMan (talk) 02:31, 25 October 2009 (UTC)


 * I'm not an authority on probability, but I don't think that's true in general since it assumes that Y depends only on N and it's not otherwise affected by Λ. You would need to replace f(y|n) with f(y|n,λ).  It works out here since you sum over all the λ's, but doesn't it get you the same result as doing it by finding the marginal of N?
 * $$f(n) = \sum_{\lambda} f(n|\lambda)f(\lambda)$$
 * $$f(y) = \sum_n f(y|n)f(n) = \sum_n f(y|n)(\sum_\lambda f(n|\lambda)f(\lambda)) = \sum_n \sum_\lambda f(y|n)f(n|\lambda)f(\lambda)$$
 * Rckrone (talk) 17:03, 26 October 2009 (UTC)

You've said:
 * $$Y | N \sim \text{bin}(N, p) \qquad N | \Lambda \sim \text{Poisson}(\Lambda) \, $$
 * $$Y | N \sim \text{bin}(N, p) \qquad N | \Lambda \sim \text{Poisson}(\Lambda) \, $$

and that is enough to entail that
 * $$Y | \Lambda \sim \text{Poisson}(p\Lambda) \, $$
 * $$Y | \Lambda \sim \text{Poisson}(p\Lambda) \, $$

by doing a bit of algebra (you need to remember the power series for the natural exponential function). To be continued if I'm not too busy..... Michael Hardy (talk) 19:46, 26 October 2009 (UTC)

First use the law of total probability, then some algebra:

\begin{align} & \Pr(Y = y \mid \Lambda) \\[10pt] & = E(\Pr(Y = y \mid N)\mid \Lambda) = E\left( \binom{N}{y} p^y (1 - p)^{N - y} \mid \Lambda \right) \\[10pt] & = \left(\frac{p}{1-p}\right)^y E\left( \binom{N}{y} (1-p)^N \mid \Lambda \right) \\[10pt] & = \left(\frac{p}{1-p}\right)^y \sum_{n=0}^\infty \binom{n}{y} (1-p)^n \frac{\Lambda^n e^{-\Lambda}}{n!} \\[10pt] & = \left(\frac{p}{1-p}\right)^y \sum_{n=y}^\infty \binom{n}{y} (1-p)^n \frac{\Lambda^n e^{-\Lambda}}{n!}\quad\left(\text{since }\binom{n}{y} = 0\text{ if }n < y\right) \\[10pt] & = \left(\frac{p}{1-p}\right)^y \sum_{m=0}^\infty \binom{m+y}{y} (1-p)^{m+y} \frac{\Lambda^{m+y} e^{-\Lambda}}{(m+y)!}\quad \left(\text{by the substitution }m = n-y\right) \\[10pt] & = p^y \Lambda^y e^{-\Lambda} \sum_{m=0}^\infty \binom{m+y}{y} (1-p)^m \frac{\Lambda^m}{(m+y)!} \\[10pt] & = p^y \Lambda^y e^{-\Lambda} \sum_{m=0}^\infty \frac{\left((1-p)\Lambda\right)^m}{y!m!} \\[10pt] & = \frac{p^y \Lambda^y e^{-\Lambda}}{y!} \sum_{m=0}^\infty \frac{\left((1-p)\Lambda\right)^m}{m!} \\[10pt] & = \frac{p^y \Lambda^y e^{-\Lambda}}{y!} e^{(1-p)\Lambda} \\[10pt] & = \frac{(p\Lambda)^y e^{-p\Lambda}}{y!}. \end{align} $$

...and there you have the Poisson distribution with expected value p&Lambda;. Michael Hardy (talk) 20:16, 26 October 2009 (UTC)

... next step. Now that we've established that we need to use the distribution of &Lambda; to find the marginal distribution of Y. So:
 * $$ Y \mid \Lambda \sim \text{Poisson}(p\Lambda), \, $$
 * $$ Y \mid \Lambda \sim \text{Poisson}(p\Lambda), \, $$

\begin{align} \Pr(Y = y) & = E(\Pr(Y = y \mid \Lambda)) = E\left( \frac{(p\Lambda)^y e^{-p\Lambda}}{y!} \right) \\ & = \int_0^\infty \frac{(p\lambda)^y e^{-p\lambda}}{y!}\, f_\Lambda(\lambda)\, d\lambda, \end{align} $$

and then you need to put in the gamma density and evaluate the integral. If you know how to find the normalizing constant in the gamma density, plus a bit of algebra, then this one's pretty straightforward.

But you haven't yet told us which of the two commonplace parametrizations of the gamma distribution you're using. Michael Hardy (talk) 20:24, 26 October 2009 (UTC)
 * I'm still fairly sure that you can't find Y|Λ from only N|Λ and Y|N except in some exceptional cases. For example suppose a bunch of fair coins are flipped and Y, N, and Λ each measure the flip of one of the coins (not necessarily distinct).  Suppose f(n|λ) is always 1/2 or in other words N and Λ are measuring different coins, f(y|n) is always 1/2 so Y and N are different coins.  You can't say from this information whether Y and Λ measure the same coin or not.  You would need Y|N,Λ instead of Y|N to say, but you don't have that. Rckrone (talk) 20:45, 26 October 2009 (UTC)
 * If you know the distribution of B given A, and that of C given A and B, then you can find the marginal distribution of C. The problem was expressed as (B given A), and (C given B).  But the word "hierarchical" was used.  I would take that word to imply that "C given B" was intended to mean (C given B and A). Michael Hardy (talk) 02:28, 27 October 2009 (UTC)
 * Actually here's an example I like better: Suppose a poll finds half of Democrats and half of Republicans approve of the war in Afghanistan, and that half the people who approve of the war and half the people who don't are satisfied with President Obama. That doesn't imply that only half of Democrats are satisfied with the president. Rckrone (talk) 23:36, 26 October 2009 (UTC)
 * As I mentioned before, the math above will get you to the correct result for Pr(Y=y) even though Y|Λ is not justified. You can get to the same place by using the formula for the marginal distribution twice and exchanging the order of summation:

\begin{align} &\Pr(Y=y) = \sum_n \Pr(Y=y|N=n)\Pr(N=n) \\ &= \sum_n \Pr(Y=y|N=n) \int_0^\infty f_{N|\Lambda}(n|\lambda) f_\Lambda(\lambda)d\lambda \\ &= \int_0^\infty \left(\sum_n \Pr(Y=y|N=n) f_{N|\Lambda}(n|\lambda)\right) f_\Lambda(\lambda)d\lambda \end{align} $$
 * and then the dirty work for the sum is the same as above. Rckrone (talk) 21:57, 26 October 2009 (UTC)


 * Thanks for the help! Rckrone, you have two different formulas, one above with a double sum, and one below with a sum and an integral.  Is the first one incorrect?  Shouldn't the two be the same?  Your work shows why that formula "works" even though it's not correct.
 * Michael Hardy, I am afraid I do not understand some of what you are doing. I do understand the basic idea, as if you use my original, but incorrect, formula, you can "show" that f(y|lambda) is Poisson(Lambda p).  I believe you are doing it a correct way, but it may be a bit advanced for me :)  For example, I do not get $$\Pr(Y = y \mid \Lambda) = E(\Pr(Y = y \mid N)\mid \Lambda)$$.  I have seen a formula for expectations of conditional expectations, $$EX = E(E(X | Y))$$.  But, I am not sure I have ever seen an expectation of a probability, and I do not understand why such a statement is true.  I do understand that the probability you wrote is a random variable since it has N and not n.  So, the expectation of it makes sense.  But, why can you insert that N in there and put the expectation on the outside?  It seems very different to me!  Thanks! StatisticsMan (talk) 01:15, 27 October 2009 (UTC)
 * Oh, also, I did not tell you which Gamma I was using because I did not know there were two :) My book uses the one with the pdf

$$f(x | \alpha, \beta) = \frac{1}{\Gamma(\alpha) \beta^\alpha} x^{\alpha - 1} e^{-x/\beta}, \qquad 0 < x < \infty, \qquad \alpha, \beta > 0$$
 * Thanks again. StatisticsMan (talk) 01:19, 27 October 2009 (UTC)

OK,
 * $$ \Pr(Y = y \mid N = n) \, $$

depends on n. Thus

\begin{align} \Pr(Y = y \mid N = 0), & \\ \Pr(Y = y \mid N = 1), & \\ \Pr(Y = y \mid N = 2), & \\ \Pr(Y = y \mid N = 3), & \\ \Pr(Y = y \mid N = 4), & \dots \end{align} $$ are different numbers. So
 * $$ \Pr(Y = y \mid N) $$

is a random variable that is equal to

\begin{align} \Pr(Y = y \mid N = 0) & \text{ if }N = 0 \\ \Pr(Y = y \mid N = 1) & \text{ if }N = 1 \\ \Pr(Y = y \mid N = 2) & \text{ if }N = 2 \\ \text{etc.} \end{align} $$ And as I said, see law of total probability. The expected value of this last-mentioned random variable is equal to the marginal probability that Y = y, i.e. to Pr(Y = y). Michael Hardy (talk) 02:03, 27 October 2009 (UTC) ...or to put it another way, let's say N is either 0, 1, or 2. Then

\begin{align} \Pr(A) & = \Pr([A\text{ and }N=0]\text{ or }[A\text{ and }N=1]\text{ or }[A\text{ and }N=2]) \\[10pt] & = \Pr(A\text{ and }N=0) + \Pr(A\text{ and }N=1) + \Pr(A\text{ and }N=2) \\[10pt] & = \Pr(A \mid N = 0)\Pr(N=0) + \Pr(A \mid N = 1)\Pr(N=1) + \Pr(A \mid N = 2)\Pr(N=2) \\[10pt] & = E(\Pr(A \mid N). \end{align} $$ Michael Hardy (talk) 02:22, 27 October 2009 (UTC)


 * StatisticsMan: The sum version of the formula is for discrete variables and the integral version is for continuous variables. Other than that they're the same formula (see Marginal distribution).  In the first post I really wasn't sure what "Poisson" and "bin" and "gamma" were so I didn't know which variables were discrete and which were continuous.  I just used the sum version to show the generic case.  Then Michael Hardy kindly demonstrated what they look like, so I was more specific the second time with Λ being continuous and N discrete. Rckrone (talk) 02:23, 27 October 2009 (UTC)

OK, here's what you'll need to know about integrals:
 * $$\int_0^\infty x^{\alpha - 1} e^{-x/\beta} \,dx = \Gamma(\alpha) \beta^\alpha$$

...to be continued..... Michael Hardy (talk) 02:38, 27 October 2009 (UTC) Continuing: "what you'll need to know about integrals" means
 * $$\int_0^\infty x^{\text{something} - 1} e^{-x/\text{something else}} \,dx = \Gamma(\text{something}) (\text{something else})^\text{something}$$

and we will changed the "something" and "something else" as we proceed.

We had
 * $$ \Pr(Y = y \mid \Lambda) = \frac{(p\Lambda)^y e^{-p\Lambda}}{y!}. $$

So now

\begin{align} \Pr(Y = y) & = E(\Pr(Y = y \mid \Lambda)) \\[10pt] & = E\left( \frac{(p\Lambda)^y e^{-p\Lambda}}{y!} \right) \\[10pt] & = \int_0^\infty \frac{(p\lambda)^y e^{-p\lambda}}{y!} f_\Lambda(\lambda)\, d\lambda \\[10pt] & = \int_0^\infty \frac{(p\lambda)^y e^{-p\lambda}}{y!} \frac{1}{\Gamma(\alpha)\beta^\alpha} \lambda^{\alpha -1} e^{-\lambda/\beta}   \, d\lambda \\[10pt] & = \frac{p^y}{y!\Gamma(\alpha)\beta^\alpha} \int_0^\infty \lambda^{y + \alpha - 1} e^{-(p + 1/\beta)\lambda} \, d\lambda \\[10pt] & = \frac{p^y}{y!\Gamma(\alpha)\beta^\alpha} \int_0^\infty \lambda^{y + \alpha - 1} e^{-\lambda/(\beta/(p\beta + 1))} \, d\lambda \\[10pt] & = \frac{p^y}{y!\Gamma(\alpha)\beta^\alpha} \cdot \Gamma(\text{something}) (\text{something else})^\text{something} \\[10pt] & = \frac{p^y}{y!\Gamma(\alpha)\beta^\alpha} \cdot \Gamma(y + \alpha) \left(\frac{\beta}{p\beta + 1}\right)^{y + \alpha} = \left(\frac{p\beta}{p\beta + 1}\right)^y \frac{\Gamma(y + \alpha)}{y!\Gamma(\alpha)}. \end{align} $$ ...to be continued.... Michael Hardy (talk) 03:43, 27 October 2009 (UTC)

OK, that last expression should say:
 * $$ \left(\frac{p\beta}{p\beta + 1}\right)^y \left( \frac{1}{p\beta + 1} \right)^\alpha \frac{\Gamma(y + \alpha)}{y!\Gamma(\alpha)}. $$

Canceling the gammas yields:
 * $$ \left(\frac{p\beta}{p\beta + 1}\right)^y \left( \frac{1}{p\beta + 1} \right)^\alpha \frac{(y + \alpha - 1)(y + \alpha -1)(y + \alpha -3)\cdots(\alpha+1)\alpha}{y!}. $$

This is:

\begin{align} & {}\quad \left(\frac{p\beta}{p\beta + 1}\right)^y \left( \frac{1}{p\beta + 1} \right)^\alpha \binom{y + \alpha -1}{\alpha} \\[10pt] & = (1 - s)^y s^\alpha \binom{y + \alpha -1}{\alpha}. \end{align} $$

...and there you have a negative binomial distribution. Michael Hardy (talk) 14:36, 27 October 2009 (UTC)


 * I think this all makes sense now. Sorry I didn't look at the link to total probability.  The reason is, as is sort of eluded to in the article, I know the total probability law as that special case of the real total probability law.  So, I thought I knew what it was so I didn't look.  StatisticsMan (talk) 01:41, 28 October 2009 (UTC)

Pyramids
How many sides does a pyramid have? —Preceding unsigned comment added by 71.50.14.205 (talk) 05:37, 25 October 2009 (UTC)
 * Depends entirely on how many sides the base area has; see Pyramid (geometry). If you're thinking about the buildings, then pretty much all of them have a rectangular base. Also note that the question is somewhat ambiguous as it's not obvious whether the base itself should count as a "side". — JAO • T • C 06:30, 25 October 2009 (UTC)


 * "Side" isn't a good word to use about a 3D figure, as it can be interpreted as "face" or as "edge". The former is the likeliest meaning in the OP, which is unfortunate as in 2D the unambiguous meaning is "edge", as in "six-sided polygon".→86.160.55.126 (talk) 11:34, 25 October 2009 (UTC)
 * It might not be the most exact term, but I don't think there's really any ambiguity. "Side" is often used informally to refer to faces of a polyhedron (as in "a 6-sided die") and not to the edges.  It might seem inconsistent to call the edges of a polygon and the faces of a polyhedron the same thing since they have different dimension, but they also both have codimension 1 relative to the whole polytope, which makes them analogous in a lot of ways. Rckrone (talk) 17:31, 25 October 2009 (UTC)
 * For what it's worth, I initially thought the OP meant edges. So there may not be ambiguity, but there might be a possibility of confusion. -- Meni Rosenfeld (talk) 19:58, 25 October 2009 (UTC)
 * Well I guess if it's unclear to people then I would be wrong about that. Rckrone (talk) 05:38, 26 October 2009 (UTC)
 * Surely, sides of a square-base pyramid refers to the four horizontal sides of the square base (which, I agree, are often associated (or even confused) with the four sloping triangular faces). Mathematically, of course, the square-base pyramid has five faces and eight edges, but colloquially one speaks of four sides which one can walk round.    D b f i r s   10:50, 26 October 2009 (UTC)

Invariant Reliabilty Index (ϐ)
I am looking for an example of calculating the invariant reliability index (ϐ) for a given function. I can't find anything related in Wikipedia search. —Preceding unsigned comment added by 216.36.86.205 (talk) 19:02, 25 October 2009 (UTC)

Er, after some further searching, it looks like what I'm after is called the first-order reliability method (FORM). Does anyone know where I can find an example of applying the method to a given function? Thanks. 216.36.86.205 (talk) 19:41, 25 October 2009 (UTC)