Talk:Mixture distribution

I have only seen these two terms used synonomously 205.155.65.236 06:30, 31 October 2007 (UTC)

Rename
Should we rename this to Mixture distribution so as to include the discrete case? —3mta3 (talk) 12:27, 24 June 2009 (UTC)

————————————————

I suggest to rename this to Mixture family, similarly to how we have the exponential family. Unlike all other “distribution” articles, this model cannot be completely described by a finite number of parameters, there always remains freedom in choosing the “mixed-in” densities pi(x), just like there is a similar freedom in the exponential family. …  st pasha  »  15:20, 16 January 2010 (UTC)

Simulation
I think the text should include a Simulation chapter. It's very easy to simulate a variable: first, sample from the discrete set {1, 2, ... n} with weights wi and get i, then simulate the i-th distribution. Is there any source for this? Albmont (talk) 17:29, 1 July 2009 (UTC)

Incorrect diagram
An IP editor plavedthis in the main text "-this is wrong it shows the concept, but the "mixed normal pdf" should be underneath the 3 pdf's because each pdf integrates to 1 on its own." I have hidden both this and the diagram for now. Melcombe (talk) 15:55, 28 June 2010 (UTC)

Incorrect formula for variance (2nd momentum; Section 'Moments')
There appears to be a mistake in the formula for the variance of a mixture distribution: The general formula is:

$$ \begin{align} \operatorname{E}[(X - \mu)^j] & = \sum_{i = 1}^n w_i \operatorname{E}[(X_i - \mu_i + \mu_i - \mu)^j] & = \sum_{i=1}^n \sum_{k=0}^j \left( \begin{array}{c} j \\ k \end{array} \right) (\mu_i - \mu)^{j-k} w_i \operatorname{E}[(X_i- \mu_i)^k] \end{align} $$

For j=2 (Variance), we get:

$$ \begin{align} \operatorname{E}[(X - \mu)^2] = \sigma^2 & = \sum_{i=1}^n \sum_{k=0}^2 \left( \begin{array}{c} 2 \\ k \end{array} \right) (\mu_i - \mu)^{2-k} w_i \operatorname{E}[(X_i- \mu_i)^k] = \sum_{i=1}^n (1 * (\mu_i - \mu)^{2} * w_i * 1) + (2 * (\mu_i - \mu) * w_i * 0) + (1 * 1 * w_i * \operatorname{E}[(X_i- \mu_i)^2]) = \sum_{i=1}^n w_i*((\mu_i - \mu)^{2} + \operatorname{E}[(X_i- \mu_i)^2]) = \\ & = \sum_{i=1}^n w_i((\mu_i - \mu)^{2} + \sigma_i^2) \end{align} $$

This is different to the formula from the article:

$$ \operatorname{E}[(X - \mu)^2] = \sigma^2 = \sum_{i = 1}^n w_i (\mu_i^2 + \sigma_i^2) - \mu^2 .$$

I believe the mistake is that

$$ (\mu_i - \mu)^{2} \neq (\mu_i^2 - \mu^2) $$


 * I believe they are equivalent. I just edited the page (https://en.wikipedia.org/w/index.php?title=Mixture_distribution&diff=890512775&oldid=889037141) to reflect the two forms.


 * I dug into this because I was confused as well. I did the same thing you did (worked forward from the general form for central moments) and came to the same conclusion. But if you keep playing around with it, you can come away with the existing form:



\begin{align} \sum_{i=1}^n w_i((\mu_i - \mu)^{2} + \sigma_i^2) & = \sum_{i=1}^n w_i(\mu_i^2 - 2\mu\mu_i + \mu^2 + \sigma_i^2) \\ & = \sum_{i=1}^n w_i\mu_i^2 - 2\sum_{i=1}^n w_i\mu_i\mu + \sum_{i=1}^n w_i\mu^2 + \sum_{i=1}^n w_i\sigma_i^2 \\ & = \sum_{i=1}^n w_i(\mu_i^2 + \sigma_i^2) + \mu^2\sum_{i=1}^n w_i - 2\mu\sum_{i=1}^n w_i\mu_i \\ & = \sum_{i=1}^n w_i(\mu_i^2 + \sigma_i^2) + \mu^2 \cdot 1 - 2\mu\cdot E[\mu] \\ & = \sum_{i=1}^n w_i(\mu_i^2 + \sigma_i^2) + \mu^2 - 2\mu^2 \\ & = \sum_{i=1}^n w_i(\mu_i^2 + \sigma_i^2) - \mu^2 \\ \end{align} $$


 * Aslvrstn (talk) 20:27, 1 April 2019 (UTC)

Edits
I altered the statement that the variance formula applied to normal distributions only. It is general.

Source: https://statisticalmodeling.wordpress.com/2011/06/16/the-variance-of-a-mixture/ — Preceding unsigned comment added by Svein Olav Nyberg (talk • contribs) 22:25, 17 November 2015 (UTC)

Finite, countable, uncountable mixtures
The term discrete is more concise than "finite and countable". Also, the term continuous feels better than uncountable here even though it's more restrictive. More accessible to someone with just a calculus background. Then we could make note that it generalizes in the context of measure theory. — Preceding unsigned comment added by 207.11.1.161 (talk) 08:44, 20 February 2016 (UTC)

Generating functions
There should be a new section about moment generating (and maybe characteristic) functions of mixture distributions. Kjetil B Halvorsen 15:10, 19 April 2017 (UTC) — Preceding unsigned comment added by Kjetil1001 (talk • contribs)

Clarification of components of ridgeline function
My understanding of the Ray and Lindsay paper is that Σ is the variance, as defined in section 2 of their paper (The ridgeline manifold). "the density of a multivariate normal distribution with mean µ and variance Σ" The page identifies it as the covariance. But I'm not solid enough on the math to want to make any edits. But someone with more time/knowledge can perhaps check this. — Preceding unsigned comment added by Rontl (talk • contribs) 21:51, 2 July 2020 (UTC)

Add formula explicitly
Where it is written "A similar integral can be written for the cumulative distribution function" (section "Uncountable mixtures") I suggest this formula is given explicitly