Talk:Variational Bayesian methods

This article feels like a mess
I'm not sure where to begin. I feel that I've wasted several hours trying to understand the material in this article, and that I would have saved hours had the article been more clearly written. For instance, it appears - and I'm still not 100% clear on this - that the notation $$E_{i \ne j} \log p(Z, X) $$ is to be understood as an expectation with respect to the variational distribution $$q(\cdot)$$, as opposed to an expectation with respect to, say, the full conditional of $$Z_i$$. So maybe better notation would be $$E_q [\log p(Z, X) | Z_i]$$ or something similar; we are apparently supposed to be integrating out everything EXCEPT $$Z_i$$ and doing so with respect to the variational distribution (the fact that we are supposed to be using the variational distribution to do this isn't mentioned anywhere). Maybe the notation used in this article is standard somewhere, but coming at this from a math stat/probability background I find it confusing.

Maybe these things are obvious to the authors of this article, but it still took quite a bit of effort to get. It also seems like much effort could be saved in calculations if the full conditional distributions of $$p(Z | X)$$ were derived first in the examples, since the full conditionals in the examples are well known. Then one would have $$q(Z_i) \propto \exp\left\{\log E_{i \ne j} p(Z_i | X, Z_{-i}) \right\}$$ since all the terms not associated with the full conditional can be absorbed into the normalizing constant. The examples would appear less daunting if shortcuts like this were used.

--68.101.66.185 (talk) 19:46, 11 August 2012 (UTC)

Equation Correctness
This edit added the claim that the equations in the section A more complex example are incorrect. I have replaced this line with a Dispute:about template. However, I am not familiar enough with the material to know if the equations are indeed incorrect, and if so, how to correct them. — Preceding unsigned comment added by 18.189.40.103 (talk) 17:09, 23 April 2018 (UTC)

Clearly a lot of work had gone into producing this section. It is still very possible that it contains errors of course, but the edit claiming that there were errors didn't even bother to say where or what the errors were, which makes the criticism difficult to confirm and in my opinion doubtful. I'd be in favour of simply deleting the unsubstantiated claim and the dispute tag until we have a more precisely defined dispute.80.229.247.11 (talk) 10:18, 24 February 2021 (UTC)

I spent an hour comparing a large portion of the equations to those in Bishop 2006, pp. 474-479. I found no errors. I did not check the rest because checking them is significant work, and I think the comment about the equations being incorrect was not in good faith. I will remove the dispute tag. If anyone knows of a genuine problem, they can make a specific claim.

Before using Bishop, I first checked the | 2011 Errata uploaded in 2016 and found only one change in that section. The article did not use the equation that needed correction.

BrotherE (talk) 23:03, 29 June 2023 (UTC)

obnoxious
Using the _same_ symbol to refer _both_ to a random variable and to the argument to its density function, is profoundly obnoxious and very bad in a number of ways. Maybe I'll come back to do some cleanup here later. Michael Hardy (talk) 21:12, 29 April 2019 (UTC)

Remove Proofs Section
The proofs section in this article doesn't seem to have a major connection to the topic. It refers to some specific papers about information geometry and there is no obvious connection to the previous sections. EitanPorat (talk) 23:39, 18 March 2023 (UTC)

Proof section - what is being proved?
The Mathematical derivation section reads clearly up to the subsection Proofs. Nothing stated up to this point needs any further proof, the math is self-contained. But then what is being proven in the Proofs section? At the very least, state what the proof is a proof of.

It appears that this section is not a Proofs section at all, but instead it is an instantiation of the algorithm. But it doesn't tie into what was just presented just before that. If this is how the algorithm is usually implemented, it needs to make a transition, and needs to motivate how the algorithm will work (how it will use this math). Right now this section just hits you and loses you. User:Ldc 2-Apr-2023 — Preceding unsigned comment added by 2601:646:9300:9E40:A02F:326:9D75:BBD8 (talk) 17:04, 2 April 2023 (UTC)


 * Totally agree remove this section please. EitanPorat (talk) 01:55, 13 March 2024 (UTC)

A duality formula for variational inference
This part is not understandable as it is. What is the point of this section? I think either the importance of this theorem should be explained, or it should be removed from the article. At least it should be moved to "further discussion", as it does not appear to be fundamental enough to have it's own section. 95.90.185.152 (talk) 13:39, 10 September 2023 (UTC)

This article is a mess
People are adding their own research and it convolutes everything. Only the significant material should be included.

why is the proof for variational inference necessary? Is this a significant paper in the field?

No original research EitanPorat (talk) 01:54, 13 March 2024 (UTC)