Talk:Hierarchical Bayes model

improper = not normalizable
I changed "be improper and not normalizable" to "be improper (not normalizable)" because it's synonymous. Any concerns?--Dangauthier 13:24, 5 September 2007 (UTC)

Illustrating picture
In the current version, it seems that the model described by the plate picture doesn't correspond to anything in the text, and doesn't illustrate the example. Reason : in the example, there is as many v_i as x_i (n), so there should be only one plate in the graph. What is N ? Question : isn't it a bit confusing for the reader ? Do you plan to add a more complex example? Or am I wrong ?--Dangauthier 13:24, 5 September 2007 (UTC)

Second paragraph in intro
If the data are $$x\,\!$$ and parameters $$\vartheta$$, isn't $$p(\vartheta)$$ a likelihood and $$p(x|\vartheta)$$ a probability, just the opposite of what is written? Or is this an accepted abuse of terminology in the Bayesian community? Van Parunak (talk) 12:44, 19 August 2008 (UTC)


 * No. In $$p(\theta|x) \propto p(x|\theta)p(\theta)$$, the prior is $$p(\theta)$$ and the posterior is $$p(\theta|x)$$. These are both probability density functions, functions of $$\theta$$. The function $$p(x|\theta)$$ is not a pdf but a likelihood: it is considered as a function of its second argument, $$\theta$$, i.e. it shows the likelihood of values of the parameter $$\theta$$ given the data $$x$$. So all three are functions of $$\theta$$, which is as it should be.


 * This is completely standard Bayesian notation. Unfortunately $$p(\cdot)$$ is used for both pdfs and likelihoods, but this usage is so well established that there's little hope of changing it. --88.109.216.145 (talk) 17:05, 5 November 2009 (UTC)

"Rich" models
I would suggest changing the following sentence in the text.

"It is a powerful tool for expressing rich statistical models that more fully reflect a given problem than a simpler model could."

As a non-statistician I simply don't understand what is being expressed with this sentence i.e. what does "fully reflect a given problem" mean? Why is it a powerful tool? etc. It seems to me that where a "simple" model describes the data optimally, there is no need to employ a different kind of model to describe it. In many ways Occam's razor would suggest that the law of parsimony should apply.Jimjamjak (talk) 12:19, 8 August 2011 (UTC)
 * Tried to fix this a bit, still not great. Simpler models are indeed better unless they do not, in fact, represent the situation very well.  Essentially Hierarchical Bayes lets the modeler more directly express dependencies between the parts of the situation being modeled. —johndburger 02:38, 24 September 2011 (UTC)