Talk:Random effects model

Note of old merge
Article merged: See old talk-page here. Btyner (talk) 02:36, 21 June 2008 (UTC)

maybe it's a wrong hint to link random effect models only to hlm?!
Hello everbody. maybe it's better to link random effects models to all the different classes of models where random effects are utilzed, e. g. mixture models!!

yes lets link them (Jessmack (talk) 19:42, 9 December 2007 (UTC))

I like this page. It gives the estimation of the variance terms. Maybe link this to regression models is more appropriate. —Preceding unsigned comment added by Danioyuan (talk • contribs) 16:18, 20 December 2007 (UTC)

Hello. I think that the page is wrong when it states that:


 * $$ \frac{1}{n}E(SSB) = \frac{\sigma^2}{n} + \tau^2.$$

Notice that E(SSB) increases as m increases, whereas the right hand side doesn't depend on m at all.

In particular, when m = 1, the left hand side becomes zero.

I think that the correct version is as follows:


 * $$ \frac{1}{(m - 1)n}E(SSB) = \frac{\sigma^2}{n} + \tau^2.$$

Here is my reasoning:

1. The distribution of $$\overline{Y}_{i\bullet} - \mu$$ is $$N(0, \tau^2 + \sigma^2/n)$$.

2. The distribution of $$\overline{Y}_{\bullet\bullet} - \mu$$ is $$N(0, \tau^2/m + \sigma^2/(nm))$$.

3. We have
 * $$\sum_{i=1}^m (\overline{Y}_{i\bullet} - \mu)^2 = (\sum_{i=1}^m (\overline{Y}_{i\bullet} - \overline{Y}_{\bullet\bullet})^2) + m(\overline{Y}_{\bullet\bullet} - \mu)^2$$

4. Hence, the expectation of the LHS above is $$m(\tau^2 + \sigma^2/n)$$, and the expectation of the second term of the RHS is $$\tau^2 + \sigma^2/n$$. Therefore, the expectation of the first term of the RHS is $$(m - 1)(\tau^2 + \sigma^2/n)$$.

5. However, the first term of the RHS is just SSB/n.

I will now amend the article accordingly. Someone should check all of this as I'm not an expert. —Preceding unsigned comment added by Slumberjay (talk • contribs) 19:15, 25 November 2008 (UTC)

Meaning of symbol
Perhaps we should mention what the symbol $$\otimes$$ stands for in the Random effects estimation section. I'm not sure I remember enough about estimation in matrix notation to hazard a guess... - dcljr (talk) 22:54, 23 July 2009 (UTC)

Opposite conventions?
I find the assertion in this edit implausible. Does anyone know of any reason to think it's true? Michael Hardy (talk) 22:51, 20 July 2010 (UTC)
 * My adding the "expert" template partly relates to this. It would certainly help if some citations were given for each of the supposedly different meanings. Melcombe (talk) 17:14, 25 January 2011 (UTC)
 * Our conversation (between me and Michael Hardy) in 2007 about whether FE was a special case of RE or vice versa certainly points to there being a contradiction in definitions. Looking at the page as it stands today I would still reiterate that it is "wrong" as far as my (econometrics-originated) definition of FE/RE is concerned. See for example (Hayashi, 2000:p334) for a discussion of FE/RE which is consistent with my assertion that RE is a special case of FE, in the sense that FE is consistent whenever RE is consistent, but under certain assumptions, FE is consistent but RE is not. Torfason (talk) 19:27, 6 March 2013 (UTC)

Lead
I have added an "expert" template because the lead is now very poor. It doesn't say what the article thinks a random effects model is, but still talks about other possible meanings fotr the term. Melcombe (talk) 17:12, 25 January 2011 (UTC)

The mixed or not a mixed
The example in the motivation section is not a mixed model, it's a purely random effects model (well, at least according to how this term is used in econometrics). Fixed effects would have been if we assumed Ui to be the unknown constants that ought to be estimated, instead of just random variables; or maybe we could have assumed that β1, β2 are in fact β1i, β2i (gender and racial impacts differ across schools), which are again unknown non-random constants. It seems to me that the notation in mixed model article ought to be fixed, whereas what have been here was correct. //  st pasha  » 10:49, 25 February 2011 (UTC)
 * In the model being discussed ...

Y_{ij} = \mu + \beta_1 \mathrm{Sex}_{ij} + \beta_2 \mathrm{Race}_{ij} + \beta_3 \mathrm{ParentsEduc}_{ij} + U_i + W_{ij},\, $$
 * Sexij, Raceij  and ParentsEducij are fixed effects, while Ui and Wij are random effects. The model does not have  β1i, β2i in it. As it stands  β1, β2 are simply regression coefficients for the fixed effects, and the model is a mixed effects model. Any extension of the model to allow other terms to be treated as random effects should not be placed so early in the article as the following section on the decomposition of the sum of squares relates to the simple model initially stated. Melcombe (talk) 09:27, 25 February 2011 (UTC)


 * Hmm, this may be the manifestation of this remark... In econometrics, the variables Sexij, ..., ParentsEducij are just additional explanatory variables, with &mu;, ..., &beta;1 are the main parameters of interest. At the same time Ui is the random effect (see e.g. p.567 in Greene's textbook). The entire model is called the random effects model. In fact, the term "mixed model" is not used in econometrics at all (at least no mention in the 77-volume Handbook of Econometrics).
 * I still don't understand the statistical definition of "mixed model" though: if you say that Sexij, Raceij and are fixed effects, while Ui is a random effect, then would you say that the regression $$Y_{ij} = \mu + \beta_1 \mathrm{Sex}_{ij} + \beta_2 \mathrm{Race}_{ij} + W_{ij}$$ is a fixed effects model? Because it seems like a regular linear regression to me... //  st pasha  » 10:49, 25 February 2011 (UTC)
 * Yes, $$Y_{ij} = \mu + \beta_1 \mathrm{Sex}_{ij} + \beta_2 \mathrm{Race}_{ij} + W_{ij}$$ is (probably) a fixed effects model. Sex and Race are categorical variables, and you are (probably) trying to estimate a specific effect for sex or race that would be repeatable in another iteration of the experiment. (It is conceivable that you aren't trying to estimate repeatable effects of sex or race, i.e. you are essentially just treating these as blocking factors, in which case it would not be a fixed effects model, but I'm guessing not). As for fixed effects vs linear regression: if you're using ordinary least squares to estimate both models, then the fixed effect model is just a special case of linear regression.--140.247.119.83 (talk) 17:29, 9 April 2013 (UTC)


 * If the model were your extended version

Y_{ij} = \mu + \beta_{1,i} \mathrm{Sex}_{ij} + \beta_{2,i} \mathrm{Race}_{ij} + \beta_{3,i} \mathrm{ParentsEduc}_{ij} + U_i + W_{ij},\, $$
 * then β1i (etc.) are either random coefficents or random effects, while Sexij (etc.) are just fixed explanatory variables... it is the β1i which are random depending on the random selection of the school (saying that better explanatory power is obtained by allowing coefficeients to vary between schools, just as Ui allows the intercept in a simple linear model to vary between schools). Sexij (etc.) are fixed, conditional on the selection of the pupil within the selected school. The errors Ui  and Wij represent errors in the predictive model once the characteristics of those pupils and schools selected are known.


 * Going back to when you said "Fixed effects would have been if we assumed Ui to be the unknown constants that ought to be estimated, instead of just random variables" ... I think this is wrong ... even in a random effects model it is possible to obtain "best possible" estimates for individual values of &mu;+Ui. For the initial model in the article, these estimates are either (for a school not included in the sample), the "grand mean" of the sample, or (for a school in the sample) a weighted average of the grand mean and the school's sample mean, where the weights depend on estimates of the variances of the random components.


 * Melcombe (talk) 12:54, 25 February 2011 (UTC)

Contrast a fixed-effects and random-effects model
It would help me understand the difference between a random effects and fixed effects model (and also mixed, I guess) if the example was changed to show random effects and fixed effects models for contrast. dfrankow (talk) 19:03, 4 March 2011 (UTC)

I agree with @Dfrankow that this page could do a better job of explaining the conceptual and practical differences between fixed and random effects models. The Fixed effects model page repeats much of the confusing jargon and doesn't address his concerns either, so I proposed a change on Talk:Fixed effects model similar to the one I'm proposing here. I'll wait a while and get feedback before making changes.
 * First, I think it's important to emphasize the importance of categorical explanatory variables in this context. By traditional definitions, continuous explanatory variables are fixed effects, so it is only important to consider categorical variables and their interactions when deciding between a fixed and random/mixed effects model.  When we treat a categorical explanatory variable as a fixed effect, we assume that we have observed every category of interest.  When we treat it as a random effect, we assume that the categories follow a categorical distribution and we have only observed a small sample of all possible categories.
 * Second, the article uses the panel-data terminology subject-specific effects to refer to effects impacting groups of observations with the same value of a categorical explanatory variable. If multiple measurements are made on one subject it is correct to call this a subject-specific effect, but outside of panel/longitudinal data that's not generally true.  In the example of students' test scores, each observation comes from a different individual student, and the effect of the school on the student's score is more accurately described as school-specific or group-specific, not subject-specific.  --Maximillion Likelihood (talk) 00:13, 3 December 2014 (UTC)

text quality / important information missing
The text doesn't explain the most important issue: What is a random effect as opposed to a fixed effect? --Jazzman (talk) 12:02, 24 May 2016 (UTC)

Mixed up Fixed and Random effects?
In the introduction, it says that biostatisticians use fixed to refer to population averages and random to refer to specific subjects. I'm pretty certain this is backwards. — Preceding unsigned comment added by 98.16.130.122 (talk) 09:15, 1 November 2016 (UTC)


 * Reading one of these papers, the statement "Contrast this to the biostatistics definitions, as biostatisticians use 'fixed' and 'random' effects to respectively refer to the population-average and subject-specific effects" is not supported by the content: https://doi.org/10.2307%2F2529876.
 * They give this equation: y = Xa + Zb + e. a is population parameters and b is individual parameters. Later they say that b can be modeled as fixed or random. This does not imply that they use the term "random" to refer to specific subjects. They use the word random to mean the parameter is drawn from a random distribution, which is the same meaning as everyone else. And like many others, they use random effects to model the effect of individuals. There is no "contrast". 128.174.75.191 (talk) 17:04, 20 February 2024 (UTC)

Simple Example
Hi All, in this section we find this description: ''In this model Ui is the school-specific random effect: it measures the difference between the average score at school i and the average score in the entire country and it is "random" because the school has been randomly selected from a larger population of schools. The term, Wij is the individual-specific effect. That is, it is the deviation of the j-th pupil’s score from the average for the i-th school. Again this is regarded as random because of the random selection of pupils within the school, even though it is a fixed quantity for any given pupil.''

The effect of the ith school is not "random" because the sample on which the model is being fit is random! The model is trying to model the population, not the sample, and the model specifies the assumptions about the population not about the sample. No model in statistics is trying to model the sample--in fact that's the opposite of statistics, which is about making inferences from samples about populations. The term "random effects" has apparently caused a lot of discussion and argument (elsewhere), and it probably was not a great name choice. Nevertheless, I'm quite sure the description in this article is wrong. May I change it? Chafe66 (talk) 20:45, 14 March 2017 (UTC)

Original article and scientists
Hi everyone, first of all, thank you so much for making this article. Its super helpful. Does anyone know what was the first article to discuss this model? It would be great to cite the original paper and give them props. — Preceding unsigned comment added by 64.206.141.45 (talk) 16:28, 19 January 2019 (UTC)
 * (David 1995) says that the first occurrence of the term "random effects" was in (Eisenhart 1947), and the first occurrence of "random effects model" was in (Scheffe 1956). (Searle 2006) says the first known occurrence of the model later known as the random effects model was in (Airy 1861). Dk657 (talk) 17:27, 29 June 2019 (UTC)

Confusing random and fixed effects
"Random effect models assist in controlling for unobserved heterogeneity when the heterogeneity is constant over time and not correlated with independent variables. This constant can be removed from longitudinal data through differencing, since taking a first difference will remove any time invariant components of the model."

This seems like a basic mistake. Fixed effects models are estimated through differencing (or the within transformation; see the FE page). But the RE estimator uses FGLS: see this or Wooldridge (10.4.1 - Estimation and Inference under the Basic Random Effects Assumptions).

Also, this article needs a section on the RE estimator.

Ulaniantho (talk) 22:49, 22 December 2021 (UTC)

Needs a software section?
I don't know if we need one. 68.134.243.51 (talk) 01:27, 19 September 2022 (UTC)

Simple example equation
I am not sure that the the presence of both of Wij and εij makes sense in the equation of the simple example (in terms of notation, not mathematically speaking), because we would then consider their sum as the individual specific random variable. Moreover εij is not taken into account in the computation of the variance of Yij later in the explanation. So shouldn't εij be removed? 213.55.220.57 (talk) 07:27, 3 September 2023 (UTC)