Wikipedia:Reference desk/Archives/Mathematics/2016 October 28

= October 28 =

if ANOVA fails for a given population X, (and/or Y), under what circumstances can you still cross compare X and Y?

 * ANOVA appears to be Analysis_of_variance

(Assume that the data has been or will be de-identified in the process and HIPPA laws are not an issue)

Suppose we have patient data for six hospitals in two cities:

NYC:
 * Bellevue Hospital (forensics floor)
 * Zucker Hillside Hospital
 * Manhattan Psychiatric Center

Philadelphia:
 * PMHCC
 * Belmont
 * Jefferson Psychiatry

Is it possible to say, cross compare health care systems in NYC and Philadelphia, even if the variances of subpatient groups for the hospitals are different enough that the ANOVA fails for say the NYC group or the Philadelphia group or both? (For example, PHMCC and Bellevue Hospital Forensics are for people who have been arrested or in prison).

Would the key issue here is that intergroup variance would have to significantly exceed intragroup variance given a certain statistical power? What test would I use in this case? 50.200.152.3 (talk) 01:57, 28 October 2016 (UTC)


 * What test you would use depends on what you want to look for. If you can tell us some hypotheses, we might be able to suggest specific tests.
 * Other than that, all I can say is that there are plenty of statistics you could look at to compare NYC to Philly, even if the variance are very different. Statistical hypothesis testing is our general article Analysis_of_variance has some comments on how some systems can be transformed to be more amendable to ANOVA, and Heteroscedasticity discusses both some relevant tests for that condition and some general techniques that can be applied in that case. SemanticMantis (talk) 20:52, 28 October 2016 (UTC)
 * Well, that doesn't quite answer my question: if ANOVA fails for a given population X, (and/or Y), are there any circumstances where you still cross compare X and Y? 50.200.152.3 (talk) 21:12, 28 October 2016 (UTC)
 * That, is, as a single aggregated metapopulation X against a single aggregated metapopulation Y (e.g. a t=test), even if this metapopulation X fails its own ANOVA for the subpopulations within it. 50.200.152.3 (talk) 21:14, 28 October 2016 (UTC)

Hypotheses
I haven't begun to look at the raw data yet, so it's hypothetical. BUT SAY:
 * NYC patients are hospitalized longer/shorter than Philly patients
 * NYC patients spend more/less per dollar to see an improvement in their depression / mental health scores than Philly patients

but NATURALLY the criminal/forensics patients within the city subgroups have a different kind of variance or distribution pattern (say), than the "civilian" versus the state hospital etc. Yanping Nora Soong (talk) 19:32, 1 November 2016 (UTC)


 * I don't believe you can use traditional ANOVA for this, because it partitions the *pooled* variance into within-groups and between-groups. There is an adjustment you can make when the groups to be compared have different variances, but if the variance is inconsistent within the groups to be compared (i.e. you have distinct subpopulations), then the key assumption of ANOVA is violated. Instead, you might look at a hierarchical linear model (HLM): in fact, the examples I remember (from waaaaay back in grad school) were estimating patient survival rates for disparate populations within a larger group (e.g. men vs women within a hospital). OldTimeNESter (talk) 14:48, 2 November 2016 (UTC)