Talk:Coefficient of variation

query by priyanka
The article mentions "if a group of temperatures are analyzed, the standard deviation does not depend on whether the Kelvin or Celsius scale is used since an object that changes its temperature by 1 K also changes its temperature by 1 C.". This implies that standard deviation does not have any units, which is incorrect. Standard deviation has the same units as that of the data. Please correct this line in the article. —Preceding unsigned comment added by 86.29.251.191 (talk) 23:00, 6 January 2011 (UTC)

Actually, the statement in the article is correct, and does not imply that the standard deviation is dimensionless. This is because the unit of the Celsius scale is the Kelvin: the Celsius scale differs from the Kelvin scale only by an additive constant: TCelsius+273.15 = TKelvin, as explained by the last clause of the statement. AndreasWittenstein (talk) 16:57, 7 February 2011 (UTC)

comment by pboyd
The Wikipedia article on "Coefficient of Variation" appears to imply that it may be used to determine the relative magnitude of the variation of a single set of data. I'm not sure that I see that.

It seems that the coefficient of variation would not be very useful in assessing the magnitude of variation in 2 situations:

1) for any variable measured on an interval scale (arbitrary zero point) and

2) ratio scales where the range of observations are all significantly greater than zero (for example, scores on an examination where the average grade was, say, 85).

Am I missing something? Or, is there an alternative measure that does allow for the assessment of the magnitude of dispersion for a single data set?

PBoyd 14:57, 14 September 2006 (UTC)

If the article implies that the CV can, based on a single set of data, determine "relative magnitude", then yes, that's incorrect. Relative comparisons should only be used on multiple sets of data. Where in the article is this implied?

Also, the CV can be useful in ratio scales when comparing two sets of data. Suppose two instructors give exams with equal means, but CV1 = 5 while CV2 = 0.5. Then we can prove the variance of the 2nd instructor's exams is less than the first instructor's. The same reasoning can be applied to interval scales. TrueBlueMichigan 05:14, 1 January 2007 (UTC)
 * I think PBoyd's first point was entirely correct and important so i've added it to the article with a reference to back it up. (Sorry but i don't quite follow point (2)). Qwfp (talk) 15:58, 22 February 2008 (UTC)

OK. Point 2 was intended to address a situation where, say one instructor's exam had a mean of 75 and a variance of 10 and the second instructors exam had a mean 85 a variance of 10. Both could have identical ranges, IQRs, MADs. That is, except for their positions on the 100 point grading scale, the two distributions are identical. Yet, the first would have a CV of (10/75 =) 0.133 and the second would have a CV of (10/85 =) 0.118.

I would think that measures of the magnitude of dispersion need to be independent of the mean, rather than dependent upon it. —Preceding unsigned comment added by 76.119.112.183 (talk) 12:59, 17 April 2008 (UTC)

Signal to noise ratio
Usually in engineering the signal to noise ratio is in terms of the power not amplitude. So should be (mu/sigma)^2 not mu/sigma. The link even leads to the page in wikipedia confirming this:

"In engineering, signal-to-noise ratio is a term for the power ratio between a signal (meaningful information) and the background noise:"

http://en.wikipedia.org/wiki/Signal_to_noise_ratio —Preceding unsigned comment added by 198.161.174.194 (talk) 15:00, 29 April 2009 (UTC)

Plainspeak:

It seems that in an article like this, the introductory paragraph should include a definition IN LAYMAN'S TERMS. I came to the page just to clarify my understanding, and the page lacked a clear, SIMPLE explanation of 'coefficients of variation' in plain speech. 130.132.133.22 (talk)Callielo —Preceding undated comment added 23:02, 21 February 2010 (UTC).

David:

"when comparing between data sets with different units or widely different means, one should use the coefficient of variation for comparison instead of the standard deviation."

As far as I can understand there is no benefit to using the CV over standard deviation if the means are similar or equal. By the formula definition the numbers will be in the same ratio to each other. When data sets use different measurement units it usually entails different means, but the CV formula adds informational value only by merit of the different means and not the different units. This same convention is used everywhere I have looked, but I haven't been able to find one example that justifies including 'different measurement units' as a criteria for using the CV over SD. —Preceding unsigned comment added by 99.231.9.155 (talk) 15:13, 13 January 2011 (UTC)

Merging Articles
Statguy1: I noticed that this section is being considered for merge with another topic. Just want to add my 2cents that It was helpful for me, today, to have CV as a separate topic. I found the current topic to be well-written and helpful. It answered my question in just a few seconds of reading. I — Preceding unsigned comment added by Statguy1 (talk • contribs) 01:58, 28 July 2013 (UTC)

I agree with Statguy1. The other article is poorly written. I would suggest expanding this article with a discussion of Relative Standard Deviation (if there is any difference from CV except the name). One other suggestion to include formulas for confidence limits of the sampling distribution. ANSK (talk) 18:29, 22 September 2013 (UTC)


 * I agree as well. I'm not sure of the exact policy for merging, but there seems to be a consensus among us three at least and I think someone with the time and expertise should do this. Pengortm (talk) 23:41, 5 October 2013 (UTC)


 * Now that this much time has passed without any disagreement, we can say consensus exists. Anybody can merge the articles now.  I would if I had time right now.  Andrew327 21:15, 14 July 2015 (UTC)

Contradiction regarding normal data?
The Definition says
 * The coefficient of variation should be computed only for data measured on a ratio scale, as these are measurements that can only take non-negative values.

But the following Estimation section gives an estimator for normally distributed data. So which is it? Dbooksta (talk) 21:22, 28 February 2014 (UTC)

Definition
The definition should mention $$\mu\neq 0$$, and IMO the denomenator should be $$|\mu| $$. Nijdam (talk) 11:32, 21 April 2014 (UTC)


 * Agreed. I looked at a number of texts and didn't see this used but all the examples had means that were positive. I think the authors didn't consider when means were negative so didn't realize the absolute value should be used. Dger (talk) 17:43, 22 January 2015 (UTC)

Definition in lead paragraph
I've put the definition in the lead paragraph: that's what I needed to read when I came to this page, and it's clearer that way. I think that repeating it in the next paragraph isn't really an issue. --Slashme (talk) 19:16, 7 June 2014 (UTC)

Unitized risk
The term "unitized risk" seems to be very uncommon, and used mainly in finance. It's hard to find good sources that use the term, but I've put in a link to an actuarial study guide. --Slashme (talk) 19:18, 7 June 2014 (UTC)

Example of Misuse
The example of misuse is misleading and should be corrected. The results are simply in different units, % degrees Cent. of one and % degrees Far. for the other. The ratio of 0.79 to 0.42 is simply the ratio of Cent. to Far. John Bollinger, CFA, CMT (talk) 15:07, 23 September 2015 (UTC)

Assessment comment
Substituted at 19:52, 1 May 2016 (UTC)

As a measure of economic inequality
Of the four axioms, Anonymity and Population independence are strictly speaking unsourced, but must be true in order for the sourced statement that it fulfils the requirements for an inequality index. PAR (talk) 22:20, 13 June 2016 (UTC)

Does expectation of sample coefficient of variation for normal distributions exist?
For a normal distribution, as I understand it, the sample mean and variance are independent. So the E(s/x_bar) = sigma * E(1/x_bar) but x_bar is a normal distribution and the expected value of an inverted normal distribution does not exist (https://stats.stackexchange.com/questions/70045/mean-and-variance-of-inverse-of-a-normal-rv)

So in the estimation section where the article states there is an unbiased estimator, isn't that wrong because there is no expectation? 205.175.118.156 (talk) 23:02, 6 July 2018 (UTC)

Adding "Hypothesis testing for the Coefficient of variation" section
It's worth adding a section on hypothesis testing. Potential source material as reference and for formulas: Not sure when I'll get to add it. If someone else can do it, that would be great. Tal Galili (talk) 10:26, 10 January 2021 (UTC)

Typo in example?
A data set of [90, 100, 110] has more variability. Its population standard deviation is 8.165 and its average is 100 (?)