Talk:Standard deviation/Archive 2

Weighted calculation
n' is defined wrongly. It should be the square of the sum of the weights, divided by the sum of the squares of the weights; i.e., V1^2/V2 from http://en.wikipedia.org/wiki/Weighted_variance#Weighted_sample_variance —Preceding unsigned comment added by 174.1.36.106 (talk) 23:30, 1 March 2011 (UTC)

Rapid calculation
the correct caluclation is here, and i don't think it's equivalent:

SQRT((s0*s2-s1^2)/(s0*(s0-1)))

http://syberad.com/calculator/WebHelp/charts/statistics/definitions/standard_deviation.htm

Anyone who can double-check this and fix the formula, that would be cool.

Simul (talk) 20:33, 18 March 2011 (UTC)

Combining Standard Deviation - a reference for citation
Found a source to cite for the discrete sampling portion of Combining Standard Deviations: http://www.burtonsys.com/climate/composite_standard_deviations.html I'm rusty on Wiki editing and don't know the right style for citing these things, so I'll leave it to someone else. Feel free to remove this comment when that is done.

Risce (talk) 14:47, 25 July 2011 (UTC)

With Sample Standard Deviation - please rephrase for clarity
"The reason for this correction is that s2 is an unbiased estimator for the variance σ2 of the underlying population, if that variance exists and the sample values are drawn independently with replacement. However, s is not an unbiased estimator for the standard deviation σ; it tends to overestimate the population standard deviation."

Can someone please clarify these two sentences - they read paradoxically. If s^2 is an unbiased estimate of σ^2, then how can s be a biased estimator of σ??

Jwsadler (talk) 15:50, 4 August 2011 (UTC)


 * I have removed the claim that Jwsadler refers to. I have looked at the article that was used as a citation; it does not really say what the author's definition of sample standard deviation is - it might be different than the contemporary one. The paragraph in question read as follows:


 * However, s is not an unbiased estimator for the standard deviation σ; it tends to underestimate the population standard deviation. (reference to Gurland J and Tripathi RC. 1971. A simple approximation for unbiased estimation of the standard deviation. Amer. Stat. 25:30-32)


 * Anša (talk) 15:24, 13 September 2012 (UTC)

Adding link to example in "Maximum likelihood estimator" wikipedia-entry
Under the section "Continuous distribution, continuous parameter space" in the "Maximum likelihood" wikipedia entry, the calculation of the expectation of the maximum likelihood estimate of the standard deviation (for a normal distribution) is examplified. This illustrates nicely why one calls the "With sample standard deviation" a biased estimator. So maybe it would be a point to add in this "With sample standard deviation" hence clarifying why it is exactly N-1 that is chosen as the adjustment. At least, that's how I understood it. — Preceding unsigned comment added by 158.64.77.254 (talk) 15:47, 20 December 2011 (UTC)

Cauchy distribution
The remark on the Cauchy distribution, "the standard deviation of a random variable that follows a Cauchy distribution is undefined because its expected value μ is undefined" is wrong. In the Cauchy distribution, the expected value μ is is easily estimated from the sample mean. The larger the sample, the more accurate the μ estimate. Not so for the sample variance! It is the estimate of the variance, σ^2=E[(x-μ)^2], that does not converge, and consequently the standard deviation (square root of variance) does not converge either.

You can calculate the sample variance, but it does not represent the distribution because the Cauchy distribution does not have a standard distribution (the defining integral for variance does not converge). With a Cauchy distribution, the larger your sample, the larger the calculated variance, without limit! With distributions that do have a variance, the variances estimated from ever larger samples converge to that of the underlying distribution. Cauchy is a classic case of a "tail heavy" distribution.

Also, the article propagates a major computing error in statistical software:

"Thus, the standard deviation is equal to the square root of (the average of the squares less the square of the average). See computational formula for the variance for a proof of this fact, and for an analogous result for the sample standard deviation."

The above is called the "computer's formula", and is in common use because it eliminates the need to make a pass through the data to compute the mean and the need subtract the mean from each sample in a second pass through the data. However, a calculation problem arises when the mean is large compared to the standard deviation. The problem is that you are subtracting two numbers that are very nearly the same-- a major no-no in numerical methods! On your calculator, take the standard deviation of 1111, 1111.1111, and 1111.2222. The exact answer is 0.1111. Some calculators are better than others. There is an simple and accurate algorithm for managing this problem that requires two memory registers, just like the grossly defective computer's formula. Details upon request to

richard1941@gmail.com. — Preceding unsigned comment added by 98.151.182.233 (talk) 04:45, 2 January 2012 (UTC)


 * The contribution "In the Cauchy distribution, the expected value μ is is easily estimated from the sample mean. The larger the sample, the more accurate the μ estimate" in the above is wrong. It is well-known that if the distribution of a single sample value is Cauchy, the distribution of the sample mean of any number of such (independent) sample values has exactly the sample Cauchy distribution ... ther distribution of the sample mean does not become more concentrated about the population centre and hence saying "The larger the sample, the more accurate the μ estimate" is incorrect. Melcombe (talk) 23:03, 6 January 2012 (UTC)

Link to Mean
How about linking to an article about the "mean", or at least explaining it. The article assumes the author knows about it.

156.34.68.113 (talk) 17:30, 20 January 2012 (UTC)
 * There's a link to 'mean' in the second sentence of the article. Qwfp (talk) 17:33, 20 January 2012 (UTC)

Standard deviation of the mean should be Standard error
I have never seen the terminology "standard deviation of the mean" used in statistics literature. "Standard deviation of the mean" describes the standard error of a measurement and I think that the section talking about the standard deviation of the mean should be restructured to reflect that. This would avoid confusion for readers unfamiliar with the subject and would redirect them to the standard error page where they could get more information.

In Cntrl (talk) 04:19, 9 February 2012 (UTC)


 * The "Standard deviation of the mean" is a well-defined population-based quantity, while "Standard error of the mean" is a sample-based estimate of the "Standard deviation of the mean". See Standard error (statistics). Melcombe (talk) 01:29, 14 February 2012 (UTC)

Variance and standard deviation merger proposal
Any cons? Please discuss it in Talk:Variance. Fgnievinski (talk) 05:25, 10 April 2012 (UTC)

Generalizing from two numbers
The concept in the section of "Generalizing from two numbers" is incorrect. — Preceding unsigned comment added by 156.42.184.101 (talk) 18:16, 14 August 2012 (UTC)


 * Can you be more specific? FilipeS (talk) 10:56, 6 September 2012 (UTC)

This section is misleading. I'm certain there's something deeply wrong with it, however I'm having trouble working out the details. Regardless, the equation it yields is absolutely wrong; see lower on the page for correct equations. I'm cutting this section temporarily and I'll work on putting it back or outlining why it's so horribly wrong. Please don't reinstate the section if you're not willing to point out or correct the logical flaws. Torsionalmetric (talk) 22:05, 11 September 2012 (UTC)

The "generalizing" section is, in principle, a nice idea. However, the mathematics shown previously was horribly flawed. The identification of $$\sigma$$ as the mean as well as the definition in that section was erroneous.
 * It should be evident that it is not possible to extrapolate from the two-number case to a three number case following that defintion.
 * The generalization from the simple case to the complex case is complete nonsense, partially based on the above impossible-extrapolation. I can show this if needed.

If the generalizing section is still desired, it can be rewritten. I'll keep these pages watched; if someone really wants it rewritten, I'll take it on. But please, don't simply revert this edit. That section may have had noble intent, but the result was mathematical garbage.Torsionalmetric (talk) 22:49, 11 September 2012 (UTC)

Moving Calculation
I think the calculation for the moving weighted standard deviation is wrong. I think the correct formula is $$ Q_k = Q_{k-1} + \frac{w_k^2 W_{k-1}}{W_k^2}(x_k - A_{k-1})^2 + w_k(x_k-A_k)^2 $$ Gannektakazoink (talk) 01:48, 6 November 2012 (UTC) or to make the transition from unweighted to weighted perfectly obvious : $$ Q_k = Q_{k-1} + \frac{(w_k^2+W_{k-1})W_{k-1}}{W_k^2}(x_k-A_{k-1})^2 $$ Gannektakazoink (talk) 15:19, 6 November 2012 (UTC)

Wrong figure in the lead section
The figure in the lead section, the first figure is wrong! Because the graph is touching the X axis. In normal distribution, the graph never touches the X axis. Thanks. -- Abhijeet Safai (talk) 15:19, 16 November 2012 (UTC)

Standard deviation and sample standard deviation
Please check the definition of standard deviation and sample standard deviation.

In my understanding of measuring technology, the standard deviation always refers to the (mostly unknown) true value and has the number of samples $$N$$ in the dominator. The standard deviation is defined as:
 * $$ \sigma_x = \sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - x_\mathrm{true})^2}. $$

The standard sample deviation (or whatever you may call it) is used if you do not know the true value (and use the mean value, instead):
 * $$ \sigma_{x,\mathrm{est.}}  = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \overline{x})^2}.$$

This term is an estimation of the standard deviation. The $$N-1$$ in the nominator (instead of $$N$$) considers the ignorance of the true value of x. --Michael Lenz (talk) 22:35, 29 November 2012 (UTC)

Yet Another Error
For the basic example the equation in simplest forms is 32/8 wich equals 4 not 2 i fixed the problem but am supriesd no one else saw it — Preceding unsigned comment added by 166.205.68.20 (talk) 03:59, 25 December 2012 (UTC)

The root pf 32/8 wow im stupid ignore am now fixing my mistake — Preceding unsigned comment added by 166.205.68.20 (talk) 04:01, 25 December 2012 (UTC)

For the basic example given in the first section, shouldn't the average of the values be divided by 7 instead of 8? I have seen many times that the number of samples is reduced by one to get the variance. Shouldn't this be the case?

WeirdnSmart0309 (talk) 00:40, 14 October 2011 (UTC)

Not an Error
When you have the entire sample population, you use 'n' as the denominator. 'n-1' is only used if your data represents a SAMPLING from the entire population. Dreslough (talk) 10:18, 21 December 2012 (UTC)

Numerical Example
The values given in the numerical example are not clearly from a normal distribution, so of what use is the standard deviation of the normal distribution? — Preceding unsigned comment added by 162.136.192.1 (talk) 20:31, 4 February 2013 (UTC)


 * The standard deviation is a measure of the total distance of the data points in sample space from their center value (mean). This applies to any kind of distribution of data, not just a normal distribution. (You'll notice the similarity between the standard deviation formula and the n-dimensional Pythagorean distance formula.) True, it may be confusing to show a normal (bell) curve, but this is probably the most common use of standard deviation. — Loadmaster (talk) 00:40, 23 February 2013 (UTC)

Mathematical identity proof
The variance article presents a proof of the standard deviation equivalent formula in terms of E(X). Here is another proof, given in terms of the sample points x$i$.

For the variance (i.e., the square of the standard deviation), assuming a finite population with equal probabilities at all points, we have:
 * $$\sigma^2 = \frac{1}{N}\sum_{i=1}^N(x_i-\overline{x})^2$$

Expanding this, we get:
 * $$\sigma^2 = \frac{1}{N}\left[ (x_1-\overline{x})^2 + (x_2-\overline{x})^2 + \cdots + (x_n-\overline{x})^2 \right]$$

Simplifying, we get:
 * $$\begin{align}

\sigma^2 & = \frac{1}{N}\left[ (x_1^2 - 2x_1\overline{x} + \overline{x}^2) + (x_2^2 - 2x_2\overline{x} +\overline{x}^2) + \cdots + (x_n^2 - 2x_n\overline{x} + \overline{x}^2) \right] \\ & = \frac{1}{N}\left[ x_1^2 + x_2^2 + \cdots + x_n^2 \,-\, 2x_1\overline{x} - 2x_2\overline{x} - \cdots - 2x_n\overline{x} \,+\, \overline{x}^2 + \cdots + \overline{x}^2 \right] \\ & = \frac{1}{N}\left[ x_1^2 + x_2^2 + \cdots + x_n^2 \,-\, 2\overline{x}(x_1 + x_2 + \cdots + x_n) \,+\, N\overline{x}^2 \right] \\ & = \frac{1}{N}\left[ x_1^2 + x_2^2 + \cdots + x_n^2 \right] - 2\overline{x}\frac{1}{N}(x_1 + x_2 + \cdots + x_n) + \overline{x}^2 \\ & = \frac{1}{N}\left( \sum_{i=1}^N x_i^2 \right) - 2\overline{x} \left(\frac{1}{N}\sum_{i=1}^N x_i\right) + \overline{x}^2 \\ & = \frac{1}{N}\left( \sum_{i=1}^N x_i^2 \right) - 2\overline{x}^2 + \overline{x}^2 \\ & = \frac{1}{N}\left( \sum_{i=1}^N x_i^2 \right) - \overline{x}^2 \end{align} $$

Expanding the last term, we get:

\sigma^2 = \frac{1}{N}\sum_{i=1}^N x_i^2 \,-\, \left(\frac{1}{N} \sum_{i=1}^{N} x_i\right)^2 $$

So then for the standard deviation, we get:

\sigma = \sqrt{\frac{1}{N}\sum_{i=1}^N(x_i-\overline{x})^2} = \sqrt{\frac{1}{N} \left(\sum_{i=1}^N x_i^2\right) - \overline{x}^2} = \sqrt{\frac{1}{N} \sum_{i=1}^N x_i^2 - \left(\frac{1}{N} \sum_{i=1}^{N} x_i\right)^2}. $$

— Loadmaster (talk) 18:12, 22 February 2013 (UTC)


 * I definitely think this article needs a link to variance near the top, we don't need long proofs but definitely the formula above or the variance one
 * $$\operatorname{Var}(X)= \operatorname{E}\left[(X-\operatorname{E}(X))^2\right] = \operatorname{E}\left[X^2\right] - 2\operatorname{E}[X]\operatorname{E}[X] + (\operatorname{E}[X])^2 = \operatorname{E}\left[X^2 \right] - (\operatorname{E}[X])^2$$
 * should be in it somewhere near the top too. Dmcq (talk) 09:02, 17 May 2013 (UTC)


 * There's already an early link to variance, in sentence 1 of paragraph 2 of the lede:


 * The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of its variance.


 * I think this is probably early enough. As for the formula, I see your point, but on the other hand what I like about the current set-up is that it gives the lay reader a nice simple example early on instead of chasing him away with formulas that may be more than what he can handle. I'll put in a verbal formula at the start of the basic example -- see what you think. Duoduoduo (talk) 14:38, 17 May 2013 (UTC)
 * I think I must have skipped a couple of pages when looking by mistake. Yes the example should have some light explanation of things so it isn't just calculation, I put in a small bit saying why the n-1 was being used. Dmcq (talk) 15:54, 17 May 2013 (UTC)

Example of two sample populations
The figure looks very strange.

First, is it supposed to be a histogram (the vertical axis suggests this) plotted in a such strange way? Then, why not using bars, as usual, and without description of binning (seems to be uniform, 10-unit wide)?

Second, a sample of size 1000 is very unlikely to have so large deviation of the mean (which looks like ~95 for the red plot) from the expectation value (claimed to be 100).

Mikhail Ryazanov (talk) 01:19, 23 April 2013 (UTC)


 * The red population in the figure cannot have a mean of 100 given its shape. The mean would be to the left of 100. Either that or the plotted values should be shifted to the right to match the stated mean.
 * May 9, 2013 — Preceding unsigned comment added by :129.162.1.37 (talk) 15:48, 9 May 2013 (UTC)


 * I agree that it appears to be a histogram plotted in a strange uninterpretable way. But I disagree that "the red population in the figure cannot have a mean of 100 given its shape". The shape is of the sample, not the population. But I think the figure is too confusing to be worth keeping. Anyone object to my removing it? Duoduoduo (talk) 16:33, 9 May 2013 (UTC)


 * I agree to remove it but I think it's worth having a practical example in addition to the theoretical Gaussian examples in this article. 195.238.25.37 (talk) 11:21, 27 May 2013 (UTC)

Header Graphic
Doesn't actually add up to 100%, .1% short on both sides. Also, most other graphics I've seen read .2/2.2/13.6/34 rather than the shown. 130.86.42.36 (talk) 01:38, 13 June 2013 (UTC)
 * It doesn't add up to 100% because only a finite interval is shown, and the distribution function has an infinite domain. The remaining 0.1% would be in the rest of the distribution and in roundoff errors.
 * I just verified that the numbers on the graph are correct. The graph goes out to 4 standard deviations. Taken out to 5 decimal places, the areas under each standard deviation segment out to 6 standard deviations are as follows:

1  34.13447%   2   13.59051%   3    2.14002%   4    0.13182%   5    0.00314%   6    0.00003%
 * Therefore, the values shown in the graph (34.1, 13.6, 2.1, 0.1) are correct when rounded to 1 decimal place. ~Amatulić (talk) 03:25, 13 June 2013 (UTC)

Naming convention for standard deviation of sample
The textbook that I use for an experimental techniques class (Squires: Practical Physics) uses the term 'standard deviation of the sample' for the 1/n version. From a physicist point of view, this sort of makes sense because it is a property of the sample that was taken from the population. (Physicists almost never deal with the entire population.) The 1/n-1 version is then our best estimate of the variance of the population from which the sample was sampled. Again, then it is somewhat reasonable to call 1/n-1 version the standard deviation of the population. (Squires calls it the best estimate of the variance of the standard deviation of the distribution instead.)

I am not saying that we need to change the article in any way unless this is a common problem. Has anyone here had a similar experience with the confusion between the 1/n and the 1/n-1 version? And is it enough to mention, or will that cause even more problems than what it is worth?

TStein (talk) 16:36, 29 August 2013 (UTC)


 * Sometimes "standard deviation of the sample" refers to one version and sometimes to the other version. But from what I see in this article more specific terminology is used to avoid confusion: there are sub-sections entitled Uncorrected sample standard deviation, Corrected sample standard deviation, and Unbiased sample standard deviation. Is there somewhere else in the article where confusion occurs? Duoduoduo (talk) 17:12, 29 August 2013 (UTC)


 * The main place that is ambiguous is the last paragraph of the lead. You are right in that the main article avoids the ambiguity for the most part. TStein (talk) 20:18, 29 August 2013 (UTC)


 * I've clarified it there -- thanks for pointing it out. Duoduoduo (talk) 22:54, 29 August 2013 (UTC)

5 sigma
I was surprised to find that there is no article on 5 sigma in view of its importance in, for instance, the standard of confidence required for confirmation of the Higgs Boson. This article briefly mentions it as a 1 in 2 million chance that the result was a statistical aberration. Most sources such as give it as 1 in 3.5 million. Can anyone clarify this concept? Dudley Miles (talk) 15:12, 30 September 2013 (UTC)
 * While I am very much interested in the solution of this problem, I wonder if either answer has any meaning in the real world. Being so deep in the tails, I would imagine that the 'correct' result is extremely sensitive to even very tiny variations from the normal distribution. TStein (talk) 21:29, 30 September 2013 (UTC)

Is the formula for the standard deviation correct under the "Basic examples" ?
Shouldn't you divide by 7 instead of divide by 8 ?

The total count is 8, but the formula is divide by (N - 1)

MS Excel uses divide by (N - 1)

Other websites uses divide by (N - 1) as well http://www.ltcconline.net/greenl/courses/201/descstat/mean.htm — Preceding unsigned comment added by 203.8.7.161 (talk) 04:46, 17 October 2013 (UTC)
 * Read the paragraph just below the formula. The N is when getting the standard deviation of the whole population. Excel is using N-1 because it would normally be used for finding the standard deviation of a sample. Using N-1 in the version for the sample gives a good estimate of the population standard deviation. Dmcq (talk) 05:11, 17 October 2013 (UTC)

Standard deviation of the median?
I do not understand why it is said that "standard deviation" is just for the mean. The formula to compute it can be used for some other stochastic variables, and, in particular, the median (instead of the "absolute deviation"):

$$\begin{align} \sigma & = \sqrt{\operatorname E[(X - median)^2]} \end{align}$$

So, it seems to me that it should be said that the standard deviation is a dispersion measure of any stochastic variable that "summarise" some dataI do not understand why it is said tha — Preceding unsigned comment added by 78.229.106.132 (talk) 10:00, 3 October 2013 (UTC)
 * The hint is in the word 'standard'. Dmcq (talk) 05:16, 17 October 2013 (UTC)

Comparison standard deviations.svg
The vertical dashed line at x=100 correspond to the maximum values of the example curves, and not to their mean values. This defect is quite clear in the pink curve. dlambert@ets.org — Preceding unsigned comment added by 144.81.85.9 (talk) 18:43, 23 September 2013 (UTC)

Agree with the above — Preceding unsigned comment added by 202.136.240.130 (talk) 09:54, 27 November 2013 (UTC)

I independently noticed the same issue. I say go ahead and fix/delete it. Ajnosek (talk) 19:42, 6 December 2013 (UTC)

Rapid calculation methods
For the weighted calculation, the article says, "And the standard deviation equations remain unchanged." Doesn't that create confusion? For sigma it works, going from:


 * $$\sigma = \frac{\sqrt{Ns_2-s_1^2} }{N}$$

to:


 * $$\sigma = \frac{\sqrt{s_0 s_2-s_1^2} }{s_0}$$

But s goes from:


 * $$s = \sqrt{\frac{Ns_2-s_1^2}{N(N-1)}}.$$

to:


 * $$s = \sqrt{\frac{s_0s_2-s_1^2}{s_0^2} \cdot \frac{N}{N-1}} = \sigma \sqrt{\frac{N}{N-1}}.$$

A casual reader might not realize that from how the article reads now. WildGardener (talk) 23:14, 6 December 2013 (UTC)

"Difference of data point from the mean"?
The article write :
 * First compute the difference of each data point from the mean(how is this named?), and square the result of each:

\begin{array}{lll} (2-5)^2 = (-3)^2 = 9 &&  (5-5)^2 = 0^2 = 0 \\    (4-5)^2 = (-1)^2 = 1  &&  (5-5)^2 = 0^2 = 0 \\    (4-5)^2 = (-1)^2 = 1  &&  (7-5)^2 = 2^2 = 4 \\    (4-5)^2 = (-1)^2 = 1  &&  (9-5)^2 = 4^2 = 16. \\    \end{array} $$ What is the real, conventional term for this concept : Naming each step properly is important for understanding. I'am beginner in statistic and unable to clarify this myself. May someone more knowledgeable dig a bit to clarify and link to the proper article. Yug (talk)  11:00, 18 November 2013 (UTC)
 * deviation from average/mean ?
 * deviation score/score deviation from average/mean ?
 * deviation error from average/mean ?
 * variance from average/mean ?
 * ... (something else)
 * When you go to the shops do you give a name to every corner never mind every step? Dmcq (talk) 13:50, 18 November 2013 (UTC)
 * Probably worth looking at Errors and residuals in statistics. I'm not sure that, in practice, the difference from the mean is often referred to as a residual. The term 'residual' is more often used when you're fitting a statistical model such as a regression model—but you can think of estimating the mean as a 'zeroth order' statistical model, as done in the example in that article. Qwfp (talk) 16:08, 18 November 2013 (UTC)
 * In Bevington's statistics text I remember reading that these are called dispersions from the mean.Dave mathews86 (talk) 07:22, 4 February 2014 (UTC)

Rapid calculation methods - R/d2
In industrial engineering and quality control we use sometimes R(range)/d2(n) to easily estimate an unbiased deviation from sample. The theory says it is very precise for n<10. I'm not a statistician so please correct me. -- 04:36, 11 February 2014 (UTC) — Preceding unsigned comment added by 187.78.178.203 (talk)

Bad graph
Obviously it should be $$\mu + \sigma, \mu+ 2 \sigma$$ instead of $$1\sigma, 2\sigma$$ etc. — Preceding unsigned comment added by Boriaj (talk • contribs) 14:39, 9 May 2014 (UTC)


 * Yes, that's true!... FilipeS (talk) 10:16, 19 July 2014 (UTC)

Unclear language in "Corrected sample standard deviation" section
The first sentence, for example, seems to be defining several things and it's unclear what is the subject for the verb. — Preceding unsigned comment added by Bkfunk (talk • contribs) 20:37, 31 July 2014 (UTC)

General comment this and other main stats pages.
I am re-teaching myself stats through reading many of these articles.

I have seen a lot of "improvements" to the stats pages on basics like standard deviation over the last few years. I have valued the excellent graphics that have been developed. I have also valued the sections with a rigorous mathematical equation based discussion of relevant issues.

In the process of these improvements though, I have felt that the wording of the introductory sections to many of these articles has become more complicated in the language trying to ensure exactness in initial concept, including boundaries, nuances and exceptions to the basic concept. All in the one paragraph.

If you already understand the topic, then this precision of definition all makes sense. However most people are seeking a beginning understanding, and in this regard the precision adds to many things to keep track of in getting the basic concept. Reading these intro sections, often I now find it difficult to get the basic "jist" of it. My response has been, nope don't get it, to hard, too complicated. Just can not get my head around it.

I have not edited the intro sections myself as I can see a lot of thought and care has gone into them to ensure they are correct in the full detail of the concept being introduced. However I do suggest consideration be given to much lighter simpler laymans style description to basic concepts that may not be that precise, but does convey the jist and feel of the basic concept.

I present the issue as I have experienced it, but I do not have the skills to write what is needed. I think such intro paragraphs are better written by those who have a conceptual understanding but not a detailed technical mathematical understanding of the topic.

So first a laymans very basic jist of the concept as an intro paragraph.

Then a more refined, technically correct refinement of the concept, followed by various diagramatic and more focused expansion on aspects of the concept. Some basic examples perhaps in the most basic form.

Then for those who want to technically use and apply these concepts and do the statistics in practice, the more rigorous and mathematical discussions of the topic and subtopics. For those already familiar with the concept the rigorous mathematical explanation of various aspects is an excellent and important part of Wikipedia in my view.

With this page, I think the coverage of combining samples does belong on this page. The same topic may also belong on the other page as well. Certainly I looked for it here. CitizenofEarth001 (talk) 11:01, 16 August 2014 (UTC)


 * I agree with Citizen, not only for statistics-related articles but many others on topics involving complex mathematical operations, there is a real lack of simple, high-level conceptual description of the concept in terms comprehensible to readers not already well-versed in the topics. To use a personal example: I really like fractals, I like to look at them and appreciate their visual appeal and self-similarity properties, and while I understand and appreciate that they are produced by mathematical calculations involving imaginary and complex numbers (concepts I generally grasp), I'm not familiar with the majority of the technical terms used in some of the more detailed aspects. I recall one time I was reading a fractal-related article that mentioned Misiurewicz points. Unfamiliar with that term, I followed the link hoping to learn enough to provide a frame of reference for the context in the previous page, but that entire article was so completely filled with either formulae or sentences containing so many other unfamiliar terms that it was hopeless for me unless I wanted to spend hours diving through one article after another to become a fractals expert! (That article has since been improved somewhat; there is a semi-decent lay explanation near the bottom.)
 * So anyway, I came to this page not knowing what a standard deviation is except that it relates to a set of values in some way. Now I know that a smaller standard deviation means that the values in the set tend to be closer together than a larger standard deviation, but I still don't conceptually understand what "one standard deviation" really means - I mean, if you tell me x and y are one standard deviation apart, I could refer to this page, and work out some math, and figure out the quantitative difference between x and y, but I still don't get it. What is the significance of the fact that they are one standard deviation apart? How does it relate to whatever else was being talked about immediately beforehand? It's still just a vague, abstract "thingy" to me.
 * Sorry for going on so long about this, I feel like I've come across as dumb, which is partly due to the fact I'm typing this on my phone, which is slow, cumbersome, and prone to errors, this preventing me from fully expressing all the thoughts I'd like to, but the key point I'm trying to make is, yes, please continue to develop broad conceptual explanations on this and other articles. Thanks.
 *  D a n si m a n  ( talk | Contribs ) 18:01, 30 January 2015 (UTC)


 * These are very helpful comments. I agree that the art of writing an encyclopedia article for Wikipedia is to keep the article approachable for a beginner while also making it scrupulously accurate. That's an art well worth practicing, and I hope to devote some time to improving this article precisely along those lines. -- WeijiBaikeBianji (talk, how I edit) 21:50, 30 January 2015 (UTC)

SD per AMA and others
A few weeks ago an anon was desperate to convince everyone that the "SD" abbreviation shouldn't be used for standard deviation. Anyone who ever used PubMed to an appreciable extent knows that that's a misapprehension (no matter how well-intentioned). I just wanted to explain here that AMA style is one of various styles that uses "SD". The anon's change was reverted (a good thing), but no one took the time to explain to them why (a regrettable thing). So I just remembered to explain here. Quercus solaris (talk) 22:43, 16 March 2015 (UTC)

Neither n nor n-1
What about dividing the sum of the squared differences by something that is neither n nor n-1, e.g. n-2? GeoffreyT2000 (talk) 04:36, 30 April 2015 (UTC)

n vs. n-1
Is there a difference between dividing by n and dividing by n-1? GeoffreyT2000 (talk) 04:40, 30 April 2015 (UTC)

I have clarified this in the main article. It is to do with estimating the sd of a large population by calculating the sd of a smaller, randomly chosen subset of the full population. Division by n-1 gives a better estimate of the sd of the full population when the sample is small. The original text did not make sense. g4oep — Preceding unsigned comment added by 77.96.58.212 (talk) 16:12, 3 June 2015 (UTC)

Confidence interval of a sampled standard deviation
This section could be more useful, it seems. Shouldn't it conclude with a formula like $$\frac{s}{\sqrt{2N}}$$? That's an estimate I've seen elsewhere for the variance in the standard deviation. I was hoping to confirm the formula and see a better explanation on Wikipedia.

The section actually lists confidence intervals in paragraph form. No formula. No graph. Not even a table. This is a weird section. Spiel496 (talk) 21:51, 20 November 2015 (UTC)

Absolutely obtuse to a lay person
We have not done very well with our stated goal of making our articles accessible and understandable to the lay public with this article. After reading for around half an hour, following interminable links to jargon, and still being quite unclear as to exactly what Standard Deviation expresses ... I went to the "Math is Fun" site and within 5 minutes had a very clear understanding, including: what Standard Deviation means, how it is used, exactly how to calculate it, and even why squaring is involved in variance of which standard deviation is simply the square root. I often do minor edits to help Wikipedia and I donate money also, and I state this here only to show my strong belief in, and commitment to Wikipedia. So it is very disappointing to be forced off site to get simple clear answers. Thanks — Preceding unsigned comment added by 97.125.83.84 (talk) 18:35, 13 July 2016 (UTC)

Agreed. Pgpotvin (talk) 18:41, 21 July 2016 (UTC)

Weighted Standard Deviation
(1) There is mention of n′ at the end and n′ appears nowhere else in the section.

T3l (talk) 03:17, 15 November 2016 (UTC) The n' is used to describe sample variance in terms of population variance. I just fixed this by adding the missing equation. Please check.

(2) The presentation of the same concept is much clearer in the article Mean square weighted deviation, which writes:

$$s^2 = \frac{\sum_{i=1}^N w_i}{{(\sum_{i=1}^N w_i})^2 - {\sum_{i=1}^N w_i^2} } \. \ {\sum_{i=1}^N w_i (x_i - \overline{x}^{\,*})^2}$$

where $$\overline{x}^{\,*}$$ is the weighted mean (see that article for details). Again the corresponding standard deviation is the square root of the variance. This is much simpler to grasp and compute than the s, A, W or Q described in this section.

Incorrect diagram under section "Interpretation and application" ?
The red and blue populations are supposed to have mean = 100 but from inspection the red one clearly appears to have a lower mean.
 * Those are samples from the populations, not the populations themselves. Dghu123 (talk) 16:26, 31 December 2016 (UTC)

Contradiction between Wikipedia pages?
There seems to be a direct contradiction between a statement in this article (in section "Basic Examples"):


 * Dividing by n − 1 rather than by n gives an unbiased estimate of the standard deviation of the larger parent population. This is known as Bessel's correction.[5]

And the article referenced on Bessel's correction (in section "Caveats").

'''There are three caveats to consider regarding Bessel's correction:


 * It does not yield an unbiased estimator of standard deviation.'''

Vector Shift (talk) 03:27, 14 March 2017 (UTC)

Also could someone explain the blue background areas and why the boldface I tried to use in the second quote works differently in the blue background area and the other part of the quote that for some reason, isn't in the blue background area?

Vector Shift (talk) 03:27, 14 March 2017 (UTC)


 * Bessel's correction gives an unbiased estimator of the variance, but not of the square root of the variance, i.e. the standard deviation.
 * The formatting didn't work the way you wanted as Mediawiki interprets a space at the beginning of a line of wikisource as requesting monospace font and a grey background. This is designed for quoting code - personally I think it's a misfeature. I've taken the start-of-line spaces out of your input above and replaced by colons, which is wiki syntax for indenting. Qwfp (talk) 06:48, 14 March 2017 (UTC)

geometric visualization
I don't get the "geometric visualization" image. I mean, I understand how it works and see that it is correct, but I don't understand how it is supposed to help anyone understand what variance is. The last step is just a sort of algebraic trick, and I find it mathematically uncomfortable because it converts the quadratic-type object variance into the linear-type object distance. Is it just me? McKay (talk) 05:07, 17 March 2017 (UTC)
 * I agree (i.e it's not just you). Qwfp (talk) 07:31, 17 March 2017 (UTC)
 * I've now removed the image. Qwfp (talk) 19:30, 17 March 2017 (UTC)

metabolic rate is requires harmonic mean, not arithmetic mean
Basal metabolic rate (BMR) "is the minimal rate of energy expenditure per unit time."

The Harmonic mean "is appropriate for situations when the average of rates is desired." See also the section Harmonic_mean

The example for standard deviation in this article should reference data which is counts rather than rates. For example, "number of popsicles consumed by various children" is not a rate, the distribution doesn't have significant outliers, and is therefore applicable for this version of standard deviation.

When the harmonic mean is used, the formula for standard deviation is more complicated; see this post. — Preceding unsigned comment added by 172.58.41.58 (talk) 00:46, 21 August 2017 (UTC)

Need to start with the basics
Odds are the person looking up 'standard deviation' on wikipedia isn't a statistician, pollster, mathematician, or logician. Odds are (s)he is not even a researcher or sociologist. (S)he is 'John Q. Public' who is reading an article in the paper, reading a report online, or watching a newscaster on TV when the results of a poll or study are included. John Q sees or hears reference to the margin of error or standard deviation and wants to find out what ME and SD mean. This article, therefore, needs to start off with a more basic definition of 'standard deviation' including when/why it is used.

Similarly, you provide 3 'basic examples' of SD. The first one contains the relatively esoteric mathematical symbols χ with the subscript i, χ with the subscript N, χ with a line over it (I don't know how to type these in wiki) ...even the Σ (summation notation) may well be Greek to many (pun intended). The second example states "For a finite set of numbers, the population standard deviation is found by taking the square root of the average of the squared deviations of the values from their average value." -I have an advanced degree and I had to read through that 3 times to grasp what it was saying. I think the 3rd example is actually a good basic example that can illustrate SD well to all the John Q. Public's.

So, I recommend starting with a definition and example for the 'layman' and then getting more advanced. Niccast (talk) 03:42, 2 April 2018 (UTC)


 * Came here to say the same thing. I know some math and some stats, but not enough to use this article. How about if the first example used data everyone encounters every day instead of "the resting metabolic rate of the northern fulmar". The latter now introduces two unrelated concepts which might also be unfamiliar to readers. The summary from the article on Expected value currently says, "the expected value in rolling a six-sided dice is 3.5, because the average of all the numbers that come up in an extremely large number of rolls is close to 3.5." This is a good example. What is the standard deviation of a fair die roll? I'm going to try to figure out how to calculate it, and if I have time I will propose an edit here (because I'm not confident in my understanding of the material)...
 * Crag (talk) 18:13, 20 April 2018 (UTC)


 * For large sample sizes the sample standard deviation of rolls of a six-sided die will approach the population standard deviation, which is √(17.5÷6) ≈ 1.7078251276599330638701731134202.
 * —DIV (120.18.155.144 (talk) 12:26, 14 August 2018 (UTC))

Unbiased estimates
Given $$\hat\sigma \approx \sqrt{ \frac{1}{N - 1.5} \sum_{i=1}^N (x_i - \bar{x})^2 }$$ and $$s^2 = \frac{1}{N-1} \sum_{i=1}^N (x_i - \overline{x})^2$$, suppose that $$\sum_{i=1}^N (x_i - \overline{x})^2 = 10$$ and $$N=3$$. Would it therefore make sense to report in tandem the two unbiased estimates $$\hat\sigma \approx \sqrt{ \frac{10}{3 - 1.5} } = \sqrt{\frac{20}{3}} \approx 2.58198889747$$ and $$s^2 = \frac{10}{3 - 1} = 5$$ ? —DIV (120.18.155.144 (talk) 12:43, 14 August 2018 (UTC))

Reference 14
The link for reference 14 no longer works. The following link contains the same article and is currently functional. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.302.7503&rep=rep1&type=pdf — Preceding unsigned comment added by 86.174.25.53 (talk) 18:24, 3 January 2019 (UTC)
 * That's blatantly violating JSTOR's terms of use given on its first page so can't be added to the article. Qwfp (talk) 19:29, 3 January 2019 (UTC)

Contradiction in article regarding 5 sigma
The section Standard deviation has this sentence: "A five-sigma level translates to one chance in 3.5 million that a random fluctuation would yield the result."

The cited source also says this.

But the table later in the article (and any mathematical calculation) shows that the part of the distribution outside of 5&sigma; is 1 in 1.74 million, which is double 1 in 3.5 million.

Mathematically, the cited source is incorrect... but I find many other sources that say the same thing in the context of particle physics. It's as if particle physics is considering only one of the tails outside 5&sigma; compared to the whole rest of the distribution including the other tail.

Someone shed some light on this. ~Anachronist (talk) 21:41, 14 February 2019 (UTC)

Verbal vs math notation
STANDARD DEVIATION A measure of dispersion of a frequency distribution equal to the square root of the mean of the squares of the deviations from the the arithmetic mean of the distribution." (Random House Dictionary of the English Language. 2d edition. 1966. Words communicate to many people better than math notation.) Patshannon+ (talk) 02:28, 20 May 2019 (UTC)PatrickDShannon@gmail.com

Confusing statement
In the forth paragraph there is a confusing statement "...is computed from the standard error of the mean (or alternatively from the product of the standard deviation of the population and the inverse of the square root of the sample size, which is the same thing) and is typically about twice the standard deviation..." This seems like an error - if SEM is calculated "...product of the standard deviation of the population and the inverse of the square root of the sample size..." how can the result be..."about twice the standard deviation..." --IcyEd (talk) 11:21, 4 September 2019 (UTC)

Please use proper colors in the drawings
The chosen colors are like a color vision test. Please use distinct and bright colors, especially because of green, yellow and red being too close. It's more important to be accessible than to pick colors by other factors. Bright yellow, bright green and red, make it more accessible. Thanks. I am referring to this picture in particular: Variance_visualisation.svg — Preceding unsigned comment added by 88.219.179.71 (talk) 11:35, 24 June 2015 (UTC)

File:Comparison_standard_deviations.svg is incorrect
The File:Comparison_standard_deviations.svg, in the Interpretation and application section, is incorrect per the talk page of the file on Wikimedia Commons: https://commons.wikimedia.org/wiki/File_talk:Comparison_standard_deviations.svg. Even someone with no background in math can see that the mean line, labeled "Average = 100," is obviously not the mean average of the red data. --Jack Autosafe (talk) 21:02, 20 November 2019 (UTC)

Sentence says the opposite of what it means
In the introduction, the second sentence of the second paragraph reads:

It is algebraically simpler, though in practice less robust, than the average absolute deviation.

I'm positive that this isn't correct, since the standard deviation is more complicated and more robust than the AAD. I think that whoever wrote this simply made a mistake, swapping the subject and object of the sentence. It should read:

It is algebraically more complex, though in practice more robust, than the average absolute deviation.

or

The average absolute deviation is algebraically simpler, though in practice less robust, than the standard deviation.

Thoughts? -Somebody without an account :) — Preceding unsigned comment added by 132.241.174.230 (talk) 22:02, 30 April 2019 (UTC)

Somebody with an account:

I'm positive that this isn't correct, since the standard deviation is more complicated...

Is it, though? I think most people would find squaring/square roots more complicated than subtracting two numbers and throwing away the negative sign. As to the other points, maybe they were referring to the effects of squaring with respect to outliers and leverage.

EntangledLoops (talk) 18:17, 5 June 2020 (UTC)

Thousands separator?
What is going on with the formatting of the "Squared difference from mean" column of the "Sum of squares calculation for female fulmars" table under "Basic examples" / "Sample standard deviation of metabolic rate of northern fulmars"?

The formatting is rendering the numbers as if it was using a thousands separator, but the character used was a space instead of a comma.

Ah, I just looked it up.. At https://docs.microsoft.com/en-us/globalization/locale/number-formatting#:~:text=The%20character%20used%20as%20the,thousands%20separator%20is%20a%20space. they say that the thousands separator is a space in Sweden. Does wikipedia's infrastructure allow region-specific rendering of the data, such that a US reader would see a comma as a thousands separator, a Swedish reader would see a space, and a German reader would see a period?

Jlkeegan (talk) 20:16, 24 August 2020 (UTC)


 * See MOS:DIGITS for the appropriate guidelines. The short version is that either commas or gaps may be used (gaps are not unique to Sweden; they're common in a lot of scientific contexts regardless of language).  And to answer the other part, no, Wikipedia has no way to customize number formats.  –Deacon Vorbis (carbon &bull; videos) 20:42, 24 August 2020 (UTC)

missing minus signs
What is going on the all the missing minus signs (all over WikiP?) — Preceding unsigned comment added by 184.157.243.232 (talk) 13:31, 20 December 2020 (UTC)

Deleting the Northern Fulmars example that started the article after the lead section
I deleted the long Northern Fulmars example of working out the calculations for computing a sample standard deviation. While I appreciate that a lot of work went into writing and formatting it, so I feel bad for the editor who inserted it (unless it's the studies' author!), the example is redundant and takes up a lot of space. It introduces a complicated formula before that formula is defined, and the sample sd before the population sd. It has two variable (male and female) instead of one. The numbers are large and long, so the arithmetic is hard.

The "grades of eight students" example that followed it does what the northern fulmars example does, and better, because it is easier to understand, shorter, and starts with the population standard deviation instead of the sample one. editeur24 (talk) 16:56, 22 December 2020 (UTC)

distribution
normal distribution is aka bell curve. it is symmetrical and the mean is equal to the median and mode of the data, where 95% of values lie within 2 standard deviation of the mean. — Preceding unsigned comment added by 41.188.194.225 (talk) 12:06, 10 March 2021 (UTC)

a nonlinear function, which does not commute with the expectation
I am worried that most readers do not understand this phrase and giving a link to the commutativity article does not help.

I was thinking about changing

"Taking square roots reintroduces bias (because the square root is a nonlinear function, which does not commute with the expectation), yielding the corrected sample standard deviation, denoted by s: "

to

"Taking square roots reintroduces bias (because the square root is a nonlinear function, which does not commute with the expectation i.e. often $$E[\sqrt{X}]\neq \sqrt{E[X]}$$), yielding the corrected sample standard deviation, denoted by s: "

I am not sure that my suggested change above helps either.

- irchans 14 Dec 2021  22:15 GMT  — Preceding unsigned comment added by Irchans (talk • contribs) 22:16, 14 December 2021 (UTC)