Talk:Autocorrelation/Archive 1

Old stuff not previously in a section
There needs to be a laymans definition of this, with a real world, applicable example.

I have been a little hesitant to edit these articles partially because there are so many different notations, and i don't know which to use.

The autocorrelation function can be written as:

$$R(\tau) \ R_f(t) \ \rho_f(t) \ R_{ff}(\tau)$$

$$\hat r_x(l) \ R_{xx}(k) \ \rho_k$$

and time series can be written:

$$x_n \ x[n] \ x(n) \ $$

Does Wikipedia have any standards for this?

I think in general, using more specific, accurate notation tends to muddle the understanding for someone first being introduced to a subject, which is also the type of person probably reading Wikipedia.

In other words,

$$R(\tau) = \int f(t)f(t+\tau) \, dt$$

is preferable to

$$R_f(\tau) = \lim_{T \to \infty} {1 \over 2T} \int_{-T}^T f^* (t)f(t+\tau) \, dt$$

How much detail should we use? Should we use the simplified version to introduce the topic, and then show the more detailed versions?

Also, this article started out in the world of probability, and I have converted it to signal processing. The two should both be included in the same article. - Omegatron 16:56, May 25, 2004 (UTC)

From the article:


 * The autocorrelation definition then becomes


 * $$R(j) = \sum_n x_n x_{n-j}$$


 * which is the definition of autocovariance.

I have some doubts about this claim. The autocorrelation is a function (of j) while the autocovariance is a number. I think what is true is that in this case the autocorrelation at j = 0 is the autocovariance.

Secondly, I have not seen definitions of the autocorrelation where the mean is substracted. Can someone confirm whether this is a common practice? -- Pgabolde 16:29, 17 Dec 2004 (UTC)


 * The web and the autocovariance article both seem to think that autocovariance is a function.


 * I believe there are many variations on the autocorrelation formula, with different weightings, normalizations, etc. and they are all still called autocorrelation. This (kind of a "vertical offset"?) seems to be just another variation. - Omegatron 18:54, Dec 17, 2004 (UTC)

I can confirm that this is common practice in a wide variety of fields. To a mathematician the autocorrelation with the mean subtracted and divided by the variance is the standard definition (it has the useful property of being in the range [-1,1]. See, for example Priestley's classic "Spectral Analysis and Time Series" (1982) London New York Academic Press.

Autocovariance is a function not a number -- I do not know where the idea that is is a single number has come from. I can think of no field where that would be the case.

--Richard Clegg 14:46, 6 Feb 2005 (UTC)

This is rather tricky --- to me, what you have defined is autocovariance not autocorrelation although I'm aware these are sometimes used interchangably and many people use the formula given on this page. Also, should we not formally define the autocorrelation in terms of expectation? For a process Xt (either discrete or continuous) then


 * $$ R(k) = \frac{E [(X_t - \mu)(X_{t+k} - \mu)]}{\sigma^2} $$

where &mu; is the mean E[X] and &sigma;2 is the variance. This is nicer mathematically since it is normalised to the range [-1,1]. It should also be noted that the autocovariance (indeed the mean and variance) are not necessarily defined unless the process is weakly stationary. Another reason for such an edit would be to bring this page into line with the definition for correlation.

Richard Clegg

I have tried to reconcile the differing definitions of autocorrelation in different disciplines. I hope nobody thinks it is arrogant of me to put the mathematical definition first.

Richard Clegg


 * Is o a common symbol for convolution? I've never seen it before... - Omegatron 14:45, Feb 6, 2005 (UTC)
 * Maybe you meant a circle?


 * $$R_f(\tau) = f^*(-\tau) \circ f(\tau) = \int_{-\infty}^{\infty} f(t+\tau)f^*(t)\, dt$$


 * Not that I have seen that before, either... - Omegatron 14:51, Feb 6, 2005 (UTC)

I did mean the circle but wasn't sure how to get that in the non math environment -- would be grateful if you could change it (thanks). It is relatively commonly used -- although not as commonly as the * but the * was already used in the equation to designate complex conjugate hence, I hoped to avoid confusion.

--Richard Clegg 14:57, 6 Feb 2005 (UTC)


 * yeah, i figured. added. -Omegatron 15:43, Feb 6, 2005 (UTC)

Which autocorrelation are we talking about?
The article has two sections that talk about "autocorrelation" without specifying whether they refer to the statistics definition or the signal processing definition. --Smack (talk) 22:35, 27 May 2005 (UTC)


 * Are they not the same thing? - Omegatron 23:29, May 27, 2005 (UTC)


 * What bothers me is the use of the two different signal processing definitions without specifying which is which. Several of the "properties", for example the autocorrelation of white noise and the Wiener-Khinchin theorem, hold only for the second definition (the limit as T tends to infinity of 1 over T of the integral).  I find this confusing. --Assemblany 07:17, 28 March 2007 (UTC)
 * Can you be more specific about what problems you see, in what sections? Smack's remarks from 2 years ago don't have much relation to the current article. Dicklyon 16:28, 28 March 2007 (UTC)

Reducing the complexity using the DFT
It is possible to reduce the complexity of the Autocorrelation from $$O(n^2)$$ to $$O(n log n)$$ using the Wiener-Khinchin theorem. This theorem is mentioned in the article but this interesting property is not.

I was thinking about adding this sentence right after the description of the Wiener-Khinchin theorem:


 * which allows to compute the discrete autocorrelation for zero-centered signals using a discrete Fourier transform, and hence reducing the complexity from $$O(n^2)$$ to $$O(n log n)$$:


 * $$R = \mathcal{F}^{-1}(|\mathcal{F}(x)|).\,$$
 * where $$\mathcal{F}^{-1}$$ is the inverse discrete fourier transform, $$\mathcal{F}$$ is the discrete fourier transform and || is the Complex Modulus.

But I don't like very much the style. Anyone has a suggestion? --Nova77 01:49, 17 December 2005 (UTC)

NOTE: The autocorrelation is normalized by the standard deviation while the FFT method is not. The normalization factor should be included otherwise its not the same computation. — Preceding unsigned comment added by 24.91.185.127 (talk) 17:20, 2 February 2013 (UTC)

Error?
Shouldn't text below first image of "The Blue Danube" say Fourier transformation or something similar? It is not "original signal" for sure.


 * What makes you certain it's not the original signal? I'm not an audio engineer but my first guess would be that it is the original signal -- it's the right length.  I just ask because I have no particular reason to believe it's not the original signal.  It's certainly not the fourier transform -- it's in the time domain not the frequency domain for a start.  --Richard Clegg 10:27, 31 May 2006 (UTC)


 * That autocorrelation function also makes no sense, which I why I removed the whole thing. If there's a relationship between those two figures, or between either figure and the Blue Danube music, it's certainly not apparent.  Neither looks like what it says it is. Dicklyon 16:11, 1 June 2006 (UTC)

Re-editing this article to make it more coherent
I think this article has got a bit "wooly" due to it being edited by people from different backgrounds. I know that a lot of different definitions of autocorrelation are used and different fields have different ideas about it but I think we're missing the commonality here and hence the article is very confusing. Does anyone have any suggestions as to how to make this more coherent? --Richard Clegg 09:48, 1 June 2006 (UTC)


 * I agree. I worked around to it when trying to make LTI system theory correct; lots of things got touched.  I'm still trying to sort out what the various definitions of autocorrelation and autocovariance are in different fields.  The article was originally written with the expectation based approach, which seems to be most common, but the formulas for computing an ACF from a given signal, or estimating the ACF of a process from a sample, are more of the integral or sum form; the statistician would say those are estimators, not definitions.  Sorry for the wool.  I hope someone can help with a rewrite that takes correctness into account better than the old one did. Dicklyon 16:07, 1 June 2006 (UTC)


 * Thanks. I think you have helped the article.  Perhaps I should try to write something which goes between the expectation based formula and the commonly used estimators.  The problem is I would do that from a stats perspective which is not necessarily helpful.  I imagine a lot of people who want to use the ACF are in engineering or science.  --Richard Clegg 09:58, 2 June 2006 (UTC)

Proposed conventions
I propose the following conventions to help to distinguish various definitions. I have to say that these conventions are a summary of what I've seen in engineering (optical) and signal processing field, but I think would be good also for the "mathematician's world". The reason I'm posting here to stimulate constructive discussion, to finally reach a common agreement.

For two stochastic variables X and Y
Correlation:
 * $$R(X,Y) = E (X Y)\,$$

Covariance:
 * $$\mathrm{cov}(X,Y) = E((X-\mu_{X})(Y-\mu_{Y})) = E(X Y) - \mu_{X} \mu_{Y}\,$$

Correlation coefficient
 * $$\rho_{X,Y} = \frac{\mathrm{cov}(X,Y)}{\sigma_{X} \sigma_{Y}} = \frac{E(XY)-E(X)E(Y)}{\sqrt{E(X^2)-E^2(X)}~\sqrt{E(Y^2)-E^2(Y)}}$$
 * look Correlation

For a stochastic continuous process X(t)
Auto-correlation:


 * $$R_X(t_1,t_2) = R(X(t_1), X(t_2)) = E( X(t_1) X(t_2) )\,$$

Auto-covariance:



C_X(t_1,t_2) = \mathrm{cov}(X(t_1), X(t_2)) = E( (X(t_1)-\mu_{X}(t_1)) \cdot (X(t_2)-\mu_{X}(t_2)) ) \, $$

Degree of correlation (also know in optics as "degree of coherence"):



\rho_{X}(t_1,t_2) = \frac{\mathrm{cov}(X(t_1),X(t_2))}{\sigma_{X}(t_1) \sigma_{X}(t_2)} = \frac{ E(X(t_1)X(t_2)) - E(X(t_1)) \cdot E(X(t_2)) } { \sqrt{E(X^2(t_1)) - E^2(X(t_1))} ~ \sqrt{ E(X^2(t_2)) - E^2(X(t_2) } } $$

If the process is second-order stationary
Auto-correlation:


 * $$R_X(\tau) = R_{X}(t, t - \tau) = E( X(t) X(t -\tau) )\,$$

Auto-covariance:


 * $$C_X(\tau) = \mathrm{cov}(X(t), X(t-\tau)) = R_{X}(\tau) - \mu_{X}^2\,$$

Degree of correlation (also know in optics as "degree of coherence"):



\rho_{X}(\tau) = \frac{\mathrm{cov}(X(t),X(t-\tau))}{\sigma_{X}^2} = \frac{ R_{X}(\tau) - \mu_{X}^2 } { \sigma_{X}^2 } $$

All the above are directly extensible to the discrete case.

I add also a consideration. Expressing the formulas in terms of the mean E gives the advatage that one can see the logical correspondence with the determinisc signal for which all the above quantities can be defined, but the mean operation is done directly in the temporal doman without using the probability density function.


 * ~ TheNoise

A response
That looks like a fine coherent set of terminology. But lets look at exactly where the article stands and what you are proposing the change.

First, in the statistics section, the definitions are discrete-time and they differ in terms of what name goes with removal or means or not. I'm no expert on that field, but if those are how they use the terms and symbols, then we shouldn't prescribe something diffferent for the sake of consistency with engineering; rather we should just describe in terms of the terms that the field uses, while pointing out the differences to avoid confusion. Otherwise, a stat guy will come along and change it all back to his way.

Second, in engineering, I've usually seen, and always preferred, the double-subscript notation to make the "auto-" and "cross-" symbologies mutually consistent. If you drop back to a single-subscript, it is no longer recognizable as a special case of the same definition. But we are only weakly linked (in the lead and one other place) to cross-corrrelation, which should be played up and made consistent if possible.

I checked some books and found the "Cov" is usually capitalized (in my small sample), and it and "E" are set in roman type (in \mathrm{}).

I would be in favor of keeping equations as simple as possible, but not leaving out anything important. For example, define the mean and standard deviation in the context of the Expectation notation, and don't use the expanded form of the definitions here (and put more words around them to say what the equations say):


 * $$\mu_{X} = \mathrm{E}(x)\,$$
 * $$\mathrm{Cov}(X,Y) = \mathrm{E}((X-\mu_{X})(Y-\mu_{Y})) \,$$
 * $$ \sigma_X = \sqrt{\mathrm{Cov}(X, X)} $$
 * $$\rho_{X,Y} = \frac{\mathrm{Cov}(X,Y)}{\sigma_{X} \sigma_{Y}} $$


 * The problem we have here is that there is no consensus in the literature. To me an article about "autocorrelation" *must* be about what autocorrelation means in the real world.  Unfortunately, it is used inconsistently and this article has to reflect that.  If we were writing a paper, thesis or article, we would be free to simply write our definitions and use them consistently.  Here, I think we have by necessity to do something different and to explain what is likely to be meant by autocorrelation when someone sees it in the literature.  This means we must acknowledge that the word has a number of meanings.  For what it is worth I have seen cov and Cov but never COV.  I have seen E in roman type, in italics and in \mathbb (the latter being my preference when I write papers).
 * While I have sympathy with the idea of trying to make a consistent set of definitions, I think we must by necessity do something different. --Richard Clegg 18:21, 9 September 2006 (UTC)


 * I think I sort of half agree. Certainly we must reflect what the real world terminology is.  But we can do that best by adopting one or a few internally consistent sets of notation that we think are most common, and then mentioning the differences.  The biggest differences are the definitional differences between the stat and eng fields, and those sections need to each be made compatible with their fields; internally, though, they should be consistent.  As to cov, Cov, E, etc., we just need to pick a style.  The one I mentioned was based on a quick check of a few books, but I'm flexible if someone shows that some other conventions are more common (not just personally preferred). Dicklyon 19:24, 9 September 2006 (UTC)


 * I agree with Dycklon here. Even if there are multiple real world conventions we should keep one and then mention the differences that can be encountered. There is one problem of definitions: the auto-covariance is sometimes defined as as the degree of correlation (using the terminology of the proposed convention). But if we define both $$R_x$$ and $$\rho_X$$ we can distinguish the two entities say also that sometime $$R_x$$ is defined as $$\rho_X$$. It's more clear IMHO than define $$R_x$$ in two different manner and also say that one of the two is also known as degree of correlation. Moreover, it seems to me that the big ambiguity is more verbal that symbolical. In fact the coefficient of correlation (usually indicate with $$\rho$$) is also called correlation (thus causing ambiguity), while when one refers to the "raw" un-normalized correlation he calls it with $$R$$. We can at least agree with this convention of symbols and then explain with the words that these entities are called in different manners. Regarding the typographical convention we should simply keep try to be coherent among some related articles. For example I prefer to use the {} for the $$E\{\cdot\}\,$$ operator. This would render the formulas a bit more difficult to edit (we should write \{ and \}) but it'll increase the readability, IMHO. For the $$E\,$$ itself, it is indifferent to me to use the roman character $$\mathrm{E}\,$$ or the bold one $$\mathbf{E}\,$$, but also this formatting would require more editing. For the covariance (cov) I used simply the convention used in the correlation page (although I forgot to put it in roman font). ~ TheNoise 14:12, 10 September 2006 (UTC)


 * I am a graduate student in engineering and have been pouring over a dozen texts who all seem to define autocorrelation a bit differently and frankly, sloppily. So I am both somewhat pleased that there is so much discussion about the very topic here, and also dismayed that is has not been made more clear. In my opinion, the most clear treatment is in Bendat and Piersol's Random Data text. I think the definitions and notation are very similar to what was suggested above. A clear distinction is made between the autocorrelation and the correlation coefficient (the latter normalized by the variance), which is not at all clear in the current Wikipedia entry. I will note, however that B&P define autocorrelation as an expected value, and therefore normalized by the length of the signal, while most signal processing texts (and indeed MATLAB) include no such normalization. This has been the cause of much confusion.--Vschmidt (talk) 14:59, 29 April 2008 (UTC)

Autocorrelation of a periodic function
There's a statement on the page that the autocorrelation of a periodic function with itself is again periodic with the same period. That can't be correct. The integral doesn't converge.


 * This is a problem with the variety of different definitions on this page. Using the definition in the statistics section it is an invalid question since the periodic function is not second-order stationary so you must use the two parameter ACF.  The signal processing definition assumes the signal has an integral which converges so that most (all) periodic signals would not be so integrable.  In spirit though, the assertion is correct.  --Richard Clegg 17:36, 5 October 2006 (UTC)


 * I think you got that a bit wrong. The statistics definition works fine, as the expected value of a finite product.  The integral definition is problematic, can it too can be OK if you allow interpretation of the integral in terms of delta functions.  Bottom line, if the autocorrelation exists, it is periodic.  You might prefer conditions that say it doesn't exist when the definitions don't lead to finite values though. Dicklyon 17:47, 5 October 2006 (UTC)


 * I see what you mean about stationarity, though. In some case the expectation will be OK, like if the phase is random and the process is ergodic.  The integral method, however, averages over phases, so doesn't care about stationarity so strictly; but it lead to delta functions.  Isn't math wonderfully nasty? Dicklyon 17:52, 5 October 2006 (UTC)


 * The statistics definition with one parameter is only valid for a second order stationary process. If the process is not second order stationary then E[X(t)X(t+T)] will be a function of t as well as T.  I'm not sure what you mean by "if the phase is random" in the case of the autocorrelation of a periodic function.  --Richard Clegg 22:00, 5 October 2006 (UTC)


 * What I mean is that if the phase is a random variable, and each sample function from the process has a different phase, with a uniform distribution over all phases, then the periodic process is second-order stationary. That is, the expected value of the product of two points with a certain time difference includes an averaging over all phases, so doesn't depend on the two points, just their difference.  Right? Dicklyon 22:33, 5 October 2006 (UTC)


 * Hmm... I can see your point.  It would be "in the spirit" of ACF but would lead to some strange quirks in the mathematics.  I'm happier saying "undefined" but it seems like there are enough definitions of ACF out there that one will fit.  --Richard Clegg 23:18, 5 October 2006 (UTC)


 * Yes, and another one that would work is the limit as T goes to infinity of 1/T times the integral of the (expected) product of a segment of length T times a shift of itself. For an ergodic stationary process it will give the same result, and for the periodic signal it has no difficulty. Since the process is ergodic you can do it on one sample function instead of relying on an expected value. I haven't seen that as a definition per se, but maybe it is, somewhere. Dicklyon 00:56, 6 October 2006 (UTC)


 * Here's a good page that explains what I was just talking about, the ergodic hypothesis applied to ACF: Dicklyon 01:00, 6 October 2006 (UTC)


 * And here's one that defines the ACF of an ergodic process in the way I described above: Dicklyon 01:02, 6 October 2006 (UTC)


 * But you were talking about a non-stationary and hence non-ergodic process.--Richard Clegg 01:21, 6 October 2006 (UTC)


 * A periodic process can be stationary and ergodic, or not, depending on whether its phase is uniformly distributed over its period. But even if that's not the case, the limit of the integral will converge to something that corresponds to what is usually meant by the ACF of such a function.  That is, sin(t) is not stationary, but sin(t+phi) for an appropriately distributed random variable phi, which is a constant in any given sample function from the process, is stationary and ergodic, if I understand this stuff right; I took a course on ergodic processes from Bob Gray about 30 years ago, but it went in one ear and out both. Dicklyon 02:12, 6 October 2006 (UTC)


 * Ah... got you. So it is periodic but you are unsure at what point in its phase it begins.  Clever, that case had not occurred to me.  Such a process could be both stationary and ergodic, you are absolutely correct.  I hadn't appreciated what you meant about phase even though you raised it early on.  Hmm...  I wonder if we could include the sin(t+phi) as an illustrative example somehow? --Richard Clegg 08:13, 6 October 2006 (UTC)


 * I think it opens a can of worms that I'm not so sure about. Let's don't do it unless we can find a text with such a treatment, so that whatever we say is verifiable. Instead, I put the limit definition that works, even if it's not random phase. Dicklyon 15:02, 6 October 2006 (UTC)

Time series
According to this page,


 * A time series is a sequence of observations which are ordered in time (or space). ... There are two kinds of time series data:
 * 1. Continuous, where we have an observation at every instant of time, e.g. lie detectors, electrocardiograms. We denote this using observation X at time t, X(t).
 * 2. Discrete, where we have an observation at (usually regularly) spaced intervals. We denote this as Xt.

If I interpret this correctly, the statistical definition in our article does not apply to time series, since a sequence of observations is a set of definite data, not a random process, even if it is assumed to have been produced by a random process. So the expectation operator does not apply. Furthermore, we had "discrete time series and processes" which seems like an inappropriate and non-parallel way to divide things up. So I changed these things. Please react, preferably with citations that clear up exactly how this issue is usually treated in statistics. Dicklyon 15:51, 18 October 2006 (UTC)

Partial Autocorrelation
I believe this should be a section within this page. Canking 18:33, 25 November 2006 (UTC)


 * I just looked up what that is, and I think it rates a separate article. Dicklyon 05:02, 26 November 2006 (UTC)


 * Unfortunately I don't have the expertise to write the article, sorry. Perhaps somebody else will write it Canking 10:55, 8 December 2006 (UTC)

Not user-friendly
Definitions not user friendly (not very accessible to the layman, in particular); it seems geared more for the intermediate to advanced mathematical student. AppleJuggler 03:17, 23 January 2007 (UTC)


 * What would a non-mathematical layman do with a definition of autocorrelation? If we produced a simpler definition that he could understand, but which was not actually correct, would that be an improvement? Dicklyon 06:24, 23 January 2007 (UTC)
 * Good point. AppleJuggler 07:06, 6 February 2007 (UTC)
 * Eaton and Eaton: "LabTutor" uses an easy to understand example of wind velocity measurements on a gusty day. If we measure two samples one millisecond apart we expect them to be the same and indeed the autocovariance value (which this article equates with autocorrelation even though technically speaking it isn't) is close to 1. However, with a time delay of five minutes between the two measurements, there is no similarity, so the autocovariance is close to 0. The autocovariance function then plots the values of the autocovariance as a function of the time intervals between the two measurements. The book goes on to show a sample autocovariance plot, elaborating that from this you can deduct the time scale of the wind. I think such an example would facilitate the laymans understanding of the term autocorrelation/autocovariance. I am not sure to what extend I am allowed to cite or even quote my source so I am reluctant to insert this into the main article. Jonemo (talk) 12:15, 15 April 2008 (UTC)

Please Please make this article user friendly. im a 3rd year student and Im having difficulty understanding this, partly because there is an unnecessarily excessive usage of unnecessary 'tough sounding' terminology that hides the real meaning and structure needs improvement. thanks a lot and great job guys! 13 feb 2008


 * Why don't you find a book or other source whose explanation you find to be more user friendly, and menion here as an example of how you think it could be done better. Dicklyon (talk) 06:42, 13 February 2008 (UTC)

A tiny suggestion, but it would be nice if the description of discrete autocorrelation:
 * $$R_{xx}(j) = \sum_n x_n \overline{x}_{n-j} \ . $$

noted that the last subscript is understood to be modulo N (the length of the signal). I don't understand the topic enough to be confident to make the change myself. —Preceding unsigned comment added by 67.169.76.230 (talk) 10:15, 20 February 2010 (UTC)

infinite variance?
I'm a little confused by the edit summary just left. The very definition of gaussian white noise is that the marginal distribution of variates at a given time have a gaussian distribution. If they have a gaussian distribution, they have finite variance. How is it you go from a finite spectrum to an infinite marginal variance? Lunch 19:44, 19 May 2007 (UTC)


 * See these books. Your definition may be correct and consistent with infinite variance, if by "variates" you mean numbers that you can get by integrals of the process times some kernel.  But a sample is not an integral, for a continuous-time white noise process, the variance of "samples" if they could be defined would be infinite, like the variance of the process, which is the variance per unit bandwidth times an infinite bandwidth.  If you define sampling as the limit as the kernel width goes to zero, of the integral a kernel (a distribution) times the process value, then that limit does not exist for a white-noise process.  So you can't sample it. Dicklyon 19:59, 19 May 2007 (UTC)

Stationary processes only?
Hi there, it seems to me that the definition in the "Statistics" section assumes that the process is stationary, seeing as mean and variance are not time-dependent. Then, confusingly the article continues saying that if the process is second-order-stationary (a weaker assumption!) some other property holds. I'm confused, thanks. 58.88.53.246 (talk) 14:17, 18 October 2008 (UTC)
 * Fixed, please review. 58.88.53.246 (talk) 13:29, 22 October 2008 (UTC)
 * Yes, better now. But a process can be non-stationary even if the mean and variance are constant. still, you have edited it to what should have been there. Melcombe (talk) 13:55, 22 October 2008 (UTC)
 * What do you suggest? The article seems correct now, since mean+variance constant is the definition of second-order stationary, not stationary. 170.148.96.107 (talk) 04:25, 23 October 2008 (UTC)
 * Possibly no changes was needed but I have changed things a little. To be precise second-order stationary requires both "mean+variance constant" and that the correlation (defined via the general expression) satisfies R(t,s)=R(|t-s|). However "mean+variance constant" alone does not imply R(t,s)=R(|t-s|), which was a strict interpretation of what was there before. Melcombe (talk) 08:48, 23 October 2008 (UTC)
 * Maybe could be the term correlation replaced by autocorrelation? Even if is it sometimes better to avoid repetitions of the same words, changing it can be here maybe confusing? EtudiantEco (talk) 17:20, 23 October 2008 (UTC)
 * Fixed now. Melcombe (talk) 10:14, 27 October 2008 (UTC)

Symmetry of the autocorelation
I'm little bit confused with the statement that the autocorrelation is an even function r(i)=r(-1)... I would this it hold only for stationary series... the symmetry of the correlation only implies that $$cov(x_t,x_{t-j})=cov(x_{t-j},x_{t})$$. Take a random walk with covariance (t-j)sigma^2 the cov between time 10 and 8 is 8sigma^2 (and so is the cov between time 10 and 8) but is not equal to the cov between time 10 and 12 (which is 10 sigma). I'm right? Well actually looking at Hamilton (1994) Time series analysis this symmetry is stated only for stationary processesEtudiantEco (talk) 17:20, 23 October 2008 (UTC)
 * Stationarity implies symmetry of the autocorrelation, but there can be other processes with symmetric autocorrelations. However your non-stationary case does not provide an example as the autocorrelations are 2/sqrt(8*10) and 2/sqrt(10*12). Examples can be constructed by starting with a stationary process and multiplying this by a non-random time-varying function, so that the mean and variance change with time but the autocorrelation does not. Melcombe (talk) 10:24, 27 October 2008 (UTC)

Assumptions in the Data
I went to this page looking for a list of assumptions that the data should meet in order to use autocorrelation, such as stationarity. This is a useful Wike page, but I think the page could be improved if there was a subsection that listed the assumptions, and how to respond if the assumptions are not met. What types of data are appropriate; only continuous variables? Can categorical, or binary data be the subject of autocorrelation? Can binary data be analyzed this way only if they meet certain contitions such only if the distribution of ++, +-, -+, and -- fall within certain limits?; otherwise a link to a join-count page would be good. Thanks to the authors of this page for their time and effort. —Preceding unsigned comment added by Vstrom650 (talk • contribs) 14:39, 28 July 2010 (UTC)

Autocorrelation of white noise
Regarding the property stated : The autocorrelation of a continuous-time white noise signal will have a strong peak (represented by a Dirac delta function) at τ = 0 and will be absolutely 0 for all other τ.

Can white noise be characterized as square integrable as follows? White noise characterized in the limit as occupying spectral band approaching -inf ~ +inf, duration approaching inf, and (frequency invariant) spectral density approaching zero may exhibit square integrability, and in this case the autocorrelation approaches a delta fn of finite energy in the limit.

Could it even be said that in the limit, such noise has spectral density approaching identity with the Fourier transform squared of a delta fn of identical energy? So that the autocorrelation of such noise could be determined somewhat heuristically as the inverse transform of the frequency invariant spectral density of white noise? —Preceding unsigned comment added by Groovamos (talk • contribs) 04:46, 15 September 2010 (UTC)

Not random numbers
It seems to me (not an expert) that the figure caption which suggests the numbers are "random" should be changed. If the series were in fact random, there would be no autocorrelation. — Preceding unsigned comment added by MQBenedict (talk • contribs) 10:06, 18 June 2012 (UTC)


 * A random process can have lots of structure. You may be confusing what in layman's terms might be caled "random" and "completely random", similar in confusion between "noise" and "white noise". Melcombe (talk) 08:11, 28 June 2012 (UTC)

Efficient computation
I don't understand Manoguru's edit of 11:46, 8 February 2012, which added what in the current version of the article is this content:


 * For data expressed as a discrete sequence, it is frequently necessary to compute the autocorrelation with high computational efficiency. The brute force method based on the definition can be used. For example, to calculate the autocorrelation of $$x = (2,3,1)$$, we employ the usual multiplication method with right shifts:



2 3 1 × 2 3 1 ________ 2 3 1     6 9 3        4 6 2 _____________  2 9 14 9 2


 * Thus the required autocorrelation is (2,9,14,9,2). In this calculation we do not perform the carry-over operation during addition because the vector $$x$$ has been defined over a field of real numbers. Note that we can halve the number of operations required by exploiting the inherent symmetry of the autocorrelation.

First, the autocorrelation should be a function of the lag; is this intended to be the one-period autocorrelation? Second, no motivation for the procedure is given. Third, the value of the autocorrelation function for lag=1 should be a scalar number, not a set of five numbers. And fourth, an autocorrelation value should be a number between -1 and 1. What am I missing here? Duoduoduo (talk) 17:15, 9 January 2013 (UTC) Fifth, it doesn't make sense to calculate autocorrelations for a time series of just three data points. Is this intended to be $$x = (2,3,1,2,3,1,...)$$ instead of $$x = (2,3,1)$$? Sixth, no citation is given. Duoduoduo (talk) 14:07, 10 January 2013 (UTC)


 * I'm going to delete this passage unless I hear an objection here soon. Duoduoduo (talk) 14:07, 10 January 2013 (UTC)


 * Hi Duo, Thanks for pointing out these issues with my edits. This was a highly pedagogical example. My main motivation in giving this example was that, when I was a student, I always found it quite difficult to calculate auto-correlation (and by virtue, cross-correlation and convolution); and I wish that someone had told me this simple fact. My answer to your questions are as follows.
 * Ques.(2) Motivation is as given by the signal processing definition of autocorrelation $$R_{xx}(j) = \sum_n x_n\,\overline{x}_{n-j}.$$ So the example I gave was just a realization of this procedure, with explicit recognition that it is nothing but ordinary multiplication with right shifts. Each vertical addition gives a particular lag component of the autocorrelation.
 * Ques.(1 & 3) In the answer (2,9,14,9,2) the Rx(0)=14, Rx(1)=Rx(-1)=9, and Rx(2)=Rx(-2)=2.
 * Ques.(4) In the signal processing definition, the autocorrelation values are not normalized. But clearly this can easily be done.
 * Ques.(5) Perhaps the change to x=(...,0,0,1,2,3,0,0,...) might help? Indeed if x=(2,3,1,2,3,1,...) then we get circular autocorrelation.
 * Ques.(6) This is a well known fact in signal processing community. Just follow the definition.
 * I will revert the edits for now, with some more explanation. If you feel that you can add something to it, please feel free to do so. If you feel that it does not belong here, then do whatever you want with it.(Manoguru (talk) 16:21, 20 March 2013 (UTC))


 * I just checked out your page. The fact that an emeritus professor finds this confusing testifies to the fact that this fact is not as well known as it should be. (Manoguru (talk) 17:01, 20 March 2013 (UTC))

avoiding autocorrelation
Generalized Least Squares-regression can be applied on data in order to avoid violation OLS-assumption of non-autocorrelation.

03:25, 6 May 2013 (UTC)

Confidence bounds
Confidence bounds of an ACF plot have two purposes in a context of statistical inference: 1) Testing the robustness of the calculated autocorrelation coefficients (ACs) 2) Finding which ARMA model fits the data the best

Explanation: Take a random sample from a certain unknown population. Ideally, this sample is representative for the population. In the ideal case, when we calculate a statistic on that sample (such as the mean, the variance or as in this case, the ACs), these are the ACs of the population. Unfortunately, we have to be really lucky for the ideal case to be true. More often, our sample is not exactly representative for the population. However, since this sample is the only sample we have, this is the best estimate of the ACs of the population available to us. It's in this context that confidence bounds are useful. Take the example of the 95% confidence bounds, which are the most used. Imagine that we can take 100 random samples of our unknown population, 95 of those samples will give us ACs within the blue lines on the ACF plot. Only 5 of them will lie outside of those lines. But we only have a single sample resulting in our ACF plot. If from this single sample we get an AC value within the blue lines or just slightly outside the blue lines, we can conclude that the found AC is a result of random chance with 95% certainty. This means this value is not really to be trusted. However, if the found an AC value lies far outside the blue lines, we can conclude that the found AC is statistically significant and we can give higher trust to this value, since it is less likely to occur by chance. So, for 1), this helps us to establish the robustness of the found ACs.

As for 2), it is first necessary to explain what an ARMA model is. Simply said, it is a mathematical formula allowing us to describe our sample, or here too, ideally, our unknown population. Ideally, this mathematical formula is so accurate that it fits our sample perfectly or even better, our unknown population. ARMA stands for autoregressive moving average model and is actually a set of many autoregressive models. Deciding which AR model fits our sample (or our population) the best can be done using the ACF plot. The key is to find the highest lag at which the AC values lies far outside the blue lines. So for example, if for lag 1, the AC value lies far outside the blue lines and for lag 2 the value lies within the blue lines, it is highly probable that our sample (or population) can be best fitted by an AR1 process, which is type of ARMA model. If for example ACs up to lag 3 are well outside the blue lines, but the AC at lag 4 is not, we have an AR3 process. If for example only the AC at lag 0 is outside the blue lines (data is always maximally correlated with itself at lag 0, hence the AC value of 1), we have an AR0 process, meaning there is no lag-autocorrelation and our data cannot be properly fitted by an ARMA model.

03:25, 6 May 2013 (UTC)

hyphen
Autocorrelation but cross-correlation? Do we take a view on whether to use a hyphen or not and does consistency matter here? Markus Kuhn (talk) 16:53, 19 March 2020 (UTC)

A problem with equation 3
Concerns Equation 3: I think that the standard is different and equation (3) should be: {R} _{XX}(\tau) = E [ X_{t} {\overline {X}}_{t-\tau } ] } (minus instead of plus)

Grounds:

1. There is a problem with the current reference: the referenced book descibes the real case thus there is no conjugation: RX (τ) := E[X(t+τ)X(t)].

2. In all the books that I have checked the conjugation applies to the delayed part (unlike (3)); The book that I checked are: a) Introduction to Spectral Analysis by Petre Stoica and Randolph Moses b) Spectral Analysis for Physical Applications by Donald B. Percival and Andrew T. Walden | Jun 3, 1993 [page 39] c) Digital Spectral Analysis by S. Lawrence Marple Jr. [page 102] d) Statistical Digital Signal Processing and Modeling by Monson H. Hayes | Apr 11, 1996 — Preceding unsigned comment added by 83.28.239.168 (talk) 11:44, 18 January 2021 (UTC)

Response
I agree with you. If the autocorrelation was defined as {R} _{XX}(\tau) = E [ X_{t} {\overline {X}}_{t+\tau } ] }, one would get a wrong signal in the exponential of the Fourier transform between power spectral density (S_{XX}) and autocorrelation. I've corrected the issue by writing the autocorrelation as {R} _{XX}(\tau) = E [ X_{t+\tau} {\overline {X}}_{t} ] } in equation 3 and and other places that were wrong. Martim Zurita (talk) 13:50, 19 October 2021 (UTC)