Talk:Distance correlation

Problems with the article
I've gone back to the two cited articles by the original authors, and I have some problems relating some things here to there:

1. In "properties": "(ii) dcovn = 0 if and only if every observation is the same." First, the same as what?: dcov refers to the dcov of two different variables. And this quote seems unlikely to be true, since it would preclude two non-constant variables from having a sample dcov of zero even if the variables are independent.


 * (1) It is correct but unclear.

The 2007 paper (page 1244, before Remark 2) says that dvarn(X) = 0 iff every sample observation is identical. Am I correct that "dcovn" in the present quote should be changed to "dvarn"?


 * It would be correct that way. It is awkward to state (ii), better dvarn.

2. In the section "Definitions#Distance covariance", it says "distance covariance is not the same as the covariance of distances, cov(|X-Y|, |Y-Y’|)". Should this say "cov(|X-X’|, |Y-Y’|)"? As it is it's not symmetric.


 * (2) you are correct; cov(|X-X’|, |Y-Y’|)

3. Still in the section "Definitions#Distance covariance", it says


 * "The population value of distance covariance [1][2] is


 * dcov(X,Y):= E|X-X’||Y-Y’| + E|X – X’| E|Y – Y’| - E|X – X’||Y – Y”| - E|X – X”||Y – Y’|


 * where E denotes expected value, X’ is an independent and identically distributed copy of X, Y’ is an independent and identically distributed copy of Y, finally X” (Y”) has the same distribution as X (Y) and independent not only of X (Y) but also of Y and Y’ (X and X’)."

I have a couple problems with this:


 * (a) Should it say that " X” (Y”) is independent not only of X and X’ (Y and Y’) but also ...."?


 * (b) I can't see how this definition relates to the one in the original papers (2007 and 2009). E.g. the closest thing I can find in the 2007 paper is in regard to the sample dcov, which is given (p. 2776, top and eq. 2.18) as

dcovn2 = (1/n2) (summation over k,l = 1 to n) |Xk - Xl| |Yk - Yl| + (1/n2) (summation over k,l = 1 to n) |Xk - Xl|× (1/n2) (summation over k,l = 1 to n) |Yk - Yl| − 2[(1/n3) (summation over k = 1 to n) (summation over l,m = 1 to n) |Xk - Xl| |Yk - Ym|]. This appears to me to translate into an expression for the population dcov2 (not dcov) =  E|Xk - Xl| |Yk - Yl| + E|Xk - Xl| × E|Yk - Yl|− 2[E|Xk - Xl| |Yk - Ym|. (I assume we can translate notation as Xk and Xl becoming X and X', and Yk, Yl and Ym becoming Y, Y’, and Y”.)

So I don't even see any mention of X” in the original paper. Duoduoduo (talk) 22:23, 21 December 2010 (UTC)


 * You want to check the later paper on Brownian Distance Covariance; this result is proved in the second part. You are correct that the equality is stated for population distance covariance. Looks like this section requires clarification.


 * Not the original poster but I will make corrections and clarifications asap. Thanks for catching the error. (modified my reply) Mathstat (talk) 23:28, 21 December 2010 (UTC)

Notational confusion
@Mathstat: Thanks for trying to clean up this article's notation. Maybe I'm just confused, but I think the difficulty arises in that the original 2007 and 2009 papers use two different meanings for dCov. The 2007 paper says on p. 2772: "The distance covariance (dCov) between random vectors X and Y with finite first moments is the nonnegative number V(X, Y ) defined by V2(X, Y ) = ...." Likewise, the 2009 paper says on pages 1236-7 "the distance covariance (dCov) statistic, derived in the next section, is the square root of V2 ...."

But then for a while in the 2009 paper they use a different definition of dCov: on p. 1238 it says "This new notion CovU(X, Y ) contains as distinct special cases distance covariance V2(X, Y )...." Six lines later it says "A surprising result develops: the Brownian covariance is equal to the distance covariance" and later in that paragraph it says "we arrive at CovW(X, Y ) = V2(X, Y )." But then on p. 1241 it says "The distance covariance (dCov) between random vectors X and Y with finite first moments is the nonnegative number V(X, Y ) defined by V2(X, Y ) = ...", which appears to have been cut and pasted from the above quote in the 2007 paper. On p. 1249 it says "the Brownian covariance of X and Y is defined by W2(X, Y ) = ...", but it appears to mean that it is defined as the square root of this. Then on p. 1250 it says "The surprising coincidence: W = V" implying that both dCov and Brownian covariance are the positive square roots of V2 and W2.

So I'm confused. I hope you're able to sort all this out so as to use a consistent notation in the Wikipedia article. Duoduoduo (talk) 18:27, 4 February 2011 (UTC)


 * Yes, as you noticed, the notation in this Wikipedia article was not quite consistent with the notation in the 2007 and 2009 papers, and these recent changes are mainly to be consistent in notation. Concerning other notational matters in the Brownian covariance part, in SR2009 pp. 1248-1249 the Brownian covariance is defined in (3.4) and (3.6). In (3.4) it is stated that Brownian covariance is defined by its square W2(X, Y ), which parallels the definition of distance covariance in both papers. In (3.6) "Brownian covariance is defined by ... (equation 3.6 with W2(X, Y )). When reading the two pages it makes sense, but on p. 1249 it would be more clear if it said "Brownian covariance W" is defined by ..." or "is defined as the square root of ..." as you wrote here. Your sentence "The surprising coincidence: W = V" implying that both dCov and Brownian covariance are the positive square roots of V2 and W2. summarizes it well. Mathstat (talk) 19:28, 4 February 2011 (UTC)

Edits to Definitions, and miscellaneous
- Sorry I made major edits without posting here! I'll do so in the future. This article is great, and just thinking of ways to improve it!

- I think Definitions need to be edited for a more layperson audience (i.e., non-theoretical statisticians). Presumably, most readers are familiar with statistics, and want to know (1) intuition behind distance covariance and (2) how to compute it. The current article has rather obscure notation (granted, taken from Szekely and Rizzo, 2009), but perhaps using "D" to denote distance matrix and "R" to denote re-centered distance matrices are more reader-friendly. Also, defining dCov^2 with the equation below "One can show that this is equivalent to the following definition:" is stated without any intuition. This should be put into a later section for readers who want to more details about dcov (i.e., this equation is derived from starting with a norm difference between distributions). My edits (drbabinski) try to clean up the notation, and make things more straightforward for a layperson reader (although much can be improved), without removing the previous definitions.

- The picture with the different data sets and a dcorr value is misleading. It is unclear how to interpret dcorr values, and saying a relationship has a larger dcorr than another relationship should be carefully interpreted based on the number of samples and variables. This differs from Pearson correlation, whose value is interpretable.

- Can the "Problems with the article" section below (in Talk) be archived? Are those problems resolved?

Drbabinski (talk) 18:37, 24 January 2018 (UTC)


 * I hope you don't mind me moving your thread to the right place &mdash;in WP-environment&mdash; to the bottom.
 * I do not have any objections to your intentions, but consider this TP as not sufficiently bloated yet to justify archiving already. Purgy (talk) 07:26, 25 January 2018 (UTC)