Talk:Wasserstein metric

Error in equation
These equations in

"...the tests satisfy the following properties"

$${\displaystyle {\begin{aligned}\int \gamma (x',x)\mathrm {d} x'&=\mu (x)\\\int \gamma (x,x')\mathrm {d} x'&=\nu (x)\end{aligned}}} $$

"That is, that the total mass..."

Should say:

$${\displaystyle {\begin{aligned}\int \gamma (x,x')\mathrm {d} x'&=\mu (x)\\\int \gamma (x',x)\mathrm {d} x'&=\nu (x)\end{aligned}}} $$

because the first equation is for mass moved out of x, and the second equation is for mass moved into x

Earth mover's distance
It seems that this concept was reinvented by computer scientists in the late 1980s, who have been calling it "Earth mover's distance" (EMD) ever since. In 2001 it was pointed out that EMD is very similar, if not identical, to the Wasserstein metric. Perhaps the two articles should be merged?


 * Cedric Villani (the fields medalist) in his 2009 seminal work on Optimal transport says that Wasserstein distance is still the dominant usage, but earth mover's distance is rapidly catching up. In the CS community EMD is clearly the dominant one. This merits two articles in Wikipedia, one with a Computer Science focus, one with a Mathematics focus. An average CS person will not want to read the Math page and possibly vice versa. — Preceding unsigned comment added by 129.27.155.34 (talk) 11:59, 1 September 2011 (UTC)

Note however that the EMD is typically used for distributions over a discrete region M (e.g. the bins of a histogram, or the pixels of an image), so the "dirt moving plan" is a matrix rather than a 2D continuous distribution. Also, in many applications of the EMD, one must allow for "dirt" to be created or destroyed (at a specified penalty cost), rather than transported. Is that reason enough to keep the articles separate? --Jorge Stolfi (talk) 19:17, 28 October 2008 (UTC)


 * If that is the difference, then it is not clear from the EMD page. However continuous and discrete distributions are distinct enough to warrant separate articles. Wqwt (talk) 23:18, 14 February 2022 (UTC)

The quote: "Because of this analogy, the metric is known in computer science as the earth mover's distance." Is certainly wrong. I doubt computer scientists called it this because of some existing analogy. — Preceding unsigned comment added by 64.118.25.227 (talk) 18:15, 8 September 2014 (UTC)


 * I aggree with the reasoning of 129.27.155.34  and that it merits two separate articles. Additionally the article on the example is much longer than the one of the principal article subject, I think that's another reason for keeping both articles. Regards, Kmw2700 (talk) 12:28, 29 September 2015 (UTC)


 * I also agree with the reasoning of 129.27.155.34.  However with this reasoning the emphasis of *this* article must be changed to being more mature mathematically, reflecting the differing roles of the two articles.  A suggestion: The second paragraph of the introduction about the EMD should be removed, this is only the p=1 case and even then is misleading as the infimum might not be achieved.  It should be replaced by a brief explanation of the association with EMD. In general this article should be more about mathematical properties, while pointing to the EMD article for information concerning computation and applications. SymplectoJim (talk) 09:58, 17 March 2019 (UTC)

Gaspard Monge
The Earth mover's distance article attributes the idea to Gaspard Monge in 1781. Is the attribution legitimate? --Jorge Stolfi (talk) 19:17, 28 October 2008 (UTC)

Yes, you can also check the page of Gaspard Monge for further references. --Joannes Vermorel (talk) 22:55, 8 February 2009 (UTC)

Wasserstein vs.Vasershtein
Rather obviously, the cyrillic lettering is a transcription of the German "Wasserstein". Therefore, it is quite natural to write "Wasserstein" instead of the more formal "Vasershtein" transliteration in latin script.--Trigamma 22:03, 15 January 2007 (UTC)


 * As I understand, "Vasershtein" is a form of transliteration while "Wasserstein" is a form of transcription. The latter is not a form to transform the cyrillic writing to Latin characters but an attempt to write the name, being of German origin, in Latin characters according to German rules. But I don't insist.--Trigamma 06:49, 16 January 2007 (UTC)


 * This is an interesting one. :-) The name is of Germanic origin, so somehow "Wasserstein" (or possibly "Wa&szlig;erstein") seems to be the "correct" spelling. The issue is complicated by the fact that, although his name was of Germanic origin, Vasershtein/Wassertein was himself Russian! So now we are in the situation of discussing someone whose name should be written in Cyrillic characters as a transliteration of the original German name. My original reference to transliteration in the article was to the problem of transliterating from the Russian, not into it; my point there was that a German would simply transliterate back to the original "Wasserstein"), whereas another reader might transliterate the Cyrillic characters according to some correspondence between Russian and English phonemes to get "Vasershtein". It's a little bit of a tangled web: perhaps we can come up with some suitable explanation to include in the article?


 * By the way, thank you for your civil words. After witnessing several edit wars in recent days and weeks, this is a breath of fresh air. Sullivan.t.j 10:46, 16 January 2007 (UTC)


 * I am myself a German. Maybe the problem is, that in German the meaning of the word "transliteration" is more restrict than in English. It is only used when there is a one-to-one correspondence between the original an the transliteration, so that the original spelling can be reconstructed from the transliteration. In all other cases, in German the word "Transcription" is used. --Trigamma 16:43, 16 January 2007 (UTC)


 * I see. Interesting. At least according to my understanding, the correspondence "Ш"<-->"sh" would still count as transliteration in the English usage of the word, even though the number of characters is different in each string. Sullivan.t.j 17:12, 16 January 2007 (UTC)


 * The same in German, I think. But writing "Wasserstein" because of the German origin, would certainly not be called a transliteration. At least the Ш would have to be written as "sch". But I am a mathematician, not an expert in linguistics.--Trigamma 17:25, 16 January 2007 (UTC)


 * Coming back to the article. What about:
 * "The usage of "Wasserstein" can be attributed to the fact that the name "Vasershtein" is of Germanic origin. "Wasserstein" is the German spelling." --Trigamma 17:32, 16 January 2007 (UTC)

This discussion seems absurd to me, in German, There are one or several !! So I guess only the German spelling could be preserved. Would you write on Zermelo's page Ernst Zermelo (or Tsermelo) ?? All mathematicians use Wasserstein...


 * This discussion to me looks a little beside the point. If it is about how to correctly spell the Wasserstein distance in Wikipedia, then the answer is obvious: It should be Wasserstein because this is by far the dominant usage. If it is about how to correctly spell the person, the answer is also obvious: Just look up his spelling on his web page (http://www.math.psu.edu/vstein), he is a professor at Penn State University. Wikipedia correctly does a redirect between different spellings. — Preceding unsigned comment added by 129.27.155.34 (talk) 12:08, 1 September 2011 (UTC)

Since this is named after an individual who has had his own webpage, it is worth noting that in the United States he does not use Wasserstein or Vasershtein or even Vaseršteĭn but Vaserstein. But he did not name the metric 09:53, 23 September 2018 (UTC)

Hutchinson Metric
Is it just me, or is the 1st Wasserstein metric precisely the same thing as the Hutchinson metric? (See M. F. Barnsley, "Fractals Everywhere") 217.44.217.128 20:48, 31 January 2007 (UTC)
 * I suppose, you are speaking of the Wasserstein metric $$W_1$$. As mentioned in the text, this metric is only a special case of the whole family of Wasserstein metrics. As far as I know, Hutchinson uses this metric in his work on fractals. He himself calls it the "Monge-Kantorovitch metric" (according to my lecture notes).


 * As far as I know the name Wasserstein metric is more widely used for $$W_2$$.--Trigamma 19:18, 12 February 2007 (UTC)


 * That would explain why Barnsley uses that name in his new fractal geometry book. Thanks for the pointer. 217.44.217.128 18:52, 15 February 2007 (UTC)

Conversion to mathematics format
The text is atrocious in that it is left as LaTeX. It needs to be cleaned up and presented in a readable format. —The preceding unsigned comment was added by 71.132.145.120 (talk) 16:26, 19 March 2007 (UTC).

Lip - Changes by Doetoe
As far as I can see, the "corrections" made by Doetoe are wrong. I remove them. --Trigamma 08:09, 6 June 2007 (UTC)

The norm called the "bounded Lipschitz norm" by Doetoe is better known as the C^{0,1} norm. It is different from the optimal Lipschitz constant Lip (f). With Lip(f) replaced by the C^{0,1} norm, one does not get the W_1 Wasserstein metric but the so-called "flat metric" (cf. Herbert Federer, Geometric Measure Theory, Springer Verlag, p. 367, paragraph 4.1.12).--Trigamma 08:23, 6 June 2007 (UTC)

Thanks, you're totally right. Doetoe 19:45, 30 August 2007 (UTC)

Changes by 144.92.237.192
First, please understand that the talk pages are here for a reason. Yes, Wikipedeans make mistakes sometimes, but the place to sort them out is the talk page, not the article text itself. On a more mathematical note, the earth-mover's distance is indeed the L1 Wasserstein distance, also known as W1 and by many other names. It is not the L2 Wasserstein distance (a.k.a. W2) because the cost of moving a bit of earth is proportional to the distance moved, not the square of the distance moved. You are right that the distance in use on M might well be the Euclidean ($$\ell^2$$) distance (if M is a subset of Rn, but that doesn't make the earth-mover's distance on measures on M L2-ish. Sullivan.t.j (talk) 01:17, 14 November 2009 (UTC)

Similar to Cramér–von Mises criterion
It looks like this test uses the Wasserstein metric with p=2, but neither article mentions the connection. — Preceding unsigned comment added by 130.234.244.141 (talk) 09:10, 20 May 2017 (UTC)

Notations in "Intuition and connection to optimal transport" section
In this equation, $$\gamma$$ represents at the same time a function $$\gamma(x,y) \mapsto [0,\infty)$$ (transport plan) and a measure on $$M \times M$$.

$$ \iint c(x,y) \gamma(x,y) \, \mathrm{d} x \, \mathrm{d} y = \int c(x,y) \, \mathrm{d} \gamma(x,y) $$

I think the transport plan should be named something else (as it is in fact the Radon–Nikodym derivative of the measure $$\gamma$$) and the above equation more thoroughly explained. — Preceding unsigned comment added by Johann Laconte (talk • contribs) 15:37, 28 January 2022 (UTC)

Normal distribution example
It would be interesting to know the joint probability distribution that minimises the objective function, but I guess it is the product of the two Gaussians? Or is there no analytical closed form? (not obvious to me from the cited paper) 178.197.207.95 (talk) 06:41, 26 April 2024 (UTC)


 * It is given in equation (2) of the cited reference. (It is Gaussian, but not of product form. Essentially, one has to maximise the correlation between the two components.) Hairer (talk) 08:58, 28 April 2024 (UTC)