Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Wasserstein Distance

149 views
Skip to first unread message

Harm Olthof

unread,
Jul 14, 2018, 3:32:59 PM7/14/18
to
Hello Group,
Does anyone know of a Tcl implementation for the Wasserstein Distance? (a similarity metric), aka "Earth Mover's Distance"?

Thanks,
Harm Olthof

Arjen Markus

unread,
Jul 16, 2018, 2:59:13 AM7/16/18
to
I know of none, but then I am not familiar (yet (*)) with the concept.

Regards,

Arjen

(*) I have printed the Wikipedia explanation ... intriguing, especially as it also refers to the Fokker-Planck equation, but that is besides the point. Might be a nice addition to the statistics package in Tcllib.

Arjen Markus

unread,
Jul 18, 2018, 6:47:33 AM7/18/18
to
I have found a paper or two on efficient algorithms to calculate it. Not entirely trivial, but not entirely impossible either by the looks of it.

Regards,

Arjen

Harm Olthof

unread,
Jul 22, 2018, 5:31:23 PM7/22/18
to
Thank you for looking into this, Arjen: it sounds promising. In the meantime I found that for normalized histograms, the calculation is actually almost trivial: take the sum of the absolute difference of the two cumulative distributions, as is explained here: https://math.stackexchange.com/questions/714476/how-do-you-compute-numerically-the-earth-movers-distance-emd

An implementation for the more general case would still be very interesting. If you see chance to make a start with what you found and get it in Tcllib that would be great. If I can be of any assistance, let me know.
Harm Olthof

Arjen Markus

unread,
Jul 23, 2018, 3:03:01 AM7/23/18
to
Ah, thanks - the general case seems to be solvable via a suitable reformulation as minimisation problem with linear constraints. That makes it much more feasible, though I am still struggling with understanding the details ;).

See for instance: https://arxiv.org/abs/1609.07092

Regards,

Arjen

Arjen Markus

unread,
Jul 24, 2018, 2:27:51 AM7/24/18
to
I realised yesterday that this can be used not only for probability densities but also for other patterns. Of course the various notes and papers do refer to that but not in terms that registered with me ;). It seems a practical solution to a problem that has puzzled me for many years. And the contours of how to go about it are beginning to form.

Regards,

Arjen

Arjen Markus

unread,
Jul 24, 2018, 3:02:11 PM7/24/18
to
Okay, the one-dimensional version is almost trivial indeed. I have put a simple implementation on the Wiki.

Regards,

Arjen

Harm Olthof

unread,
Jul 26, 2018, 3:30:55 AM7/26/18
to
Hello Arjen,
Thanks, that looks great! I will study your implementation and start using it :-)
Regards,
Harm Olthof
0 new messages