Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Chebyshev distance for K-means

577 views
Skip to first unread message

Sandro Saitta

unread,
May 1, 2007, 4:43:50 AM5/1/07
to
Hello,

I want to try k-means with the chebyshev distance. I know it is
possible to use euclidean and city block distances but what about
chebyshev? Is there a way to use this distance with the standard
k-means matlab function?

Thanks.
Sandro.

Peter Perkins

unread,
May 1, 2007, 6:22:11 AM5/1/07
to

K-means (the algorithm) requires two things:

1) a distance measure, and
2) a way to compute the centroid of a cluster so as to minimize the sum of
point-to-centroid distances.

There are lots of distance measures, but few have a simple computation for the
centroid. For example, the default distance used by KMEANS (the function) is
most definitely NOT "Euclidean", but rather "squared Euclidean", because for the
latter, the centroid is the arithmetic mean, while for the former, it is a
difficult computation requiring an iterative solution (the calculation has a
hyphenated name attached to it that escapes me).

That's why no Chebyshev distance in KMEANS.

That being said, there is a "coordinate-free" version of K-means (the algorithm)
that does not require a centroid calculation. However, it is not the standard
thing that most people want, and requires a distance matrix rather than raw
data, limiting its use for large datasets (which is what most people use K-means
for). Is that of interest to you?

It's certainly also possible to modify KMEANS to use Chebyshev distance and some
unsuitable centroid calculation, but you're on your own there.

- Peter Perkins
The MathWorks, Inc.

Sandro Saitta

unread,
May 1, 2007, 8:22:02 AM5/1/07
to
Thanks for your answer. I'm not sure to understand all your
explanations. The Manhattan, Euclidean (more precisely SqEuclidean)
and Chebyshev are three special cases of the Minkowski distance.

If p is the parameter of the Minkowski distance, then Euclidean
distance corresponds to p=2, Manhattan to p=1 and Chebyshev to
P=infinity.

I base myself on a paper which compares clustering with Euclidean,
Manhattan and Chenyshev distances for clustering using SOM (that why
I thought of comparing Chebyyshev with sqEuclidean). In a book by
Webb, Chebyshev is defined as:

d(x,y) = sum(abs(x_i - y_i))

Starting from this, I don't understand why it could not simply
replace the standard sqEuclidean distance.

Thanks in advance for your help.
Regards.

Peter Perkins

unread,
May 1, 2007, 11:20:57 AM5/1/07
to

What's the formula to compute the centroid? For example, the centroid for city
block distance is NOT the componentwise arithmetic mean, it's the componentwise
median. If you can compute an appropriate centroid easily for the Chebyshev
distance, then you're on your way. I haven't thought about this at all, and
there may an obvious answer for the centroid.

K-means partitions your data. The criterion for the partition is that the sum
of the point-to-centroid distances is minimized over all partitions. It makes
little or no sense to find a partition that minimizes that criterion if the
centroid itself is not the point that minimizes the point-to-centroid distances
within a cluster.

Sandro Saitta

unread,
May 2, 2007, 4:06:03 AM5/2/07
to
Thanks for your help. I didn't see the problem like this. Therefore,
it means that if I want to compare K-means with different distance
function I should choose measures such as Euclidean, Manhattan and
Correlation for example.

alex kullik

unread,
May 12, 2008, 9:54:03 AM5/12/08
to
Peter Perkins <Peter.Perki...@mathworks.com> wrote
in message <f174cj$t80$1...@fred.mathworks.com>...

I need this "coordinate-free" version of kmeans! Where can i
get it?

Ashraful Alam

unread,
Sep 16, 2013, 11:39:06 AM9/16/13
to
"Sandro Saitta" wrote in message <ef55...@webcrossing.raydaftYaTP>...

Ashraful Alam

unread,
Sep 16, 2013, 11:44:07 AM9/16/13
to
hello sir,
i am making my project on image segmentation using fcm clustering method. i have read all the discussion described above. i am intereted to know that " Can chebyshev distance measure be applied on fcm clustering method, if so then what will be the matlab funtion for fcm clsterin method?

thank you sir

yashasho...@gmail.com

unread,
May 14, 2018, 2:56:09 AM5/14/18
to
Hi ,
I still did not intuitively understand, why the centroid calculation should be depend on the distance measure used.Why it can still be the usual centroid calculation. Can you explain with a example ?
0 new messages