Clustering algorithm Mathematica 7

Jan Baetens

unread,

Nov 20, 2009, 6:50:12 AM11/20/09

to

Hi all,

Currently, I'm using the built-in clustering algorithm of Mathematica 7,
though it isn't clear for me which algorithm it actually is since this
is not mentioned in the extended help pages. Presumably, it's the normal
K-means clustering but I'm not sure.

As such, I'd like know whether someone knows which implementation is
used in Mathematica 7 for data clustering.

Thanks,

Jan

--
ir. Jan Baetens

Ghent University
Department of Applied Mathematics, Biometrics and Process Control
Coupure Links 653
B-9000 Gent
Belgium

jan.b...@UGent.be
http://users.ugent.be/~jbaetens/

tel: ++32 (0)9 264 59 31
fax: ++32 (0)9 264 62 20

Darren Glosemeyer

unread,

Nov 21, 2009, 3:42:28 AM11/21/09

to

Jan Baetens wrote:
> Hi all,
>
> Currently, I'm using the built-in clustering algorithm of Mathematica 7,
> though it isn't clear for me which algorithm it actually is since this
> is not mentioned in the extended help pages. Presumably, it's the normal
> K-means clustering but I'm not sure.
>
> As such, I'd like know whether someone knows which implementation is
> used in Mathematica 7 for data clustering.
>
> Thanks,
>
> Jan
>
>

The default is k-medoids. Agglomerative clustering is also included as a
method option. Brief discussion of the methods is included in the
documentation. This can be found by entering

tutorial/PartitioningDataIntoClusters

in the Documentation Center or online at

http://reference.wolfram.com/mathematica/tutorial/PartitioningDataIntoClusters.html

Here are some references about the methods that you might also find useful:

L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction
to Cluster Analysis, New York: John Wiley & Sons, 1990.

P. J. Rousseeuw, �Silhouettes: A Graphical Aid to the Interpretation and
Validation of Cluster Analysis,� J Comput. Appl. Math., 20, 1987, 53�65.

R. Tibshirani, G. Walther, and T. Hastie, �Estimating the Number of
Clusters in a Dataset Via the Gap Statistic.� Stanford Univ. Tech.
report. March 2000. (published Journal of the Royal Statistical Society,
B, 63, 2001, 411�423.)

Darren Glosemeyer
Wolfram Research

martin....@gmail.com

unread,

Dec 16, 2012, 1:06:21 AM12/16/12

to

Hi,

I have started using the clustering methods in Mathematica 9 and I'd like to know whether the "FindClusters" function with the "optimization" method still uses k medoids in Version 9. Could you please specify on which methods/publications the initial seed choice is based? (The documentation states that the algorithm "starts by building a set of k representative objects")
What lies behind the "KMeans" option in ClusterComponents? Is this just an implementation of the naked k means algorithm with a random choice of initial seeds or is there some seed selection process involved as a first step?
Thanks very much,

Martin

Martin Lottner
Technische Universit=E4t M=FCnchen
Department of Physics
James-Franck-Str. 1
85748 Garching
Germany

Matthias Odisio

unread,

Dec 20, 2012, 3:20:45 AM12/20/12

to

Hello,

On 12/16/12 12:06 AM, martin....@gmail.com wrote:> Hi,

>
> I have started using the clustering methods in Mathematica 9 and
> I'd like to know whether the "FindClusters" function with the
> "optimization" method still uses k medoids in Version 9.

Yes it does.

> Could you please specify on which methods/publications the initial
> seed choice is based? (The documentation states that the algorithm
> "starts by building a set of k representative objects")
> What lies behind the "KMeans" option in ClusterComponents? Is this
> just an implementation of the naked k means algorithm with a
> random choice of initial seeds or is there some seed selection
> process involved as a first step?

I would think it's a random process. You can control it using the
"RandomSeed" (sub-) option:

FindClusters[l, n, Method -> {"Optimize", "RandomSeed" -> s}]
ClusteringComponents[l, n, "RandomSeed" -> s]

Would the bottom half of this tutorial provide more helpful information?
http://reference.wolfram.com/mathematica/tutorial/PartitioningDataIntoClusters.html

Matthias Odisio
Wolfram Research