K-means test data

19 views
Skip to first unread message

Vasia Kalavri

unread,
Jun 4, 2014, 8:13:48 AM6/4/14
to stratosp...@googlegroups.com
Hello all,

I have a question regarding the data used for the k-means tests (KMeansData.class).
Is there a description of the datasets somewhere or have you created the datasets from scratch?
Basically, I'm looking for data to test a k-means giraph implementation, so if you have any pointers, please let me know!

Thanks,
V.

Fabian Hueske

unread,
Jun 4, 2014, 8:18:25 AM6/4/14
to stratosp...@googlegroups.com
Hi,

the k-Means data set was created with the KMeansDataGenerator.
The generator is really simple: It chooses some points as centers and distributes data points around these centers with a Gaussian distribution.

Fabian


--
You received this message because you are subscribed to the Google Groups "stratosphere-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stratosphere-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/stratosphere-dev.
For more options, visit https://groups.google.com/d/optout.

Vasia Kalavri

unread,
Jun 4, 2014, 8:28:31 AM6/4/14
to stratosp...@googlegroups.com
Thanks Fabian! That's very useful!

V.
Reply all
Reply to author
Forward
0 new messages