SyntheticControlHClust

10 views

Skip to first unread message

David Cabanillas

unread,

Feb 4, 2014, 9:43:04 AM2/4/14

to jmotif-...@googlegroups.com

I'm trying to work with jmotif. And I have been tested SyntheticControlHClust however I have some problems.

1) The data from the example is composed by

6 classes X 50 examples per class

2) Why this is necessary?

if (classLabel.equalsIgnoreCase("2") || classLabel.equalsIgnoreCase("3")

|| classLabel.equalsIgnoreCase("6")) {

continue;

3) and this

for (double[] series : e.getValue()) {

skip++;

// if (skip < 0) {

if (skip < 15) {

continue;

}

4) I don't know why preRes, tfidf size=9.

Thanks.

Pavel Senin

unread,

Feb 5, 2014, 3:09:33 PM2/5/14

to jmotif-discuss

Hi David:

Thank you for the interest.

Probably you are looking on the code where I was trying to figure out the best possible clustering example and for that I have excluded some classes, and some of the class' series.

The reason for this is that SAX-VSM approach, while works for clustering of this dataset, suffers from few issues. One of them is that series are somewhat short to generate "good" discriminating patterns which would help with clustering. Another issue is that noise level is too high at some series and this also introduces some confusion for the algorithm.

Overall, from my experience with this dataset, algorithm is too sensitive thus its performance is unstable. The tf*idf part is aggressively lowering weights of patterns that appear in multiple classes, thus, sometimes, when noise is to high, and just by a chance a "good" discriminating word for a class appears in others, it "cancels out" that pattern. I was experimenting with thresholds on the frequencies, but didn't finish the exploration.

--
You received this message because you are subscribed to the Google Groups "jmotif-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jmotif-discus...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.