SAXification using the jmotif.lib.jar on a hadoop cluster

184 views
Skip to first unread message

vaske maskinsen

unread,
Sep 13, 2012, 9:55:56 AM9/13/12
to jmotif-...@googlegroups.com
Hi everyone,

I am trying to SAXify timeseries using the jmotif.lib.jar on a hadoop cluster. Locally SAX conversion was straightforward and worked well. However, on the hadoop cluster (I use Apache Pig to speak with it) the needed classes are not found...
=> Error: java.lang.ClassNotFoundException: edu.hawaii.jmotif.datatype.TSException
I tried to add all available versions of the jar to my build-path but that doesn't make any difference.
Any suggestions what the problem could be? Should I build the jar file myself? How can I do that?

All help is very welcome.
Greets,
Vaske

Pavel Senin

unread,
Sep 13, 2012, 10:15:10 AM9/13/12
to jmotif-...@googlegroups.com
Hi there:

I'm sorry, I didn't put this info online. The code depends on some Hackystat code I have used for convenience. That code is jarred in the lib/hackystat/ folder of the trunk, see it here <http://code.google.com/p/jmotif/source/browse/#svn%2Ftrunk%2Flib%2Fhackystat>. 

Sorry.
--
Mahalo, Pavel.

Pavel Senin

unread,
Sep 13, 2012, 10:20:49 AM9/13/12
to jmotif-...@googlegroups.com
I would guess, that it is better for you to build your own jar packing all things together along with the up-to date code. 
You will need to checkout trunk, install needed packages (ant and junit), and run Ant with jar.build.xml: 'ant -f jar.build.xml', if code builds, then you can modify jar.build.xml to unjar Hackystat code into temp folder and jar it back together with jmotif.

I will put that script to SVN a bit later if you can wait.
--
Mahalo, Pavel.

Pavel Senin

unread,
Sep 13, 2012, 3:15:16 PM9/13/12
to jmotif-...@googlegroups.com
well, you need more jars to build jmotif

        <!-- The compile classpath  -->
<path id="compile.classpath">
<fileset file="${env.JUNIT_HOME}/junit-${junit.version}.jar" />
<fileset file="${env.FINDBUGS_HOME}/lib/annotations.jar" />
<fileset file="${env.WEKA_HOME}/weka.jar" />
<fileset dir="${lib.dir}/hackystat" />
</path>
 
you'll need findugs, weka and hackystat libs to compile it.

i've put the script into trunk, you can build the standalone lib:

===
psenin@t135:~/workspace/jmotif$ ant -f jar.standalone.build.xml 
Buildfile: /home/psenin/workspace/jmotif/jar.standalone.build.xml

clean:
   [delete] Deleting directory /home/psenin/workspace/jmotif/build

compile:
    [mkdir] Created dir: /home/psenin/workspace/jmotif/build/classes
    [javac] Compiling 76 source files to /home/psenin/workspace/jmotif/build/classes

jar:
    [mkdir] Created dir: /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/tools/weka-3-7-6/weka.jar into /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/workspace/jmotif/lib/hackystat/hackystatlogger.lib.jar into /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/workspace/jmotif/lib/hackystat/stacktrace.lib.jar into /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/workspace/jmotif/lib/hackystat/tstamp.lib.jar into /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/workspace/jmotif/lib/hackystat/time.lib.jar into /home/psenin/workspace/jmotif/tmp
    [unjar] Expanding: /home/psenin/workspace/jmotif/lib/hackystat/hackystatuserhome.lib.jar into /home/psenin/workspace/jmotif/tmp
     [copy] Copying 76 files to /home/psenin/workspace/jmotif/tmp
      [jar] Building jar: /home/psenin/workspace/jmotif/jmotif.standalone.jar
   [delete] Deleting directory /home/psenin/workspace/jmotif/tmp

BUILD SUCCESSFUL
Total time: 10 seconds
psenin@t135:~/workspace/jmotif$ ls -la jmotif.standalone.jar
-rw-rw-r-- 1 psenin psenin 6734084 Sep 13 21:13 jmotif.standalone.jar
===
--
Mahalo, Pavel.

vaske maskinsen

unread,
Sep 14, 2012, 4:53:38 AM9/14/12
to jmotif-...@googlegroups.com
Thank you very much Pavel,

this came really fast and I think I can manage to build the jar myself now whenever I need to.
Thanks again for so much help!

Greets,
Vaske

Pavel Senin

unread,
Sep 14, 2012, 5:13:17 AM9/14/12
to jmotif-...@googlegroups.com
You are welcome.

Please let me know if things don't work as expected, or if you experience any issues with the algorithm implementation. 
In fact, there might be some issues with package naming if you have used old library. Currently, we are cleaning API and refactoring some code.
--
Mahalo, Pavel.

Message has been deleted

vaske maskinsen

unread,
Oct 24, 2012, 5:51:03 AM10/24/12
to jmotif-...@googlegroups.com
Hi Pavel,

Somehow I don't manage to build the standalone jar. I tried it using IntelliJ Idea (which is my standard IDE) and in Eclipse. In Idea I managed to solve the dependencies... but get some strange Ant-errors, so I thought it's an Idea-specific problem with Ant.
In Eclipse (which I'm not very used to), I collected and set all needed jars. But there seems to be an issue with Weka. I have Weka 3.6.8 installed - is this the right version or which one do I need?
Error:

    [javac] jmotif\src\edu\hawaii\jmotif\
sax\SAXFactory.java:1100: cannot find symbol
    [javac] symbol  : method subList(int,int)
    [javac] location: class weka.core.Instances
    [javac]     List<Instance> tmpList = data.subList(start, end);
    [javac]                                  ^
    [javac] jmotif\src\edu\hawaii\jmotif\sax\SAXFactory.java:1116: cannot find symbol
    [javac] symbol  : method size()
    [javac] location: class weka.core.Instances
    [javac]     double[] vals = new double[tsData.size()];
    [javac]                                      ^
    [javac] jmotif\src\edu\hawaii\jmotif\sax\SAXFactory.java:1117: cannot find symbol
    [javac] symbol  : method size()
    [javac] location: class weka.core.Instances
    [javac]     for (int i = 0; i < tsData.size(); i++) {
    [javac]                               ^
    [javac] jmotif\src\edu\hawaii\jmotif\sax\SAXFactory.java:1118: cannot find symbol
    [javac] symbol  : method get(int)
    [javac] location: class weka.core.Instances
    [javac]       vals[i] = tsData.get(i).value(dataAttribute.index());
    [javac]                       ^
    [javac] jmotif\src\edu\hawaii\jmotif\sax\SAXFactory.java:1166: cannot find symbol
    [javac] symbol  : method size()
    [javac] location: class weka.core.Instances
    [javac]     while ((currPosition + window) < tsData.size()) {
    [javac]                                            ^
    [javac] jmotif\src\edu\hawaii\jmotif\timeseries\TestTSUtils.java:319: cannot find symbol
    [javac] symbol  : method size()
    [javac] location: class weka.core.Instances
    [javac]     for (int i = 0; i < data.size(); i++) {
    [javac]                             ^
    [javac] jmotif\src\edu\hawaii\jmotif\timeseries\TestTSUtils.java:320: cannot find symbol
    [javac] symbol  : method get(int)
    [javac] location: class weka.core.Instances
    [javac]       ts.add(new TPoint(data.get(i).value(dataAttribute.index()), Integer.valueOf(i).longValue()));
    [javac]                             ^
    [javac] 7 errors

Thanks in advance,
Vaske

PS: Could you perhaps add the standalone jar to the download section? It could also be usefull for others perhaps.


Pavel Senin

unread,
Oct 24, 2012, 6:47:33 AM10/24/12
to jmotif-...@googlegroups.com
Hi there:

Well, most likely you need WEKA 3-7-3, because it works for me over years by now:

============== the build

psenin@piedras:/media/Stock/workspace-school/jmotif$ ant -f jar.build.xml 
Buildfile: /media/Stock/workspace-school/jmotif/jar.build.xml

compile:
    [mkdir] Created dir: /media/Stock/workspace-school/jmotif/build/classes
    [javac] Compiling 76 source files to /media/Stock/workspace-school/jmotif/build/classes

jar-sax:
    [mkdir] Created dir: /media/Stock/workspace-school/jmotif/tmp-lib
     [copy] Copying 76 files to /media/Stock/workspace-school/jmotif/tmp-lib
      [jar] Building jar: /media/Stock/workspace-school/jmotif/jmotif.lib.jar
   [delete] Deleting directory /media/Stock/workspace-school/jmotif/tmp-lib

jar:

BUILD SUCCESSFUL
Total time: 2 seconds

=============== WEKA library version:
psenin@piedras:/media/Stock/workspace-school/jmotif$ env | grep weka
WEKA_HOME=/media/Stock/recovery/tools/weka-3-7-3

****************

I just downloaded latest DEV build, which is 3-7-7, <http://prdownloads.sourceforge.net/weka/weka-3-7-7.zip>, it works too:

psenin@piedras:/media/Stock/workspace-school/jmotif$ export WEKA_HOME=/media/Stock/recovery/tools/weka-3-7-7
psenin@piedras:/media/Stock/workspace-school/jmotif$ ant -f jar.build.xml 
Buildfile: /media/Stock/workspace-school/jmotif/jar.build.xml

compile:

jar-sax:
    [mkdir] Created dir: /media/Stock/workspace-school/jmotif/tmp-lib
     [copy] Copying 76 files to /media/Stock/workspace-school/jmotif/tmp-lib
      [jar] Building jar: /media/Stock/workspace-school/jmotif/jmotif.lib.jar
   [delete] Deleting directory /media/Stock/workspace-school/jmotif/tmp-lib

jar:

BUILD SUCCESSFUL
Total time: 1 second
******************
--
Mahalo, Pavel.

Pavel Senin

unread,
Oct 24, 2012, 6:53:45 AM10/24/12
to jmotif-...@googlegroups.com
I uploaded the jar compiled with WEKA 3-7-7.
--
Mahalo, Pavel.

vaske maskinsen

unread,
Oct 24, 2012, 7:31:36 AM10/24/12
to jmotif-...@googlegroups.com
Thank you Pavel,

I can build with that version too! I only tried older versions before...

Greets,
Felix

Pavel Senin

unread,
Oct 24, 2012, 7:38:31 AM10/24/12
to jmotif-...@googlegroups.com
Excellent! Let me know if you run into any problems.
--
Mahalo, Pavel.

Raj Patil

unread,
Dec 2, 2012, 11:16:51 AM12/2/12
to jmotif-...@googlegroups.com
Hi Vaske,

This is an interesting idea that i have been thinking of as well. Can you please advise on any progress you have made so far and have published any findings/code?

regards,
raj

Josh Patterson

unread,
Dec 2, 2012, 3:12:23 PM12/2/12
to jmotif-...@googlegroups.com

If you check out the openPDC project we have some older open source code where we use MapReduce to not only sort a lot of time series data but use a sliding window technique and decompose the window into SAX and classify that with a 1NN classifier based on work by Dr Keogh @ UCR:


http://openpdc.codeplex.com/SourceControl/changeset/view/81593#721424


General timeseries on hadoop


http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/

http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/

http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

--
Twitter: @jpatanooga
Principal Solution Architect @ Cloudera
hadoop: http://www.cloudera.com

vaske maskinsen

unread,
May 7, 2013, 7:02:58 AM5/7/13
to jmotif-...@googlegroups.com
Hi Raj,

sorry, some time has passed...
I have successfully done the SAXification on hadoop. It involves some changes of jMotif, you have to write some additional code for hadoop execution of course...  As Josh advised, take a look at openPDC. This should be a good starting point.
As soon as the work is published (in progress), I´ll post a link here.

Greets,
Vaske
Reply all
Reply to author
Forward
0 new messages