Hi,
Currently we are using IBM BigInsight 2.1.2, which comes with Hadoop 2.2.0 with MR1 and Hbase 0.96.0. I understand it maybe not be supported by Hraven, but I still gave it a try.
I clone the latest code out, and following the directions given in the README.
1) I added "mapred.job.tracker.history.completed.location" in the mapred-site.xml, and I did saw the MR job log files stored in the HDFS as I specified.
2) I run "bin/create_schema.rb", but change compression from "LZO" to "SNAPPY", the schema created without any errors.
3) Now I tried to run the jobFilePreprocessor.sh, but got the following errors:
$ ./jobFilePreprocessor.sh /opt/ibm/biginsights/hadoop-conf/ /hadoop/mapred/history/done /user/test/hraven/historyraw test-cluster 100 1024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/ibm/biginsights/IHC/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yzhang/hraven/hraven-0.9.16-SNAPSHOT/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/07/18 10:00:48 INFO etl.JobFilePreprocessor: output=/user/yzhang/hraven/historyraw
14/07/18 10:00:48 INFO etl.JobFilePreprocessor: input=/hadoop/mapred/history/done
14/07/18 10:00:48 INFO etl.JobFilePreprocessor: forceAllFiles: false
14/07/18 10:00:48 INFO etl.JobFilePreprocessor: cluster=p2-bibigin101
14/07/18 10:00:48 INFO etl.JobFilePreprocessor: maxFileSize=1024
Exception in thread "main" java.lang.NoClassDefFoundError: org.apache.hadoop.hbase.filter.WritableByteArrayComparable
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at com.twitter.hraven.etl.JobFilePreprocessor.run(JobFilePreprocessor.java:289)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.twitter.hraven.etl.JobFilePreprocessor.main(JobFilePreprocessor.java:452)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.WritableByteArrayComparable
at java.net.URLClassLoader.findClass(URLClassLoader.java:434)
at java.lang.ClassLoader.loadClass(ClassLoader.java:660)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
... 12 more
The error looks strange for me. As I know that class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" is not in the Hbase 0.96.0 any more. What I don't understand is why this class needed in HRaven?
My question is:
1) For Hadoop 2.2.0 with MR1 + Hbase 0.96.0, will HRaven work?
2) What can cause the above error? Is my configuration not correct? If so, where?
Thanks
Yong