Titan 0.5.0: Gremlin & Hadoop 2.5 JobCounter.MB_MILLIS

at...@infotrellis.com

unread,

Aug 23, 2014, 11:01:52 AM8/23/14

to aureliu...@googlegroups.com

I have two Hadoop clusters (using Hadoop ver 2.5 & Hadoop ver 2.4.1). I tried Titab 0.5.0's titan-hadoop-2 functionality as per the documentation. Using gremlin, I can load the graph from a file in HDFS and I query the graph with success (i.e. the map reduce jobs finish successfully). However, just after the successful job completion message, I get the following exception when displaying job statistics.

I suspect this is due to the version mis-match of the Hadoop cluster (2.5 and/or 2.4.1) and the Hadoop jars (ver 2.2.0) used by gremlin. This behaviour is the same independent of which cluster the jobs are run at (either the Hadoop 2.5.0 or Hadoop 2.4.1); i.e. I get the same exception complaining about JobCounter.MB_MILLIS_MAPS.

When I run the Titan-Hadoop-2 against its own local hadoop (i.e. do nor provide any hadoop config), everything seems to be working fine.

I believe the gremlin scripts are causing the problem here, as it blindly loads all jars from the lib directory including its own hadoop 2.2.0 jars.

CP=`abs_path`/../conf
CP=$CP:$(find -L `abs_path`/../lib/ -name '*.jar' | tr '\n' ':')
CP=$CP:$(find -L `abs_path`/../ext/ -name '*.jar' | tr '\n' ':')

Your feedback on this would be much appreciated.

Thanks!

EXCEPTION

java.lang.RuntimeException: No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS

at com.thinkaurelius.titan.hadoop.tinkerpop.gremlin.ResultHookClosure.call(ResultHookClosure.java:44)

at groovy.lang.Closure.call(Closure.java:428)

at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSite.invoke(PogoMetaMethodSite.java:231)

at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.call(PogoMetaMethodSite.java:64)

at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)

at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:324)

at org.codehaus.groovy.tools.shell.Groovysh.this$3$setLastResult(Groovysh.groovy)

at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)

at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)

at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:2416)

at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:3347)

at org.codehaus.groovy.tools.shell.Shell.setProperty(Shell.groovy)

at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.setGroovyObjectProperty(ScriptBytecodeAdapter.java:528)

at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:152)

at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:114)

at org.codehaus.groovy.tools.shell.Shell$leftShift$0.call(Unknown Source)

at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:88)

at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)

at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)

at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)

at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1079)

at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)

at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)

at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:100)

at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:272)

at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:52)

at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:137)

at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:57)

at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)

at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)

at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1079)

at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)

at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)

at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:66)

at com.thinkaurelius.titan.hadoop.tinkerpop.gremlin.Console.<init>(Console.java:61)

at com.thinkaurelius.titan.hadoop.tinkerpop.gremlin.Console.<init>(Console.java:68)

at com.thinkaurelius.titan.hadoop.tinkerpop.gremlin.Console.main(Console.java:73)

Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS

at java.lang.Enum.valueOf(Enum.java:236)

at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.valueOf(FrameworkCounterGroup.java:148)

at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.findCounter(FrameworkCounterGroup.java:182)

at org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)

at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:240)

at org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:370)

at org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:511)

at org.apache.hadoop.mapreduce.Job$7.run(Job.java:756)

at org.apache.hadoop.mapreduce.Job$7.run(Job.java:753)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:753)

at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1361)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1289)

at com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler.run(Hadoop2Compiler.java:299)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.thinkaurelius.titan.hadoop.HadoopPipeline.submit(HadoopPipeline.java:1092)

at com.thinkaurelius.titan.hadoop.HadoopPipeline.submit(HadoopPipeline.java:1075)

at com.thinkaurelius.titan.hadoop.tinkerpop.gremlin.ResultHookClosure.call(ResultHookClosure.java:39)

GREMLIN CLASS PATH Extract for Hadoop jars

/lib/hadoop-annotations-2.2.0.jar

/lib/hadoop-auth-2.2.0.jar

/lib/hadoop-client-2.2.0.jar

/lib/hadoop-common-2.2.0.jar

/lib/hadoop-hdfs-2.2.0.jar

/lib/hadoop-mapreduce-client-app-2.2.0.jar

/lib/hadoop-mapreduce-client-common-2.2.0.jar

/lib/hadoop-mapreduce-client-core-2.2.0.jar

/lib/hadoop-mapreduce-client-jobclient-2.2.0.jar

/lib/hadoop-mapreduce-client-shuffle-2.2.0.jar

/lib/hadoop-yarn-api-2.2.0.jar

/lib/hadoop-yarn-client-2.2.0.jar

/lib/hadoop-yarn-common-2.2.0.jar

/lib/hadoop-yarn-server-common-2.2.0.jar

/lib/hadoop-yarn-server-nodemanager-2.2.0.jar

at...@infotrellis.com

unread,

Aug 23, 2014, 11:34:57 AM8/23/14

to aureliu...@googlegroups.com

So I tried a quick and dirty test by replacing the Hadoop 2.2.0 jars from Titan's lib directory with Hadoop 2.5.0 jars (as shown below). The exception has gone away, I get the job counts for my query properly (see output below).

I guess the gremlin scripts needs to be smarter so that if HADOOP_PREFIX is set, then use the hadoop jars files from there instead of using Titan's default hadoop jar files.

JAR FILES REPLACED

./hadoop-mapreduce-client-core-2.2.0.jar.del

./hadoop-yarn-api-2.2.0.jar.del

./hadoop-yarn-server-common-2.2.0.jar.del

./hadoop-mapreduce-client-jobclient-2.2.0.jar.del

./hadoop-annotations-2.2.0.jar.del

./hadoop-mapreduce-client-shuffle-2.2.0.jar.del

./hadoop-mapreduce-client-common-2.2.0.jar.del

./hadoop-yarn-common-2.2.0.jar.del

./hadoop-auth-2.2.0.jar.del

./hadoop-common-2.2.0.jar.del

./hadoop-yarn-server-nodemanager-2.2.0.jar.del

./hadoop-yarn-client-2.2.0.jar.del

./hadoop-hdfs-2.2.0.jar.del

./hadoop-mapreduce-client-app-2.2.0.jar.del

ADD HADOOP2.5.0 jars

cd titan-0.5.0-hadoop2/lib

ln -s /hadoop-2.5.0/mapreduce/hadoop-mapreduce-client-core-2.5.0.jar .

ln -s /hadoop-2.5.0/yarn/hadoop-yarn-api-2.5.0.jar .

ln -s /hadoop-2.5.0/yarn/hadoop-yarn-server-common-2.5.0.jar .

ln -s /hadoop-2.5.0/mapreduce/hadoop-mapreduce-client-jobclient-2.5.0.jar .

ln -s /hadoop-2.5.0/common/lib/hadoop-annotations-2.5.0.jar .

ln -s /hadoop-2.5.0/mapreduce/hadoop-mapreduce-client-shuffle-2.5.0.jar .

ln -s /hadoop-2.5.0/mapreduce/hadoop-mapreduce-client-common-2.5.0.jar .

ln -s /hadoop-2.5.0/yarn/hadoop-yarn-common-2.5.0.jar .

ln -s /hadoop-2.5.0/common/lib/hadoop-auth-2.5.0.jar .

ln -s /hadoop-2.5.0/common/hadoop-common-2.5.0.jar .

ln -s /hadoop-2.5.0/yarn/hadoop-yarn-server-nodemanager-2.5.0.jar .

ln -s /hadoop-2.5.0/yarn/hadoop-yarn-client-2.5.0.jar .

ln -s /hadoop-2.5.0/hdfs/hadoop-hdfs-2.5.0.jar .

ln -s /hadoop-2.5.0/mapreduce/hadoop-mapreduce-client-app-2.5.0.jar .

QUERY OUTPUT

g = HadoopFactory.open('titan-graphson.properties')

g.V.map

gremlin> g.V.map

11:31:25 WARN com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler - Path tracking is enabled for this Titan/Hadoop job (space and time expensive)

11:31:25 WARN com.thinkaurelius.titan.hadoop.compat.h2.Hadoop2Compiler - State tracking is enabled for this Titan/Hadoop job (full deletes not possible)

11:31:26 WARN org.apache.hadoop.mapreduce.JobSubmitter - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

11:31:28 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1

11:31:28 INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1

11:31:28 INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1408777709136_0007

11:31:29 INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://masternode2:8088/proxy/application_1408777709136_0007/

11:31:29 INFO org.apache.hadoop.mapreduce.Job - Running job: job_1408777709136_0007

11:31:38 INFO org.apache.hadoop.mapreduce.Job - Job job_1408777709136_0007 running in uber mode : false

11:31:38 INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0%

11:31:50 INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0%

11:31:50 INFO org.apache.hadoop.mapreduce.Job - Job job_1408777709136_0007 completed successfully

11:31:50 INFO org.apache.hadoop.mapreduce.Job - Counters: 33

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=181566

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=2154

HDFS: Number of bytes written=2604

HDFS: Number of read operations=5

HDFS: Number of large read operations=0

HDFS: Number of write operations=3

Job Counters

Launched map tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=9588

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=9588

Total vcore-seconds taken by all map tasks=9588

Total megabyte-seconds taken by all map tasks=9818112

Map-Reduce Framework

Map input records=12

Map output records=0

Input split bytes=126

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=165

CPU time spent (ms)=6720

Physical memory (bytes) snapshot=214536192

Virtual memory (bytes) snapshot=714326016

Total committed heap usage (bytes)=134217728

com.thinkaurelius.titan.hadoop.mapreduce.transform.PropertyMapMap$Counters

VERTICES_PROCESSED=12

com.thinkaurelius.titan.hadoop.mapreduce.transform.VerticesMap$Counters

EDGES_PROCESSED=0

VERTICES_PROCESSED=12

File Input Format Counters

Bytes Read=2028

File Output Format Counters

Bytes Written=0

==>0 {_id=[0], name=[saturn], type=[titan]}

==>1 {_id=[1], name=[jupiter], type=[god]}

==>2 {_id=[2], name=[neptune], type=[god]}

==>3 {_id=[3], name=[pluto], type=[god]}

==>4 {_id=[4], name=[sky], type=[location]}

==>5 {_id=[5], name=[sea], type=[location]}

==>6 {_id=[6], name=[tartarus], type=[location]}

==>7 {_id=[7], name=[hercules], type=[demigod]}

==>8 {_id=[8], name=[alcmene], type=[human]}

==>9 {_id=[9], name=[nemean], type=[monster]}

==>10 {_id=[10], name=[hydra], type=[monster]}

==>11 {_id=[11], name=[cerberus], type=[monster]}

Guy Taylor

unread,

Aug 23, 2014, 11:36:53 AM8/23/14

to aureliu...@googlegroups.com

This is a versioning mismatch, likely caused by the libs in the Titan /lib dir.

I spent a couple of hours on this yesterday, and then got the correct libraries loading and it worked much better.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/e53ec2d7-5c1d-40f4-8727-8a93dc4aff28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Guy Taylor
Systems Choreographer
--

at...@infotrellis.com

unread,

Aug 23, 2014, 12:18:36 PM8/23/14

to aureliu...@googlegroups.com

Thanks for your reply. I came to the same conclusion after spending a few hours on this.

Two things stood out for me:

The error message did not contain any indication leading to the version mis-match diagnosis.
I understand that this is an error message from the hadoop code, so we may be stuck with it.
Enhancing the gremlin startup script is a 'low hanging fruit' when it comes to fixing this problem.
The HADOOP_PREFIX is already being used, so why not take advantage of that.

To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraphs+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/e53ec2d7-5c1d-40f4-8727-8a93dc4aff28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dan LaRocque

unread,

Aug 25, 2014, 1:38:17 AM8/25/14

to aureliu...@googlegroups.com

Hi,

Thanks for following up. I think this may be a variation on https://issues.apache.org/jira/browse/MAPREDUCE-5831.

I've seen a couple of oddities on installations with mismatched client/cluster Hadoop versions. Besides counter linkage errors, I've also seen the http link to the job tracker start with http://http://. They must have shifted the bit of code that prepends the protocol around between minors in the 2.x series. Both issues disappear when I've matched the client Hadoop version to the cluster's.

As Vinod mentioned in that bug, cross-version MR/YARN wire compatibility isn't really stable yet. This matches my experience.

Dropping the cluster's jars into the client often works, but it's really moving the problem around. It rules out wire compat issues between client and cluster since they're all the same code, but now we have the possibility of ClassNotFoundException/MethodNotFoundException if an ABI-breaking change across Hadoop minor versions touches code referenced by Titan-Hadoop or Titan-HBase's classfiles. If changing minors produces a Hadoop linkage error in Titan, then I would like to add some defensive reflection around the affected type to work on either side of the ABI change. Still, linkage errors are a better problem than wire protocol incompatibility, since the latter can't be fixed by making Titan smarter about how it uses Hadoop.

It's theoretically possible to avoid both linkage and wire compat problems by recompiling Titan (and whatever other bits of the stack under/above it that are exposed to Hadoop APIs) against the specific version installed on your cluster. This is a niche approach and a tremendous pain; I don't think most people want to go down that road.

That's a good point about gremlin.sh & $HADOOP_PREFIX. Though we would need to modify gremlin.sh so that it falls back gracefully to using the Hadoop jars packed with Titan when $HADOOP_PREFIX is unset. The zipfile supports a self-contained trivial Hadoop MR environment out of the box using the local JobRunner and LocalFileSystem, so it's possible to play with Titan-Hadoop without an actual cluster in place. I also wonder if we need to check $HADOOP_HOME for MRv1. Some details to check, but it seems feasible.

thanks,
Dan

To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/4803774a-b465-4708-8939-d3df9c55fbae%40googlegroups.com.

Reply all

Reply to author

Forward

Titan 0.5.0: Gremlin & Hadoop 2.5 JobCounter.MB_MILLIS_MAPS Exception

at...@infotrellis.com

at...@infotrellis.com

Guy Taylor

at...@infotrellis.com

Dan LaRocque