Titan-0.5.0-M3 + Apache Spark 1.0.0 : java.lang.ClassNotFoundException

880 views
Skip to first unread message

Matt Chamberlin

unread,
Aug 1, 2014, 4:25:20 PM8/1/14
to aureliu...@googlegroups.com
Hi all,

I'm attempting to use Titan-Hadoop (previously Faunus) to load Titan data (as FaunusVertex objects) from Cassandra into a Spark HadoopRDD using TitanCassandraInputFormat.

I had this working when using Faunus-0.4.4, but after moving to Titan-0.5.0, I'm running into the following error at runtime:

Exception in thread "main" java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.hadoop.formats.titan.input.current.TitanHadoopSetupImpl
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
at com.thinkaurelius.titan.hadoop.formats.titan.TitanInputFormat.setConf(TitanInputFormat.java:45)
at com.thinkaurelius.titan.hadoop.formats.titan.cassandra.TitanCassandraInputFormat.setConf(TitanCassandraInputFormat.java:49)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1097)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:703)
at SparkTitanTest$.main(SparkTitanTest.scala:32)
at SparkTitanTest.main(SparkTitanTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
... 18 more
Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: com.thinkaurelius.titan.diskstorage.util.time.Timestamps$2
at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
at com.thinkaurelius.titan.graphdb.database.serialize.kryo.KryoSerializer.readClassAndObject(KryoSerializer.java:77)
at com.thinkaurelius.titan.graphdb.database.serialize.StandardSerializer.readClassAndObject(StandardSerializer.java:85)
at com.thinkaurelius.titan.diskstorage.configuration.backend.KCVSConfiguration.staticBuffer2Object(KCVSConfiguration.java:244)
at com.thinkaurelius.titan.diskstorage.configuration.backend.KCVSConfiguration.toMap(KCVSConfiguration.java:180)
at com.thinkaurelius.titan.diskstorage.configuration.backend.KCVSConfiguration.asReadConfiguration(KCVSConfiguration.java:187)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1272)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:92)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:81)
at com.thinkaurelius.titan.hadoop.formats.titan.input.current.TitanHadoopSetupImpl.<init>(TitanHadoopSetupImpl.java:42)
... 23 more
Caused by: java.lang.ClassNotFoundException: com.thinkaurelius.titan.diskstorage.util.time.Timestamps$2
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
... 35 more

I've checked my fat jar that I submit to Spark to make sure that the Timestamps$2 class exists and have made sure I'm using Kryo 2.21 (Spark 1.0.0 dependency) and the same version of ASM as described in this post.
I also tried forcing Spark to use Kryo 2.22 (titan-hadoop dependency) to no avail.

Right now I'm using:

titan-0.5.0-M3-hadoop2
spark-1.0.0-hadoop2

I know there have been a couple of other posts regarding similar use-cases for Titan + Spark so I'm hoping someone can give me insight into this problem. 

Thanks.

Matthias Broecheler

unread,
Aug 13, 2014, 8:22:16 AM8/13/14
to aureliu...@googlegroups.com

Hi Matt,
We haven't tried to use 0.5 with spark. Were you able to sort out the class path issue?
Thanks
Matthias

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/54b792d4-7682-4751-a9ff-434df2f649bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

j...@senzari.com

unread,
Aug 15, 2014, 2:33:45 PM8/15/14
to aureliu...@googlegroups.com
Not sure if you have figured that out or not, but here is some of my thoughts.
One possible cause to your problem is the different ClassLoaders used by Kryo and Spark.
The default Kryo class loader cannot interpret the class URL in distributed Spark environment. One needs to set ClassLoader used by Kryo to be the one used by Spark. 

Khanderao

unread,
Aug 25, 2014, 6:47:52 PM8/25/14
to aureliu...@googlegroups.com
Mattias

Same issue for me.. anyway to use java serialization for now? Spark has a way to change serializers, does Titan has a similar way?

14/08/25 22:24:55 INFO GraphDatabaseConfiguration: Enabled partitioning
14/08/25 22:24:55 ERROR JobScheduler: Error running job streaming job 1409005486000 ms.0

com.esotericsoftware.kryo.KryoException: Unable to find class: com.thinkaurelius.titan.diskstorage.util.time.Timestamps$2
    at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
    at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
    at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
    at com.thinkaurelius.titan.graphdb.database.serialize.kryo.KryoSerializer.readClassAndObject(KryoSerializer.java:77)
    at com.thinkaurelius.titan.graphdb.database.serialize.StandardSerializer.readClassAndObject(StandardSerializer.java:85)


Matthias Broecheler

unread,
Aug 26, 2014, 9:05:43 PM8/26/14
to aureliu...@googlegroups.com
We cannot really swap out the serializer since it must be coherent across the entire graph for the entire lifetime of the database.
Also, I don't see how that would help - it would still be necessary to find the class, wouldn't it?


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Matthias Broecheler
http://www.matthiasb.com

Mike Luu

unread,
Jan 21, 2015, 5:28:15 PM1/21/15
to aureliu...@googlegroups.com
Did you ever figure this out? I'm running into the same problem.

Thanks

Matthias Broecheler

unread,
Jan 22, 2015, 3:14:10 PM1/22/15
to aureliu...@googlegroups.com
We are in the process of removing the kryo dependency from Titan. That will be a major undertaking and hopefully take care of these issues.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Matt Chamberlin

unread,
Jan 22, 2015, 8:50:45 PM1/22/15
to aureliu...@googlegroups.com
Thanks for the update Matthias.

I ended up abandoning my attempts at resolving the issue and my Spark/Titan-0.5.x dreams (for now) in anticipation of TinkerPop 3 support in Titan-0.9+ and the prospect of a Spark-based GraphComputer and/or GraphX integration.
I'm excited Spark/Titan is on your radar though! Keep up the great work!

Khanderao

unread,
Mar 1, 2015, 6:42:22 PM3/1/15
to aureliu...@googlegroups.com
Matthias,

is this resolved?

Unlike Matt, Too late to abandon now for me.

K

Dan LaRocque

unread,
Mar 4, 2015, 6:50:20 AM3/4/15
to aureliu...@googlegroups.com
Hi,

I'm considering shading Kryo in Titan 0.5.5 as a fix for this.

I reproduced the Timestamps$2 ClassNotFoundException.  A trivial Spark job that wraps 0.5.4's or titan05's TitanCassandraInputFormat in a RDD and counts vertices triggers it.  I loaded graph of the gods beforehand.

Kryo defaults to using the classloader that loaded itself.  Under Spark, that classloader can't resolve Timestamps$2.  Calling <kryoinstance>.setClassLoader(Thread.currentThread().getContextClassLoader()) within Titan right after constructing a Kryo instance makes the Timestamps$2 CNFE disappear when executing within Spark.  Based on that, I think Titan is using Spark's own Kryo classes, which were loaded prior to and in isolation from application classes.

However, adding the setClassLoader call mentioned above just leads to a "KryoException: Encountered unregistered class ID: 10" later.  I seem to recall that exception being caused by a mismatch between the Kryo version used to write Titan's data and the version used to read it at runtime.  Titan 0.5.x uses Kryo 2.22 (and that's definitely what I used to load graph of the gods), whereas Spark 1.2.1 uses Kryo 2.21.  So, a Kryo version mismatch is plausible insofar as Titan really is using Spark's Kryo classes at runtime.  This issue seems to say the same thing, although it was about Storm rather than Spark: https://github.com/thinkaurelius/titan/issues/615.

I briefly tried building Spark 1.2.1 with Kryo 2.22, but that died in compilation.  I then tried shading Titan's copy of Kryo 2.22 and packing it into titan-core.  This fixed the "unregistered class ID" exception.  It also apparently obviated the <kryoinstance>.setClassLoader call.  Now my test Spark app succeeds and prints the expected GotG vertex count.

I'm going to run some tests and think about this.  I have this in a branch, but I'm hesitant to merge and push because this is a fairly disruptive change to include in a 0.5.x bugfix release.  It would amount to implementing https://github.com/thinkaurelius/titan/issues/821, which we recently said we would postpone until titan09 (https://github.com/thinkaurelius/titan/issues/821#issuecomment-70897394).

thanks,
Dan

Etienne Couritas

unread,
Aug 6, 2015, 1:05:21 PM8/6/15
to Aurelius
Hello

Of course I have the same problem.

I'm interested by the setClassLoader method. Is it possible to call this outside titan ? Maybe before the initialisation of titan ?
Spark can be use with java serialisation, maybe it could be a workaround    

By the way I wonder what do you mean by shading ?

Etienne Couritas

unread,
Aug 14, 2015, 11:40:25 AM8/14/15
to Aurelius
I have goods new.
With building the master version of Titan (0.5.5-SNAPSHOT) and setting cassandra-all version to 2.1.5 It's becaume possible to load data from titan inside spark
Reply all
Reply to author
Forward
0 new messages