Query using SparkGraphComputer in Janus Graph giving error

1,013 views
Skip to first unread message

Himanshu Gupta

unread,
Oct 24, 2017, 9:56:51 AM10/24/17
to JanusGraph users
Hi,

I am using Janus 0.1.1 with Cassandra 3.0.14 as a backend. Can anyone help me how to traverse graph using Spark with Canssandra as a backend ? Can you specify the configurations?

current I'm using following configuration file:  read-cassandra.properties
--------------------------------------------------------------------------------
#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.CassandraInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=localhost
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
janusgraphmr.ioformat.cf-name=edgestore
storage.backend=cassandra
#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.driver.host=localhost
#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.serializer=org.apache.spark.serializer.KryoSerializer


--------------------------------------------------------------------------------

I do following operations:

gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties")
==>hadoopgraph[cassandrainputformat->gryooutputformat]

gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]

gremlin> g.V().count()
java.io.IOException: Could not get input splits

What could be the reason for this error? How do i resolve this?

Ted Wilmes

unread,
Oct 24, 2017, 4:39:12 PM10/24/17
to JanusGraph users
Hello,
The record reader that comes with JanusGraph 0.1.1 is not compatible with Cassandra 3. Can you try using the latest JanusGraph 0.3.0 release and the read-cassandra-3.properties configuration? Here's the relevant issue that was fixed in 0.2.0: https://github.com/JanusGraph/janusgraph/issues/172

Thanks,
Ted
Message has been deleted
Message has been deleted

Himanshu Gupta

unread,
Oct 26, 2017, 1:58:59 AM10/26/17
to JanusGraph users
Hi Ted Wilmes,

Even using latest janus graph 0.2.0 and read-cassandra-3.properties still I'm facing same error.

actully I want to use spark graph computer from java api

I did following operations in my java code 
---------------------------------------------------------------------------------------------
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer;
import org.apache.tinkerpop.gremlin.structure.Graph;
import org.apache.tinkerpop.gremlin.structure.util.GraphFactory;

public class Sample{

public static void main(String[] args) {
Graph graph = GraphFactory.open("read-cassandra-3.properties");
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
System.out.println(g.V().count().next());
}
}
---------------------------------------------------------------------------------------------

Error I'm getting is 


java.lang.IllegalStateException: java.lang.ExceptionInInitializerError
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.Traversal.fill(Traversal.java:177)
at org.apache.tinkerpop.gremlin.process.traversal.Traversal.toList(Traversal.java:115)
at com.apm.main.Sample.main(Sample.java:18)
Caused by: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:68)
... 8 more
Caused by: java.lang.ExceptionInInitializerError
at org.apache.spark.SparkContext.withScope(SparkContext.scala:714)
at org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:1129)
at org.apache.spark.api.java.JavaSparkContext.newAPIHadoopRDD(JavaSparkContext.scala:507)
at org.apache.tinkerpop.gremlin.spark.structure.io.InputFormatRDD.readGraphRDD(InputFormatRDD.java:42)
at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:215)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Jackson version is too old 2.4.4
at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:56)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:549)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:81)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
... 9 more
- show quoted text -

Ankur Goel

unread,
Oct 26, 2017, 5:44:12 AM10/26/17
to JanusGraph users
Please upgrade your Jackson dependency version.

~

Ted Wilmes

unread,
Oct 26, 2017, 8:21:13 AM10/26/17
to Ankur Goel, JanusGraph users
Hello,
Good suggestion on the Jackson version Ankur. Himansha, you can also try running it from the Gremlin console to confirm it works.

Thanks,
Ted

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/cWAasFVmgS8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a9b5018a-809e-4f08-a522-3f75c89468c5%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Himanshu Gupta

unread,
Oct 26, 2017, 8:46:28 AM10/26/17
to JanusGraph users
hi all,

I tried with all latest version but still I got this error -
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.9.2
at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:751)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:81)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
... 9 more

Ted I did same thing in janusgraph 0.2.0 gremlin console also I got same error which I was getting in previous version "java.io.IOException: Could not get input splits"

As per https://github.com/JanusGraph/janusgraph/releases/  janusgraph 0.2.0 support gremlin 3.2.6 and I created a maven project, So these
dependencies came from there only I'm sharing my pom.xml

<modelVersion>4.0.0</modelVersion>
<groupId>com.apm.myapp</groupId>
<artifactId>myapp_spark</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>myapp-Spark</name>
<dependencies>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-core</artifactId>
<version>3.2.6</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>hadoop-gremlin</artifactId>
<version>3.2.6</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>spark-gremlin</artifactId>
<version>3.2.6</version>
</dependency>
<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-core</artifactId>
<version>0.2.0</version>
</dependency>
<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-cassandra</artifactId>
<version>0.2.0</version>
</dependency>
<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-solr</artifactId>
<version>0.2.0</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>

<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-hadoop-core</artifactId>
<version>0.2.0</version>
<exclusions>
<exclusion>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-hbase-core</artifactId>
</exclusion>
</exclusions>
</dependency>

</dependencies>
</project>
To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.

Himanshu Gupta

unread,
Oct 27, 2017, 8:50:50 AM10/27/17
to JanusGraph users
do you think I missed something ?

Ted Wilmes

unread,
Oct 27, 2017, 8:53:02 AM10/27/17
to JanusGraph users
Hi Himansha,
Maybe I missed it, but can you post the full IOException you're getting when you can't get the input splits?

Thanks,
Ted

Himanshu Gupta

unread,
Oct 30, 2017, 1:42:05 AM10/30/17
to JanusGraph users
Hi Ted,

here it is full IOException

          \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: janusgraph.imports
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/ubuntu/software/janusgraph-0.2.0-hadoop2/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/ubuntu/software/janusgraph-0.2.0-hadoop2/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
05:16:45 WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.tinkergraph
gremlin>  graph = GraphFactory.open("conf/hadoop-graph/read-cassandra-3.properties");
==>hadoopgraph[cassandra3inputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer);
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
05:19:00 WARN  org.apache.spark.util.Utils  - Your hostname, ubuntuamaster resolves to a loopback address: 127.0.0.1; using 10.0.0.10 instead (on interface eth0)
05:19:00 WARN  org.apache.spark.util.Utils  - Set SPARK_LOCAL_IP if you need to bind to another address
java.io.IOException: Could not get input splits
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.IllegalStateException: java.io.IOException: Could not get input splits
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:234)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:447)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:191)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
        at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:124)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:166)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:478)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Could not get input splits
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:68)
        ... 56 more
Caused by: java.io.IOException: Could not get input splits
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203)
        at org.janusgraph.hadoop.formats.cassandra.CassandraBinaryInputFormat.getSplits(CassandraBinaryInputFormat.java:62)
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat.getSplits(GiraphInputFormat.java:62)
        at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
        at org.apache.spark.rdd.RDD$$anonfun$fold$1.apply(RDD.scala:1088)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
        at org.apache.spark.rdd.RDD.fold(RDD.scala:1082)
        at org.apache.spark.api.java.JavaRDDLike$class.fold(JavaRDDLike.scala:399)
        at org.apache.spark.api.java.AbstractJavaRDDLike.fold(JavaRDDLike.scala:46)
        at org.apache.tinkerpop.gremlin.spark.process.computer.traversal.strategy.optimization.interceptor.SparkStarBarrierInterceptor.apply(SparkStarBarrierInterceptor.java:101)
        at org.apache.tinkerpop.gremlin.spark.process.computer.traversal.strategy.optimization.interceptor.SparkStarBarrierInterceptor.apply(SparkStarBarrierInterceptor.java:64)
        at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:260)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints 127.0.0.1
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199)
        ... 52 more
Caused by: java.io.IOException: failed connecting to all endpoints 127.0.0.1
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317)
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
        at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
        ... 4 more

Himanshu Gupta

unread,
Oct 31, 2017, 8:55:18 AM10/31/17
to JanusGraph users
Hi Ted, 

Is it like still this issue in janus graph 0.2.0 ? 

Thanks.

Ted Wilmes

unread,
Oct 31, 2017, 9:07:50 AM10/31/17
to Himanshu Gupta, JanusGraph users
Hi Himanshu,
This jumps out at me: 

Caused by: java.io.IOException: failed connecting to all endpoints 127.0.0.1

Can you check your connectivity to Cassandra with cqlsh?

--Ted

To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/37289067-a0b1-43d5-a713-bbb0d3f2c639%40googlegroups.com.

Himanshu Gupta

unread,
Nov 3, 2017, 3:13:03 AM11/3/17
to JanusGraph users
Hi Ted,

yes I'm able to connect cqlsh even I can see janusgraph keyspace in list. Problem is when I'm using normal cassandra configuration i can do all query, but when I try with hadoop read-cassandra-3.properties it is giving exception.

In JanusGraph Documentation I saw they are doing bulk loading graph into cassandra and they are getting output 



Ankur Goel

unread,
Dec 4, 2017, 9:37:28 AM12/4/17
to JanusGraph users
Himanshu,

 are you able to solve this?

~

Himanshu Gupta

unread,
Dec 13, 2017, 6:37:02 AM12/13/17
to JanusGraph users
no ankur

Ankur Goel

unread,
Dec 13, 2017, 6:44:58 AM12/13/17
to Himanshu Gupta, JanusGraph users
Check if thrift is enable on Cassandra, graph computer use thrift for communication.

~

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/cWAasFVmgS8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.

Himanshu Gupta

unread,
Jan 8, 2018, 4:38:51 AM1/8/18
to JanusGraph users
yes ankur i'm using cassandra thrift only.... Do you got any solution ?


On Wednesday, December 13, 2017 at 5:14:58 PM UTC+5:30, Ankur Goel wrote:
Check if thrift is enable on Cassandra, graph computer use thrift for communication.

~
On Wed, Dec 13, 2017 at 5:07 PM, Himanshu Gupta <techie...@gmail.com> wrote:
no ankur

On Monday, December 4, 2017 at 8:07:28 PM UTC+5:30, Ankur Goel wrote:
Himanshu,

 are you able to solve this?

~

On Friday, November 3, 2017 at 12:43:03 PM UTC+5:30, Himanshu Gupta wrote:
Hi Ted,

yes I'm able to connect cqlsh even I can see janusgraph keyspace in list. Problem is when I'm using normal cassandra configuration i can do all query, but when I try with hadoop read-cassandra-3.properties it is giving exception.

In JanusGraph Documentation I saw they are doing bulk loading graph into cassandra and they are getting output 



--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/cWAasFVmgS8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.

Ankur Goel

unread,
Jan 8, 2018, 5:17:27 AM1/8/18
to Himanshu Gupta, JanusGraph users
Working fine at my end.

~

To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/7e6b2213-9c4e-4b70-bcc4-e1408dcd42cf%40googlegroups.com.

Debasish Kanhar

unread,
Mar 7, 2018, 1:45:15 PM3/7/18
to JanusGraph users
@Himanshu @Ankur, were you guys able to solve this? Still facing the same error, and I don't know any workaround to solve this issue. For now my workaround is I save my subgraph on which I want to do OLAP into file system, as GraphSON/Kryo and then read it in .properties containing input path as the filename where I saved Graph.

Himanshu Gupta

unread,
Mar 20, 2018, 2:07:20 AM3/20/18
to JanusGraph users
No Debasish still I'm facing same issue I tried with different cassandra version but still facing same error .... What could be the reason for this error? 

Ankur Goel

unread,
Mar 20, 2018, 4:59:07 AM3/20/18
to Himanshu Gupta, JanusGraph users
Share your configuration.


On Tue, Mar 20, 2018 at 11:37 AM, Himanshu Gupta <techie...@gmail.com> wrote:
No Debasish still I'm facing same issue I tried with different cassandra version but still facing same error .... What could be the reason for this error? 

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/cWAasFVmgS8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/17bf034f-4d32-4391-bdc8-836ba62a756d%40googlegroups.com.

Himanshu Gupta

unread,
Mar 20, 2018, 9:17:25 AM3/20/18
to JanusGraph users
#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cassandrathrift
janusgraphmr.ioformat.conf.storage.hostname=****.****.****.****
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=testdb

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.serializer=org.apache.spark.serializer.KryoSerializer
gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra-3.properties");
==>hadoopgraph[cassandra3inputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer);
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()

Debasish Kanhar

unread,
Mar 22, 2018, 1:35:24 PM3/22/18
to JanusGraph users
Hi,

So what I did was I enabled thrift manually across all nodes of Cassandra, and that seemed to be working.

As far as my memory serves me correct, Cassandra 3 doesn't start thrift server by default unless configured, and hence this step.

Misha Brukman

unread,
Mar 22, 2018, 9:51:39 PM3/22/18
to Debasish Kanhar, JanusGraph users
Can you use the "cql" backend instead of "cassandrathrift"?

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/5064bac2-c67a-46ec-8af2-72aa83ef835a%40googlegroups.com.

Ankur Goel

unread,
Mar 23, 2018, 5:25:08 AM3/23/18
to Himanshu Gupta, JanusGraph users

Add this:

janusgraphmr.ioformat.conf.storage.backend=cassandra

spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer


Enable thrift.


Java programme:



Graph graph = GraphFactory.open("conf/hadoop-graph/read-cassandra-3.properties");

GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);

System.out.println(g.V().count().next());


Try it should work.


~AnkurG




--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/cWAasFVmgS8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.

ste...@indisputable.io

unread,
May 30, 2018, 7:21:40 AM5/30/18
to JanusGraph users
I'm experiencing the same issue here. Any tips on how to solve this?  I'm currently just trying to test this locally.

I'm using JanusGraph 0.2.0.
Cassandra version is 3.11.2
Cassandra is run in docker container with this command (so thrift should be running):
 docker run -d -p 7001:7001 -p 7199:7199 -p 9042:9042 -p 9160:9160 -v  /<test_dir>/cass:/var/lib/cassandra -e CASSANDRA_START_RPC=true --name cass  cassandra
Just for sanity's sake I also executed:
docker exec -it cass nodetool enablethrift

/conf/hadoop-graph/read-cassandra-3.properties is default:

#

# Hadoop Graph Configuration

#

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph

gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat

gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat


gremlin.hadoop.jarsInDistributedCache=true

gremlin.hadoop.inputLocation=none

gremlin.hadoop.outputLocation=output


#

# JanusGraph Cassandra InputFormat configuration

#

janusgraphmr.ioformat.conf.storage.backend=cassandra

janusgraphmr.ioformat.conf.storage.hostname=localhost

janusgraphmr.ioformat.conf.storage.port=9160

janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph


#

# Apache Cassandra InputFormat configuration

#

cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner


#

# SparkGraphComputer Configuration

#

spark.master=local[*]

spark.serializer=org.apache.spark.serializer.KryoSerializer


Here is the output from gremlin console:

plugin activated: janusgraph.imports

plugin activated: tinkerpop.server

plugin activated: tinkerpop.utilities

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/<test_dir>/janusgraph-0.2.0-hadoop2/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/<test_dir>/janusgraph-0.2.0-hadoop2/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

22:38:25 WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

plugin activated: tinkerpop.hadoop

plugin activated: tinkerpop.spark

plugin activated: tinkerpop.tinkergraph

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cassandra.properties')

==>standardjanusgraph[cassandrathrift:[127.0.0.1]]

gremlin> g = graph.traversal()

==>graphtraversalsource[standardjanusgraph[cassandrathrift:[127.0.0.1]], standard]

gremlin> g.V().count()

22:42:22 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes

==>443136

gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cassandra-3.properties')

==>hadoopgraph[cassandra3inputformat->gryooutputformat]

gremlin> g = graph.traversal().withComputer(SparkGraphComputer)

==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]

gremlin> g.V().count()

java.io.IOException: Could not get input splits

Type ':help' or ':h' for help.

Display stack trace? [yN]y

java.lang.IllegalStateException: java.io.IOException: Could not get input splits

at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)

at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)

at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)

at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)

at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)

at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)

at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:234)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)

at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)

at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)

at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)

at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:447)

at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:191)

... 54 more

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints 172.17.0.2

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:192)

at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199)

... 52 more

Caused by: java.io.IOException: failed connecting to all endpoints 172.17.0.2

at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317)

at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)

at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)

at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)

... 4 more


On Tuesday, October 24, 2017 at 6:56:51 AM UTC-7, Himanshu Gupta wrote:
Hi,

I am using Janus 0.1.1 with Cassandra 3.0.14 as a backend. Can anyone help me how to traverse graph using Spark with Canssandra as a backend ? Can you specify the configurations?

current I'm using following configuration file:  read-cassandra.properties
--------------------------------------------------------------------------------
#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.CassandraInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=localhost
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
janusgraphmr.ioformat.cf-name=edgestore
storage.backend=cassandra
#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.driver.host=localhost
#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.serializer=org.apache.spark.serializer.KryoSerializer


--------------------------------------------------------------------------------

I do following operations:

gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties")
==>hadoopgraph[cassandrainputformat->gryooutputformat]

gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]

gremlin> g.V().count()
java.io.IOException: Could not get input splits

Reply all
Reply to author
Forward
0 new messages