Unable to do OLAP using Spark on JG0.2.0

399 views

Skip to first unread message

Debasish Kanhar

unread,

Jul 9, 2018, 11:00:40 AM7/9/18

to JanusGraph users

Hi,

I've been using Janusgraph with few of our projects since last few months but only recently have I started scaling the dataset and doing some real analytics on it.

Earlier we used to have dataset with around 100k nodes and 500k edges. All our use cases were point queries, which required us to hit a particular vertex and then do somethng like centrality on it. Since those queries are comparitively non expensive, Indexes was doing the job for us till now.

But, now I've started scaling the dataset to huge number. My new dataset has around 1.5million nodes and 13million edges. The total insert time took us around 5 hours against standalone cassandra whereas took around 15hours against clustered cassandra.

Now, when I do a simple query like Cycle detection on this dataset, the following query takes a lot of time to make it impractical.

"Find cycles of all length for a vertex limit to 20" Hence the query query would be something like:

g.V(156090).as('a').repeat(out().simplePath()).emit(loops().is(gt(1))).out().where(eq('a')).path().dedup().by(unfold().order().by(id).dedup().fold()).limit(20)

The above query takes any amount of time to finish, the time can be as low as 15-20mins to 3-4 hours to finish depending on vertex id I select.

Now, doing this for 1 verrex takes 1 hour, then doing the same process for 1.5million nodes seems pointless.

This is the place where I want to bring Spark into mix.

I followed https://stackoverflow.com/questions/40105047/setup-and-configuration-of-titan-for-a-spark-cluster-and-cassandra/40180104#40180104 to setup spark cluster along with Hadoop cluster. We have 2 spark clusters running:

Cluster 1: <191 - Master, 178-Slave, Spark 2.2.0, Cassandra node: 178>
Cluster 2: <81-Master, 82,83,84- Slave, Spark 1.6.1, Cassandra node: 82,83,84>

Reason for 2 separate spark clusters is that we wanted to test JanusGraph 0.2.0 as well as JanusGraph 0.3.0 (Cloned from master).

Now, irrespective of which Spark & Cassandra I'm connecting to, when I specify

spark.master=local[1]

all my OLAP works, but when I specify it to real up and running spark cluster, all hell breaks loose!!

spark.master=spark://IP_191:7077                  # Doesn't work from JG 0.3.3 as spark here is 2.2
spark.master=spark://IP_81:7077                  # Doesn't work from JG 0.2.0 as spark here is 1.6.1

Over last few days of exploration, I was finally able to know that even if Tinkerpop > 3.2 doesn't need HDFS as intermediate storage, we still need to add HADOOP_CONF_DIR to our Console ClassPATH.

Case 1: Connecting JG 0.2.0 to Spark 1.6.1 cluster at master (81)

I add the following lines to my gremlin.sh and then start console:

CP="$CP":/root/IGA/hadoop/etc/hadoop # Hadoop version 2.6.5

Also, I checked on UI, my hadop instance is up and running.

When I start console now, It fails with

Exception in thread "main" org.apache.tinkerpop.gremlin.groovy.plugin.PluginInitializationException: No FileSystem for scheme: hdfs
        at org.apache.tinkerpop.gremlin.hadoop.groovy.plugin.HadoopGremlinPlugin.afterPluginTo(HadoopGremlinPlugin.java:91)
        at org.apache.tinkerpop.gremlin.groovy.plugin.AbstractGremlinPlugin.pluginTo(AbstractGremlinPlugin.java:86)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.PluggedIn.activate(PluggedIn.groovy:58)
        at org.apache.tinkerpop.gremlin.console.Console$_closure19.doCall(Console.groovy:146)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at groovy.lang.Closure.call(Closure.java:414)
        at groovy.lang.Closure.call(Closure.java:430)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2040)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2025)
        at org.codehaus.groovy.runtime.dgm$158.doMethodInvoke(Unknown Source)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:133)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:478)
Caused by: java.io.IOException: No FileSystem for scheme: hdfs
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
        at org.apache.tinkerpop.gremlin.hadoop.groovy.plugin.HadoopGremlinPlugin.afterPluginTo(HadoopGremlinPlugin.java:84)
        ... 21 more

Looking at JG#674 it looks like this is case of missing hadoop jar issue. I quickly did "ls" on janusgraph0.2.0/lib and following is my result confirming that lot of hadoop jars are missing compared to Janusgraph0.3.0

[root@igatest191 janusgraph-0.2.0-hadoop2]# ll lib/|grep hadoop
-rw-rw-r-- 1 root root   180736 Jul 18  2017 avro-mapred-1.7.7-hadoop2.jar
-rw-rw-r-- 1 root root    17385 Jul 18  2017 hadoop-annotations-2.7.2.jar
-rw-rw-r-- 1 root root    70685 Jul 18  2017 hadoop-auth-2.7.2.jar
-rw-rw-r-- 1 root root  3443040 Jul 18  2017 hadoop-common-2.7.2.jar
-rw-rw-r-- 1 root root   134652 Aug 28  2017 hadoop-gremlin-3.2.6.jar
-rw-rw-r-- 1 root root   104992 Oct 11  2017 janusgraph-hadoop-0.2.0.jar

Against janusgraph0.3.0:

[root@igatest191 janusgraph-0.3.0-SNAPSHOT-hadoop2]# ll lib/ | grep hadoop
-rw-r--r-- 1 root root   180736 Mar 13 00:29 avro-mapred-1.7.7-hadoop2.jar
-rw-r--r-- 1 root root    17385 Mar 13 01:01 hadoop-annotations-2.7.2.jar
-rw-r--r-- 1 root root    70685 Mar 13 00:30 hadoop-auth-2.7.2.jar
-rw-r--r-- 1 root root     2545 Mar 13 01:01 hadoop-client-2.7.2.jar
-rw-r--r-- 1 root root  3443040 Mar 13 00:30 hadoop-common-2.7.2.jar
-rw-r--r-- 1 root root   127410 Jul  3 01:18 hadoop-gremlin-3.3.3.jar
-rw-r--r-- 1 root root  8268375 Mar 13 01:01 hadoop-hdfs-2.7.2.jar
-rw-r--r-- 1 root root   516614 Mar 13 01:01 hadoop-mapreduce-client-app-2.7.2.jar
-rw-r--r-- 1 root root   753123 Mar 13 01:01 hadoop-mapreduce-client-common-2.7.2.jar
-rw-r--r-- 1 root root  1531485 Mar 13 06:07 hadoop-mapreduce-client-core-2.7.2.jar
-rw-r--r-- 1 root root    38213 Mar 13 01:01 hadoop-mapreduce-client-jobclient-2.7.2.jar
-rw-r--r-- 1 root root    48268 Mar 13 01:01 hadoop-mapreduce-client-shuffle-2.7.2.jar
-rw-r--r-- 1 root root  2015575 Mar 13 06:07 hadoop-yarn-api-2.7.2.jar
-rw-r--r-- 1 root root   142639 Mar 13 01:01 hadoop-yarn-client-2.7.2.jar
-rw-r--r-- 1 root root  1653294 Mar 13 06:07 hadoop-yarn-common-2.7.2.jar
-rw-r--r-- 1 root root   364376 Mar 13 01:01 hadoop-yarn-server-common-2.7.2.jar
-rw-r--r-- 1 root root   106202 Jul  4 00:44 janusgraph-hadoop-0.3.0-SNAPSHOT.jar

Now, instead of waiting for stable release of 0.3.0 to be released, I went ahead and clonned from master to get latest stable build of Janusgraph but even that isn't working for me.

I added the same CLASSPATH entries to janusgraph0.3.0 gremlin.sh script, and did "hdfs" to confirm if my console is pointing to right Hadoop cluster.

gremlin> hdfs
==>storage[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_1971276907_1, ugi=root (auth:SIMPLE)]]]

Now, I use the following properties file to load HadoopGraph from within Gremlin console"

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output


#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cassandrathrift
storage.backend=cassandrathrift
janusgraphmr.ioformat.conf.storage.hostname=XXX_IP_178
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=keyspace_test

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.keyspace=keyspace_test
cassandra.input.predicate=0c00020b0001000000000b000200000000020003000800047fffffff0000
cassandra.input.columnfamily=edgestore
cassandra.range.batch.size=2147483647
cassandra.thrift.framed.size_mb=1024

#
# SparkGraphComputer Configuration
#
spark.master=spark://XXX_IP_191:7077
spark.executor.memory=6g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
gremlin.spark.persistContext=true
spark.driver.extraClassPath=/opt/janusgraph-0.3.0-SNAPSHOT-hadoop2/*
spark.executor.extraClassPath=/opt/janusgraph-0.3.0-SNAPSHOT-hadoop2/*

# Default Graph Computer
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

####################################
# Hadoop Cluster configuration     #
####################################
fs.defaultFS=hdfs://XXX_IP_81:54310

I loaded the above properties file and did following, but I'm failing when I issue a count query:

gremlin> graph = GraphFactory.open("conf/testGraph-spark-cassandra3.properties")
==>hadoopgraph[cassandra3inputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
[Stage 0:>                                                         (0 + 0) / 62]

It then fails with following error:

java.lang.NoClassDefFoundError: Could not initialize class com.datastax.driver.core.Cluster
        at org.janusgraph.hadoop.formats.cassandra.CqlBridgeRecordReader.initialize(CqlBridgeRecordReader.java:127)
        at org.janusgraph.hadoop.formats.util.GiraphRecordReader.initialize(GiraphRecordReader.java:60)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:182)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:179)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:134)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

I googled a bit few, and it looks like we need to install external connector to connect to Cassandra from Spark. I did that by running following from my Gremlin console for Janusgraph 0.3:

:install com.datastax.spark spark-cassandra-connector_2.11 2.0.8

This installed connector jars in /ext directory. Now, I restarted the console, and restarted OLAP but I'm still facing same error. I also see the jars being added :

Following is screenshot of my Enviorenemnt from Job UI (@4040 port)

Then, if that is scenario, what I'm I missing? The following things have been tried, either using Gremlin console, or standalone Java class.

Create spark context to connect to spark cluster, create RDD and collect. (Works)
Create spark context, and use any of TP/JG libs like StarVertex, TinkerGraph and execute that against Spark cluster. (Works)
Create spark context, and withing the spark job, connect to Janusgraph using Cassandra backend, and do count of vertices (Works).
Create spark context, and within that spark job, connect to HadoopGraph for doing OLAP (Doesnt work).

Looks like its just HadoopGraph scenario which isnt working. So any suggestion, because right now I'm at complete deadend, and we are also moving towards end of deadline to make this Proof of concept work.

Any help will be really appreciated. And if any extra information is needed, please let me know so that I can post the same.

Debasish Kanhar

unread,

Jul 9, 2018, 11:12:43 AM7/9/18

to JanusGraph users

Update,

Tried doing OLAP after adding following jars as plugins to gremlin console, but that also didn't work:

gremlin> :install com.datastax.cassandra cassandra-driver-core 3.1.0
==>Loaded: [com.datastax.cassandra, cassandra-driver-core, 3.1.0]
gremlin> :install com.datastax.cassandra cassandra-driver-mapping 3.1.0
==>Loaded: [com.datastax.cassandra, cassandra-driver-mapping, 3.1.0]
gremlin> :q

So, which jars am I missing exactly?

Jason Plurad

unread,

Jul 9, 2018, 12:38:18 PM7/9/18

to JanusGraph users

You should clone the 0.2 branch and build the 0.2.1-SNAPSHOT and align for Spark 1.6.1. Or just grab the 0.2.1-SNAPSHOT build from here https://ibm.box.com/janusgraph-release-candidates . I tried it out and it worked for me with a standalone Spark master and worker.

Doing piecemeal grabs between the streams will drive you mad because the version conflicts between 0.2 and master/0.3 branches are significant.

I've opened up Issue 1159 to address the problems on master.

HadoopMarc

unread,

Jul 9, 2018, 2:34:55 PM7/9/18

to JanusGraph users

Hi Debasish,

Regarding you original query:

the emit() condition "loops().is().gt(1)" seems contradictory with following a SimplePath().
the repeat loop has no end condition. Depending on your graph structure this may imply a combinatorial explosion. Why not build it up slowly with end conditions times(3), time(5), etc.?

Cheers, Marc

Op maandag 9 juli 2018 17:00:40 UTC+2 schreef Debasish Kanhar:

Debasish Kanhar

unread,

Jul 9, 2018, 4:13:53 PM7/9/18

to JanusGraph users

Hi Jason,

Thanks for giving the link to 0.2.1 release. I downloaded that and trying against Spark 1.6.1 with 1 master and 1 worker.

I'm still getting a few errors when running against cluster, whereas against local I'm able to get OLAP working.

Just wanted to confirm your setup and weather I'm doing it correctly or not.

I use following set of machines and giving reference for easy description:

IP_191 (Spark master, Janusgraph 0.2.1)

IP_178 (Spark worker, Cassandra, Elasticsearch)

I just changed the spark.master from conf/read-cassandra-3.properties from local to my existing spark cluster, and I'm unable to finish any spark jobs. Even g.V().count() fails.

My updated properties file:

# Hadoop Graph Configuration
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph

gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

# JanusGraph Cassandra InputFormat configuration
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=IP_178
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph

# Apache Cassandra InputFormat configuration

cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
# SparkGraphComputer Configuration
spark.master=spark://IP_191:7077
spark.executor.memory=6g
spark.serializer=org.apache.spark.serializer.KryoSerializer

I checked the logs of executor, and I'm finding following errors, again back to error I was getting few months back while running spark jobs.

18/07/09 15:58:39 WARN KryoShimServiceLoader: KryoShimService implementations org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@780a8fa9 and org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService@239362de are tied with priority value 0.  Preferring org.janusgraph.hadoop.serialize.JanusGraphKryoShimService to the other because it has a lexicographically greater classname.  Consider setting the system property "gremlin.io.kryoShimService" instead of relying on priority tie-breaking.
18/07/09 15:58:39 INFO KryoShimServiceLoader: Set KryoShimService provider to org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@780a8fa9 (class org.janusgraph.hadoop.serialize.JanusGraphKryoShimService) because its priority value (0) is the highest available
18/07/09 15:58:39 INFO KryoShimServiceLoader: Configuring KryoShimService provider org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@780a8fa9 with user-provided configuration
18/07/09 15:58:39 INFO ConnectionPoolMBeanManager: Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterJanusGraphConnectionPool,ServiceType=connectionpool
18/07/09 15:58:39 INFO CountingConnectionPoolMonitor: AddHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:39 INFO ConnectionPoolMBeanManager: Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=KeyspaceJanusGraphConnectionPool,ServiceType=connectionpool
18/07/09 15:58:39 INFO CountingConnectionPoolMonitor: AddHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:39 INFO CountingConnectionPoolMonitor: AddHost: 9.37.25.178
18/07/09 15:58:39 INFO CountingConnectionPoolMonitor: RemoveHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:40 INFO GraphDatabaseConfiguration: Generated unique-instance-id=092519b218996-igatest178-rtp-raleigh-ibm-com1
18/07/09 15:58:40 INFO ConnectionPoolMBeanManager: Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterJanusGraphConnectionPool,ServiceType=connectionpool
18/07/09 15:58:40 INFO CountingConnectionPoolMonitor: AddHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:40 INFO ConnectionPoolMBeanManager: Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=KeyspaceJanusGraphConnectionPool,ServiceType=connectionpool
18/07/09 15:58:40 INFO CountingConnectionPoolMonitor: AddHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:40 INFO CountingConnectionPoolMonitor: AddHost: 9.37.25.178
18/07/09 15:58:40 INFO CountingConnectionPoolMonitor: RemoveHost: igatest178.rtp.raleigh.ibm.com
18/07/09 15:58:40 INFO Backend: Configuring index [search]
18/07/09 15:58:40 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[pool-31-thread-1,5,main]
java.lang.NoSuchMethodError: org.apache.http.util.Asserts.check(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:313)
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192)
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
        at java.lang.Thread.run(Thread.java:748)
18/07/09 15:58:40 INFO DiskBlockManager: Shutdown hook called
18/07/09 15:58:41 INFO ShutdownHookManager: Shutdown hook called
18/07/09 15:58:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-447841e0-3085-42c4-ac7b-7d7b19168c9b/spark-877b4167-e114-46a9-9900-a80b7ed083a1
18/07/09 15:58:41 ERROR Executor: Exception in task 0.2 in stage 0.0 (TID 2)
java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:46)
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100)
        at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47)
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:156)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:129)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:64)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
        ... 33 more
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
        at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477)
        at org.janusgraph.diskstorage.Backend.getIndexes(Backend.java:464)
        at org.janusgraph.diskstorage.Backend.<init>(Backend.java:149)
        at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1925)
        at org.janusgraph.graphdb.database.StandardJanusGraph.<init>(StandardJanusGraph.java:139)
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164)
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133)
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:123)
        at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.<init>(JanusGraphHadoopSetupImpl.java:52)
        ... 38 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
        ... 47 more
Caused by: java.lang.NoSuchMethodError: org.apache.http.util.Asserts.check(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)
        at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)
        at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:344)
        at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:326)
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:219)
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:192)
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:154)
        at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.getMajorVersion(RestElasticSearchClient.java:118)
        at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.<init>(RestElasticSearchClient.java:101)
        at org.janusgraph.diskstorage.es.ElasticSearchSetup$1.connect(ElasticSearchSetup.java:78)
        at org.janusgraph.diskstorage.es.ElasticSearchIndex.interfaceConfiguration(ElasticSearchIndex.java:303)
        at org.janusgraph.diskstorage.es.ElasticSearchIndex.<init>(ElasticSearchIndex.java:215)
        ... 52 more

But, I see that all jars under lib are distributed when I see my Spark Job UI (On port 4040) under enviorenment tab.

Also note that, my hdfs is configured and points to live hadoop cluster too.

So, did you try with similar setup and were able to make this work? Did I miss anything?

Thanks

Message has been deleted

Debasish Kanhar

unread,

Jul 9, 2018, 4:17:35 PM7/9/18

to JanusGraph users

Hi HadoopMarc,

I agree, and a repeat without limit can lead to unexpected explosion, but according to use case, we are concerned about cycles without knowing their length.

Hence we are rather applying limit in the end which restricts the computations of cycles till a particular limit. Hence, a lower limit will help me avoid complex computations and hence memory explosions for long length paths.

Correct me if my understanding is wrong :-)

Debasish Kanhar

unread,

Jul 9, 2018, 4:22:38 PM7/9/18

to JanusGraph users

Just wondering Jason, can this be due to Elasticsearch version? we are using 5.6.2 and also confirmed the ES and Cassandra nodes are running.

Debasish Kanhar

unread,

Jul 9, 2018, 4:30:36 PM7/9/18

to JanusGraph users

Also, the error seems to be unresolved even after adding following properties too:

cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.keyspace=janusgraph
cassandra.input.predicate=0c00020b0001000000000b000200000000020003000800047fffffff0000
cassandra.input.columnfamily=edgestore
cassandra.range.batch.size=2147483647
cassandra.thrift.framed.size_mb=1024


        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.</span

Antriksh Shah

unread,

Jul 10, 2018, 12:12:02 AM7/10/18

to JanusGraph users

Hey,

I believe there is a version mismatch in your jars. My hunch is the spark-yarn.jar is mostly creating this issue.

We faced the same set of errors earlier... with the help of http://yaaics.blogspot.com/2017/06/configuring-apache-tinkerpop-for-spark.html, http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html, and http://tinkerpop.apache.org/docs/3.3.3/recipes/#olap-spark-yarn we were able to get it working.

Debasish Kanhar

unread,

Jul 10, 2018, 1:35:26 AM7/10/18

to Antriksh Shah, JanusGraph users

Hi Antriksh,

Thanks for the input. I'll have a look at spark-yarn.jar based on your links when I'm in office.

But I remember following the first link, also point 2 is we don't use yarn. We just use standalone spark cluster.

So ideally we shouldn't be having conflicts because of spark-yarn right?

Also, we are distributing just those jars which are present under lib/ directory as those are the only ones added to classpath in console. So then my question immediately becomes are those the only jars needed to be distributed, and if yes why should those jars which comes bundled with janusgraph should create version conflict, as they are expected to have been tested too right?

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/j1t-CelBmkk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/ec36aba7-dde7-46cb-aad3-0f0d7d69f05a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Evgeniy Ignatiev

unread,

Jul 10, 2018, 2:48:26 AM7/10/18

to janusgra...@googlegroups.com

Hello.

In case you are distributing all the jars coming with /lib directory of janusgraph, there might be worth considering removing hbase related ones, as your workload runs on Cassandra. I remember that we faced numerous issues, some of them looked like yours, with the classes from hbase shaded jars replacing classes we needed. In our case we ended up stripping conflicting classes from hbase-shaded-server jar.

Best regards,
Evgeniy Ignatiev.

You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAMcT64DqkgNkk5p4J8Vn_-vHCe4nHqySzEF%2BsoLpY-qd7r30Mg%40mail.gmail.com.

marc.de...@gmail.com

unread,

Jul 10, 2018, 3:20:33 AM7/10/18

to JanusGraph users

Hi Debasish,

The limit(20) in your query is not effective in limiting thenumber of traversals because of the order().by() part in your query. If you do:

g.V(156090).as('a').repeat(out().simplePath()).emit().times(3).out().count()
g.V(156090).as('a').repeat(out().simplePath()).emit().times(5).out().count()
g.V(156090).as('a').repeat(out().simplePath()).emit().times(7).out().count()
...
g.V(156090).as('a').repeat(out().simplePath()).emit().times(21).out().count()

you will see what is happening in your graph.

Cheers, Marc

Op maandag 9 juli 2018 22:17:35 UTC+2 schreef Debasish Kanhar:

Debasish Kanhar

unread,

Jul 10, 2018, 7:44:11 AM7/10/18

to JanusGraph users

@Yevgeniy: Thanks for the suggestion. I tried by removing hbase jar and yarn jars, and re-running the whole process again, but I'm still at same error.

All, this is strange issue being faced on my end I guess. What information will be helpful in resolving? Will a copy of CLASSPATH entry help or anything?

Evgeniy Ignatiev

unread,

Jul 10, 2018, 7:58:07 AM7/10/18

to janusgra...@googlegroups.com

Maybe --verbose:class can be helpful - something like this:

spark.executor.extraJavaOptions=-verbose:class

So there should be printed for each class from where it comes in respective container logs.

Best regards,
Evgeniy Ignatiev

To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/fcfd5715-4a7a-48b2-b5b0-6256e888d13a%40googlegroups.com.

Debasish Kanhar

unread,

Jul 10, 2018, 10:42:01 AM7/10/18

to JanusGraph users

Hi @Evgeniy: How will the parameter help me? verbose class?

Under my Spark application UI (4040 port) I see following under executors:

Now, when I do check logs under "stderr" the following is my log entry:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/07/10 10:29:42 INFO CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
18/07/10 10:29:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/07/10 10:29:43 INFO SecurityManager: Changing view acls to: root
18/07/10 10:29:43 INFO SecurityManager: Changing modify acls to: root
18/07/10 10:29:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
18/07/10 10:29:43 INFO SecurityManager: Changing view acls to: root
18/07/10 10:29:43 INFO SecurityManager: Changing modify acls to: root
18/07/10 10:29:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
18/07/10 10:29:44 INFO Slf4jLogger: Slf4jLogger started
18/07/10 10:29:44 INFO Remoting: Starting remoting
18/07/10 10:29:44 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecuto...@9.37.25.178:46158]
18/07/10 10:29:44 INFO Utils: Successfully started service 'sparkExecutorActorSystem' on port 46158.
18/07/10 10:29:44 INFO DiskBlockManager: Created local directory at /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/blockmgr-21626bd0-215f-4e2b-a554-5afe59a3305c
18/07/10 10:29:44 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/07/10 10:29:44 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrain...@9.37.25.191:34208
18/07/10 10:29:44 INFO WorkerWatcher: Connecting to worker spark://Wor...@9.37.25.178:44136
18/07/10 10:29:44 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
18/07/10 10:29:44 INFO Executor: Starting executor ID 12 on host igatest178.rtp.raleigh.ibm.com
18/07/10 10:29:45 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41006.
18/07/10 10:29:45 INFO NettyBlockTransferService: Server created on 41006
18/07/10 10:29:45 INFO BlockManagerMaster: Trying to register BlockManager
18/07/10 10:29:45 INFO BlockManagerMaster: Registered BlockManager

While when I checkou stdout I see following:

un$2$$anonfun$apply$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.ThrottlerProvider from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.EndpointManager$$anonfun$9$$anonfun$10 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.EndpointManager$$anonfun$9$$anonfun$11 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.SchemeAugmenter from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AbstractTransportAdapter from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.ActorTransportAdapter from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaProtocolTransport from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaProtocolSettings from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaProtocolSettings$$anonfun$7 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaProtocolSettings$$anonfun$6 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaPduProtobufCodec$ from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.util.ByteString from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.util.CompactByteString from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.util.ByteString$ByteString1C from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaPduCodec$AkkaPdu from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.PduCodecException from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded com.google.protobuf.InvalidProtocolBufferException from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.transport.AkkaPduCodec$class from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded com.google.protobuf.Internal$EnumLite from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded com.google.protobuf.ProtocolMessageEnum from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.WireFormats$CommandType from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded com.google.protobuf.Internal$EnumLiteMap from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.WireFormats$CommandType$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded akka.remote.WireFormats$AkkaControlMessageOrBuilder from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
........
[Loaded java.lang.management.GarbageCollectorMXBean from /usr/java/jdk1.8.0_172-amd64/jre/lib/rt.jar]
[Loaded com.sun.management.GarbageCollectorMXBean from /usr/java/jdk1.8.0_172-amd64/jre/lib/rt.jar]
[Loaded sun.management.GarbageCollectorImpl from /usr/java/jdk1.8.0_172-amd64/jre/lib/rt.jar]
[Loaded scala.collection.convert.DecorateAsScala$$anonfun$asScalaBufferConverter$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded scala.collection.convert.Wrappers$MutableBufferWrapper from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$computeTotalGcTime$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$reportHeartBeat$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded org.apache.spark.Heartbeat from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded org.apache.spark.HeartbeatResponse from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded sun.reflect.GeneratedSerializationConstructorAccessor19 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedSerializationConstructorAccessor20 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedMethodAccessor1 from __JVM_DefineClass__]
[Loaded io.netty.handler.timeout.IdleStateEvent from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded io.netty.handler.timeout.IdleState from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded io.netty.buffer.CompositeByteBuf$Component from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]
[Loaded io.netty.buffer.PooledSlicedByteBuf$1 from file:/root/iga/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar]

Looks like all jars are getting distributed. But when I do OLAP, the same error resurfaces again with same stack trace and not much clarity on which Jar is breaking things off.

The following is my stacktrace taken from stderr console logs:

....
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./gprof-0.3.1-groovy-2.4.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/janusgraph-bigtable-0.2.1-SNAPSHOT.jar with timestamp 1531233365400
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/janusgraph-bigtable-0.2.1-SNAPSHOT.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp1798086954382390480.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/11139597031531233365400_cache to /root/iga/spark/work/app-20180710102451-0008/12/./janusgraph-bigtable-0.2.1-SNAPSHOT.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./janusgraph-bigtable-0.2.1-SNAPSHOT.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/asm-tree-5.0.3.jar with timestamp 1531233365356
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/asm-tree-5.0.3.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp3178611410848478637.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-13912099031531233365356_cache to /root/iga/spark/work/app-20180710102451-0008/12/./asm-tree-5.0.3.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./asm-tree-5.0.3.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/reflectasm-1.07-shaded.jar with timestamp 1531233365769
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/reflectasm-1.07-shaded.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp652955947451734099.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-3020041891531233365769_cache to /root/iga/spark/work/app-20180710102451-0008/12/./reflectasm-1.07-shaded.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./reflectasm-1.07-shaded.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/gremlin-groovy-3.2.9.jar with timestamp 1531233364902
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/gremlin-groovy-3.2.9.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4566401577670539208.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/13389009961531233364902_cache to /root/iga/spark/work/app-20180710102451-0008/12/./gremlin-groovy-3.2.9.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./gremlin-groovy-3.2.9.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/curator-recipes-2.7.1.jar with timestamp 1531233365502
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/curator-recipes-2.7.1.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp3829276583644954266.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-7230829691531233365502_cache to /root/iga/spark/work/app-20180710102451-0008/12/./curator-recipes-2.7.1.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./curator-recipes-2.7.1.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/jffi-1.2.10-native.jar with timestamp 1531233365352
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/jffi-1.2.10-native.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4961870570726750823.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-16579481121531233365352_cache to /root/iga/spark/work/app-20180710102451-0008/12/./jffi-1.2.10-native.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./jffi-1.2.10-native.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/jackson-annotations-2.6.6.jar with timestamp 1531233366064
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/jackson-annotations-2.6.6.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp8453303469542242344.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/5393579911531233366064_cache to /root/iga/spark/work/app-20180710102451-0008/12/./jackson-annotations-2.6.6.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./jackson-annotations-2.6.6.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/jnr-ffi-2.0.7.jar with timestamp 1531233365343
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/jnr-ffi-2.0.7.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4722225128737775624.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-541596141531233365343_cache to /root/iga/spark/work/app-20180710102451-0008/12/./jnr-ffi-2.0.7.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./jnr-ffi-2.0.7.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/hadoop-gremlin-3.2.9.jar with timestamp 1531233365405
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/hadoop-gremlin-3.2.9.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp8555211978762033949.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/18320624931531233365405_cache to /root/iga/spark/work/app-20180710102451-0008/12/./hadoop-gremlin-3.2.9.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./hadoop-gremlin-3.2.9.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/api-asn1-api-1.0.0-M20.jar with timestamp 1531233365472
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/api-asn1-api-1.0.0-M20.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp2969082886565617830.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/3440678851531233365472_cache to /root/iga/spark/work/app-20180710102451-0008/12/./api-asn1-api-1.0.0-M20.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./api-asn1-api-1.0.0-M20.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/scalap-2.10.0.jar with timestamp 1531233365875
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/scalap-2.10.0.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp3612607226499135748.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/7026913041531233365875_cache to /root/iga/spark/work/app-20180710102451-0008/12/./scalap-2.10.0.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./scalap-2.10.0.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/javax.inject-1.jar with timestamp 1531233365634
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/javax.inject-1.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp6935115257515615123.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/8006927991531233365634_cache to /root/iga/spark/work/app-20180710102451-0008/12/./javax.inject-1.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./javax.inject-1.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/stringtemplate-3.2.jar with timestamp 1531233366150
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/stringtemplate-3.2.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp7022502220342512103.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/13139938311531233366150_cache to /root/iga/spark/work/app-20180710102451-0008/12/./stringtemplate-3.2.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./stringtemplate-3.2.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/jackson-xc-1.9.13.jar with timestamp 1531233365631
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/jackson-xc-1.9.13.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4904532842453846347.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/17485049641531233365631_cache to /root/iga/spark/work/app-20180710102451-0008/12/./jackson-xc-1.9.13.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./jackson-xc-1.9.13.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/logback-core-1.1.2.jar with timestamp 1531233365091
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/logback-core-1.1.2.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp8946259640623836470.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-6333571541531233365091_cache to /root/iga/spark/work/app-20180710102451-0008/12/./logback-core-1.1.2.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./logback-core-1.1.2.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/hadoop-mapreduce-client-core-2.7.2.jar with timestamp 1531233365668
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/hadoop-mapreduce-client-core-2.7.2.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp9220899167328081835.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/14936750261531233365668_cache to /root/iga/spark/work/app-20180710102451-0008/12/./hadoop-mapreduce-client-core-2.7.2.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./hadoop-mapreduce-client-core-2.7.2.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/avro-1.7.4.jar with timestamp 1531233365445
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/avro-1.7.4.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp5407678074856930474.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-7332377371531233365445_cache to /root/iga/spark/work/app-20180710102451-0008/12/./avro-1.7.4.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./avro-1.7.4.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/hadoop-yarn-common-2.7.2.jar with timestamp 1531233365599
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/hadoop-yarn-common-2.7.2.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp537768874850129973.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-9932547681531233365599_cache to /root/iga/spark/work/app-20180710102451-0008/12/./hadoop-yarn-common-2.7.2.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./hadoop-yarn-common-2.7.2.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/akka-actor_2.10-2.3.11.jar with timestamp 1531233365852
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/akka-actor_2.10-2.3.11.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4735888887348543122.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-7320050521531233365852_cache to /root/iga/spark/work/app-20180710102451-0008/12/./akka-actor_2.10-2.3.11.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./akka-actor_2.10-2.3.11.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/xml-apis-1.3.04.jar with timestamp 1531233365056
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/xml-apis-1.3.04.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp773109755983689449.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-16782858861531233365056_cache to /root/iga/spark/work/app-20180710102451-0008/12/./xml-apis-1.3.04.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./xml-apis-1.3.04.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/metrics-core-3.0.1.jar with timestamp 1531233365010
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/metrics-core-3.0.1.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4323868342655893285.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-15206701381531233365010_cache to /root/iga/spark/work/app-20180710102451-0008/12/./metrics-core-3.0.1.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./metrics-core-3.0.1.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/groovy-sql-2.4.15-indy.jar with timestamp 1531233366243
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/groovy-sql-2.4.15-indy.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp4349886102726648442.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-9190421111531233366243_cache to /root/iga/spark/work/app-20180710102451-0008/12/./groovy-sql-2.4.15-indy.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./groovy-sql-2.4.15-indy.jar to class loader
18/07/10 10:36:09 INFO Executor: Fetching http://9.37.25.191:46800/jars/jsch-0.1.42.jar with timestamp 1531233366147
18/07/10 10:36:09 INFO Utils: Fetching http://9.37.25.191:46800/jars/jsch-0.1.42.jar to /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/fetchFileTemp2735571341701023301.tmp
18/07/10 10:36:09 INFO Utils: Copying /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0/-15217933761531233366147_cache to /root/iga/spark/work/app-20180710102451-0008/12/./jsch-0.1.42.jar
18/07/10 10:36:09 INFO Executor: Adding file:/root/iga/spark/work/app-20180710102451-0008/12/./jsch-0.1.42.jar to class loader
18/07/10 10:36:09 INFO MapOutputTrackerWorker: Updating epoch to 12 and clearing cache
18/07/10 10:36:10 INFO TorrentBroadcast: Started reading broadcast variable 13
18/07/10 10:36:10 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 19.1 KB, free 19.1 KB)
18/07/10 10:36:10 INFO TorrentBroadcast: Reading broadcast variable 13 took 160 ms
18/07/10 10:36:10 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 60.8 KB, free 79.9 KB)
18/07/10 10:36:16 WARN KryoShimServiceLoader: KryoShimService implementations org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService@1d6414c5 and org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@68039742 are tied with priority value 0.  Preferring org.janusgraph.hadoop.serialize.JanusGraphKryoShimService to the other because it has a lexicographically greater classname.  Consider setting the system property "gremlin.io.kryoShimService" instead of relying on priority tie-breaking.
18/07/10 10:36:16 INFO KryoShimServiceLoader: Set KryoShimService provider to org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@68039742 (class org.janusgraph.hadoop.serialize.JanusGraphKryoShimService) because its priority value (0) is the highest available
18/07/10 10:36:16 INFO NewHadoopRDD: Input split: ColumnFamilySplit((5070934897061886230, '5586080062392498421] @[igatest178.rtp.raleigh.ibm.com])
18/07/10 10:36:16 INFO TorrentBroadcast: Started reading broadcast variable 10
18/07/10 10:36:16 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 23.2 KB, free 103.1 KB)
18/07/10 10:36:16 INFO TorrentBroadcast: Reading broadcast variable 10 took 15 ms
18/07/10 10:36:17 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 324.2 KB, free 427.3 KB)
18/07/10 10:36:17 WARN KryoShimServiceLoader: KryoShimService implementations org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService@315b242b and org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@6961e898 are tied with priority value 0.  Preferring org.janusgraph.hadoop.serialize.JanusGraphKryoShimService to the other because it has a lexicographically greater classname.  Consider setting the system property "gremlin.io.kryoShimService" instead of relying on priority tie-breaking.
18/07/10 10:36:17 INFO KryoShimServiceLoader: Set KryoShimService provider to org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@6961e898 (class org.janusgraph.hadoop.serialize.JanusGraphKryoShimService) because its priority value (0) is the highest available
18/07/10 10:36:17 INFO KryoShimServiceLoader: Configuring KryoShimService provider org.janusgraph.hadoop.serialize.JanusGraphKryoShimService@6961e898 with user-provided configuration
18/07/10 10:36:17 INFO CassandraThriftStoreManager: Closed Thrift connection pooler.
18/07/10 10:36:17 INFO GraphDatabaseConfiguration: Generated unique-instance-id=092519b210349-igatest178-rtp-raleigh-ibm-com1
18/07/10 10:36:17 INFO Backend: Configuring index [search]
18/07/10 10:36:18 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[pool-16-thread-1,5,main]


java.lang.NoSuchMethodError: org.apache.http.util.Asserts.check(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:313)
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192)
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
        at java.lang.Thread.run(Thread.java:748)

18/07/10 10:36:18 INFO DiskBlockManager: Shutdown hook called
18/07/10 10:36:18 INFO ShutdownHookManager: Shutdown hook called
18/07/10 10:36:18 INFO ShutdownHookManager: Deleting directory /tmp/spark-3d63c27f-40c0-4806-92c5-1f0adf6dbb64/executor-e941fc5b-cffc-4ffd-89fb-8ab6abfac4be/spark-320b3dfc-7390-40ac-a468-d4712ed683a0

Maybe you can help me with what does this trace means.

Also, the following is trace I get from console:

gremlin> g.V().limit(10)
10:36:18 ERROR org.apache.spark.scheduler.TaskSchedulerImpl  - Lost executor 12 on igatest178.rtp.raleigh.ibm.com: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
10:36:18 WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.0 in stage 4.0 (TID 12, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 12 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
[Stage 4:>                                    10:36:34 ERROR org.apache.spark.scheduler.TaskSchedulerImpl  - Lost executor 13 on igatest178.rtp.raleigh.ibm.com: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
10:36:34 WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.1 in stage 4.0 (TID 13, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 13 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
[Stage 4:>                                                 10:36:51 ERROR org.apache.spark.scheduler.TaskSchedulerImpl  - Lost executor 14 on igatest178.rtp.raleigh.ibm.com: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
10:36:51 WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.2 in stage 4.0 (TID 14, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 14 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
[Stage 4:>               10:37:09 ERROR org.apache.spark.scheduler.TaskSchedulerImpl  - Lost executor 15 on igatest178.rtp.raleigh.ibm.com: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
10:37:09 WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.3 in stage 4.0 (TID 15, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 15 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
10:37:09 ERROR org.apache.spark.scheduler.TaskSetManager  - Task 0 in stage 4.0 failed 4 times; aborting job
                                           (0 + 1) / 9]org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 15, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 15 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 15, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 15 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
        at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:237)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
        at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:460)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:196)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
        at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:130)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:89)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
        at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:169)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:481)
Caused by: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 15, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 15 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:68)
        ... 54 more
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 15, igatest178.rtp.raleigh.ibm.com): ExecutorLostFailure (executor 15 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:920)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:918)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
        at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:918)
        at org.apache.spark.api.java.JavaRDDLike$class.foreachPartition(JavaRDDLike.scala:225)
        at org.apache.spark.api.java.AbstractJavaRDDLike.foreachPartition(JavaRDDLike.scala:46)
        at org.apache.tinkerpop.gremlin.spark.process.computer.SparkExecutor.executeVertexProgramIteration(SparkExecutor.java:179)
        at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:279)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

gremlin>

Debasish Kanhar

unread,

Jul 10, 2018, 10:43:01 AM7/10/18

to JanusGraph users

So, it again brings me back to square one. I know I'm facing an error, just don't know what is causing the error!!

Jason Plurad

unread,

Jul 10, 2018, 2:26:33 PM7/10/18

to JanusGraph users

What Cassandra version are you using? Which storage.backend are you using when initializing the keyspace?

When I was trying it yesterday, I didn't have Elasticsearch involved. The error you've shown is a conflict with Apache HttpComponents httpcore between Spark and Elasticsearch.
java.lang.NoSuchMethodError: org.apache.http.util.Asserts.check(ZLjava/lang/String;Ljava/lang/Object;)V
I was able to get the JanusGraph 0.2.1 release to work with a Spark 1.6.1 standalone cluster like this:

// start janusgraph pre-packaged distribution (Cassandra 2.1.20 and Elasticsearch 6.0.1)
bin/janusgraph.sh start

// start spark master and worker
sbin/start-master.sh -h 127.0.0.1 -p 7077 --webui-port 7070
sbin/start-slave.sh -h 127.0.0.1 --webui-port 7071 -c 2 -m 4G spark://127.0.0.1:7077

// initialize the graph of the gods (no changes to the default config)
graph = JanusGraphFactory.open("conf/janusgraph-cassandra-es.properties")
GraphOfTheGodsFactory.load(graph)
graph.tx().commit()
graph.close()

// update master and add classpath to read-cassandra.properties
// this prepends the JanusGraph jars to the executor's classpath
spark.master=spark://127.0.0.1:7077
spark.executor.extraClassPath=/usr/lib/janusgraph-0.2.1-hadoop2/lib/*

// run the traversal with spark
graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties")
g = graph.traversal().withComputer(SparkGraphComputer)
g.V().valueMap(true).toList()

I also tested with a standalone Cassandra 3.11.0 and Elasticsearch 5.6.1 with pretty much the same steps, except modifying and using read-cassandra-3.properties instead.

Jason Plurad

unread,

Jul 10, 2018, 2:28:42 PM7/10/18

to JanusGraph users

Also note I was running everything on the same machine. If you have your Spark master and workers on separate machines, you'd need to make sure that the JanusGraph jars referenced in the extraClassPath property are accessible on the same path on those machines.

Debasish Kanhar

unread,

Jul 10, 2018, 2:50:22 PM7/10/18

to Jason Plurad, JanusGraph users

Hi Jason,

Thanks for letting me know your configuration. We are using elastic search 5.6.2. can that be a reason why I'm getting conflict between ES and Spark with just change in version number from 5.6.1 to 5.6.2?

Other thing, I do set gremlin.jarsInDistributed= true. That does distribute the jars to all worker right. ?

Then in that case I won't need to specify extraClassPath as all jars under CLASSPATH gets distributed into workers even if that worker is on different machine. Please correct me if I'm wrong here, or should I do that just as safety measure?

I will change elastic search version to sync with yours and will get back on this thread.

To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a5f70d1-da47-4803-a384-ebb211e073f8%40googlegroups.com.

Jason Plurad

unread,

Jul 10, 2018, 3:10:29 PM7/10/18

to JanusGraph users

The classpath loading order is causing problems. The ES version from 5.6.1 to 5.6.2 does not matter.

gremlin.hadoop.jarsInDistributedCache=true is set in read-cassandra.properties by default, but again, that is only for distributing the jars from one machine to another. If you don't override the default classpath either through spark.executor.extraClassPath in the graph properties or the SPARK_CLASSPATH in the spark-env.sh, Spark's jars will be loaded before JanusGraph's jars which causes the mismatch.

If you have Spark on a different set of machines, you'll need to copy the JanusGraph jar files over to those machines so you can explicitly set the classpath order. Then you'll need to set one of spark.executor.extraClassPath or SPARK_CLASSPATH.

On Tuesday, July 10, 2018 at 2:50:22 PM UTC-4, Debasish Kanhar wrote:

Hi Jason,

Thanks for letting me know your configuration. We are using elastic search 5.6.2. can that be a reason why I'm getting conflict between ES and Spark with just change in version number from 5.6.1 to 5.6.2?

Other thing, I do set gremlin.jarsInDistributed= true. That does distribute the jars to all worker right. ?

Then in that case I won't need to specify extraClassPath as all jars under CLASSPATH gets distributed into workers even if that worker is on different machine. Please correct me if I'm wrong here, or should I do that just as safety measure?

I will change elastic search version to sync with yours and will get back on this thread.

To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.

Debasish Kanhar

unread,

Jul 11, 2018, 3:28:15 AM7/11/18

to JanusGraph users

Hi Jason,

I added the Jars to ClassPath as suggested, and it seems like I've proceeded but still I'm unable to do any OLAP queries. The error I'm getting seems like debuggabale though but would want your inputs weather this behaviour is expected?

What is happening is that the HadoopGraph is trying to connect to Elasticsearch at localhost irrespective of providing the property to file. I use the following properties file:

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cassandra

janusgraphmr.ioformat.conf.storage.hostname=igatest178.rtp.raleigh.ibm.com
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=igaGraph

# index.search.backend=elasticsearch
# index.search.hostname=igatest178.rtp.raleigh.ibm.com
# index.search.port=9200

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner

#
# SparkGraphComputer Configuration
#

spark.master=spark://igatest191.rtp.raleigh.ibm.com:7077
gremlin.spark.persistContext=true
spark.executor.extraClassPath=/root/lib/janusgraoh-0.2.1/*
spark.serializer=org.apache.spark.serializer.KryoSerializer

I just changed my keyspace and added extraClassPath and changes hostname of Cassandra to my standlone cassandra, but I'm getting following error:

03:17:15.692 [Executor task launch worker-0] INFO  c.n.a.c.i.CountingConnectionPoolMonitor - AddHost: igatest178.rtp.raleigh.ibm.com
03:17:15.696 [Executor task launch worker-0] DEBUG o.j.d.c.a.AstyanaxStoreManager - Found keyspace igaGraph
03:17:15.696 [Executor task launch worker-0] DEBUG o.j.d.c.a.AstyanaxStoreManager - Custom RetryBackoffStrategy com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy@3d91d247
03:17:15.698 [Executor task launch worker-0] INFO  c.n.a.c.i.ConnectionPoolMBeanManager - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=KeyspaceJanusGraphConnectionPool,ServiceType=connectionpool
03:17:15.698 [Executor task launch worker-0] INFO  c.n.a.c.i.CountingConnectionPoolMonitor - AddHost: igatest178.rtp.raleigh.ibm.com
03:17:15.702 [Executor task launch worker-0] INFO  c.n.a.c.i.CountingConnectionPoolMonitor - AddHost: 9.37.25.178
03:17:15.702 [Executor task launch worker-0] INFO  c.n.a.c.i.CountingConnectionPoolMonitor - RemoveHost: igatest178.rtp.raleigh.ibm.com
03:17:15.704 [Executor task launch worker-0] INFO  org.janusgraph.diskstorage.Backend - Configuring index [search]
03:17:15.705 [Executor task launch worker-0] DEBUG o.j.d.es.ElasticSearchSetup - Configuring RestClient
03:17:15.705 [Executor task launch worker-0] DEBUG o.j.d.es.ElasticSearchSetup - Configured remote host: 127.0.0.1 : 9200
03:17:15.707 [Executor task launch worker-0] DEBUG o.a.h.impl.nio.client.MainClientExec - [exchange: 9] start execution
03:17:15.708 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAddCookies - CookieSpec selected: default
03:17:15.708 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAuthCache - Re-using cached 'basic' auth scheme for http://127.0.0.1:9200
03:17:15.708 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAuthCache - No credentials for preemptive authentication
03:17:15.708 [Executor task launch worker-0] DEBUG o.a.h.i.n.c.InternalHttpAsyncClient - [exchange: 9] Request connection for {}->http://127.0.0.1:9200
03:17:15.708 [Executor task launch worker-0] DEBUG o.a.h.i.n.c.PoolingNHttpClientConnectionManager - Connection request: [route: {}->http://127.0.0.1:9200][total kept alive: 0; route allocated: 0 of 5; total allocated: 0 of 10]
03:17:15.709 [pool-91-thread-1] DEBUG o.a.h.i.n.c.PoolingNHttpClientConnectionManager - Connection request failed
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) [httpasyncclient-4.1.2.jar:4.1.2]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
03:17:15.709 [pool-91-thread-1] DEBUG o.a.h.i.n.c.InternalHttpAsyncClient - [exchange: 9] connection request failed
03:17:15.710 [pool-91-thread-1] DEBUG org.elasticsearch.client.RestClient - request [GET http://127.0.0.1:9200/] failed
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) [httpasyncclient-4.1.2.jar:4.1.2]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
03:17:15.710 [pool-91-thread-1] DEBUG org.elasticsearch.client.RestClient - added host [http://127.0.0.1:9200] to blacklist
03:17:15.711 [Executor task launch worker-0] WARN  o.j.d.e.rest.RestElasticSearchClient - Unable to determine Elasticsearch server version. Default to FIVE.
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_172]
03:17:15.711 [Executor task launch worker-0] DEBUG o.j.d.es.ElasticSearchIndex - Configured ES query nb result by query to 50
03:17:15.711 [Executor task launch worker-0] DEBUG o.a.h.impl.nio.client.MainClientExec - [exchange: 10] start execution
03:17:15.712 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAddCookies - CookieSpec selected: default
03:17:15.712 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAuthCache - Re-using cached 'basic' auth scheme for http://127.0.0.1:9200
03:17:15.712 [Executor task launch worker-0] DEBUG o.a.h.c.protocol.RequestAuthCache - No credentials for preemptive authentication
03:17:15.712 [Executor task launch worker-0] DEBUG o.a.h.i.n.c.InternalHttpAsyncClient - [exchange: 10] Request connection for {}->http://127.0.0.1:9200
03:17:15.712 [Executor task launch worker-0] DEBUG o.a.h.i.n.c.PoolingNHttpClientConnectionManager - Connection request: [route: {}->http://127.0.0.1:9200][total kept alive: 0; route allocated: 0 of 5; total allocated: 0 of 10]
03:17:15.713 [pool-91-thread-1] DEBUG o.a.h.i.n.c.PoolingNHttpClientConnectionManager - Connection request failed
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) [httpasyncclient-4.1.2.jar:4.1.2]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
03:17:15.713 [pool-91-thread-1] DEBUG o.a.h.i.n.c.InternalHttpAsyncClient - [exchange: 10] connection request failed
03:17:15.714 [pool-91-thread-1] DEBUG org.elasticsearch.client.RestClient - request [GET http://127.0.0.1:9200/_cluster/health?timeout=30s&wait_for_status=yellow] failed
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) [httpasyncclient-4.1.2.jar:4.1.2]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
03:17:15.714 [pool-91-thread-1] DEBUG org.elasticsearch.client.RestClient - updated host [http://127.0.0.1:9200] already in blacklist
03:17:15.716 [Executor task launch worker-0] ERROR org.apache.spark.executor.Executor - Exception in task 0.2 in stage 0.0 (TID 4)


java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl


        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:46) ~[janusgraph-hadoop-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100) ~[janusgraph-hadoop-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47) ~[janusgraph-hadoop-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67) ~[janusgraph-hadoop-0.2.1-SNAPSHOT.jar:na]
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:156) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:129) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:64) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.scheduler.Task.run(Task.scala:89) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) ~[spark-core_2.10-1.6.1.jar:1.6.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_172]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_172]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
Caused by: java.lang.reflect.InvocationTargetException: null
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_172]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_172]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_172]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_172]
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        ... 33 common frames omitted

Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex


        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.diskstorage.Backend.getIndexes(Backend.java:464) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.diskstorage.Backend.<init>(Backend.java:149) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1925) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.graphdb.database.StandardJanusGraph.<init>(StandardJanusGraph.java:139) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:123) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.<init>(JanusGraphHadoopSetupImpl.java:52) ~[janusgraph-hadoop-0.2.1-SNAPSHOT.jar:na]
        ... 38 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_172]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_172]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_172]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_172]
        at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58) ~[janusgraph-core-0.2.1-SNAPSHOT.jar:na]
        ... 47 common frames omitted
Caused by: org.janusgraph.diskstorage.PermanentBackendException: Connection refused
        at org.janusgraph.diskstorage.es.ElasticSearchIndex.<init>(ElasticSearchIndex.java:243) ~[janusgraph-es-0.2.1-SNAPSHOT.jar:na]
        ... 52 common frames omitted
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_172]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_172]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[httpasyncclient-4.1.2.jar:4.1.2]
        ... 1 common frames omitted
03:17:15.728 [dispatcher-event-loop-2] INFO  o.a.s.e.CoarseGrainedExecutorBackend - Got assigned task 5

Looks like HadoopGraph is trying to connect to Localhost Elasticsearch. But why?

So, I added following properties to my file and tried running again hoping that it would connect to right Elasticsearch now.

index.search.backend=elasticsearch
index.search.hostname=igatest178.rtp.raleigh.ibm.com
index.search.port=9200

But the error is the same as above.

So is it known bug that HadoopGraph can only connect to localhost Elasticsearch?

Antriksh Shah

unread,

Jul 11, 2018, 4:30:10 AM7/11/18

to JanusGraph users

Hey could you add the following also to your property file and check if it is working:?

janusgraphmr.ioformat.conf.index.search.backend=elasticsearch

janusgraphmr.ioformat.conf.index.search.hostname=IP

Debasish Kanhar

unread,

Jul 11, 2018, 7:17:22 AM7/11/18

to JanusGraph users

Hi Antriksh,

Thanks for those set of configurations. I FINALLY GOT SPARK WITH OLAP WORKING AFTER 6 months!! :-D :-D This is the best thing I've felt since started working with Janus.

Anyways, I also changed my elasticsearch.yml's http.host to expose the ES node at localhost as well as its own container hostname, and was able to make OLAP working with that too.

Also, I was able to get OLAP working even when my Spark executors don't have Cassandra installed on them. That was a new discovery for me as I thought that Spark workers needed Cassandra and ES to be installed on same machine as worker.

Thanks everyone who helped me through the course. Now next step would be to wait for 0.3.0 release so that we can move to recent version of Spark as well. :-)

Antriksh Shah

unread,

Jul 13, 2018, 2:31:49 AM7/13/18

to JanusGraph users

I feel you! It took me a ton of time to get it working. I had made a mental resolution to provide issues faced and steps taken to resolve it, but have'nt completed it as of yet. If you get some time, do share what all issues you faced, and what you did to resolve it :)

Jason Plurad

unread,

Jul 13, 2018, 12:28:56 PM7/13/18

to JanusGraph users

I'm planning to make some doc updates with https://github.com/JanusGraph/janusgraph/issues/1159

Debasish, Antriksh, Hadoop Marc: please let me know if you'd like to help polish this off. Thanks!

Reply all

Reply to author

Forward

0 new messages