CqlBridgeRecordReader Kinda stuck at this point for long. Anyy help really appreciated.
Graph graph = JanusGraphFactory.build().
set("storage.backend", "cassandra").
set("storage.hostname", "9.30.xxx.222, 9.30.xx.29, 9.30.xxx.218").set("storage.port", "9160").
set("index.search.backend", "elasticsearch").set("index.search.hostname", "127.0.0.1").
open();
GraphTraversalSource g = graph.traversal();
System.out.println("Before Pushing " + g.V().count().profile().next());
String read_cassandra_properties = "/opt/IGA/JanusGraph0.2.0/conf/hadoop-graph/read-cassandra3-cluster.properties";
Graph graph_computer = GraphFactory.open(read_cassandra_properties);read-cassandra3-cluster.properties is as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=9.30.xxx.222, 9.30.xx.29, 9.30.xxx.218
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=local[4]
spark.serializer=org.apache.spark.serializer.KryoSerializer
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
Once I load Graph, I do g = graph_computer.traversal().withComputer();
I try to use the g to do profile of my Count step, but it fails:
The following code fails:
System.out.println(g1.V().count().profile().next());
I get an error telling following things:
java.lang.IllegalStateException: java.io.IOException: Could not get input splits
And
Caused by: java.io.IOException: failed connecting to all endpoints 9.30.253.222
Please have a look at this link for complete stack trace.
Traceback.
And,
I thought the error is because I can't connect to Cassandra IPs. but as
mentioned I was able to connect to Cassandra cluster few lines back, so
this error kind of seems strange.
Thanks in advance!By
default, JanusGraph uses the Astyanax library to connect to Cassandra
clusters. On EC2 and Rackspace, it has been reported that Astyanax was
unable to establish a connection to the cluster. In those cases,
changing the backend to storage.backend=cassandrathrift solved the problem.
HTH, Marc
g.V().count();
result = graph.compute().program(PageRankVertexProgram.build().create()).submit().get();
g = result.graph().traversal();
g.V().valueMap()
java.lang.NoClassDefFoundError: io/netty/channel/epoll/EpollDatagramChannel$DatagramSocketAddress## Hadoop Graph Configuration#gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraphgremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.cassandra.CassandraInputFormatgremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=truegremlin.hadoop.inputLocation=nonegremlin.hadoop.outputLocation=output
## JanusGraph Cassandra InputFormat configuration#janusgraphmr.ioformat.conf.storage.backend=cassandrathrift# As mentioned, I've local Cassandra in same VM as Spark Master and JG.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1janusgraphmr.ioformat.conf.storage.port=9160janusgraphmr.ioformat.conf.storage.cassandra.keyspace=panamaDev
## Apache Cassandra InputFormat configuration#cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
## SparkGraphComputer Configuration#spark.master=yarn-clientspark.driver.host=9.30.100.218#spark.executor.memory=1536mspark.executor.memory=6gspark.serializer=org.apache.spark.serializer.KryoSerializerspark.yarn.dist.archives=/opt/JanusGraph/0.2.0/lib.zipspark.yarn.dist.files=/opt/JanusGraph/0.2.0/lib/janusgraph-hbase-0.2.0.jarspark.driver.extraLibraryPath=/home/hadoop/hadoop/lib/nativespark.executor.extraLibraryPath=/home/hadoop/hadoop/lib/nativegremlin.spark.persistContext=true
# Default Graph Computergremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
gremlin> graph = GraphFactory.open("/opt/resources/janusgraph-connections/testGraph-OLAP-yarn-cassandra-local.properties")==>hadoopgraph[cassandrainputformat->gryooutputformat]gremlin> g = graph.traversal().withComputer()==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], graphcomputer]gremlin> g.V().count()11:58:17 ERROR org.apache.spark.SparkContext - Error initializing SparkContext.org.apache.spark.SparkException: Unable to load YARN support at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:399) at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:394) at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:394) at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:411) at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2118) at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:105) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:365) at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288) at org.apache.spark.SparkContext.<init>(SparkContext.scala:457) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2281) at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:52) at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:60) at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:193) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:174) at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:395) ... 18 moreorg.apache.spark.SparkException: Unable to load YARN supportType ':help' or ':h' for help.Display stack trace? [yN]Hi All,