Errors with GiraphGraphComputer

267 views
Skip to first unread message

vnma...@gmail.com

unread,
Jun 29, 2016, 2:04:13 PM6/29/16
to Gremlin-users
Hello,

I am having some issues with running the GiraphGraphComputer in the gremlin console. I have been able to successfully install the tinkerpop.giraph plugin and activate it, and the HADOOP_GREMLIN_LIBS variable is set to the ext/giraph-gremlin/lib directory.

I am trying to run the gremlin console on VM. The actual graph I am trying to read in has been successful using the SparkGraphComputer, but when I try and use Giraph, I get the following output:


plugin activated: tinkerpop.giraph
gremlin> graph = GraphFactory.open('conf/hadoop-graph/hadoop-script.properties')
==>hadoopgraph[scriptinputformat->graphsonoutputformat]
gremlin> g = graph.traversal(computer(GiraphGraphComputer))
==>graphtraversalsource[hadoopgraph[scriptinputformat->graphsonoutputformat], giraphgraphcomputer]
gremlin> g.V().count()
17:54:13 WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17:54:14 INFO  org.apache.tinkerpop.gremlin.hadoop.process.computer.giraph.GiraphGraphComputer  - HadoopGremlin(Giraph): TraversalVertexProgram[GraphStep([],vertex), CountGlobalStep, ComputerResultStep]
java.lang.IllegalStateException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have only one worker since only 1 task at a time!


Also, my properties file looks like this:

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat
gremlin.hadoop.jarsInDistributedCache=true

gremlin.hadoop.inputLocation=data/mygraph.txt
gremlin.hadoop.scriptInputFormat.script=data/script-input-tinkerpop.groovy
gremlin.hadoop.outputLocation=output

#####################################
# GiraphGraphComputer Configuration #
#####################################
giraph.minWorkers=1
giraph.maxWorkers=2
giraph.useOutOfCoreGraph=true
giraph.useOutOfCoreMessages=true
mapred.map.child.java.opts=-Xmx1024m
mapred.reduce.child.java.opts=-Xmx1024m
giraph.numInputThreads=4
giraph.numComputeThreads=4
# giraph.maxPartitionsInMemory=1
# giraph.userPartitionCount=2

####################################
# SparkGraphComputer Configuration #
####################################
spark.master=local[4]
# spark.master=yarn-client
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
# spark.kryo.registrationRequired=true
# spark.storage.memoryFraction=0.2
spark.eventLog.enabled=true
spark.eventLog.dir=tmp/spark-event-logs
# spark.ui.killEnabled=true

Has anyone else been having similar issues or figured out how to run the GiraphGraphComputer?

Jason Plurad

unread,
Jun 29, 2016, 4:46:52 PM6/29/16
to Gremlin-users
Try adding this property

giraph.SplitMasterWorker=false

I have took notes here https://github.com/pluradj/ambari-vagrant/blob/tp3/ubuntu14.4/tp3/GiraphGraphComputer.md

-- Jason

vnma...@gmail.com

unread,
Jun 29, 2016, 5:49:28 PM6/29/16
to Gremlin-users
Hi Jason,

Thanks for your reply. I added that to the properties file, and that issue no longer appears. Unfortunately, now I am encountering a connection issue with Zookeeper.

Here is the (first) error that shows up when i try to run the same commands as above:

21:39:16 WARN  org.apache.giraph.zk.ZooKeeperManager  - onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)
        at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)
        at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor


Does anyone know of any configurations or properties to deal with ZooKeeper?

Jason Plurad

unread,
Jun 30, 2016, 9:34:52 AM6/30/16
to Gremlin-users
Giraph will start ZooKeeper on its own on port 22181, if you don't specify your own. Have you successfully run any of the standalone Giraph samples to confirm it works with your setup?

If you stand up your own ZooKeeper, you specify it in the properties file like this:

# Use external ZooKeeper instead of local ZooKeeper (optional)
giraph.zkList=192.168.0.1:2181



-- Jason

Marko Rodriguez

unread,
Jun 30, 2016, 9:39:02 AM6/30/16
to gremli...@googlegroups.com
Hi,

Note that if you let Giraph standup ZooKeeper “on the fly” for you, you will get 1 or 2 “connection refused” Exceptions before a valid connection is made because Giraph tries to connect before ZooKeeper is fully loaded. This is a known “issue” discussed in the Giraph documentation. Its not really an issue as it just throws an Exception and retries, but it does fill your logs with Exception messages unfortunately.

HTH,
Marko.
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/a8ec91d8-24ab-463a-a338-eeeff4cae3b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HadoopMarc

unread,
Aug 20, 2017, 9:50:18 AM8/20/17
to Gremlin-users
Hi gremlin on Giraph testers,

I also found that to run the GiraphGraphComputer example from the ref docs locally (so without external or pseudo hadoop services), you need to add to gremlin-console's classpath (working form gremlin-console' s root):

export CLASSPATH=$PWD/lib/*
bin/gremlin.sh


This seems to be a bug: GiraphGraphComputer attempts to start a Zookeeper from its Zookeeper directory using the classpath from gremlin.sh with relative paths. So, adding the same artifacts with absolute paths to the classpath as above can be used as a workaround.

If nobody reports back that it does work without the workaround classpath, I'll make the ticket.

Cheers,   Marc

Op donderdag 30 juni 2016 15:39:02 UTC+2 schreef Marko A. Rodriguez:

HadoopMarc

unread,
Aug 20, 2017, 10:39:51 AM8/20/17
to Gremlin-users
Hi all,

In retracing the changes I made, I hit on a step I omitted above (which is kind of obvious except for the hundreds of lines of output logging without a sensible cue...). When using GiraphGraphComputer locally, hdfs falls back to the local file system and the corresponding line in conf/hadoop/hadoop-gryo.properties should read:

gremlin.hadoop.inputLocation=data/tinkerpop-modern.kryo

Marc

Op zondag 20 augustus 2017 15:50:18 UTC+2 schreef HadoopMarc:

HadoopMarc

unread,
Aug 23, 2017, 3:23:41 PM8/23/17
to Gremlin-users
OK, I wrapped up this entire discussion in the ticket below:
https://issues.apache.org/jira/browse/TINKERPOP-1757

Cheers,    Marc

Op zondag 20 augustus 2017 16:39:51 UTC+2 schreef HadoopMarc:
Reply all
Reply to author
Forward
0 new messages