Does titan support graph compute on mesos

lexk

unread,

May 17, 2016, 3:33:33 AM5/17/16

to Aurelius

Titan1.1 + Hbase1.0.1+Hadoop2.7.0

Hi, i was using `SparkGraphComputer` to compute simple vertex’s count.

```

config_file=‘/path/to/titan-hbase-spark.properties'

graph = GraphFactory.open(config_file)

g = graph.traversal(computer(SparkGraphComputer))

g.V().count()

```

when i submit job to spark, config `spark.master=spark://spark-master:7077` in `titan-hbase-spark.properties`, it succeed to run job on spark, and return the count of vertics.

but when i submit job to mesos , config `spark.master=mesos://mesos-master:5050` in `titan-hbase-spark.properties`, it failed, continue to restart task and lost task.It seems that the executor couldn’t connect driver, titan’s jar didn’t copy to mesos slave.

following log was found in mesos slave’s `stderr`

```

16/05/16 11:39:26 INFO Utils: Successfully started service 'sparkExecutorActorSystem' on port 43535.

org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout

at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)

at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)

at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)

at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)

at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)

at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:97)

at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:106)

at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)

at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:324)

at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)

at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:217)

at org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:75)

Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]

at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

at scala.concurrent.Await$.result(package.scala:107)

at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

... 7 more

```

HadoopMarc

unread,

May 17, 2016, 3:41:25 PM5/17/16

to Aurelius

Hi Lex,

Nice to hear about Mesos, I also hope for myself that my Spark applications will one day be portable to Mesos. What distributed filesystem do you use with it?

Regarding your question, does Mesos have any configuration files that need be on your classpath?

I have only experience with getting Spark-yarn to work with Titan, and there the Titan app/console/server needs to have the spark-yarn jars and the directory with spark-yarn configuration files on the classpath. You would get this kind of error message if titan on spark-yarn misses the resource manager config.

Cheers, Marc

Op dinsdag 17 mei 2016 09:33:33 UTC+2 schreef lexk:

lexk

unread,

May 17, 2016, 10:37:21 PM5/17/16

to Aurelius

Hi Marc,

Thanks for your reply, and your work on spark applications ^ ^.

Titan's backend was hbase 1.0.1, stored files in hdfs(Hadoop 2.7.0), and `gremlin.hadoop.outputLocation` was set to hdfs path.

For mesos, only export `MESOS_NATIVE_JAVA_LIBRARY`, and I found `mesos-0.21.1-shaded-protobuf.jar` in ${TITAN_HOME}/lib.

Looking forward for your reply.

Cheers, lexk

在 2016年5月18日星期三 UTC+8上午3:41:25，HadoopMarc写道：

lexk

unread,

May 18, 2016, 4:26:42 AM5/18/16

to Aurelius

Hi Marc,

Finally find out that `${HADOOP_HOME}` didn't export, after add this it succeed to run titan job on mesos. Thank you very very much!

By the way, is there some config to help reducing compute time? It took about 20 mins to count vertics in graph(about 1.3M vertices)

As far as i know, I can set `spark.driver.memory, spark.executor.memory, spark.cores.max` in my config file(`titan-hbase-spark.properties`) to add running cpu and memory.I set sufficient cores, but found only one active task running. The running task num is depended on hbase table region's num?

Most time-taken stage:

foreachPartition at SparkExecutor.java:134 8.7min

mapPartitionsToPair at SparkExecutor.java:170 almost 9 min

p.s. english is not my native language, there maybe many grammar mistake, if there is any you can’t undstand , pls tell me

在 2016年5月18日星期三 UTC+8上午3:41:25，HadoopMarc写道：

Hi Lex,

lexk

unread,

May 18, 2016, 11:28:56 PM5/18/16

to Aurelius

I also tried to add hbase table’s region num from 1 to 100, and loaded more test data(12M vertices). It only took 4 min to return result.

在 2016年5月18日星期三 UTC+8下午4:26:42，lexk写道：

Reply all

Reply to author

Forward