Titan1.1 + Hbase1.0.1+Hadoop2.7.0
Hi, i was using `SparkGraphComputer` to compute simple vertex’s count.
```
config_file=‘/path/to/titan-hbase-spark.properties'
graph = GraphFactory.open(config_file)
g = graph.traversal(computer(SparkGraphComputer))
g.V().count()
```
when i submit job to spark, config `spark.master=spark://spark-master:7077` in `titan-hbase-spark.properties`, it succeed to run job on spark, and return the count of vertics.
but when i submit job to mesos , config `spark.master=mesos://mesos-master:5050` in `titan-hbase-spark.properties`, it failed, continue to restart task and lost task.It seems that the executor couldn’t connect driver, titan’s jar didn’t copy to mesos slave.
following log was found in mesos slave’s `stderr`
```
16/05/16 11:39:26 INFO Utils: Successfully started service 'sparkExecutorActorSystem' on port 43535.
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:97)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:106)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:324)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:217)
at org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:75)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 7 more
```
Hi Marc,
Finally find out that `${HADOOP_HOME}` didn't export, after add this it succeed to run titan job on mesos. Thank you very very much!
By the way, is there some config to help reducing compute time? It took about 20 mins to count vertics in graph(about 1.3M vertices)
As far as i know, I can set `spark.driver.memory, spark.executor.memory, spark.cores.max` in my config file(`titan-hbase-spark.properties`) to add running cpu and memory.I set sufficient cores, but found only one active task running. The running task num is depended on hbase table region's num?
Most time-taken stage:
foreachPartition at SparkExecutor.java:134 8.7min
mapPartitionsToPair at SparkExecutor.java:170 almost 9 min
p.s. english is not my native language, there maybe many grammar mistake, if there is any you can’t undstand , pls tell me
Hi Lex,
I also tried to add hbase table’s region num from 1 to 100, and loaded more test data(12M vertices). It only took 4 min to return result.