TitanHadoop Import Failed Job (ID block allocation)

Elaine Wu

unread,

Dec 15, 2015, 3:19:08 PM12/15/15

to Aurelius

Hi,

We are trying to test importing a pretty large graph (1.5 billion edges and 270 million vertices) into our hbase (1.1.2) backend using Titan Hadoop (0.5.4). We tried running a job on very small graphs, which worked fine. The MR job finished, and the data got written into hbase properly. However, when we try the bigger dataset, the job fails the second time mapping portion occurs:

Executed [Job 1/2: EdgeCopyMapReduce.Map > EdgeCopyMapReduce.Reduce > IdentityMap.Map] successfully

Executing [Job 2/2: TitanGraphOutputMapReduce.VertexMap > TitanGraphOutputMapReduce.Reduce > TitanGraphOutputMapReduce.EdgeMap]

Our hbase is set up on top of hdfs, but this is separate from the cluster where we are running our MR jobs (hadoop 2.6.0).

The error that we run into is (the same error repeats for various partitions/namespaces:

Error: java.io.IOException: ID block allocation on partition(7)-namespace(3) timed out in 120.0 s

at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$VertexMap.map(TitanGraphOutputMapReduce.java:153)

at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$VertexMap.map(TitanGraphOutputMapReduce.java:105)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

at org.apache.hadoop.mapreduce.lib.chain.Chain.runMapper(Chain.java:389)

at org.apache.hadoop.mapreduce.lib.chain.ChainMapper.run(ChainMapper.java:149)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Caused by: com.thinkaurelius.titan.core.TitanException: ID block allocation on partition(7)-namespace(3) timed out in 120.0 s

at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.waitForIDRenewer(StandardIDPool.java:131)

at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.nextBlock(StandardIDPool.java:148)

at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.nextID(StandardIDPool.java:193)

at com.thinkaurelius.titan.graphdb.database.idassigner.VertexIDAssigner.assignID(VertexIDAssigner.java:330)

at com.thinkaurelius.titan.graphdb.database.idassigner.VertexIDAssigner.assignID(VertexIDAssigner.java:185)

at com.thinkaurelius.titan.graphdb.database.idassigner.VertexIDAssigner.assignID(VertexIDAssigner.java:151)

at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.assignID(StandardTitanGraph.java:385)

at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addPropertyInternal(StandardTitanTx.java:744)

at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.setProperty(StandardTitanTx.java:778)

at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addProperty(StandardTitanTx.java:696)

at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addVertex(StandardTitanTx.java:494)

at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addVertexWithLabel(StandardTitanTx.java:516)

at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.addVertexWithLabel(TitanBlueprintsGraph.java:215)

at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$VertexMap.getTitanVertex(TitanGraphOutputMapReduce.java:216)

at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$VertexMap.getCreateOrDeleteVertex(TitanGraphOutputMapReduce.java:186)

at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$VertexMap.map(TitanGraphOutputMapReduce.java:132)

... 11 more

Caused by: java.util.concurrent.TimeoutException

at java.util.concurrent.FutureTask.get(FutureTask.java:205)

at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.waitForIDRenewer(StandardIDPool.java:121)

... 26 more

We were guessing that it has something to do with the ids.block-size parameter, which we attempted to set in our script properties file. However, we still ran into the same problem. Does anyone possibly have a idea why this might be happening or possible config parameters we need to set (or might be set incorrectly)?

Any help is greatly appreciated, thanks!

David

unread,

Dec 15, 2015, 4:57:09 PM12/15/15

to Aurelius

Hi,

>> which we attempted to set in our script properties file

What does your config look like ?

It appears you are using the default timeout for ids.renew-timeout based on the timeout message of 120.0 s
Have you tried increasing that setting ?

Larger id blocks lead to fewer acquisition failures. The 0.5.4 manual says this:

Set ids.block-size to the number of vertices you expect to add per Titan instance per hour.

In the case of bulk load, that all happens at once. A very large initial id block size might help:
Some people have used sizes like this and larger: ids.block-size=20000000, but this is also impacted
by your cluster config, like number of mappers.

Elaine Wu

unread,

Dec 16, 2015, 10:54:58 AM12/16/15

to Aurelius

Hi David,

That's the same number we set our block size to, and we also set the renew-timeout property, but we are wondering if it's not getting set somehow, because we noticed the 120 s too. What is the proper way to set these properties? We've also tried prefixing it by titan.hadoop, which we saw in some other properties file on github.

Our config file looks something like this:

titan.hadoop.output.format=com.thinkaurelius.titan.hadoop.formats.hbase.TitanHBaseOutputFormat

titan.hadoop.output.conf.storage.backend=hbase

titan.hadoop.output.conf.storage.hostname=our hostname

titan.hadoop.output.conf.storage.port=2181

titan.hadoop.output.conf.storage.batch-loading=true

titan.hadoop.output.infer-schema=false

#titan.hadoop.graph.output.titan.ids.block-size=20000000

#titan.hadoop.graph.output.titan.ids.renew-timeout=3600000

ids.block-size=20000000

ids.renew-timeout=3600000

# controls size of transaction

mapred.task.timeout=5400000

mapred.min.split.size=134217728

mapred.reduce.tasks=20

mapred.job.reuse.jvm.num.tasks=-1

mapreduce.map.memory.mb=8096

mapreduce.map.java.opts=-Xmx8072m

mapreduce.reduce.memory.mb=8096

mapreduce.reduce.java.opts=-Xmx8072m

Thanks for the help!

Reply all

Reply to author

Forward