Could not commit transaction due to exception during persistence

1,228 views
Skip to first unread message

Parimi Rohit

unread,
Sep 19, 2013, 11:17:32 PM9/19/13
to aureliu...@googlegroups.com
Hi All,

I am a new Titan user and my use-case is to update a graph in map reduce. Following is the pseudo-code for my Map class, where I am getting an exception. 


Class Map
{

configure()
{

open titan db;

}

map()
{

Compute some value from the graph for the current user;
Set the computed value as a property for that user;

}

close()
{

graph.commit();
graph.shutdown();

}

}

With the above map class, for one of the map tasks, I am getting the following exception:

com.thinkaurelius.titan.core.TitanException: Could not commit transaction due to exception during persistence
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:848)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.commit(TitanBlueprintsGraph.java:64)
at adsorption.parallel.Parallel$SumCommonItemsMap.map(Parallel.java:337)
at adsorption.parallel.Parallel$SumCommonItemsMap.map(Parallel.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: com.thinkaurelius.titan.core.TitanException: Permanent exception during backend operation
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:64)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.save(StandardTitanGraph.java:277)
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:839)
... 11 more
Caused by: com.thinkaurelius.titan.diskstorage.locking.PermanentLockingException: Updated state: lock acquired but value has changed since read [LockClaim [backer=com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore@1473ec7, key=0x1-100-105-115-116-83-117-237, col=0x132, expectedValue=null]]
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockTransaction.verifyAllLockClaims(ConsistentKeyLockTransaction.java:427)
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore.mutate(ConsistentKeyLockStore.java:118)
at com.thinkaurelius.titan.diskstorage.BackendTransaction.mutateVertexIndex(BackendTransaction.java:111)
at com.thinkaurelius.titan.graphdb.database.IndexSerializer.addProperty(IndexSerializer.java:83)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:306)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.access$000(StandardTitanGraph.java:45)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:262)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:203)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:61)
... 13 more

When I ran it another time, I got the same exception with a different stack trace:

Caused by: com.thinkaurelius.titan.diskstorage.locking.PermanentLockingException: Lock could not be acquired because it is held by a remote transaction [LockClaim [backer=com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore@32f71, key=0x1-100-105-115-116-83-117-237, col=0x132, expectedValue=null]]
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockTransaction.verifyAllLockClaims(ConsistentKeyLockTransaction.java:396)
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore.mutate(ConsistentKeyLockStore.java:118)
at com.thinkaurelius.titan.diskstorage.BackendTransaction.mutateVertexIndex(BackendTransaction.java:111)
at com.thinkaurelius.titan.graphdb.database.IndexSerializer.addProperty(IndexSerializer.java:83)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:306)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.access$000(StandardTitanGraph.java:45)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:262)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:203)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:61)
... 12 more

I am just wondering if I am using the commit correctly or if I have to commit after I update each user. Also, is it possible to update the same vertex in two different mappers (by update I meant adding edges to that vertex). Is there any documentation for updating the graph in a distributed environment? I am using HBase as backend.

Any help is much appreciated.

Thanks,
Rohit


Matthias Broecheler

unread,
Sep 21, 2013, 4:20:12 PM9/21/13
to aureliu...@googlegroups.com
Hi Parimi,

this exception is most often caused by NOT defining the schema up front in a MapReduce job against a Titan graph. If all mappers attempt to create the same type at (virtually) the same time, you are likely to get a locking exception as below.

See other posts on this mailing list for similar problems. Easiest solution: define the schema up front (just once).
HTH,.
Matthias
--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


--
Matthias Broecheler
http://www.matthiasb.com

Parimi Rohit

unread,
Sep 23, 2013, 3:06:33 PM9/23/13
to aureliu...@googlegroups.com
Thanks Matthias. I created the schema while loading the data using batchgraph and I dont have the exception anymore in my MapReduce program. 

Also, I read in the bulk loading section of Titan documentation that Faunus can be used to load the data. Can any one please point me to the class in Faunus or link to the documentation that explains bulk loading through Faunus. 

Thanks,
Rohit
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraphs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

David

unread,
Sep 24, 2013, 8:55:56 AM9/24/13
to aureliu...@googlegroups.com
I'm sure there are other (perhaps better) links, but here is one:


See Marko's opening post.

Parimi Rohit

unread,
Sep 27, 2013, 7:27:13 PM9/27/13
to aureliu...@googlegroups.com
Thanks David, for the link.

Rohit

Parimi Rohit

unread,
Nov 4, 2013, 7:09:00 PM11/4/13
to aureliu...@googlegroups.com
Hi Matthias,

Things worked fine for a month and now I got this exception, again.


To summarize, I am trying to bulk load data into Titan using MapReduce and as suggested in the above reply, I created my schema in a single thread. It looks like the following:

                        g.makeType().name("MyID").dataType(String.class).indexed(Vertex.class).unique(Direction.OUT).makePropertyKey();
g.makeType().name("type").dataType(String.class).unique(Direction.OUT).makePropertyKey();
/*Set the edge labels and properties*/
g.makeType().name("labelDist_inj").dataType(FullDouble.class).unique(Direction.OUT).makePropertyKey();
g.makeType().name("labelDist_est").dataType(FullDouble.class).unique(Direction.OUT).makePropertyKey();
g.makeType().name("click_injLabel").makeEdgeLabel();
g.makeType().name("click_estLabel").makeEdgeLabel();

With the above schema the loading succeeded. However, recently, I found that for the following query, more than one vertex is retrieved, even though the property I used in the query is unique in my dataset.


                         Iterator<Vertex> it = g.getVertices("MyID", MyID).iterator();

Based on the suggestions by Daniel in this thread (https://groups.google.com/forum/#!topic/aureliusgraphs/RE1JLvL4IwU) , I changed the schema to the following,

                        g.makeType().name("MyID").dataType(String.class).indexed(Vertex.class).unique(Direction.BOTH).makePropertyKey();

Now when I try to bulk load data into titan graph, I get "Could not commit transaction due to exception during persistence" exception. I am using Titan 0.3.2. (I already posted this question in he thread https://groups.google.com/forum/#!topic/aureliusgraphs/RE1JLvL4IwU but re-posting here as it is relevant to the previous question.) 

2013-11-04 15:21:17,358 ERROR com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockTransaction: Lock expired: LockClaim [backer=com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore@35aec48e, key=0x1-51-55-48-50-48-95-117-115-101-242, col=0x64-0-0-0-0-0-0-138, expectedValue=null] (txn=com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockTransaction@5ec7f78a)
2013-11-04 15:21:40,219 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-11-04 15:21:40,367 WARN org.apache.hadoop.mapred.Child: Error running child
com.thinkaurelius.titan.core.TitanException: Could not commit transaction due to exception during persistence
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:848)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.commit(TitanBlueprintsGraph.java:64)
at graphCreator.CreateGraph_Parallel$VertexLoader.close(CreateGraph_Parallel.java:297)
at org.apache.hadoop.mapred.lib.DelegatingMapper.close(DelegatingMapper.java:61)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: com.thinkaurelius.titan.core.TitanException: Permanent exception during backend operation
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:64)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.save(StandardTitanGraph.java:277)
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:839)
... 11 more
Caused by: com.thinkaurelius.titan.diskstorage.locking.PermanentLockingException: Lock could not be acquired because it is held by a remote transaction [LockClaim [backer=com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore@35aec48e, key=0x1-49-48-48-48-48-49-95-117-115-101-242, col=0x64-0-0-0-0-0-0-138, expectedValue=null]]
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockTransaction.verifyAllLockClaims(ConsistentKeyLockTransaction.java:396)
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore.mutate(ConsistentKeyLockStore.java:118)
at com.thinkaurelius.titan.diskstorage.BackendTransaction.mutateVertexIndex(BackendTransaction.java:111)
at com.thinkaurelius.titan.graphdb.database.IndexSerializer.addProperty(IndexSerializer.java:83)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:306)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.access$000(StandardTitanGraph.java:45)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:270)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:203)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:61)
... 13 more


Thanks,
Rohit

Matthias Broecheler

unread,
Nov 4, 2013, 11:57:55 PM11/4/13
to aureliu...@googlegroups.com
Hi Rohit,

this is because you are probably loading in parallel and therefore creating uniqueness conflicts. The reason you got duplicates in the first place is because there are duplicates in the dataset. By adding uniqueness(IN) you are saying that the attribute value should be unique across your entire graph so that Titan ensures that there is at most one vertex for any given MyID. However, you are still loading multiple identical MyID and when this happens in parallel you get a locking exception because the later thread fails since the earlier one got the lock.

HTH,
Matthias


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Parimi Rohit

unread,
Nov 5, 2013, 1:45:54 PM11/5/13
to aureliu...@googlegroups.com
Hi Matthias,

Thanks for your reply. Yes, I was trying to load the vertices and edges in parallel using MapReduce. I went back and checked my dataset and confirmed that there are no duplicates. However, the problem was solved when I issued a commit after processing about 10,000 vertices. There was no locking exception. 

This got me thinking if the problem is with using the mapreduce framework. I know that if a task is slow the job tracker assigns another tasktracker to process the same file. If that is the case, we have two tasktrackers working on the same data and may be this is causing the locking exception. Just a thought.

Thanks,
Rohit

Matthias Broecheler

unread,
Nov 6, 2013, 10:36:32 PM11/6/13
to aureliu...@googlegroups.com
Yes, I believe that would be a reasonable explanation. Also, committing reasonable chunks of data is recommended best practice so that the likelihood of transactional failure due to write exceptions is low.
Reply all
Reply to author
Forward
0 new messages