Parallel write

351 views
Skip to first unread message

Shridhar B

unread,
Mar 18, 2015, 10:18:11 AM3/18/15
to aureliu...@googlegroups.com
Hi,

How to achieve parallel writes in Titan, is this possible?
I know Cassandra can write data parallely this implies a higher write throughput.

I tried to do this by having multithread running prallely to insert data and i got Lock Exception

StandardTitanGraph [ERROR] Could not commit transaction [80] due to exception
com.thinkaurelius.titan.diskstorage.locking.PermanentLockingException: Local lock contention


If we cannot do this then how to work in distributed environment with Titan,???

Thanks,
Shridhar B

Stephen Mallette

unread,
Mar 18, 2015, 11:52:44 AM3/18/15
to aureliu...@googlegroups.com
You need to have some method to retry your transaction if you get a locking exception.  If retry doesn't help alleviate the problem then you might be keeping your transactions open for too long - consider committing earlier to free the locks.  If that doesn't help, then you might have a problem with your schema and you may need to think of what to eliminate locking entirely.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/ba7c2eb3-9ffc-4005-a0d1-a8c2dc3e886b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Shridhar B

unread,
Mar 19, 2015, 2:31:50 AM3/19/15
to aureliu...@googlegroups.com
Here my main concern is how do i write data to TitanDB parallely.???

Stephen Mallette

unread,
Mar 19, 2015, 6:36:38 AM3/19/15
to aureliu...@googlegroups.com
in your original email you said that you "tried to do this by having multithread running prallely to insert data and i got Lock Exception". multiple threads writing data seems like a perfectly fine way to do what you want.  you just need to sort out the locking exceptions when you do that - and i gave you the ways you can deal with that.  i think you have all the information you need.

Message has been deleted

Shridhar B

unread,
Mar 31, 2015, 7:42:00 AM3/31/15
to aureliu...@googlegroups.com

If this can be solved by keeping a lock-wait time, then only one thread writes at a time not the multithread, so what is the use of this since only one thread writes at a time and rest waiting other thread to release the lock. In this case what is the performance that we achieve? 
I need  multiple thread to write same time that means parallely. so that i can improve the performance. Same as like Cassandra does.

- Shridhar 

Stephen Mallette

unread,
Mar 31, 2015, 7:51:34 AM3/31/15
to aureliu...@googlegroups.com
I might not be following your line of questioning.  If you can control your consistency of unique properties without locking or you don't need consistency there, then by all means turn it off.  Locking is expensive and creates problems bottlenecks for those parallel writes.  If unique consistency is important to you and you can't ensure it through some means as part of your data load, then you will need to use a locking system somewhere.  If you use the one with Titan, then follow the advice I supplied.  If i'm not answering your question, then you probably need to completely rephrase it or someone else will need to try to answer it.

Shridhar B

unread,
Apr 1, 2015, 2:47:01 AM4/1/15
to aureliu...@googlegroups.com

Ok let me rephrase like this. straightforward.  I want to have Concurrent write to Titan DB. is there any document or sample code that i can get? 

Stephen Mallette

unread,
Apr 1, 2015, 7:51:34 AM4/1/15
to aureliu...@googlegroups.com
Your first post says:

I tried to do this by having multithread running prallely to insert data

I take that to mean you already have the code to do this.  You have two choices:

1. Remove locks and manage uniqueness consistency yourself: http://s3.thinkaurelius.com/docs/titan/0.5.4/eventual-consistency.html OR
2. Keep locking enabled and add transaction retry to your code (and consider other suggestions i made in this thread)

However, if you'd still like an example:

gremlin> g = TitanFactory.open('conf/titan-cassandra.properties')
==>titangraph[cassandrathrift:[127.0.0.1]]
gremlin> mgmt = g.getManagementSystem()
==>com.thinkaurelius.titan.graphdb.database.management.ManagementSystem@59429fac
gremlin> mgmt.makePropertyKey("val").dataType(Integer.class).make()
==>val
gremlin> mgmt.commit()
==>null
gremlin> import java.util.concurrent.*
==>import org.apache.hadoop.hdfs.*
==>import org.apache.hadoop.conf.*
==>import org.apache.hadoop.fs.*
...
==>import com.tinkerpop.pipes.util.iterators.*
==>import com.tinkerpop.pipes.util.structures.*
==>import org.apache.commons.configuration.*
==>import java.util.concurrent.*
gremlin> service = Executors.newFixedThreadPool(4)
==>java.util.concurrent.ThreadPoolExecutor@56da52a7[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
gremlin> (0..<10000).each{i-> service.submit({g.addVertex([val:i]);g.commit()})}; service.shutdown();service.awaitTermination(10, TimeUnit.MINUTES)
==>true
gremlin> g.V.count()
==>10000

note there are no locking exceptions because my schema does not have locking enabled on "val".  i know that my entries to val are unique because i'm controlling them via counter - i thus manage uniqueness myself.



Reply all
Reply to author
Forward
0 new messages