Concurrent reads on single transaction?

88 views
Skip to first unread message

Alan47

unread,
Sep 2, 2015, 3:59:00 AM9/2/15
to OrientDB
Hello everyone,


Detailed explanation of the situation
we are currently in the process of migrating a project that previously used Titan to OrientDB. We are facing some issues regarding concurrency. In Titan, several threads could perform read operations on a single transaction without any problems. The OrientDB manual states that each thread should have its own OrientGraph instance. This is hard for us to manage, as we are using OrientDB as the backend for a web application. Such an environment runs a lot of threads, some of them long-running (e.g. threads managed by Tomcat), others short-lived (e.g spawned by collection.parallelStream()). Any solution involving a ThreadLocal is problematic because there is no clean way to clean it up again and closing the OrientGraph instance. Also, opening transactions in OrientDB seems to have a considerable performance overhead, which means that we have to keep a transaction open and perform several reads on it to minimize the transaction creation overhead. But this is only possible if we can be sure that it is safe to have multiple concurrent reads on a single OrientDB instance. Is that the case?

TL;DR (Short version)
Is it safe to perform concurrent reads on an OrientGraph instance, provided that no write operations occur on that instance?


Thanks,


Alan

Alan47

unread,
Sep 13, 2015, 5:08:25 AM9/13/15
to OrientDB
I am still looking for an answer to this question...

Jan Plaček

unread,
Sep 13, 2015, 11:23:49 AM9/13/15
to OrientDB
I am a newbie so I unfortuantelly can't answer your question, but I would like to ask why do you have to perform reads in a transaction? Do you use REPEATABLE READS isolation level?

Dne středa 2. září 2015 9:59:00 UTC+2 Alan47 napsal(a):

Alan47

unread,
Sep 15, 2015, 11:39:15 AM9/15/15
to OrientDB
Hi,

thanks for the reply, and sorry for the late answer, these are busy times for me...

@Topic: To the best of my current knowledge about databases, if you don't use transactions on both read and write operations, then write operations cannot perform the required locking and synchronization. I assume that OrientDB is no different in that matter. I also assume that "repeatable reads" is the standard isolation level in OrientDB, as in all other ACID databases that I am aware of.

But the question is a much simpler one, actually. The OrientDB graph documentation states (at several locations) that each thread must use its own OrientGraph instance, obtained by calling OrientGraphFactory#tx() nor #noTx(), respectively. I only want to know if it is safe to share an OrientGraph among multiple threads for read-only purposes, i.e. will it violate some assumptions in the caching implementation for example? This is not documented anywhere, and I'm afraid only an Orient DB developer (or someone with really deep insight in the code) can answer this.

Jan Plaček

unread,
Sep 15, 2015, 12:32:52 PM9/15/15
to OrientDB
Actually by default OrientDB uses READ COMMITED isolation level and it's only available option for remote mode.
REPEATABLE READS can be used with memory or plocal modes, with a price of higher memory consumption.

You stated that there won't be any writes on that Graph instance, therefore TXs are not neccessary if you don't use REPEATABLE READS (the reads are always running out of TX scope) and therefore you can use non-transactional instance without any performance overhead caused by TX creation.

About sharing DB among threads as you said, that's a question for developer. However if it's possible now, my concern would be if it will also be possible in the future ... I think that developers are counting on the fact that one instance is used in one thread.

Dne úterý 15. září 2015 17:39:15 UTC+2 Alan47 napsal(a):

Alan47

unread,
Sep 15, 2015, 3:44:14 PM9/15/15
to OrientDB
Hi,
thanks for the response, much appreciated! The fact that OrientDB executes reads exclusively outside of transactions is new to me, and rather surprising. Is this documented anywhere?

We are running OrientDB in plocal mode exclusively, and according to a doc page I read today the default isolation level is serializable, which is more than sufficient.

If reads do indeed not require a transaction, i.e. a OrientGraphNoTx, then it would be interesting how that class is intended to behave under concurrent access, as the documentation refers to "one transaction per thread". That would imply that if there is no transaction, we are free to execute parallel reads on the same graph instance, but thats just my interpretation.

You are absolutely correct about the threat of future problems when relying on undocumented behaviour. This is exactly why I ask this question, in fact our application is running fine right now with concurrent reads on the same graph.

Jan Plaček

unread,
Sep 15, 2015, 5:16:42 PM9/15/15
to orient-...@googlegroups.com

Check the isolation part.

I might not be precise about "reads". The COMMANDS are executed out of TX scope.
The point is that when a record is "fetched" from DB you will get lastly commited version of that record, every time this "fetch" is requested and it does not matter if there is or isn't a transaction. 
However, as far as my observations go, accessing the same record via Graph API multiple times (Graph.getVertex(id)) in a single transaction does not lead to refetching/querying DB for that record again.
(Actually I am not sure about that, but I am sure that changes performed on this record in a single TX are preserved, so it leads me to a conclusion, that you're working with a local cached copy of that record)
So by using this (imho) very limited Graph API exclusively you will basically achieve a REPEATABLE READS behavior, but in reality it isn't, because it does not work with other retrieval methods (queries/commands).
This is at least my understanding of the documentation and my observations.



Dne úterý 15. září 2015 21:44:14 UTC+2 Alan47 napsal(a):
Reply all
Reply to author
Forward
0 new messages