Recommendations for implementing locking with Tarantool

Ciprian D

unread,

Jul 14, 2018, 10:49:28 AM7/14/18

to Tarantool discussion group (English)

Hello everyone!

I am looking to implement locking support (as in binary semaphores, not I/O locking) on top of Tarantool in the context multi-master deployments.

In the case of single node or single master deployments, this should be straightforward as atomicity is guaranteed by the single thread implementation (other fibers will wait their turn while a lock is being acquired/released). There is also a nice off the shelf solution available https://github.com/dreadatour/tarantool-locksmith

However, in a multi-master architecture, the concept is not (as) portable as, in the event of a potential race condition, the lock specific operations are not necessarily commutative with respect to the follow up operations when an object lock is being acquired/released: https://tarantool.io/en/doc/2.0/book/replication/repl_duplicates/#commutative-changes -- therefore a naive lock implementation could be subject to (a) false lock assessment in the context of race conditions as well as (b) driving the replication process out of sync when Tarantool cannot reconcile data on competing master nodes.

Any advice on how to proper architect/model locks on multi-master Tarantool deployments is appreciated, thank you!

Konstantin Nazarov

unread,

Jul 14, 2018, 1:32:24 PM7/14/18

to tara...@googlegroups.com, Ciprian D

Truly fault tolerant distributed locks require raft, paxos or other consistent distributed log replication algorithm.

Trying to do that with asynchronous master-master is likely not feasible without tuning your expectations.

Fortunately, if you are fine with certain level of unavailability, you may design the system around other high level tools like Consul. Consul itself is not performant enough to serve as a provider of fine-grained real-time locks, but it can reliably coordinate multiple instances of tarantool. Using it you may elect a leader among tarantools in multiple regions/servers with high level of certainty, and mark the rest as read-only. Clients will have to talk to consul first to find the current leader of the pack, and then ask how long this node is a leader. If it has been elected less than, say, 10 seconds ago, clients must wait for locks to expire, until taking any locks.

--
You received this message because you are subscribed to the Google Groups "Tarantool discussion group (English)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tarantool+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ciprian D

unread,

Jul 16, 2018, 6:27:36 PM7/16/18

to Tarantool discussion group (English)

Hi Konstantin,

Thanks for the response, this is really insightful.

Accepting the obvious ("tuning your expectations") is key for the proper architecture approach; I am still in the experimenting stage and I have a few options on the table -- I'd like to employ some of the indirect sharding generated by load balancing in such fashion were nodes will be responsible to managing locks pertaining to resources "assigned" directly to them; competing nodes would would have to directly request lock acquiring/releasing via an API call executed against the node which governs the resource in question; the locks themselves would be stored in temporary spaces so they would not fall into the replication scope. In the event of a node failure (regardless of the culprit's nature), its former assigned resources can be rebalanced across the cluster.

Thanks again!

Reply all

Reply to author

Forward