State of Distributed Lock

1,426 views
Skip to first unread message

Drew

unread,
Jan 6, 2012, 7:07:17 PM1/6/12
to Hazelcast
Hi Everyone,

We are evaluating HazelCast and Zookeeper for managing distributed
locks. What's the state of distributed locks? Are they production
ready? Anyone using them? Any gotchas/bugs to look out for?

Thanks,

Drew

Talip Ozturk

unread,
Jan 10, 2012, 3:18:25 AM1/10/12
to haze...@googlegroups.com
Yes they are production ready. Transaction support also relies on
locks and transactions are used in many places.

You don't have to use the lock() API for locking.. Here is another way:

ConcurrentMap lockMap = Hazelcast.getMap("locks");
if (lockMap.putIfAbsent(lockKey, thisMember) == null) {
// you got the lock
try {

} finally {
lockMap.remove(lockKey);
}
}

What is good about this?
1. You can persist the locks by using MapStore for this map
2. You can set TTL for the locks.. so locks are auto-released after TTL
3. You can call lockMap.get(lockKey) to get the lock owner (if needed).
What is bad about this?
1. You won't have the tryLock(timeout) support

-talip

> --
> You received this message because you are subscribed to the Google Groups "Hazelcast" group.
> To post to this group, send email to haze...@googlegroups.com.
> To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.
>

Drew

unread,
Jan 11, 2012, 3:28:53 PM1/11/12
to Hazelcast
Thanks Talip for a the response. What we are trying to do is to
control writes to Cassandra, so only one thread (in a multi-threaded/
multi-server environment) can write to a specific table. It is crucial
to make sure that even if there is a network issue between HazelCast
instances, there will only be one thread writing. In addition, if that
thread/process dies, we should release the lock so other threads/
processes can write to the table.

Considering that use case, would you recommend using the Lock API or
the method that you described?

Talip Ozturk

unread,
Jan 12, 2012, 5:28:51 AM1/12/12
to haze...@googlegroups.com
On Wed, Jan 11, 2012 at 10:28 PM, Drew <dr...@venarc.com> wrote:
> Thanks Talip for a the response. What we are trying to do is to
> control writes to Cassandra, so only one thread (in a multi-threaded/
> multi-server environment) can write to a specific table. It is crucial
> to make sure that even if there is a network issue between HazelCast
> instances, there will only be one thread writing.

Say you have two nodes of Hazelcast (or any other clustered lock
manager) and say there is a network problem between the two. Each node
will keep maintaining the locks independently, which will allow two
different processes to acquire the 'write-lock'. In a network
partitioning (split-brain) scenario, lock consistency cannot be
guaranteed, when the lock manager is clustered.

> In addition, if that
> thread/process dies, we should release the lock so other threads/
> processes can write to the table.

Hazelcast will detect the death of lock owner process and release the
locks owned by that node but I cannot detect the user's dead threads
yet.

> Considering that use case, would you recommend using the Lock API or
> the method that you described?

No. I wouldn't unless you relax the network partitioning requirement
or handle it somehow.

-talip

-talip

Zack Radick

unread,
Jan 12, 2012, 1:40:04 PM1/12/12
to haze...@googlegroups.com
Drew,
In order to handle similar constraints I had to write my own partition aware monitoring (using MemberListeners) and configure a cluster majority size in my application.  It is a bit clunky, but it does a reasonably good job of preventing members who are not part of the majority cluster from continuing operations in an unsafe state.  You could probably do something similar.

You also might want to vote on the enhancement to add cluster majority configuration to HC:
http://code.google.com/p/hazelcast/issues/detail?id=725 

Cheers,
--Zack

Drew

unread,
Jan 12, 2012, 2:19:19 PM1/12/12
to Hazelcast
@Talip:
I can relax the requirement for now since we don't have that many
servers, (in hopes of getting ticket 725 in future).

With that assumption, is it better to use the Map approach or the Lock
approach? Which one is a more stable code path?


@Zack: Thanks Zack, is you code open source? As far as the ticket, I
voted for it a while back and I've been following it ;)



On Jan 12, 10:40 am, Zack Radick <zrad...@conducivetech.com> wrote:
> Drew,
> In order to handle similar constraints I had to write my own partition
> aware monitoring (using MemberListeners) and configure a cluster majority
> size in my application.  It is a bit clunky, but it does a reasonably good
> job of preventing members who are not part of the majority cluster from
> continuing operations in an unsafe state.  You could probably do something
> similar.
>
> You also might want to vote on the enhancement to add cluster majority
> configuration to HC:http://code.google.com/p/hazelcast/issues/detail?id=725
>
> Cheers,
> --Zack
>
>
>
>
>
>
>
> On Thu, Jan 12, 2012 at 2:28 AM, Talip Ozturk <ta...@hazelcast.com> wrote:

Zack Radick

unread,
Jan 12, 2012, 4:44:19 PM1/12/12
to haze...@googlegroups.com
Drew,
Unfortunately it is not open source, but the basic idea is that I have a manager style class that encapsulates all of the Hazelcast usage. Within that class I have a configured "cluster majority" and registered a membership listener on the Hazelcast cluster.  Before I allow work to be processed, I ensure that the local member can still see a majority of the cluster members.  This does not  guarantee that I can't accidentally have two machines think they can do something for a given key during a partition event (typically only the work that is in-flight is at risk), but I have made my work idempotent to prevent problems as a result.
Cheers,
--Zack

Talip Ozturk

unread,
Jan 12, 2012, 5:50:20 PM1/12/12
to haze...@googlegroups.com
> I can relax the requirement for now since we don't have that many
> servers, (in hopes of getting ticket 725 in future).
>
> With that assumption, is it better to use the Map approach or the Lock
> approach? Which one is a more stable code path?

I would go with Lock approach because Locks are auto-released when the
lock owner process dies. With the map.putIfAbsent() approach, you will
have to listen to membership events and remove (release) the locks
owned by the dead member yourself.

So definitely go with locks but even here you have two options: global
locks and locks-map
1. Global Locks
Lock lock = Hazelcast.getLock(keyToLock);
lock.lock();
try {
}fnally{
lock.unlock();
}

This approach is nice when you have tens of locks. Because each global
lock instance is managed cluster-wide. And you will have to call
lock.destroy() to terminate it (get it garbage collected).

2. Locks-Map
IMap lockMap = Hazelcast.getMap("locks");
lockMap.lock(keyToLock);
try{
}finally {
lockMap.unlock(keyToLock);
}
You use this map only for locks. Locks created here are very cheap and
auto-garbage collected when there is no lock owner and no-one waiting
on the lock. You can have millions of locks with this approach; very
light and you do nothing to maintain. This is my favorite.

-talip

shane muffat

unread,
Jun 20, 2012, 2:49:00 PM6/20/12
to haze...@googlegroups.com
Semi-old thread but very important.  How can locking be considered production ready if it cannot ensure lock consistency?  I didn't realize this until reading this thread and I think I have to look elsewhere to avoid multiple nodes from obtaining the same lock.

Fuad Malikov

unread,
Jun 21, 2012, 2:43:39 AM6/21/12
to haze...@googlegroups.com
What's the relation between lock consistency and being production ready?

lock's are consistent unless you have a split in your network. And at that point Hazelcast selects availability instead of consistency and keeps going. You may want to have a membership listener and whenever a node leaves a cluster, you can stop your own application to preserve the consistency.

-fuad

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To view this discussion on the web visit https://groups.google.com/d/msg/hazelcast/-/12KgGZ2cDyIJ.

shane muffat

unread,
Jun 21, 2012, 11:45:36 AM6/21/12
to haze...@googlegroups.com
Yes, you are right...it's just a limitation.  Zookeeper handles this with enforcing majority rules using quorum algorithms.  If I care about order of my tasks coming in, then I must have lock consistency across the nodes.  I think we'll end up using both Hazelcast for caching and other maps and zookeeper to ensure lock consistency and possibly our queues as well.

It would be great if Hazelcast implemented something like zookeeper for this kind of issue.

Thanks for the reply.



On Thursday, June 21, 2012 2:43:39 AM UTC-4, Fuad Malikov wrote:
What's the relation between lock consistency and being production ready?

lock's are consistent unless you have a split in your network. And at that point Hazelcast selects availability instead of consistency and keeps going. You may want to have a membership listener and whenever a node leaves a cluster, you can stop your own application to preserve the consistency.

-fuad
Reply all
Reply to author
Forward
0 new messages