IMap - when to lock and not to lock?

793 views
Skip to first unread message

Thomas Larsson

unread,
Oct 28, 2013, 2:06:24 PM10/28/13
to haze...@googlegroups.com
Hello.

I have recently inherited some code that uses hazelcast and I am trying to clean it up a bit.
I encountered one piece of code which looks a bit like the one below (I have simplified it a bit).

public class Cache
{
   private final IMap<Key, Value> map;

   public Cache(IMap map) { this.map = map; }

   public void put(Key key, Value value)
   {
      map.lock(key);
      try
      {
         map.put(key, value);
      }
      finally
      {
         map.unlock(key);
      }
   }

   public Value get(Key key)
   {
      return map.get(key);
   }

   public int size()
   {
      return map.size();
   }

   public void clear()
   {
      map.clear();
   }
}

I don't understand what the purpose is for locking around the put operation. I get the feeling it serves no purpose, but it is just that, a feeling.
Why would they lock only around the put operation and not for example around the clear operation, if locking should be used at all here.
I mean, even with locking, any client code that first performs a put and immediately after a get cannot rely on getting back the value which was just put since another thread could have modified the map.

Can anyone give me their thoughts?
If necessary I could probably supply more context to the code.

Best Regards
Thomas

Ali Gürbüz

unread,
Oct 28, 2013, 3:45:00 PM10/28/13
to haze...@googlegroups.com
you are totally right, if you don’t use lock for other methods too, it is meaningless.


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
For more options, visit https://groups.google.com/groups/opt_out.

Iso

unread,
Oct 29, 2013, 3:44:14 AM10/29/13
to haze...@googlegroups.com
I think there should be a lock around clear as well.

No lock for get operation is the normal way as I understand it.
Since hazelcast returns a clone of serialized data, its thread-safe.
And your class is a "cache" so I assume it would be read mostly and used intensively.
Event with a lock you "might" get a old value back (due to network spikes and other network or timing issue no lib can prevent this for happening).
And if this is a read mostly map, with for example 1000 get calls per seconds, then a lock for every get is very expensive. 
As I understand every lock operation is a broadcast to every cluster node, if you have 20 nodes, then the lock will be send, serialized then deserialized every time on every node (20 internal broadcast * 1000 gets). That means potentially "1000" time more overhead (depends on your data of course).

Peter Veentjer

unread,
Oct 29, 2013, 7:19:54 AM10/29/13
to haze...@googlegroups.com
On Tue, Oct 29, 2013 at 9:44 AM, Iso <linmi...@gmx.net> wrote:
I think there should be a lock around clear as well.

What would be the purpose? The clear method is threadsafe.
 

No lock for get operation is the normal way as I understand it.

Correct
 
Since hazelcast returns a clone of serialized data, its thread-safe.

Correct.

Each get will get its own clone of the data, so unless the data is shared between threads, it is isolated and therefor threadsafe.
 
And your class is a "cache" so I assume it would be read mostly and used intensively.
Event with a lock you "might" get a old value back (due to network spikes and other network or timing issue no lib can prevent this for happening).

Even with a normal concurrenthashmap the value that is returned, could be stale as soon as it is received. This is not related to networking or splitbrain.

Hazelcast provides sequential consistency, unless there are network problems.
 
And if this is a read mostly map, with for example 1000 get calls per seconds, then a lock for every get is very expensive. 
As I understand every lock operation is a broadcast to every cluster node, if you have 20 nodes, then the lock will be send, serialized then deserialized every time on every node (20 internal broadcast * 1000 gets). That means potentially "1000" time more overhead (depends on your data of course).

The lock operation doesn't need to be send to every node. The lock is part of a partition, which is owned by a single member. So the lock/unlock operation always are send to the member owning that partiion. 
 

--

Iso

unread,
Oct 29, 2013, 8:21:20 AM10/29/13
to haze...@googlegroups.com
Thanks for information. :)
Reply all
Reply to author
Forward
0 new messages