To be clear, we changed our code to use a Distributed Lock whereas previously we were seeing issues with the IMap-based lock. We made no code change other than to substitute the IMap-based lock with the Distributed Lock, but this was enough to fix the issue for us (or at least mask the root issue). What led us to try a Distributed Lock in lieu of an IMap-based lock is issue 223 (not issue 233) which we know is to directly fix an issue with MultiMap, but the fix appeared to be to components that are shared w/ Map.
I'll attempt to describe below the scenario we think we're seeing. We don't know if the scenario is accurate in terms of the exact sequence of events. Obviously if the lock isn't respected, the events could occur in a different order. The issue is always reproducible in our system.
- Thread 1 acquires lock "foo". Key "foo" is not yet an entry in the IMap, so it puts a value in for "foo", and then releases the lock.
- Thread 2, 3, 4 look to acquire lock "foo"
- Thread 2 acquires lock "foo", sees there's a value for "foo", registers an EntryListener, and then releases the lock.
- Thread 3 acquires lock "foo", sees there's a value for "foo", adds a listener to thread 2, and then releases the lock.
- Thread 4 acquires lock "foo", removes "foo" from the map, and then unlocks "foo".
We haven't found evidence that any of the threads successfully acquire lock "foo" while another thread holds it, though we suspect it to be the case. We're still investigating. The issue we suspect is thread 4 successfully acquires lock "foo" while thread 3 holds it. Meanwhile, it's possible that thread 4 removes "foo" from the map before thread 3 can register its listener on thread 2 and thus thread 3 never gets a callback. We completely rely on the lock "foo" to synchronize these operations.
Finally, we haven't tried running w/ the fix for issue 223, so we have no reason to believe that patch fixes the underlying issue. It may not be relevant. Is it worth trying the fix for 223 or are you convinced it has no relevance to our use case?