HazelCast Server stopping with OOM every one week

123 views
Skip to first unread message

pkga...@gmail.com

unread,
Mar 23, 2015, 6:11:36 AM3/23/15
to haze...@googlegroups.com
Hi All,

I am running HazelCast 3.4.1 Server(1 Node only) and connecting it though Java clients from within my other Java files. Please suggest if I am missing something based on the following information. Let me know if you need more information to find the root cause of this.

My use case is very simple

1. Connect to the server from Client.
2. Create a Map with key as String and values as Serialized XML's.
3. This Map is accessed by Mappers and Reducers to read and modify Map Entries. Map Entries are added to map using put, accessed using get and modified using replace methods correspondingly.
4. On Completion
       Delete the map using
              instance.getMap(mapName).destroy(); where instance is an object of type HazelcastClient
       Close the client connections using
              instance.shutdown();

We ran some the following test cases to check for the memory leakage

1.        Shutdown Hadoop. Restarted HazelCast server. Ran Garbage Collection (GC) using mancenter. Created a Heap Dump . See the following summary from JVisualVM while analyzing the HPROF file

     Total bytes: 49,040,741

    Total classes: 2,685

    Total instances: 235,562

    Classloaders: 14

    GC roots: 1,505

    Number of objects pending for finalization: 0

 2.       Restart Hadoop. Ran a job and created a heap dump

     Total bytes: 96,102,606

    Total classes: 2,929

    Total instances: 393,400

    Classloaders: 15

    GC roots: 1,529

    Number of objects pending for finalization: 0

 3.       Shutdown Hadoop. Create a heap dump

    Total bytes: 136,635,654

    Total classes: 2,929

    Total instances: 1,432,769

    Classloaders: 15

    GC roots: 1,529

    Number of objects pending for finalization: 0

4.       Run GC. Create a heap dump

    Total bytes: 16,213,645

    Total classes: 2,929

    Total instances: 137,577

    Classloaders: 15

    GC roots: 1,529

    Number of objects pending for finalization: 0

 The first test is to see the objects in the Heap when the HazelCast server has run no jobs. So there would no maps and entries. If you analyze the heap this is true.

 The second test is to run a Job and check the objects in Heap. There are 2 objects that are of interest in the heap namely com.hazelcast.spi.DefaultObjectNamespace and  com.hazelcast.concurrent.lock.LockStoreImpl. 

 The third test is to see if Hadoop is holding on to these objects. So I shutdown Hadoop and created a heap dump. Those objects are still there in the Heap.

 The fourth test is to see if running GC clears them from Heap. From analyzing the heap, we can see that it cleared some of the objects, but a majority are still in memory.

Thanks,

Praveen Gautam


karv...@gmail.com

unread,
Mar 24, 2015, 4:06:40 AM3/24/15
to haze...@googlegroups.com, pkga...@gmail.com
I work with the original poster and was analyzing this issue along with him. Looks like in the com.hazelcast.concurrent.lock.LockStoreContainer class when clearLockStore is called the lockStores map entry is not removed.

I have attached a small test case's heap dump here. Notice how com.hazelcast.spi.DefaultObjectNamespace and  com.hazelcast.concurrent.lock.LockStoreImpl classes hang around even after the map has been removed.

Is there a reason to not clear the lockStore for a map from the lockStoreContainer?

Please adivce. Thanks,

Arvind.
Reply all
Reply to author
Forward
0 new messages