ConcurrentHashMap array growing to millions entries in Hazelcast_2.0.1 leading to memory leak

3,130 views
Skip to first unread message

Hari Poludasu

unread,
Jan 22, 2013, 11:55:17 PM1/22/13
to haze...@googlegroups.com
Hi,
 
In my production node, sometimes in PLs, java.util.concurrent.ConcurrentHashMap array is growing to 2 million entries and
thus leading to memory leak situation.
Is this a known bug which is similar memory leak issue solved in 1.9.3 verion.
 
 
 
Regards,
Hari.

Hari Poludasu

unread,
Jan 23, 2013, 9:48:17 PM1/23/13
to haze...@googlegroups.com
I have attached MAT heap analysis screenshots of both memory leak and non-memory leak (normal behaviour) on same PL,
This issue also looks close to issue posted on ConcurrentMapManager in this link (http://code.google.com/p/hazelcast/issues/detail?id=230)
HeapDump_memoryleak.jpg
HeapDump_no_memoryleak.jpg

Hari Poludasu

unread,
Jan 29, 2013, 5:28:22 AM1/29/13
to haze...@googlegroups.com
We are using MultiMap<String, String> in our applications and adding entries with put(key, value) and
removing entries with remove(key) and remove(key, vlaue)
 
We have a 10 blade setup, occasionally, on one PL, memory keeps growing, analysis of heapdump as shown in above post shows the
concurrentHashMap under CMap is growing huge, looks like remove operations is not happening on this node.
Is it a known bug in 2.0.1?
 
This is happening in customer site only, in our local environment we are not able to reproduce this.
If it is a known bug we can give a new delivery with later versions of Hazelcast.

On Wednesday, 23 January 2013 12:55:17 UTC+8, Hari Poludasu wrote:

Talip Ozturk

unread,
Jan 29, 2013, 11:10:53 AM1/29/13
to haze...@googlegroups.com
Can you create a simple test to reproduce the issue? Use your objects
to see if it is really removing or not.

If you can reproduce and please post a test that we can also run and reproduce.

-talip
> --
> You received this message because you are subscribed to the Google Groups
> "Hazelcast" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hazelcast+...@googlegroups.com.
>
> To post to this group, send email to haze...@googlegroups.com.
> Visit this group at http://groups.google.com/group/hazelcast?hl=en-US.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Hari Poludasu

unread,
Jan 30, 2013, 10:57:50 PM1/30/13
to haze...@googlegroups.com
Hi Talip,
 
We tried reproducing the issue with a simple testcode, which performs put, get and remove values in MultiMap in a two node cluster on server as shown
 
HazelcastInstance theirHazelcast = Hazelcast.newHazelcastInstance(config);
MultiMap<String, String> map = Hazelcast.getMultiMap("default");
Runtime rt = Runtime.getRuntime().getRuntime();
while (true) {
int cnt = 0;
while(cnt < 10000)
{
 int key = (int) (Math.random() * 10000);
 int operation = ((int) (Math.random() * 100));
 if (operation < 40) {
  map.get(String.valueOf(key));
  gets.incrementAndGet();
 } else if (operation < 80) {
  map.put(String.valueOf(key), "localhost");
  puts.incrementAndGet();
 } else {
  map.remove(String.valueOf(key));
  // map.remove(null);
  removes.incrementAndGet();
 }   
 cnt++;
}
int putCount = puts.getAndSet(0);
int getCount = gets.getAndSet(0);
int removeCount = removes.getAndSet(0);
System.out.println("TOTAL:" + (removeCount + putCount + getCount) / STATS_SECONDS);
System.out.println("PUTS:" + putCount / STATS_SECONDS);
System.out.println("GEtS:" + getCount / STATS_SECONDS);
System.out.println("REMOVES:" + removeCount / STATS_SECONDS);
System.out.println("Total in KB " + rt.totalMemory()/1024 + " used: " + (rt.totalMemory() - rt.freeMemory()) /1024);
try {
 Thread.sleep(STATS_SECONDS * 1000);
} catch (InterruptedException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
}
 
The test was running from last 24 hours, but heap is normal, no memory leak happening.
Even on customer site, it happens once in a month.
 
Here are my finding from the heap dumps of three scenerios (MemoryLeak in customer site, Normal Customer site and heap dump of test code)
I have attached the screenshots of three dumps where the difference lies in number of entries stored in java.util.concurrent.ConcurrentHashMap$HashEntry[] which is a managed by CMap.
 
In screenshots Normal_PL4 and TestCode, size of this array is 16384, but in screenshot of  MemoryLeak_PL7
the value is 2097152 which is multiple of 128 times of 16384.
 
It looks like after the cluster is running for sometime some thread may be blocking in Pl which is not removing the entries from this array, hence once the
chunck of 16384 is filled, again the array is growing with another chunck.
 
Could you explain which reasons might have triggered the ncontrolled increase of this ConcurrentHashMap array size in CMap.
 
Regards,
Hari

On Wednesday, 23 January 2013 12:55:17 UTC+8, Hari Poludasu wrote:
MemoryLeak_PL7.jpg
Normal_PL4.jpg
TestCode.jpg

Enes Akar

unread,
Feb 19, 2013, 5:21:17 AM2/19/13
to haze...@googlegroups.com
Hi Harri;

(sorry for late answer)
I have tried your test code. I also do not experience memory leak.

Looking at your heap dump (the leaked one in your site) it seems there are more than 2 million entries.
So probably somewhat you can not remove from multimap.

What I am suspicious about is whether you implement equals() in your objects removing from multimap.
Hazelcast looks at equals() on removing objects.

Also I recommend you to upgrade to 2.5 version as it has many bugs resolved and improvements that may also recover such issues.




--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.



--
Enes Akar
Hazelcast | Open source in-memory data grid
Mobile: +90.505.394.1668
Reply all
Reply to author
Forward
0 new messages