Eviction and Garbage Collection: OutOfMemoryError, GC overhead limit exceeded, etc..

232 views
Skip to first unread message

Anatoly

unread,
Feb 18, 2014, 4:43:00 PM2/18/14
to haze...@googlegroups.com
When entries get evicted (25% on max-size of elements), I can still see most of memory hangs out in OldGen, after several evictions, I am getting an "OutOfMemoryError".

I tried several: ConcurrentMarkSweep, UseParallelGC, etc.

A simulation is: 7.5K entires in a single map. Heap is 10GB, eviction is at 25% when map reaches 1 * 1000 * 1000 (1 mil ) of entries.

Anything simple I can be overlooking?

Thank you,
/Anatoly

Ahmet Mircik

unread,
Feb 19, 2014, 3:51:57 AM2/19/14
to haze...@googlegroups.com
Hi, 

Which version are you using?


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/0b27124e-2981-48ce-8cb4-bdbb665fec40%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Anatoly

unread,
Feb 19, 2014, 7:05:07 AM2/19/14
to haze...@googlegroups.com
3.1.5

Ahmet Mircik

unread,
Feb 19, 2014, 7:39:38 AM2/19/14
to haze...@googlegroups.com
Can you verify this with latest snapshot?

<dependency>
<groupId>com.hazelcast</groupId> <artifactId>hazelcast</artifactId> <version>3.2-SNAPSHOT</version> </dependency> <repository> <id>sonatype-snapshots</id> <name>Sonatype Snapshot Repository</name> <url>https://oss.sonatype.org/content/repositories/snapshots</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>true</enabled> </snapshots> </repository>


Peter Veentjer

unread,
Feb 19, 2014, 8:56:35 AM2/19/14
to haze...@googlegroups.com
@Ahmet

You are advising the 3.2-SNAPSHOT; do you think that the bug is solved in that release? 

@Anatoly:

Can you make a heap dump so we can have a look at it?


Ahmet Mircik

unread,
Feb 19, 2014, 9:08:24 AM2/19/14
to haze...@googlegroups.com
@Peter AFAIR some eviction related fixes sent. But this issue is different. Probably. 
Just a quick check, if we still there or not.


Anatoly

unread,
Feb 19, 2014, 11:12:50 AM2/19/14
to haze...@googlegroups.com
Is there a way to build/download a mancenter (war) for 3.2-SNAPSHOT or override the "3.1.5" so it works with a "3.2-SNAPSHOT" node?

I am doing it with a 3.2-SNAPSHOT node, but I would like to correlate mancenter to the visualgc.

Thank you,
/Anatoly

Ahmet Mircik

unread,
Feb 19, 2014, 11:30:10 AM2/19/14
to haze...@googlegroups.com
Actually we released an RC yesterday.

<dependency>
<groupId>com.hazelcast</groupId> <artifactId>hazelcast</artifactId>
    <version>3.2-RC1</version>
</dependency>

And this is the management center war for it.


Just use these.




Anatoly

unread,
Feb 19, 2014, 12:48:32 PM2/19/14
to haze...@googlegroups.com
The problem is still there. Here is "heap" stats (the dump is 10GB, not sure Google Groups will like it). And below are health monitor notifications.

Thank you,
/Anatoly

P.S. This run is with G1, but the same behavior is observed with parallel gc and concurrent m&s.


VM version is 24.45-b08

using thread-local object allocation.
Garbage-First (G1) GC with 8 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 11811160064 (11264.0MB)
   NewSize          = 1363144 (1.2999954223632812MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5452592 (5.1999969482421875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 20971520 (20.0MB)
   MaxPermSize      = 85983232 (82.0MB)
   G1HeapRegionSize = 1048576 (1.0MB)

Heap Usage:
G1 Heap:
   regions  = 11264
   capacity = 11811160064 (11264.0MB)
   used     = 9561041744 (9118.11994934082MB)
   free     = 2250118320 (2145.8800506591797MB)
   80.94921830025586% used
G1 Young Generation:
Eden Space:
   regions  = 0
   capacity = 10485760 (10.0MB)
   used     = 0 (0.0MB)
   free     = 10485760 (10.0MB)
   0.0% used
Survivor Space:
   regions  = 39
   capacity = 40894464 (39.0MB)
   used     = 40894464 (39.0MB)
   free     = 0 (0.0MB)
   100.0% used
G1 Old Generation:
   regions  = 9129
   capacity = 11759779840 (11215.0MB)
   used     = 11759124168 (11214.374702453613MB)
   free     = 655672 (0.6252975463867188MB)
   99.99442445344283% used
Perm Generation:
   capacity = 26214400 (25.0MB)
   used     = 25426904 (24.248985290527344MB)
   free     = 787496 (0.7510147094726562MB)
   96.99594116210938% used


[79879]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined.
Feb 19, 2014 11:56:43 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=8.4G, memory.free=2.4G, memory.total=10.8G, memory.max=11.0G, memory.used/total=77.49%, memory.used/max=75.92%, load.process=28.00%, load.system=37.00%, load.systemAverage=355.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=1, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:57:13 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.3G, memory.free=1.5G, memory.total=10.8G, memory.max=11.0G, memory.used/total=86.51%, memory.used/max=84.77%, load.process=28.00%, load.system=32.00%, load.systemAverage=430.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:57:43 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=7.9G, memory.free=2.9G, memory.total=10.8G, memory.max=11.0G, memory.used/total=73.44%, memory.used/max=71.96%, load.process=23.00%, load.system=29.00%, load.systemAverage=386.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:58:13 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=8.5G, memory.free=2.3G, memory.total=10.8G, memory.max=11.0G, memory.used/total=79.03%, memory.used/max=77.43%, load.process=20.00%, load.system=25.00%, load.systemAverage=327.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:58:43 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.1G, memory.free=1.6G, memory.total=10.8G, memory.max=11.0G, memory.used/total=84.73%, memory.used/max=83.02%, load.process=22.00%, load.system=31.00%, load.systemAverage=271.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:59:13 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.7G, memory.free=1.1G, memory.total=10.8G, memory.max=11.0G, memory.used/total=89.62%, memory.used/max=87.81%, load.process=33.00%, load.system=39.00%, load.systemAverage=203.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=1, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:59:43 AM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.9G, memory.free=934.8M, memory.total=10.8G, memory.max=11.0G, memory.used/total=91.53%, memory.used/max=89.69%, load.process=39.00%, load.system=46.00%, load.systemAverage=251.00%, thread.count=53, thread.peakCount=53, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 12:00:13 PM com.hazelcast.util.HealthMonitor
INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=10.2G, memory.free=687.8M, memory.total=10.9G, memory.max=11.0G, memory.used/total=93.82%, memory.used/max=92.74%, load.process=23.00%, load.system=32.00%, load.systemAverage=277.00%, thread.count=54, thread.peakCount=54, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Ahmet Mircik

unread,
Feb 28, 2014, 8:19:59 AM2/28/14
to haze...@googlegroups.com
Hi,

Currently, If you don't access any entry after population of map there is a strong possibly that you get an OOM.
Because, heap keeps growing and eviction process can not handle this. Probably this is your case. 



Anatoly

unread,
Feb 28, 2014, 9:13:56 AM2/28/14
to haze...@googlegroups.com
@Ahmet,

   Can you elaborate on why "eviction process can not handle this"?

Thank you,
/Anatoly

Ahmet Mircik

unread,
Feb 28, 2014, 9:48:19 AM2/28/14
to haze...@googlegroups.com
Say you have 1 node 1 map and 271 partitions and map max size is 1K, eviction percentage is 25 and making 10K puts.
If you don't access any entry after population of map, current eviction logic evicts 1 entry per partition for every run and eviction runs in every second. It may not compensate a high load. Entry access affects LRU | LFU.

For more details please check MapEvictTask class out in this file: 



Reply all
Reply to author
Forward
0 new messages