Eviction and Garbage Collection: OutOfMemoryError, GC overhead limit exceeded, etc..

Anatoly

unread,

Feb 18, 2014, 4:43:00 PM2/18/14

to haze...@googlegroups.com

When entries get evicted (25% on max-size of elements), I can still see most of memory hangs out in OldGen, after several evictions, I am getting an "OutOfMemoryError".

I tried several: ConcurrentMarkSweep, UseParallelGC, etc.

A simulation is: 7.5K entires in a single map. Heap is 10GB, eviction is at 25% when map reaches 1 * 1000 * 1000 (1 mil ) of entries.

Anything simple I can be overlooking?

Thank you,

/Anatoly

Ahmet Mircik

unread,

Feb 19, 2014, 3:51:57 AM2/19/14

to haze...@googlegroups.com

Hi,

Which version are you using?

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/0b27124e-2981-48ce-8cb4-bdbb665fec40%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Anatoly

unread,

Feb 19, 2014, 7:05:07 AM2/19/14

to haze...@googlegroups.com

3.1.5

Ahmet Mircik

unread,

Feb 19, 2014, 7:39:38 AM2/19/14

to haze...@googlegroups.com

Can you verify this with latest snapshot?


<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast</artifactId>
    <version>3.2-SNAPSHOT</version>
</dependency>
<repository>
    <id>sonatype-snapshots</id>
    <name>Sonatype Snapshot Repository</name>
    <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    <releases>
        <enabled>false</enabled>
    </releases>
    <snapshots>
        <enabled>true</enabled>
    </snapshots>
</repository>

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/25b730a3-8b6c-4482-bd3e-f7800d9c2905%40googlegroups.com.

Peter Veentjer

unread,

Feb 19, 2014, 8:56:35 AM2/19/14

to haze...@googlegroups.com

@Ahmet

You are advising the 3.2-SNAPSHOT; do you think that the bug is solved in that release?

@Anatoly:

Can you make a heap dump so we can have a look at it?

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/CAEb73hHs8V6%2Bgq0hZrFqi7bcpg8eEiNkEwXULx532DLADskDKg%40mail.gmail.com.

Ahmet Mircik

unread,

Feb 19, 2014, 9:08:24 AM2/19/14

to haze...@googlegroups.com

@Peter AFAIR some eviction related fixes sent. But this issue is different. Probably.

Just a quick check, if we still there or not.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/CAGuAWdDtZShcVpH9%2BSn%2Bg8n7u6srRpoAqcxkz4m7qqhq7ZExPQ%40mail.gmail.com.

Anatoly

unread,

Feb 19, 2014, 11:12:50 AM2/19/14

to haze...@googlegroups.com

Is there a way to build/download a mancenter (war) for 3.2-SNAPSHOT or override the "3.1.5" so it works with a "3.2-SNAPSHOT" node?

I am doing it with a 3.2-SNAPSHOT node, but I would like to correlate mancenter to the visualgc.

Thank you,

/Anatoly

Ahmet Mircik

unread,

Feb 19, 2014, 11:30:10 AM2/19/14

to haze...@googlegroups.com

Actually we released an RC yesterday.


<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast</artifactId>

    <version>3.2-RC1</version>
</dependency>

And this is the management center war for it.

https://s3.amazonaws.com/uploads.hipchat.com/80135/576619/kQFjdV77dCMVsQt/hazelcast-3.2-RC1.zip

Just use these.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/8ca17231-5365-4ad8-90d2-d7f0cfccc461%40googlegroups.com.

Anatoly

unread,

Feb 19, 2014, 12:48:32 PM2/19/14

to haze...@googlegroups.com

The problem is still there. Here is "heap" stats (the dump is 10GB, not sure Google Groups will like it). And below are health monitor notifications.

Thank you,

/Anatoly

P.S. This run is with G1, but the same behavior is observed with parallel gc and concurrent m&s.

VM version is 24.45-b08

using thread-local object allocation.

Garbage-First (G1) GC with 8 thread(s)

Heap Configuration:

MinHeapFreeRatio = 40

MaxHeapFreeRatio = 70

MaxHeapSize = 11811160064 (11264.0MB)

NewSize = 1363144 (1.2999954223632812MB)

MaxNewSize = 17592186044415 MB

OldSize = 5452592 (5.1999969482421875MB)

NewRatio = 2

SurvivorRatio = 8

PermSize = 20971520 (20.0MB)

MaxPermSize = 85983232 (82.0MB)

G1HeapRegionSize = 1048576 (1.0MB)

Heap Usage:

G1 Heap:

regions = 11264

capacity = 11811160064 (11264.0MB)

used = 9561041744 (9118.11994934082MB)

free = 2250118320 (2145.8800506591797MB)

80.94921830025586% used

G1 Young Generation:

Eden Space:

regions = 0

capacity = 10485760 (10.0MB)

used = 0 (0.0MB)

free = 10485760 (10.0MB)

0.0% used

Survivor Space:

regions = 39

capacity = 40894464 (39.0MB)

used = 40894464 (39.0MB)

free = 0 (0.0MB)

100.0% used

G1 Old Generation:

regions = 9129

capacity = 11759779840 (11215.0MB)

used = 11759124168 (11214.374702453613MB)

free = 655672 (0.6252975463867188MB)

99.99442445344283% used

Perm Generation:

capacity = 26214400 (25.0MB)

used = 25426904 (24.248985290527344MB)

free = 787496 (0.7510147094726562MB)

96.99594116210938% used

[79879]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined.

Feb 19, 2014 11:56:43 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=8.4G, memory.free=2.4G, memory.total=10.8G, memory.max=11.0G, memory.used/total=77.49%, memory.used/max=75.92%, load.process=28.00%, load.system=37.00%, load.systemAverage=355.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=1, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:57:13 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.3G, memory.free=1.5G, memory.total=10.8G, memory.max=11.0G, memory.used/total=86.51%, memory.used/max=84.77%, load.process=28.00%, load.system=32.00%, load.systemAverage=430.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:57:43 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=7.9G, memory.free=2.9G, memory.total=10.8G, memory.max=11.0G, memory.used/total=73.44%, memory.used/max=71.96%, load.process=23.00%, load.system=29.00%, load.systemAverage=386.00%, thread.count=52, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:58:13 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=8.5G, memory.free=2.3G, memory.total=10.8G, memory.max=11.0G, memory.used/total=79.03%, memory.used/max=77.43%, load.process=20.00%, load.system=25.00%, load.systemAverage=327.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:58:43 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.1G, memory.free=1.6G, memory.total=10.8G, memory.max=11.0G, memory.used/total=84.73%, memory.used/max=83.02%, load.process=22.00%, load.system=31.00%, load.systemAverage=271.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:59:13 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.7G, memory.free=1.1G, memory.total=10.8G, memory.max=11.0G, memory.used/total=89.62%, memory.used/max=87.81%, load.process=33.00%, load.system=39.00%, load.systemAverage=203.00%, thread.count=51, thread.peakCount=52, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=1, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 11:59:43 AM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=9.9G, memory.free=934.8M, memory.total=10.8G, memory.max=11.0G, memory.used/total=91.53%, memory.used/max=89.69%, load.process=39.00%, load.system=46.00%, load.systemAverage=251.00%, thread.count=53, thread.peakCount=53, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Feb 19, 2014 12:00:13 PM com.hazelcast.util.HealthMonitor

INFO: [x.x.x.x]:5702 [dev] [3.2-RC1] memory.used=10.2G, memory.free=687.8M, memory.total=10.9G, memory.max=11.0G, memory.used/total=93.82%, memory.used/max=92.74%, load.process=23.00%, load.system=32.00%, load.systemAverage=277.00%, thread.count=54, thread.peakCount=54, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=6, clientEndpoint.count=0, connection.active.count=0, connection.count=0

Ahmet Mircik

unread,

Feb 28, 2014, 8:19:59 AM2/28/14

to haze...@googlegroups.com

Hi,

Currently, If you don't access any entry after population of map there is a strong possibly that you get an OOM.

Because, heap keeps growing and eviction process can not handle this. Probably this is your case.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/65dcc8a8-1f10-4531-bd0f-aa54334d139e%40googlegroups.com.

Anatoly

unread,

Feb 28, 2014, 9:13:56 AM2/28/14

to haze...@googlegroups.com

@Ahmet,

Can you elaborate on why "eviction process can not handle this"?

Thank you,

/Anatoly

Ahmet Mircik

unread,

Feb 28, 2014, 9:48:19 AM2/28/14

to haze...@googlegroups.com

Say you have 1 node 1 map and 271 partitions and map max size is 1K, eviction percentage is 25 and making 10K puts.

If you don't access any entry after population of map, current eviction logic evicts 1 entry per partition for every run and eviction runs in every second. It may not compensate a high load. Entry access affects LRU | LFU.

For more details please check MapEvictTask class out in this file:

https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/map/MapService.java

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/4d60ed78-3279-4738-927f-895cf2f27037%40googlegroups.com.

Reply all

Reply to author

Forward