WAS Out Of Memory, Heap Size .......

srini...@yahoo.com

unread,

May 25, 2004, 12:35:17 PM5/25/04

to

Hi All,

I have an application (production) On WAS 4.0.2 running on IBM AIX. It didn't have any problem for a year. Now I see a lot of "java.lang.OutOfMemoryError" errors.

I enabled the Verbose GC , analyzed the Allocation Failures and noticed that the application server is running out of memory when it needs more than ~40MB. The Heapsize for the application server is 512MB (-Xmx and -Xms). Surprisingly it works fine sometimes. Which made me think that the issue could be a "Heap fragmentation" issue.

I also enabled the heap profiling but failed to get the heapdump to the file that I created. I downloaded the HAT(Heap Analysis Tool) to see if there are any memory leaks. But the tool needs the binary format of the heap dump.

I am also enclosing the successful and unsuccessful Verbose GC log.

I'd be grateful to you, if you could let me know

1) Can I increase the heap size to 1024M? If so, are there any consequences?
2) How to trigger the WAS heap dump when needed?
3) If I can change the JDK for a better garbage collection. As you know we get the IBM JVM 1.3.1 with the WAS 4.0.2.
4) If we should migrate to WAS 5.1?
5) If I am off track?

I apologize, if I am not clear. Please let me know, if I need to post any other information.

Thanks in advance,
Srinivas.

The following is the Verbose GC log for the recreation of the problem that we have.

The Successful Garbage Collection :

<AF[22]: Allocation Failure. need 46511728 bytes, 2055340 ms since last AF>
<AF[22]: managing allocation failure, action=1 (89027200/254700928) (13401048/13
at com.toyota.owk.framework.broker.document.AdobeProcessManager$AdobeTim
eOutTimer.run(AdobeProcessManager.java:113)

<AF[22]: Allocation Failure. need 46511728 bytes, 2055340 ms since last AF>
<AF[22]: managing allocation failure, action=1 (89027200/254700928) (13401048/13
405312)>
<GC(26): GC cycle started Sun May 23 18:30:14 2004
<GC(26): freed 96076480 bytes, 74% free (198504728/268106240), in 3933 ms>
<GC(26): mark: 112 ms, sweep: 3234 ms, compact: 587 ms>
<GC(26): refs: soft 0 (age >= 32), weak 0, final 87, phantom 0>
<GC(26): moved 115393 objects, 52220456 bytes, reason=1, used 32 more bytes>
<AF[22]: managing allocation failure, action=3 (198504728/268106240)>
<GC(26): need to expand mark bits for 268368384-byte heap>
<GC(26): expanded mark bits by 4096 to 4194304 bytes>
<GC(26): need to expand alloc bits for 268368384-byte heap>
<GC(26): expanded alloc bits by 4096 to 4194304 bytes>
<GC(26): expanded heap fully by 262144 to 268368384 bytes, 74% free>
<AF[22]: managing allocation failure, action=4 (198766872/268368384)>
<AF[22]: clearing all remaining soft refs>
<GC(27): GC cycle started Sun May 23 18:30:14 2004
<GC(27): freed 46643824 bytes, 91% free (245410696/268368384), in 168 ms>
<GC(27): mark: 96 ms, sweep: 72 ms, compact: 0 ms>
<GC(27): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
<GC(28): GC cycle started Sun May 23 18:30:14 2004
<GC(28): freed 64 bytes, 91% free (245410760/268368384), in 159 ms>
<GC(28): mark: 95 ms, sweep: 64 ms, compact: 0 ms>
<GC(28): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
<AF[22]: completed in 4309 ms>

The Unsuccessful Garbage Collection

+227951 <AF[23]: Allocation Failure. need 46511936 bytes, 2570785 ms since last
AF>
+227952 <AF[23]: managing allocation failure, action=1 (120881744/254949968) (13
418336/13418416)>
+227953 <GC(29): GC cycle started Sun May 23 19:13:07 2004
+227954 <GC(29): freed 64441840 bytes, 74% free (198741920/268368384), in 2219 m
s>
+227955 <GC(29): mark: 101 ms, sweep: 1673 ms, compact: 445 ms>
+227956 <GC(29): refs: soft 0 (age >= 6), weak 0, final 77, phantom 0>
+227957 <GC(29): moved 107994 objects, 5137272 bytes, reason=1, used 40 more b
ytes>
+227958 <AF[23]: managing allocation failure, action=3 (198741920/268368384)>
+227959 <AF[23]: managing allocation failure, action=4 (198741920/268368384)>
+227960 <AF[23]: clearing all remaining soft refs>
+227961 <GC(30): GC cycle started Sun May 23 19:13:07 2004
+227962 <GC(30): freed 26560 bytes, 74% free (198768480/268368384), in 160 ms>
+227963 <GC(30): mark: 94 ms, sweep: 66 ms, compact: 0 ms>
+227964 <GC(30): refs: soft 9 (age >= 6), weak 0, final 0, phantom 0>
+227965 <GC(31): GC cycle started Sun May 23 19:13:08 2004
+227966 <GC(31): freed 1784 bytes, 74% free (198770264/268368384), in 412 ms>
+227967 <GC(31): mark: 95 ms, sweep: 65 ms, compact: 252 ms>
+227968 <GC(31): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+227969 <GC(31): moved 4728 objects, 315864 bytes, reason=1>
+227970 <AF[23]: managing allocation failure, action=6 (198770264/268368384)>
+227971 <AF[23]: totally out of heap space>
+227972 <AF[23]: completed in 2805 ms>

The Next Trial: Success!!

+228397 <AF[24]: Allocation Failure. need 46511976 bytes, 500396 ms since last A
F>
+228398 <AF[24]: managing allocation failure, action=2 (162426736/268368384)>
+228399 <GC(32): GC cycle started Sun May 23 19:21:30 2004
+228400 <GC(32): freed 82930848 bytes, 91% free (245357584/268368384), in 2157 m
s>
+228401 <GC(32): mark: 100 ms, sweep: 1618 ms, compact: 439 ms>
+228402 <GC(32): refs: soft 0 (age >= 32), weak 0, final 36, phantom 0>
+228403 <GC(32): moved 106166 objects, 5030432 bytes, reason=1, used 32 more b
ytes>
+228404 <AF[24]: completed in 2157 ms>

Another Trial: Failure

+228840 <AF[25]: Allocation Failure. need 46511936 bytes, 3652996 ms since last
AF>
+228841 <AF[25]: managing allocation failure, action=1 (119467040/254949968) (13
418336/13418416)>
+228842 <GC(33): GC cycle started Sun May 23 20:22:25 2004
+228843 <GC(33): freed 65786720 bytes, 74% free (198672096/268368384), in 2242 m
s>
+228844 <GC(33): mark: 102 ms, sweep: 1690 ms, compact: 450 ms>
+228845 <GC(33): refs: soft 0 (age >= 32), weak 0, final 90, phantom 0>
+228846 <GC(33): moved 107637 objects, 4968144 bytes, reason=1, used 184 more
bytes>
+228847 <AF[25]: managing allocation failure, action=3 (198672096/268368384)>
+228848 <AF[25]: managing allocation failure, action=4 (198672096/268368384)>
+228849 <AF[25]: clearing all remaining soft refs>
+228850 <GC(34): GC cycle started Sun May 23 20:22:26 2004
+228851 <GC(34): freed 12864 bytes, 74% free (198684960/268368384), in 171 ms>
+228852 <GC(34): mark: 105 ms, sweep: 66 ms, compact: 0 ms>
+228853 <GC(34): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
+228854 <GC(35): GC cycle started Sun May 23 20:22:26 2004
+228855 <GC(35): freed 584 bytes, 74% free (198685544/268368384), in 407 ms>
+228856 <GC(35): mark: 97 ms, sweep: 66 ms, compact: 244 ms>
+228857 <GC(35): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+228858 <GC(35): moved 59 objects, 2928 bytes, reason=1>
+228859 <AF[25]: managing allocation failure, action=6 (198685544/268368384)>
+228860 <AF[25]: totally out of heap space>
+228861 <AF[25]: completed in 2833 ms>

Next Trial: Success

<AF[26]: Allocation Failure. need 46511480 bytes, 867829 ms since last AF>
<AF[26]: managing allocation failure, action=2 (161879528/268368384)>
<GC(36): GC cycle started Sun May 23 20:36:56 2004
<GC(36): freed 81138288 bytes, 90% free (243017816/268368384), in 1763 ms>
<GC(36): mark: 100 ms, sweep: 1663 ms, compact: 0 ms>
<GC(36): refs: soft 0 (age >= 32), weak 0, final 42, phantom 0>
<AF[26]: completed in 1765 ms>

Failure :

+229717 <AF[27]: Allocation Failure. need 46511952 bytes, 800224 ms since last A
F>
+229718 <AF[27]: managing allocation failure, action=1 (122001576/254949968) (13
418336/13418416)>
+229719 <GC(37): GC cycle started Sun May 23 20:50:18 2004
+229720 <GC(37): freed 63370960 bytes, 74% free (198790872/268368384), in 2185 m
s>
+229721 <GC(37): mark: 101 ms, sweep: 1649 ms, compact: 435 ms>
+229722 <GC(37): refs: soft 0 (age >= 32), weak 0, final 41, phantom 0>
+229723 <GC(37): moved 105128 objects, 5047152 bytes, reason=1, used 104 more
bytes>
+229724 <AF[27]: managing allocation failure, action=3 (198790872/268368384)>
+229725 <AF[27]: managing allocation failure, action=4 (198790872/268368384)>
+229726 <AF[27]: clearing all remaining soft refs>
+229727 <GC(38): GC cycle started Sun May 23 20:50:18 2004
+229728 <GC(38): freed 29152 bytes, 74% free (198820024/268368384), in 164 ms>
+229729 <GC(38): mark: 98 ms, sweep: 66 ms, compact: 0 ms>
+229730 <GC(38): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
+229731 <GC(39): GC cycle started Sun May 23 20:50:19 2004
+229732 <GC(39): freed 2592 bytes, 74% free (198822616/268368384), in 421 ms>
+229733 <GC(39): mark: 96 ms, sweep: 64 ms, compact: 261 ms>
+229734 <GC(39): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+229735 <GC(39): moved 9830 objects, 510056 bytes, reason=1>
+229736 <AF[27]: managing allocation failure, action=6 (198822616/268368384)>
+229737 <AF[27]: totally out of heap space>
+229738 <AF[27]: completed in 2783 ms>

Success:

<AF[28]: Allocation Failure. need 46511824 bytes, 951781 ms since last AF>
<AF[28]: managing allocation failure, action=2 (165643864/268368384)>
<GC(40): GC cycle started Sun May 23 21:06:12 2004
<GC(40): freed 77349104 bytes, 90% free (242992968/268368384), in 1622 ms>
<GC(40): mark: 114 ms, sweep: 1508 ms, compact: 0 ms>
<GC(40): refs: soft 0 (age >= 32), weak 0, final 34, phantom 0>
<AF[28]: completed in 1622 ms>

Ben_

unread,

May 25, 2004, 3:39:06 PM5/25/04

to

> 1) Can I increase the heap size to 1024M? If so, are there any
consequences?

Technically possible, but you have to weight this with the fact that it will
take much more time to GC, and having a larger heap may only make problems
appear later, if there is an implementation problem in the code. As you
already identified I also believe it's a fragmentation issue (there may be
40 Mb heap free but not contiguously and GC cycles and heap compaction did
not help). I understand from the exception stacktrace that you're processing
PDFs on the fly (AdobeProcessManager). It may be legitimate to have to
allocate 40 Mb heap but I would have a look at the code to see if there are
means to allocate smaller chunks of contiguous heap (arrays require
contiguous memory) or use different object (vectors instead of arrays, for
example).

Additionaly, the sample you posted shows GC cycles up to four seconds. You
should ensure there is enough physical memory and the memory is not swapped
(it dramatically impacts GC). Also, I would expect that double the memory
would be more than double the time to GC, but I may be wrong on this.

> 2) How to trigger the WAS heap dump when needed?

I never tried.

> 3) If I can change the JDK for a better garbage collection. As you know we
get the IBM JVM 1.3.1 with the WAS 4.0.2.

You can't, but anyway, it don't see in what it would help in this case.

> 4) If we should migrate to WAS 5.1?

To have JDK 1.4 and J2EE 1.3 ? Again, I don't see in what it would help.

> 5) If I am off track?

You already made a good job at tracking the problem. I would try to
determine if it is legitimate / good design that the application requires 40
MB heap at once. It may be a coincidence, but GC log shows it always needs
46511xxx bytes. It's intriguing, but could simply be because you make the
exact same operation.

Hope this helps already.

Glen Burson

unread,

May 26, 2004, 4:17:27 AM5/26/04

to

Hi Srinivas,

Firstly I'd have a look at the application to see why it's requesting 40mb
of memory - it's far too much for the GC to be efficient. It should be split
up into much smaller chunks. I can recommend a really good document around
this subject: -

http://www-106.ibm.com/developerworks/java/jdk/diagnosis/

In answer to your questions: -

1. Yes you can increase the heap to 1024M, but it probably won't fix your
problem and may actually make the application performance worse if you
haven't enough real memory to support it.
2. To trigger a heap dump, add this to your startupServer.sh (at the start);

IBM_JAVA_MMAP_JAVA_HEAP=true
export IBM_JAVA_MMAP_JAVA_HEAP

Then a "kill -3 <process id>" will trigger a heap dump. This will also cause
a heap dump when you get an OutOfMemory exception.

3. Don't think you can upgrade the JVM, although there is a patch to fix a
known memory leak in the JVM ORB - see PQ69054.

4. I wouldn't migrate just to fix this problem without fully understanding
it. If the problem is an application one, you will have the same problem
regardless of the WebSphere version.

Cheers,

Glen.

"srinu...@yahoo.co" <srini...@yahoo.com> wrote in message
news:1105871298.1085502916651.JavaMail.wasadmin@swg3ws006...

paul_...@uk.ibm.com

unread,

May 27, 2004, 4:47:54 AM5/27/04

to

There's some new guidance that says setting -Xms and -Xmx equal can cause heap fragmentation. That's because the initial server startup will be free to splat objects all over the complete heap, and some of them are eternal, leaving fixed blocks of memory active. So the recommendation is to have -Xms lower than -Xmx. That forces garbage collection during the early life of the server, so these objects get allocated in a smaller area of memory. I'd try -Xms 128M -Xmx 512M, and see if that helps.

Looking at your verbose:gc output, there's something funny going on. It says you have a bit over 256M, not 512M. (all those number pairs in brackets are free and max memory). Your final out-of-memory error appears to be occurring at 256Mb. You're also taking a very long time for the sweep process of a moderately sized heap - I'd expect less than a second. Are you paging?

I suspect something in your configuration is wrong - maybe you're specifying -Xmx in two different places. With 256Mb heap and WebSphere in there, I'm not surprised you sometimes can't get 40 Mb contiguous.

Ben_

unread,

Jun 20, 2004, 5:50:37 AM6/20/04

to

Hello,

> There's some new guidance that says setting -Xms and -Xmx equal can cause
> heap fragmentation.

I'm very interested in references on this, because it's what we're used to
do and also encountered problems with large heap block allocations.

Do you have information to share ?

Thanks.

Glen Burson

unread,

Jun 21, 2004, 5:38:33 AM6/21/04

to

Hi Ben,

I think the problem is that there are typically a lot of JNI allocations in
the heap that are pinned and are allocated for the life of the application.
Most of these are allocated during startup. If the min and max heap sizes
are the same, these pinned allocations are scattered throughout the heap.
Whereas if the min heap size is quite low, most of these allocations will be
closer together at the start of the heap, leaving the bulk of the heap (when
it's expanded) more free of pinned memory.

I did read this somewhere but can't remember which article...

Cheers,
Glen.

"Ben_" <re...@newsgroup.com> wrote in message
news:cb3mla$9pok$1...@news.boulder.ibm.com...

gius...@web-systems.it

unread,

Jun 23, 2004, 5:16:12 AM6/23/04

to

I had an Out of Memory error using the object CachedXPathAPI instead
of XPathAPI (that solved the problem)...,but my application is slower
because the class XPathAPI is a static class...wherever it doesn't
cause the out of memory error

I'm using was 5.0.2 could it be a garbage collection problem?????

p.s. i tried to increase the heap, but i had the same problem

diazdiez

unread,

Jun 24, 2004, 5:06:16 AM6/24/04

to

Hi,
An application server can record "totally out of heap space" or "insufficient heap space to satisfy allocation request" when the Java™ virtual machine (JVM™) heap space is fragmented and there is not enough contiguous space to allocate an object to the heap.
At this point I think that upgrading JDK service release can help you with this issue.
There is an apar (PQ82989)with the following description:
ERROR DESCRIPTION:
OutOfMemoryError occurs for a large size of an allocation
request,even though there are enough free space in the heap
PROBLEM SUMMARY:
One of the reasons for Heap fragmentation are dosed objects.
Because of heap fragementation, request for allocation of
big objects was failing.
PROBLEM CONCLUSION:
We dose objects on each mark phase but only clear dosed
bits on completion of a compaction phase of GC.Feature 5646
addresses this by only dosing on a compaction on JDK141.
We have backported this to 131 as part of this defect.
This problem is solved in SDK 20031230 or later.
In the following link is available the service release 7 for jdk 1.3.1:
http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&q1=service+release+7&uid=swg24006479&loc=en_US&cs=utf-8&lang=en+en

Also it clould be interested the following technote:
http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&q1=allocation+jvm&uid=swg21164885&loc=en_US&cs=utf-8&lang=en+en

I hope this help you.

diazdiez

unread,

Jun 24, 2004, 5:09:04 AM6/24/04

to

This technote also could be useful:
http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&q1=allocation+jvm&q2=swprofiler&uid=swg21162314&loc=en_US&cs=utf-8&lang=en

Glen Burson

unread,

Jun 25, 2004, 5:45:08 AM6/25/04

to

Thanks for that, very useful information...

Cheers,
Glen.

"diazdiez" <diaz...@mixmail.com> wrote in message
news:1261326538.1088067976359.JavaMail.wasadmin@swg3ws006...

Hi,
An application server can record "totally out of heap space" or

"insufficient heap space to satisfy allocation request" when the JavaT
virtual machine (JVMT) heap space is fragmented and there is not enough

Frankie Li

unread,

Jun 25, 2004, 9:52:01 PM6/25/04

to

"Glen Burson" <gbu...@salmon.ltd.uk> wrote in message news:<cbgo2h$7fek$1...@news.boulder.ibm.com>...

> Thanks for that, very useful information...
>
> Cheers,
> Glen.
>
> "diazdiez" <diaz...@mixmail.com> wrote in message
> news:1261326538.1088067976359.JavaMail.wasadmin@swg3ws006...

> Also it clould be interested the following technote:
> http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&q1=allocation
> +jvm&uid=swg21164885&loc=en_US&cs=utf-8&lang=en+en
>

Yeah, this is an interesting technote. Thanks. I notice from the
technote that min thread and max thread is recommended to be the same
value so as to avoid fregmentation when the thread pool growth.

Is this the practice for most of the WAS user outside?

Steen Jensen

unread,

Sep 28, 2004, 5:19:06 AM9/28/04

to

Hey,

this looks like the kind of tool I need to track down a leak in my
application, but.....

where on earth does it output ?
When I start my appserver with the startServer.bat file it logs alot of
stuff there, but when the bat file is done, when does the output end up ?

/Steen Jensen

"diazdiez" <diaz...@mixmail.com> skrev i en meddelelse
news:529600731.1088068143957.JavaMail.wasadmin@swg3ws006...

Ken Hygh

unread,

Sep 28, 2004, 6:00:14 AM9/28/04

to

Steen Jensen wrote:

....\WebSphere\AppServer\logs\<servername>\xxx.log

Ken