I have an application (production) On WAS 4.0.2 running on IBM AIX. It didn't have any problem for a year. Now I see a lot of "java.lang.OutOfMemoryError" errors.
I enabled the Verbose GC , analyzed the Allocation Failures and noticed that the application server is running out of memory when it needs more than ~40MB. The Heapsize for the application server is 512MB (-Xmx and -Xms). Surprisingly it works fine sometimes. Which made me think that the issue could be a "Heap fragmentation" issue.
I also enabled the heap profiling but failed to get the heapdump to the file that I created. I downloaded the HAT(Heap Analysis Tool) to see if there are any memory leaks. But the tool needs the binary format of the heap dump.
I am also enclosing the successful and unsuccessful Verbose GC log.
I'd be grateful to you, if you could let me know
1) Can I increase the heap size to 1024M? If so, are there any consequences?
2) How to trigger the WAS heap dump when needed?
3) If I can change the JDK for a better garbage collection. As you know we get the IBM JVM 1.3.1 with the WAS 4.0.2.
4) If we should migrate to WAS 5.1?
5) If I am off track?
I apologize, if I am not clear. Please let me know, if I need to post any other information.
Thanks in advance,
Srinivas.
The following is the Verbose GC log for the recreation of the problem that we have.
The Successful Garbage Collection :
<AF[22]: Allocation Failure. need 46511728 bytes, 2055340 ms since last AF>
<AF[22]: managing allocation failure, action=1 (89027200/254700928) (13401048/13
at com.toyota.owk.framework.broker.document.AdobeProcessManager$AdobeTim
eOutTimer.run(AdobeProcessManager.java:113)
<AF[22]: Allocation Failure. need 46511728 bytes, 2055340 ms since last AF>
<AF[22]: managing allocation failure, action=1 (89027200/254700928) (13401048/13
405312)>
<GC(26): GC cycle started Sun May 23 18:30:14 2004
<GC(26): freed 96076480 bytes, 74% free (198504728/268106240), in 3933 ms>
<GC(26): mark: 112 ms, sweep: 3234 ms, compact: 587 ms>
<GC(26): refs: soft 0 (age >= 32), weak 0, final 87, phantom 0>
<GC(26): moved 115393 objects, 52220456 bytes, reason=1, used 32 more bytes>
<AF[22]: managing allocation failure, action=3 (198504728/268106240)>
<GC(26): need to expand mark bits for 268368384-byte heap>
<GC(26): expanded mark bits by 4096 to 4194304 bytes>
<GC(26): need to expand alloc bits for 268368384-byte heap>
<GC(26): expanded alloc bits by 4096 to 4194304 bytes>
<GC(26): expanded heap fully by 262144 to 268368384 bytes, 74% free>
<AF[22]: managing allocation failure, action=4 (198766872/268368384)>
<AF[22]: clearing all remaining soft refs>
<GC(27): GC cycle started Sun May 23 18:30:14 2004
<GC(27): freed 46643824 bytes, 91% free (245410696/268368384), in 168 ms>
<GC(27): mark: 96 ms, sweep: 72 ms, compact: 0 ms>
<GC(27): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
<GC(28): GC cycle started Sun May 23 18:30:14 2004
<GC(28): freed 64 bytes, 91% free (245410760/268368384), in 159 ms>
<GC(28): mark: 95 ms, sweep: 64 ms, compact: 0 ms>
<GC(28): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
<AF[22]: completed in 4309 ms>
The Unsuccessful Garbage Collection
+227951 <AF[23]: Allocation Failure. need 46511936 bytes, 2570785 ms since last
AF>
+227952 <AF[23]: managing allocation failure, action=1 (120881744/254949968) (13
418336/13418416)>
+227953 <GC(29): GC cycle started Sun May 23 19:13:07 2004
+227954 <GC(29): freed 64441840 bytes, 74% free (198741920/268368384), in 2219 m
s>
+227955 <GC(29): mark: 101 ms, sweep: 1673 ms, compact: 445 ms>
+227956 <GC(29): refs: soft 0 (age >= 6), weak 0, final 77, phantom 0>
+227957 <GC(29): moved 107994 objects, 5137272 bytes, reason=1, used 40 more b
ytes>
+227958 <AF[23]: managing allocation failure, action=3 (198741920/268368384)>
+227959 <AF[23]: managing allocation failure, action=4 (198741920/268368384)>
+227960 <AF[23]: clearing all remaining soft refs>
+227961 <GC(30): GC cycle started Sun May 23 19:13:07 2004
+227962 <GC(30): freed 26560 bytes, 74% free (198768480/268368384), in 160 ms>
+227963 <GC(30): mark: 94 ms, sweep: 66 ms, compact: 0 ms>
+227964 <GC(30): refs: soft 9 (age >= 6), weak 0, final 0, phantom 0>
+227965 <GC(31): GC cycle started Sun May 23 19:13:08 2004
+227966 <GC(31): freed 1784 bytes, 74% free (198770264/268368384), in 412 ms>
+227967 <GC(31): mark: 95 ms, sweep: 65 ms, compact: 252 ms>
+227968 <GC(31): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+227969 <GC(31): moved 4728 objects, 315864 bytes, reason=1>
+227970 <AF[23]: managing allocation failure, action=6 (198770264/268368384)>
+227971 <AF[23]: totally out of heap space>
+227972 <AF[23]: completed in 2805 ms>
The Next Trial: Success!!
+228397 <AF[24]: Allocation Failure. need 46511976 bytes, 500396 ms since last A
F>
+228398 <AF[24]: managing allocation failure, action=2 (162426736/268368384)>
+228399 <GC(32): GC cycle started Sun May 23 19:21:30 2004
+228400 <GC(32): freed 82930848 bytes, 91% free (245357584/268368384), in 2157 m
s>
+228401 <GC(32): mark: 100 ms, sweep: 1618 ms, compact: 439 ms>
+228402 <GC(32): refs: soft 0 (age >= 32), weak 0, final 36, phantom 0>
+228403 <GC(32): moved 106166 objects, 5030432 bytes, reason=1, used 32 more b
ytes>
+228404 <AF[24]: completed in 2157 ms>
Another Trial: Failure
+228840 <AF[25]: Allocation Failure. need 46511936 bytes, 3652996 ms since last
AF>
+228841 <AF[25]: managing allocation failure, action=1 (119467040/254949968) (13
418336/13418416)>
+228842 <GC(33): GC cycle started Sun May 23 20:22:25 2004
+228843 <GC(33): freed 65786720 bytes, 74% free (198672096/268368384), in 2242 m
s>
+228844 <GC(33): mark: 102 ms, sweep: 1690 ms, compact: 450 ms>
+228845 <GC(33): refs: soft 0 (age >= 32), weak 0, final 90, phantom 0>
+228846 <GC(33): moved 107637 objects, 4968144 bytes, reason=1, used 184 more
bytes>
+228847 <AF[25]: managing allocation failure, action=3 (198672096/268368384)>
+228848 <AF[25]: managing allocation failure, action=4 (198672096/268368384)>
+228849 <AF[25]: clearing all remaining soft refs>
+228850 <GC(34): GC cycle started Sun May 23 20:22:26 2004
+228851 <GC(34): freed 12864 bytes, 74% free (198684960/268368384), in 171 ms>
+228852 <GC(34): mark: 105 ms, sweep: 66 ms, compact: 0 ms>
+228853 <GC(34): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
+228854 <GC(35): GC cycle started Sun May 23 20:22:26 2004
+228855 <GC(35): freed 584 bytes, 74% free (198685544/268368384), in 407 ms>
+228856 <GC(35): mark: 97 ms, sweep: 66 ms, compact: 244 ms>
+228857 <GC(35): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+228858 <GC(35): moved 59 objects, 2928 bytes, reason=1>
+228859 <AF[25]: managing allocation failure, action=6 (198685544/268368384)>
+228860 <AF[25]: totally out of heap space>
+228861 <AF[25]: completed in 2833 ms>
Next Trial: Success
<AF[26]: Allocation Failure. need 46511480 bytes, 867829 ms since last AF>
<AF[26]: managing allocation failure, action=2 (161879528/268368384)>
<GC(36): GC cycle started Sun May 23 20:36:56 2004
<GC(36): freed 81138288 bytes, 90% free (243017816/268368384), in 1763 ms>
<GC(36): mark: 100 ms, sweep: 1663 ms, compact: 0 ms>
<GC(36): refs: soft 0 (age >= 32), weak 0, final 42, phantom 0>
<AF[26]: completed in 1765 ms>
Failure :
+229717 <AF[27]: Allocation Failure. need 46511952 bytes, 800224 ms since last A
F>
+229718 <AF[27]: managing allocation failure, action=1 (122001576/254949968) (13
418336/13418416)>
+229719 <GC(37): GC cycle started Sun May 23 20:50:18 2004
+229720 <GC(37): freed 63370960 bytes, 74% free (198790872/268368384), in 2185 m
s>
+229721 <GC(37): mark: 101 ms, sweep: 1649 ms, compact: 435 ms>
+229722 <GC(37): refs: soft 0 (age >= 32), weak 0, final 41, phantom 0>
+229723 <GC(37): moved 105128 objects, 5047152 bytes, reason=1, used 104 more
bytes>
+229724 <AF[27]: managing allocation failure, action=3 (198790872/268368384)>
+229725 <AF[27]: managing allocation failure, action=4 (198790872/268368384)>
+229726 <AF[27]: clearing all remaining soft refs>
+229727 <GC(38): GC cycle started Sun May 23 20:50:18 2004
+229728 <GC(38): freed 29152 bytes, 74% free (198820024/268368384), in 164 ms>
+229729 <GC(38): mark: 98 ms, sweep: 66 ms, compact: 0 ms>
+229730 <GC(38): refs: soft 9 (age >= 32), weak 0, final 0, phantom 0>
+229731 <GC(39): GC cycle started Sun May 23 20:50:19 2004
+229732 <GC(39): freed 2592 bytes, 74% free (198822616/268368384), in 421 ms>
+229733 <GC(39): mark: 96 ms, sweep: 64 ms, compact: 261 ms>
+229734 <GC(39): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
+229735 <GC(39): moved 9830 objects, 510056 bytes, reason=1>
+229736 <AF[27]: managing allocation failure, action=6 (198822616/268368384)>
+229737 <AF[27]: totally out of heap space>
+229738 <AF[27]: completed in 2783 ms>
Success:
<AF[28]: Allocation Failure. need 46511824 bytes, 951781 ms since last AF>
<AF[28]: managing allocation failure, action=2 (165643864/268368384)>
<GC(40): GC cycle started Sun May 23 21:06:12 2004
<GC(40): freed 77349104 bytes, 90% free (242992968/268368384), in 1622 ms>
<GC(40): mark: 114 ms, sweep: 1508 ms, compact: 0 ms>
<GC(40): refs: soft 0 (age >= 32), weak 0, final 34, phantom 0>
<AF[28]: completed in 1622 ms>
Additionaly, the sample you posted shows GC cycles up to four seconds. You
should ensure there is enough physical memory and the memory is not swapped
(it dramatically impacts GC). Also, I would expect that double the memory
would be more than double the time to GC, but I may be wrong on this.
> 2) How to trigger the WAS heap dump when needed?
I never tried.
> 3) If I can change the JDK for a better garbage collection. As you know we
get the IBM JVM 1.3.1 with the WAS 4.0.2.
You can't, but anyway, it don't see in what it would help in this case.
> 4) If we should migrate to WAS 5.1?
To have JDK 1.4 and J2EE 1.3 ? Again, I don't see in what it would help.
> 5) If I am off track?
You already made a good job at tracking the problem. I would try to
determine if it is legitimate / good design that the application requires 40
MB heap at once. It may be a coincidence, but GC log shows it always needs
46511xxx bytes. It's intriguing, but could simply be because you make the
exact same operation.
Hope this helps already.
Firstly I'd have a look at the application to see why it's requesting 40mb
of memory - it's far too much for the GC to be efficient. It should be split
up into much smaller chunks. I can recommend a really good document around
this subject: -
http://www-106.ibm.com/developerworks/java/jdk/diagnosis/
In answer to your questions: -
1. Yes you can increase the heap to 1024M, but it probably won't fix your
problem and may actually make the application performance worse if you
haven't enough real memory to support it.
2. To trigger a heap dump, add this to your startupServer.sh (at the start);
IBM_JAVA_MMAP_JAVA_HEAP=true
export IBM_JAVA_MMAP_JAVA_HEAP
Then a "kill -3 <process id>" will trigger a heap dump. This will also cause
a heap dump when you get an OutOfMemory exception.
3. Don't think you can upgrade the JVM, although there is a patch to fix a
known memory leak in the JVM ORB - see PQ69054.
4. I wouldn't migrate just to fix this problem without fully understanding
it. If the problem is an application one, you will have the same problem
regardless of the WebSphere version.
Cheers,
Glen.
"srinu...@yahoo.co" <srini...@yahoo.com> wrote in message
news:1105871298.1085502916651.JavaMail.wasadmin@swg3ws006...
> There's some new guidance that says setting -Xms and -Xmx equal can cause
> heap fragmentation.
I'm very interested in references on this, because it's what we're used to
do and also encountered problems with large heap block allocations.
Do you have information to share ?
Thanks.
I think the problem is that there are typically a lot of JNI allocations in
the heap that are pinned and are allocated for the life of the application.
Most of these are allocated during startup. If the min and max heap sizes
are the same, these pinned allocations are scattered throughout the heap.
Whereas if the min heap size is quite low, most of these allocations will be
closer together at the start of the heap, leaving the bulk of the heap (when
it's expanded) more free of pinned memory.
I did read this somewhere but can't remember which article...
Cheers,
Glen.
"Ben_" <re...@newsgroup.com> wrote in message
news:cb3mla$9pok$1...@news.boulder.ibm.com...
I'm using was 5.0.2 could it be a garbage collection problem?????
p.s. i tried to increase the heap, but i had the same problem
Also it clould be interested the following technote:
http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&q1=allocation+jvm&uid=swg21164885&loc=en_US&cs=utf-8&lang=en+en
I hope this help you.
Cheers,
Glen.
"diazdiez" <diaz...@mixmail.com> wrote in message
news:1261326538.1088067976359.JavaMail.wasadmin@swg3ws006...
Hi,
An application server can record "totally out of heap space" or
"insufficient heap space to satisfy allocation request" when the JavaT
virtual machine (JVMT) heap space is fragmented and there is not enough
Is this the practice for most of the WAS user outside?
this looks like the kind of tool I need to track down a leak in my
application, but.....
where on earth does it output ?
When I start my appserver with the startServer.bat file it logs alot of
stuff there, but when the bat file is done, when does the output end up ?
/Steen Jensen
"diazdiez" <diaz...@mixmail.com> skrev i en meddelelse
news:529600731.1088068143957.JavaMail.wasadmin@swg3ws006...
....\WebSphere\AppServer\logs\<servername>\xxx.log
Ken