Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

strange memory leak

183 views
Skip to first unread message

p595pimp

unread,
Sep 25, 2007, 5:33:10 AM9/25/07
to
we have been experiencing a strange memory problem with a new p550
server that is going to host a new JDE EnterpriseOne implementation.
So far, oracle & JDE are installed, and users are able to log in,
etc... everything seems to be fine EXCEPT...
the machine shipped with 8GB memory, as specified by the JDE/Oracle
requirements... all of the neccessary AIX & application prereq.
filesets & patches are installed (java, c-compiler, etc...)..
everything looks good except that the machine is hitting 80% memory
usage with only 12 users logged in!!! when we go live in a couple of
weeks, there will be aprox. 80 users.
I understand that we will probably need to add more memory (we're
thinking of going to 24GB), but still... the system should not be
hitting 80% of 8GB with only 12 users logged in. Also, we're
averaging 29% paging.
I don't know alot about oracle/jde, and there are a lot of processes
running from both... i'm suspecting the problem may be within the app
side... but the jde implementation guys are saying they don't know
whats wrong. so.
here is some output below... does anybody have any idea what might be
causing this, or have any suggestions?
Thanks!
-P

(if anybody wants to see the process list let me know)
c-dev:/ # bootinfo -r
8126464

c-prod:/ #lsattr -El sys0 -a realmem
realmem 8126464 Amount of usable physical memory in Kbytes False

c-dev:/ # lsattr -El sys0 -a realmem
realmem 8126464 Amount of usable physical memory in Kbytes False

c-prod:/ #lsps -s
Total Paging Space Percent Used
8192MB 29%

c-dev:/ # lsps -s
Total Paging Space Percent Used
8192MB 11%

LOGGED ERROR REPORTS:

c-prod:/ #errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F7FA22C9 0820162407 I O SYSJ2 UNABLE TO ALLOCATE SPACE IN
FILE SYSTEM
2F3E09A4 0725162607 I H hdisk0 REPAIR ACTION
2F3E09A4 0725162607 I H scsi1 REPAIR ACTION
2F3E09A4 0725162607 I H scsi0 REPAIR ACTION
2F3E09A4 0725162607 I H sisscsia0 REPAIR ACTION
2F3E09A4 0725162607 I H scsi1 REPAIR ACTION
2F3E09A4 0725162607 I H scsi0 REPAIR ACTION
2F3E09A4 0725162607 I H ent1 REPAIR ACTION
2F3E09A4 0725162607 I H ent0 REPAIR ACTION
2F3E09A4 0725162607 I H cd0 REPAIR ACTION
2F3E09A4 0725162607 I H fcs1 REPAIR ACTION
2F3E09A4 0725162607 I H fscsi1 REPAIR ACTION
2F3E09A4 0725162607 I H fcnet1 REPAIR ACTION
2F3E09A4 0725162607 I H fcs0 REPAIR ACTION
2F3E09A4 0725162607 I H fscsi0 REPAIR ACTION
2F3E09A4 0725162607 I H fcnet0 REPAIR ACTION
2F3E09A4 0725162607 I H scsi3 REPAIR ACTION
2F3E09A4 0725162607 I H ses1 REPAIR ACTION
2F3E09A4 0725162607 I H hdisk1 REPAIR ACTION
2F3E09A4 0725162607 I H scsi2 REPAIR ACTION
2F3E09A4 0725162607 I H sisscsia1 REPAIR ACTION
2F3E09A4 0725162607 I H scsi3 REPAIR ACTION
2F3E09A4 0725162607 I H scsi2 REPAIR ACTION
2F3E09A4 0725162607 I H fcs3 REPAIR ACTION
2F3E09A4 0725162607 I H fscsi3 REPAIR ACTION
2F3E09A4 0725162607 I H fcnet3 REPAIR ACTION
2F3E09A4 0725162607 I H fcs2 REPAIR ACTION
2F3E09A4 0725162607 I H fscsi2 REPAIR ACTION
2F3E09A4 0725162607 I H fcnet2 REPAIR ACTION
2F3E09A4 0725162607 I H usbhc1 REPAIR ACTION
2F3E09A4 0725162607 I H usbhc0 REPAIR ACTION
2F3E09A4 0725162607 I H ent3 REPAIR ACTION
2F3E09A4 0725162607 I H ent2 REPAIR ACTION
2F3E09A4 0725162607 I H sysplanar0 REPAIR ACTION
2F3E09A4 0725162607 I H sys0 REPAIR ACTION

c-dev:/ # errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
E142C6D4 0801133707 T H sysplanar0 EEH temporary error for
adapter
BE0A03E5 0801133707 P H sysplanar0 ENVIRONMENTAL PROBLEM
2F3E09A4 0725162307 I H sisscsia1 REPAIR ACTION
2F3E09A4 0725162307 I H scsi3 REPAIR ACTION
2F3E09A4 0725162307 I H scsi2 REPAIR ACTION
2F3E09A4 0725162307 I H lai0 REPAIR ACTION
2F3E09A4 0725162307 I H fcs2 REPAIR ACTION
2F3E09A4 0725162307 I H fscsi2 REPAIR ACTION
2F3E09A4 0725162307 I H fcnet2 REPAIR ACTION
2F3E09A4 0725162307 I H fcs1 REPAIR ACTION
2F3E09A4 0725162307 I H fscsi1 REPAIR ACTION
2F3E09A4 0725162307 I H fcnet1 REPAIR ACTION
2F3E09A4 0725162307 I H usbhc1 REPAIR ACTION
2F3E09A4 0725162307 I H usbhc0 REPAIR ACTION
2F3E09A4 0725162307 I H ent3 REPAIR ACTION
2F3E09A4 0725162307 I H ent2 REPAIR ACTION
2F3E09A4 0725162307 I H sysplanar0 REPAIR ACTION
2F3E09A4 0725162307 I H sys0 REPAIR ACTION

TOPAS OUTPUT:

dev:/ # topas
Topas Monitor for host: c-dev EVENTS/QUEUES FILE/
TTY
Tue Sep 4 15:26:26 2007 Interval: 2 Cswitch 210
Readch 1100.8K0
Syscall 1646
Writech 25828
Kernel 1.0 |# | Reads 57
Rawin 0
User 0.3 |# | Writes 6
Ttyout 819
Wait 0.0 |# | Forks 0
Igets 0
Idle 98.7 |############################| Execs 1
Namei 252
Runqueue 0.5
Dirblk 0
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 0.0
lo0 2.4 9.0 9.0 1.2 1.2
en4 1.1 3.5 2.0 0.2 0.9 PAGING
MEMORY
Faults 585
Real,MB 7936
Disk Busy% KBPS TPS KB-Read KB-Writ Steals 0 %
Comp 76.1
hdisk1 0.0 0.0 0.0 0.0 0.0 PgspIn 0 %
Noncomp 24.3
dac0 0.0 24.0 1.5 0.0 24.0 PgspOut 0 %
Client 24.3
hdisk0 0.0 0.0 0.0 0.0 0.0 PageIn 0
dac1 0.0 0.0 0.0 0.0 0.0 PageOut 5
PAGING SPACE
Sios 5
Size,MB 8192
Name PID CPU% PgSp Owner %
Used 9.4
oracle 897050 0.2 10.0 oracle NFS (calls/sec) %
Free 90.5
jdenet_k 1896702 0.2 24.6 jde812 ServerV2 0
topas 671958 0.2 3.8 oracle ClientV2 0
Press:
topas 2064542 0.2 2.3 root ServerV3 0 "h"
for help
topas 499842 0.2 3.8 root ClientV3 0 "q"
to quit


prod:/ # topas
Topas Monitor for host: c-prod EVENTS/QUEUES FILE/
TTY
Tue Sep 4 15:26:57 2007 Interval: 2 Cswitch 399
Readch 67900
Syscall 693
Writech 26560
Kernel 0.2 |# | Reads 21
Rawin 0
User 0.2 |# | Writes 5
Ttyout 144
Wait 0.0 |# | Forks 0
Igets 0
Idle 99.6 |############################| Execs 0
Namei 62
Runqueue 0.0
Dirblk 0
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 0.0
lo0 1.4 7.0 7.0 0.7 0.7
en4 0.2 0.5 0.5 0.0 0.2 PAGING
MEMORY
Faults 6
Real,MB 7936
Disk Busy% KBPS TPS KB-Read KB-Writ Steals 0 %
Comp 58.3
hdisk1 0.0 0.0 0.0 0.0 0.0 PgspIn 0 %
Noncomp 41.9
dac0 0.0 0.0 0.0 0.0 0.0 PgspOut 0 %
Client 41.9
dac1 0.0 30.0 3.0 0.0 30.0 PageIn 0
hdisk0 0.0 0.0 0.0 0.0 0.0 PageOut 6
PAGING SPACE
Sios 6
Size,MB 8192
Name PID CPU% PgSp Owner %
Used 28.6
topas 2056332 0.1 2.0 root NFS (calls/sec) %
Free 71.3
emagent 1425542 0.1 9.5 oracle ServerV2 0
java 704612 0.0 17.1 root ClientV2 0
Press:
java 520196 0.0 341.5 root ServerV3 0 "h"
for help
oracle 397542 0.0 8.3 oracle ClientV3 0 "q"
to quit


**DIAGNOSTICS (c-prod$smitty diag, etc...) were run, in normal mode on
all resources and in Problem Determination Mode, on mem0, on both
machines (prod & dev), and no problems were found

Jim....@cibc.com

unread,
Sep 25, 2007, 7:28:26 AM9/25/07
to
On Sep 25, 5:33 am, p595pimp <christiancolb...@gmail.com> wrote:
> we have been experiencing a strange memory problem with a new p550
> server that is going to host a new JDE EnterpriseOne implementation.
> So far, oracle & JDE are installed, and users are able to log in,
> etc... everything seems to be fine EXCEPT...
> the machine shipped with 8GB memory, as specified by the JDE/Oracle
> requirements... all of the neccessary AIX & application prereq.
> filesets & patches are installed (java, c-compiler, etc...)..
> everything looks good except that the machine is hitting 80% memory
> usage with only 12 users logged in!!! when we go live in a couple of
> weeks, there will be aprox. 80 users.
> I understand that we will probably need to add more memory (we're
> thinking of going to 24GB), but still... the system should not be
> hitting 80% of 8GB with only 12 users logged in. Also, we're
> averaging 29% paging.
> I don't know alot about oracle/jde, and there are a lot of processes
> running from both... i'm suspecting the problem may be within the app
> side... but the jde implementation guys are saying they don't know
> whats wrong. so.
> here is some output below... does anybody have any idea what might be
> causing this, or have any suggestions?
> Thanks!
> -P
>

P: The 29% you quote isn't paging but rather the amount of the disk
space in your paging spaces that's occupied.
This isn't at all a problem. I have happy systems with much larger
numbers. I don't see where you're getting the
80% number from but if it is virtual memory used then that's not a
problem either. In AIX virtual memory is
supposed to be full or nearly so all the time.
Your topas output is folded over so I can't read it. The numbers to be
looking for firstly are the "paging space paging"
values. These show up on vmstat as the pi and po columns and should be
zero or nearly so all the time. Only if these are
well into the double digits consistently might you have a memory
problem. Even then it wouldn't necessarily be
a memory leak. You should suspect a leak if and only if the real
memory allocated to all running processes sums to
significantly and consistently less than total installed RAM.

HTH

Jim Lane

dickd

unread,
Sep 25, 2007, 12:04:21 PM9/25/07
to
I agree with Jim. This is not something to be concerned with (yet).
Every file you touch in AIX is cached in the real memory paging.
If you need access to the file again, it will be very quick.
If memory demands increase, and your applications need more memory
these filesystem pages will be quickly harvested for the new storage
requests.

This is normal operation.

My primary tool to decide whether I have a memory leak is "svmon".
It requires root privilege to use it ... but honestly, I don't see
why,
so I always change the permissions so the "little people" can use it.

chmod 6555 /usr/sbin/svmon

The first tool I reach for is "vmstat". Watch the paging columns.
They are a quick and easy way to decide whether you have a paging
problem.

Note: I have seen customers observe this AIX behaviour, and decide
to go out and buy extra real memory, only to have that fill too.
It will always fill on AIX to the extent that there are additional
filesystem blocks that have been touched.

This behaviour is controllable in the "tuneables" area of AIX,
but you really REALLY have to have a good reason to change it.
A good reason might be ... substantially less memory than 8GB.

The other tuning exercise that might cause this is to over-tune
your database and other applications to provide memory caches
that are substantially larger than available real memory.

Of course you're going to fill them ... and then the system will
start paging, and additional real memory will help.
But so will tuning-down the memory cache behaviour of the
applications.

Mark Taylor

unread,
Sep 26, 2007, 4:21:50 AM9/26/07
to
First and foremost, protect your computational memory,

http://www-941.ibm.com/collaboration/wiki/download/attachments/436/VMM+Tuning+Tip+-+Proctecting+Comp+Memory.pdf?version=1

then paste some of the following if the problem still occurs when the
system is under load.

vmstat -v
vmo -a | egrep "lru|max"
vmstat 1 20


Rgds
Mark Taylor

p595pimp

unread,
Sep 27, 2007, 6:56:06 AM9/27/07
to
here's the output from the vmstat/vmo ... if all the svmon output is
showing is the VIRTUAL mem. (and thats supposed to be really high)..
how funny that AIX doesn't have a simple command to just show how much
PHYSCAL memory is being used by the system at a given time!! i've
read to subtract numclient from numperm to do this for JFS (which,
btw, i did in this env., and it = 0 (zero)%, which can't be right??),
but what about JFS2.. is that the same, or...? whats the real deal?
is there really a simple cmd. that i'm missing here, or do you just
have to do the math on the paging rates, etc...?

thanks,
-P

VMSTAT/VMO OUTPUT:
prod:/ #vmstat -v
2031616 memory pages
1953348 lruable pages
5694 free pages
1 memory pools
429088 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
41.8 numperm percentage
817527 file pages
0.0 compressed percentage
0 compressed pages
41.8 numclient percentage
80.0 maxclient percentage
817527 client pages
0 remote pageouts scheduled
168 pending disk I/Os blocked with no pbuf
943966 paging space I/Os blocked with no psbuf
2740 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
38691 external pager filesystem I/Os blocked
with no fsbuf
0 Virtualized Partition Memory Page
Faults
0.00 Time resolving virtualized partition
memory page faults
prod:/ #
prod:/ #vmo -a |egrep "lru|max"
lru_file_repage = 1
lru_poll_interval = 10
lrubucket = 131072
maxclient% = 80
maxfree = 1088
maxperm = 1562678
maxperm% = 80
maxpin = 1639292
maxpin% = 80
npsrpgmax = 131072
npsscrubmax = 131072
strict_maxclient = 1
strict_maxperm = 0
prod:/ #


prod:/ #vmstat 1 20

System configuration: lcpu=8 mem=7936MB

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 1579948 9578 0 0 0 0 0 0 4 370 375 0 0 99 0
0 0 1579948 9578 0 0 0 0 0 0 7 1694 400 0 0 99 0
0 0 1579948 9578 0 0 0 0 0 0 4 284 388 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 5 425 412 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 5 689 379 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 17 506 432 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 1 234 371 0 0 99 0
0 0 1579949 9575 0 0 0 0 0 0 10 756 393 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 2 486 397 0 0 99 0
0 0 1579948 9576 0 0 0 0 0 0 3 293 393 1 1 98 0
0 0 1580378 9058 0 0 0 0 0 0 13 1080 393 1 1 98 0
0 0 1580000 9434 0 0 0 0 0 0 3 75151 491 6 4 90 0
0 0 1580000 9434 0 0 0 0 0 0 1 254 401 0 0 99 0
0 0 1580002 9430 0 0 0 0 0 0 9 828 441 0 0 99 0
0 0 1580002 9430 0 0 0 0 0 0 2 207 365 0 0 99 0
0 0 1580002 9430 0 0 0 0 0 0 1 188 352 0 0 99 0
1 0 1580002 9430 0 0 0 0 0 0 7 3904 438 0 1 98 0
0 0 1580002 9430 0 0 0 0 0 0 1 283 369 0 0 99 0
0 0 1580002 9430 0 0 0 0 0 0 1 298 383 0 0 99 0
0 0 1580002 9430 0 0 0 0 0 0 5 712 384 0 0 99 0
prod:/ #

On Sep 26, 1:21 am, Mark Taylor <m...@talk21.com> wrote:
> First and foremost, protect your computational memory,
>

> http://www-941.ibm.com/collaboration/wiki/download/attachments/436/VM...

Mark Taylor

unread,
Sep 27, 2007, 8:42:15 AM9/27/07
to
ok, set lru_file_repage to 0, then i would reboot if possible to start
from a clean sheet (u dont have to, its just easier) .. then see if
you still have the issue. Read that doc in the link i pasted above
which discusses why to do so.

>> AIX doesn't have a simple command to just show how much
>> PHYSCAL memory is being used by the system at a given time

nmon "m" sub command gives you some decent output .. but vmstat is
pretty good .. i.e. vmstat and vmstat -v or svmon .. depends which u
prefer, i use vmstat.

Rgds
Mark Taylo


p595pimp

unread,
Sep 27, 2007, 7:30:25 PM9/27/07
to
okay, you said:
> >> AIX doesn't have a simple command to just show how much
> >> PHYSCAL memory is being used by the system at a given time
>
> nmon "m" sub command gives you some decent output .. but vmstat is
> pretty good .. i.e. vmstat and vmstat -v or svmon .. depends which u
> prefer, i use vmstat.

BUT, vmstat only shows info about VIRTUAL memory...

dev:/ # vmstat


System configuration: lcpu=8 mem=7936MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa

1 2 1741988 8903 0 1 3 247 691 0 128 3575 434 1 1 98 0

which ALWAYS makes it look like you have very little free memory,
because, according to the man pages for vmstat:
Active virtual pages.
fre
Size of the free list. Note: A large portion of real
memory is
utilized as a cache for file system data. It is not
unusual for
the size of the free list to remain small.

[[ another quick question... (but this is a bit off the point, so I'm
more concerned w/the other questions)... why, according to 'vmstat -
vs', are page-in's an order of magnitude higher than page-out's???
(yet, the "paging space page in's" are much LESS than the "paging
space page outs"). jeesh!!! ]]

(eg... dev:/ # vmstat -vs
2258986214 total address trans. faults
1201141520 page ins
108631385 page outs
9680361 paging space page ins
15899702 paging space page outs


SIMILARLY, svmon deals w/virtual memory, and also indicates very
little "free memory"
dev:/ # svmon
size inuse free
pin virtual
memory 2031616 1970646 60970 468738 1738330
pg space 2097152 256811

so... i guess what i'm wondering is, is this because AIX assigns ALL
of the phys.mem. to paging space and v.mem... or... ? and is there
really not a SIMPLE way, that doesnt involve doing math on the various
paging stats and values, to figure out, at a given time, how much
PHYSICAL (real) mem. is being used?

in other words.. i understand how to look at all the VIRTUAL memory
stuff... but is there a way to separate that from the PHYS mem stuff,
and just say... "Hey, this system with 8GB of RAM is currently using
XGB (or X %) of that RAM under this load, so maybe we should add X-
more GB of RAM, or look for a memory leak" ?!??

Thanks,
-P

Thomas Braunbeck

unread,
Sep 28, 2007, 2:01:45 AM9/28/07
to
p595pimp schrieb:

>
> in other words.. i understand how to look at all the VIRTUAL memory
> stuff... but is there a way to separate that from the PHYS mem stuff,
> and just say... "Hey, this system with 8GB of RAM is currently using
> XGB (or X %) of that RAM under this load, so maybe we should add X-
> more GB of RAM, or look for a memory leak" ?!??

svmon -G, for an example and more details see
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/amount_mem_use.htm

Paul Landay

unread,
Sep 28, 2007, 6:09:58 AM9/28/07
to p595pimp
p595pimp wrote:
:

> [[ another quick question... (but this is a bit off the point, so I'm
> more concerned w/the other questions)... why, according to 'vmstat -
> vs', are page-in's an order of magnitude higher than page-out's???
> (yet, the "paging space page in's" are much LESS than the "paging
> space page outs"). jeesh!!! ]]
>
> (eg... dev:/ # vmstat -vs
> 2258986214 total address trans. faults
> 1201141520 page ins
> 108631385 page outs
> 9680361 paging space page ins
> 15899702 paging space page outs
:

Just a guess, but if the page was paged-in, then was not changed
(e.g. a disk cache of read-only files or a shared lib) then it
would not need to be paged-out again, but if the memory was
needed that real memory page would get re-used (stolen), so
eventually that same page might get paged-in again without an
intermediate page-out (i.e. no 1-to-1 of in/out).

When looking for memory leaks, at least for long-running
user-space processes (WAS, Database), I look at the SZ column
of 'ps -lef'. If the system has reached 'steady-state'
for a given workload and the SZ column continues to grow
for a process then I consider that a clue to a memory leak
in that process.

Paul Landay

Mark Taylor

unread,
Sep 28, 2007, 6:23:45 AM9/28/07
to
>> according to 'vmstat - vs', are page-in's an order of magnitude higher than page-out's???
>> (yet, the "paging space page in's" are much LESS than the "paging

page-ins/page-outs are file and paging space pages
paging space page-ins/page-outs are just paging space

http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/vmstat_s.htm

>> really not a SIMPLE way, that doesnt involve doing math on the various
>>paging stats and values, to figure out, at a given time, how much
>>PHYSICAL (real) mem. is being used?

nmon with the "m" sub command (or topas)

>> which ALWAYS makes it look like you have very little free memory,

Yup, by default (and your config as you have not changed it) AIX will
allow up to 80% of realmem to be used as filesystem cache .. so, in
your case your working memory (computational) AVM == 1580002 * 4096
6471688192
6471688192/1024/1024/1024 == 6.02GB

Your filesystem cache is also taking up 40% of realmem (because you
have not set lru_file_repage to 0) == 3.2GB

6.02 + 3.2 = 9.22GB

You only have 8GB, so you have approx 1.2GB on paging space

>> in other words.. i understand how to look at all the VIRTUAL memory
>> stuff... but is there a way to separate that from the PHYS mem stuff,

The "fre" column in vmstat will show you how many real memory pages
you have free on the freelist i.e. realmem - fre == how much realmem
is being used.

Once again, if you set lru_file_repage to 0 then you will favour only
paging file pages and not working (computational) pages and this will
stop you paging to pages space if possible ...

You may benefit from reading the docs ..

http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/real_memory_mngment.htm

Rgds
Mark Taylor


Mike

unread,
Sep 29, 2007, 7:15:35 AM9/29/07
to
Funny story that is related to your customers buying memory. I won't
tell you what multinational company I work for but lets call it the
Institute of Black Magic. I once was called into an after hours call by
on of the many managers we have. Mr Customer was complaining that memory
was full and they wanted to double it. pi,po were 0. Paging space was 0.
vmstat was 99 % idle. fr and sr on vmsat were 0. I almost laughed until
I realized they were quite serious. It took almost 4 hours to convince
the manger he was wasting the customers money and then a 2 page email
describingin memory management in AIX to make them realize they were
full of shit.

I miss the old days when the only thing between you and your server was
a keyboard not the BS we have today.

hetric...@gmail.com

unread,
Oct 2, 2007, 6:04:45 PM10/2/07
to

HAHAHahahha!!! thats got to be a pretty common scenario, given the
incredibly obtuse and convoluted methods AIX provides for answering
such a SIMPLE question: "how much memory is the system using? does it
need a memory upgrade?"
to make matters worse in this particular case, the pSeries boxes in
question are the first unix boxes this co. has EVER had... never even
had linux or anything... NO experience w/any kind of unix at all...
strictly a windows shop up until now, and they only have a couple of
windows admins, who, of course, are used to seeing very simple memory
usage statistics, a la Windows Task Manager -> Performance tab.
trying to explain to these guys (and even worse, their managers!)
that simply checking memory usage on an AIX server requires a PhD in
CS (my brain still hurts from trying to understand it all myself, and
I've been doing unix (Solaris) for years) is not either simple or
convincing.
bottom line seems to be that there IS NO simple way to check mem.
usage, without having a firm grasp of the concepts of AIX memory
management internals (paging space.. the diff. b/w page-ins, page-
outs, paging space page-ins, paging-space page-outs, virtual memory,
page faults, pins, steals, working/consistent/user pages, frames,
buffers, filesytem caches, computational memory (i.e. process memory
(data, stack, and heap) VS. kernel memory and shared memory, etc.
etc... ) as well as understanding the cryptic output of svmon, vmstat,
ps, and some tools you end up having to install from perf.tools, and
of course the multiplicity of various flags and arguments to those
commands, which tend to have man-page explanations such as:
"%MEM
Calculated as the sum of the number of working segment and code
segment pages in memory times 4 (that is, the RSS value), divided by
the size of the real memory of the machine in KB, times 100, rounded
to the nearest full percentage point. "

so... yeah. trying to explain all that shit to somebody who's used to
just hitting "ctrl-alt-delete"-> Task Mgr. to get an answer, in about
2 seconds, that they don't have to calculate or RESEARCH... is...
well... IT SUCKS!!! seriously.

oh well.. if it was easy, i guess we'd all be out of a job. so,
thanks for "keeping it real", 'International Black Magic'!!!!!

hahahaaaa
-P

David J Dachtera

unread,
Oct 2, 2007, 10:46:41 PM10/2/07
to

I'm accustomed to something even more straight-forward (very small Alpha,
hobbyist machine):

$ show memory
System Memory Resources on 2-OCT-2007 21:57:38.86

Physical Memory Usage (pages): Total Free In Use Modified
Main Memory (256.00MB) 32768 23658 8340 770

Extended File Cache (Time of last reset: 24-SEP-2007 22:07:52.18)
Allocated (MBytes) 28.37 Maximum size (MBytes) 128.00
Free (MBytes) 0.71 Minimum size (MBytes) 3.12
In use (MBytes) 27.66 Percentage Read I/Os 56%
Read hit rate 66% Write hit rate 0%
Read I/O count 8643 Write I/O count 6605
Read hit count 5776 Write hit count 0
Reads bypassing cache 14 Writes bypassing cache 0
Files cached open 251 Files cached closed 164
Vols in Full XFC mode 0 Vols in VIOC Compatible mode 5
Vols in No Caching mode 0 Vols in Perm. No Caching mode 0

Granularity Hint Regions (pages): Total Free In Use Released
Execlet code region 1024 0 772 252
Execlet data region 256 0 256 0
S0/S1 Executive data region 260 0 260 0
Resident image code region 1024 0 471 553

Slot Usage (slots): Total Free Resident Swapped
Process Entry Slots 62 43 19 0
Balance Set Slots 60 43 17 0

Dynamic Memory Usage: Total Free In Use Largest
Nonpaged Dynamic Memory (MB) 1.97 0.84 1.13 0.75
Paged Dynamic Memory (MB) 1.14 0.63 0.50 0.63
Lock Manager Dyn Memory (KB) 496.00 320.50 175.50

Buffer Object Usage (pages): In Use Peak
32-bit System Space Windows (S0/S1) 0 0
64-bit System Space Windows (S2) 0 0
Physical pages locked by buffer objects 0 0

Memory Reservations (pages): Group Reserved In Use Type
Total (0 bytes reserved) 0 0

Write Bitmap (WBM) Memory Summary
Local bitmap count: 0 Local bitmap memory usage (bytes) 0.00
Master bitmap count: 0 Master bitmap memory usage (bytes) 0.00

Swap File Usage (8KB pages): Index Free Size
DISK$ALPHASYS:[SYS0.SYSEXE]SWAPFILE.SYS
1 488 488

Paging File Usage (8KB pages): Index Free Size
DISK$ALPHASYS:[SYS0.SYSEXE]PAGEFILE.SYS
254 10744 10744
Total committed paging file usage: 2204

Of the physical pages in use, 2895 pages are permanently allocated to OpenVMS.

Apologies if that wraps badly.

Wonder if there are any third-party products for taming the UN*X beast...

--
David J Dachtera
dba DJE Systems
http://www.djesys.com/

Unofficial OpenVMS Marketing Home Page
http://www.djesys.com/vms/market/

Unofficial Affordable OpenVMS Home Page:
http://www.djesys.com/vms/soho/

Unofficial OpenVMS-IA32 Home Page:
http://www.djesys.com/vms/ia32/

Unofficial OpenVMS Hobbyist Support Page:
http://www.djesys.com/vms/support/

Richard D. Latham

unread,
Oct 2, 2007, 10:49:32 PM10/2/07
to
hetric...@gmail.com writes:

<snip>

> bottom line seems to be that there IS NO simple way to check mem.
> usage, without having a firm grasp of the concepts of AIX memory
> management internals (paging space.. the diff. b/w page-ins, page-
> outs, paging space page-ins, paging-space page-outs, virtual memory,
> page faults, pins, steals, working/consistent/user pages, frames,
> buffers, filesytem caches, computational memory (i.e. process memory
> (data, stack, and heap) VS. kernel memory and shared memory, etc.
> etc... ) as well as understanding the cryptic output of svmon, vmstat,
> ps, and some tools you end up having to install from perf.tools, and
> of course the multiplicity of various flags and arguments to those
> commands, which tend to have man-page explanations such as:
> "%MEM
> Calculated as the sum of the number of working segment and code
> segment pages in memory times 4 (that is, the RSS value), divided by
> the size of the real memory of the machine in KB, times 100, rounded
> to the nearest full percentage point. "
>
> so... yeah. trying to explain all that shit to somebody who's used to
> just hitting "ctrl-alt-delete"-> Task Mgr. to get an answer, in about
> 2 seconds, that they don't have to calculate or RESEARCH... is...
> well... IT SUCKS!!! seriously.
>
> oh well.. if it was easy, i guess we'd all be out of a job. so,
> thanks for "keeping it real", 'International Black Magic'!!!!!
>

For people at that level of skill, tell them to execute the command
"lsps -a".

If the number(s) are lower than, say, 5%, they've got enough RAM.


--
#include <disclaimer.std> /* I don't speak for IBM ... */
/* Heck, I don't even speak for myself */
/* Don't believe me ? Ask my wife :-) */
Richard D. Latham lat...@us.ibm.com

Mark Taylor

unread,
Oct 3, 2007, 4:48:13 AM10/3/07
to
>> and they only have a couple of windows admins

So this company done its research then and put into place the training
for AIX so they could maximise the return on thier investment ..
lol .. :)

I expect there is a tab in WSM for the Windows bods ..

It is fairly simple to see how much memory is being used .. topas or
nmon will paint a very nice picture for them .. but you are right,
many AIX admins (ones that have been on many courses and profess to be
experts) still struggle with VMM theory and how to tune it / stop it
paging .. AIX should really now be shipped with lru_file_repage set to
0 which would cut out a lot of this argument .. we have been waiting
for this switch for a long long time .. yet, still it is shipped /
installed as a file server ..

Rgds
Mark Taylor

Dieter Stumpner

unread,
Oct 3, 2007, 3:08:01 PM10/3/07
to
p595pimp wrote:
> we have been experiencing a strange memory problem with a new p550
> server that is going to host a new JDE EnterpriseOne implementation.
> So far, oracle & JDE are installed, and users are able to log in,
> etc... everything seems to be fine EXCEPT...
> the machine shipped with 8GB memory, as specified by the JDE/Oracle
> requirements... all of the neccessary AIX & application prereq.
> filesets & patches are installed (java, c-compiler, etc...)..
> everything looks good except that the machine is hitting 80% memory
> usage with only 12 users logged in!!! when we go live in a couple of
> weeks, there will be aprox. 80 users.
> I understand that we will probably need to add more memory (we're
> thinking of going to 24GB), but still... the system should not be
> hitting 80% of 8GB with only 12 users logged in. Also, we're
> averaging 29% paging.
> I don't know alot about oracle/jde, and there are a lot of processes
> running from both... i'm suspecting the problem may be within the app
> side... but the jde implementation guys are saying they don't know
> whats wrong. so.
> here is some output below... does anybody have any idea what might be
> causing this, or have any suggestions?
> Thanks!
> -P
[SNIP] some vmm-stats

Hi!

Think you search on the wrong place for your problem! As far as i know
oracle (think u mean the db part) isnt a "normal" process. If you start
oracle *you* tell them how much memory it will use. Independent of
number of user oracle use! Dont know JDE, but it sounds like it uses a
java vm. Then the same problem: *you* define the amount of Memory the vm
can use.
First dont worry about 29% pagingspace used (read all the posts why) and
then use the oracle enterprise manager(?) to check your db buffers and
check out the jconsole to look if you give the java vm enough memory.

with best regards
Dieter Stumpner

patrice

unread,
Oct 5, 2007, 8:48:10 PM10/5/07
to
> Dieter Stumpner- Masquer le texte des messages précédents -
>
> - Afficher le texte des messages précédents -

Hi,
When paging space fills up, it means that the system sends out
computational memory frames to paging space.
Computational pages are program pages. If the system has to send
frames to paging space to make room in memory, it takes time to put
them on the disks.
Worst, if he needs to repage in the computational pages because a
program like Oracle needs that page to continue its processing, the
system and application overall performance is suffering.
A very interesting vmo parameter called v_pinshm can be set to 1
(default value is 0).
This AIX parameter allows to pin shared memory segments into real
memory and avoids shared memory segments to be launched to paging
space.
In general, when it is not set on an oracle database server, Oracle
shared memory segments are sended out to paging space if room is
needed.
Regarding the memory counters you have (%computational < RAM), you
should not have a so big percentage of paging space used.
Try to launch svmon -Pgt 10 to see what processes are eating paging
space.
Either those processes are oracle processes with high shared memory
values or Java processes.
If you are running AIX 5.3, as it seems to be the case, try the
followings :
vmo -o lru_file_repage=0 (avoids to send computational pages to
paging space when numperm located between minperm and maxperm : this
is your case)
vmo -o lru_poll_interval=10
vmo -o v_pinshm=1 (allows shared memory segments to be pinned into
memory).
put lock_sga oracle parameter to true into instance init file
Check that the sum of memory of your oracle instances and JVMs are not
bigger than 80% of your RAM.
Stop/restart your databases and JVMs.
Everything should be OK after that.
Hope this helps and regards.
Patrice
/* I speak ONLY for myself. */
/* My views and my opinions do not in an way represent those of my
actual or previous employer, even if they like them and especially if
they don't... */

0 new messages