also... i know IBM says it meets HW req's, but do you think its just
too weak of a setup to run 5.2??? i mean, this little POS only has one
333mhz proc and 256mbram!!! sales dept. wants this guy to just upgrade
to a new pSeries! anybody have ideas or similar problems?
BTW... heres the results of 'vmstat -v' and 'vmstat 1 20':
# vmstat 1 20
System Configuration: lcpu=1 mem=256MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
2 3 52014 390 0 2 2 45 134 0 607 5700 932 5 5 77 12
0 0 52018 384 0 1 0 0 0 0 194 2078 71 0 0 98 2
0 0 52018 384 0 0 0 0 0 0 191 2063 67 0 1 99 0
0 0 52018 384 0 0 0 0 0 0 198 2436 76 2 1 94 3
0 0 52018 384 0 0 0 0 0 0 193 2064 68 0 0 99 0
0 0 52018 383 0 0 0 0 0 0 195 2159 75 1 1 96 2
0 0 52018 383 0 0 0 0 0 0 194 2064 62 0 1 99 0
0 0 52398 123 0 0 42 120 799 0 226 2851 99 2 3 95 0
0 0 52419 126 0 0 8 24 141 0 221 2879 121 2 4 93 1
0 0 52018 718 0 0 64 192 1567 0 224 3827 98 3 8 89 0
0 0 52391 346 0 0 1 0 0 0 196 2851 75 0 7 93 0
2 0 52463 266 0 2 0 0 0 0 232 2773 154 4 1 92 3
0 1 53062 126 0 0 82 464 3780 0 288 23574 325 14 53 14 19
2 0 52519 668 0 0 0 0 0 0 226 13942 225 9 38 45 8
0 0 52519 668 0 0 0 0 0 0 192 2172 73 0 1 99 0
0 0 52924 271 0 0 1 8 73 0 212 8963 141 6 17 74 3
4 0 52521 674 0 0 0 0 0 0 233 14434 249 15 35 44 7
1 0 52521 674 0 0 0 0 0 0 197 2226 94 1 1 98 0
0 0 52521 674 0 0 0 0 0 0 196 2436 74 0 2 96 2
3 0 52596 599 0 0 0 0 0 0 206 5324 104 5 6 88 1
# vmstat -v
65536 memory pages
59212 lruable pages
191 free pages
1 memory pools
10049 pinned pages
80.1 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
61.2 numperm percentage
36268 file pages
0.0 compressed percentage
0 compressed pages
0.0 numclient percentage
80.0 maxclient percentage
0 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
64573 paging space I/Os blocked with no psbuf
412862 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no
fsbuf
What command did you use to measure the slowdown?
What application does the machine run?
> perf. capacity. other than the os upgrade, another disk was added, and
> a datavg was separated out from the (previously solitary) rootvg. no
> other apps were added, and there is no additional workload. i have run
Is the problem constant, or does it only pop up a couple times a day, at
certain times of the hour or day?
> BTW... heres the results of 'vmstat -v' and 'vmstat 1 20':
> # vmstat 1 20
This tells us that your CPU and disk is *extremely* bored.
So, next thing I would look at is network traffic and configuration, and
also, memory usage.
Can you also do these commands and post output:
# ps -efl|grep <application name>
# lsps -a
# svmon -G
# netstat -v en0|grep Media
(and for any other Ethernet interfaces that is enabled)
# time dd if=/dev/rhdisk0 of=/dev/null bs=64k count=10000
# time dd if=/dev/rhdisk1 of=/dev/null bs=64k count=10000
# time dd if=/dev/rhdisk2 of=/dev/null bs=64k count=10000
(and for any other drives in the system, assuming it's a few.
Each dd will take about 45 seconds to run.)
# netstat -i 1
(and cancel it after running for 10 seconds)
# netstat -in
-Dan
# lsps -a
Page Space Physical Volume Volume Group Size %Used Active
Auto Type
hd6 hdisk1 rootvg 256MB 41 yes
yes lv
# svmon -G
ksh: svmon: not found.
# netstat -v en0|grep Media
Media Speed Selected: 100 Mbps Full Duplex
Media Speed Running: 100 Mbps Full Duplex
# time dd if=/dev/rhdisk0 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m22.23s
user 0m0.17s
sys 0m0.84s
# time dd if=/dev/rhdisk1 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m25.70s
user 0m0.15s
sys 0m1.02s
# time dd if=/dev/rhdisk2 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m28.45s
user 0m0.05s
sys 0m0.81s
# netstat -i 1
input (en0) output input (Total) output
packets errs packets errs colls packets errs packets errs colls
3328612 0 3720965 0 0 3586536 0 3979085 0 0
11 0 10 0 0 11 0 10 0 0
35 0 36 0 0 37 0 38 0 0
18 0 16 0 0 18 0 16 0 0
30 0 21 0 0 30 0 21 0 0
26 0 26 0 0 26 0 26 0 0
10 0 5 0 0 10 0 5 0 0
28 0 25 0 0 28 0 25 0 0
13 0 12 0 0 15 0 14 0 0
33 0 33 0 0 33 0 33 0 0
15 0 14 0 0 15 0 14 0 0
40 0 43 0 0 40 0 43 0 0
17 0 16 0 0 17 0 16 0 0
14 0 13 0 0 18 0 17 0 0
^C# netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs
Coll
en0 1500 link#2 0.6.29.dc.90.73 3329042 0 3721358 0
0
en0 1500 192.168.10 192.168.10.5 3329042 0 3721358 0
0
lo0 16896 link#1 257936 0 258132 0
0
lo0 16896 127 127.0.0.1 257936 0 258132 0
0
lo0 16896 ::1 257936 0 258132 0
0
#
:)
> # lsps -a
> Page Space Physical Volume Volume Group Size %Used Active
> Auto Type
> hd6 hdisk1 rootvg 256MB 41 yes
> yes lv
41% used? Hmm. That suggests you might be seeing paging activity which
is a big performance killer if it has to go so far into paging space.
I'm not sure because AIX may reserve paging space, but not actually
start to page in/out unless it is running low on memory.
It's not yet clear exact culprit or why. Need to investigate further
with svmon.
> # svmon -G
> ksh: svmon: not found.
This is a must-have tool, especially for debugging memory related
issues.
To install it, you will need to install perfagent.tools and
bos.perf.tools.
If you already have these installed, it should be /usr/sbin/svmon.
Once you have it, look at 'svmon' output and also 'svmon -U|more'
output.
You can post 'svmon' output here because it is short.
svmon -P has a lot of information, but it tells you breakdown of memory
usage including what is using how much of paging space.
It will probably be too large to post 'svmon -P' output here. You could
make it temporarily available on a web site, e-mail the output to me, or
analyze the results yourself.
svmon -P is great because it lets you figure out if you should blame the
OS or blame the application for excessive memory usage. :)
> # netstat -v en0|grep Media
> Media Speed Selected: 100 Mbps Full Duplex
> Media Speed Running: 100 Mbps Full Duplex
Confirmed that network connection appears to be set up and seen ok.
> # time dd if=/dev/rhdisk0 of=/dev/null bs=64k count=10000
The dd tests shows 25-30 MB/sec for all disks, which is healthy.
Also wanted to rule out possibility of a failing disk which can cause a
performance hit due to SCSI bad block sparing if disk is on way out in
certain circumstances. But does not appear to be a problem here.
> # netstat -i 1
> input (en0) output input (Total) output
> packets errs packets errs colls packets errs packets errs colls
> 3328612 0 3720965 0 0 3586536 0 3979085 0 0
> 11 0 10 0 0 11 0 10 0 0
> 35 0 36 0 0 37 0 38 0 0
> 18 0 16 0 0 18 0 16 0 0
> 30 0 21 0 0 30 0 21 0 0
Very quiet network activity, and no errors or collisions. Very good.
Also still need to know if the problem is constant or occasional?
What application does the machine run?
Can you post output of ps -efl | grep <application> ?
That would offer insight into memory sizing by the application.
It also helps figure out if its memory use is excessive (for example,
due to an application bug like a memory leak).
-Dan (the Man)
What kind of kernel you are runing now ?
$ bootinfo -y # from memory
>... now the machine is suffering SERIOUS performance
> problems.... it has basically slowed down to about 1/5 of its previous
> perf. capacity.
Track the machine over a day to get a some usefully data
$ nmon .
and take a look during times with no problems and times with problems
other than the os upgrade, another disk was added, and
> a datavg was separated out from the (previously solitary) rootvg.
On the same scsi-controler ? . Any error messages about scsi in the
error log ?
But what kind of fs you are using on this new vg :
raw,jfs or jfs2 ?
$ lsvg -l datavg
> no
> other apps were added, and there is no additional workload. i have run
> perf.tools and nmon, but cannot seem to find anything extraordinarily
> obvious going on.
then do a simple
$ vmstat 5 10000 > /tmp/vmstat.out
to get a system performance overview
> does ANYONE know what could be up with this piece?
> how else can i get to the root of the problem and figure out whats
> going on?
>
> also... i know IBM says it meets HW req's, but do you think its just
> too weak of a setup to run 5.2??? i mean, this little POS only has one
> 333mhz proc and 256mbram!!! sales dept. wants this guy to just upgrade
> to a new pSeries! anybody have ideas or similar problems?
>
> # vmstat -v
...
> 64573 paging space I/Os blocked with no psbuf
> 412862 filesystem I/Os blocked with no fsbuf
You might read http://www.tek-tips.com/faqs.cfm?fid=5685
>From your vmstat output your low on memory.
Just insert 256MB more.
hth
Hajo
i'm not sure about the actual app(s) running on the machine, so i cant
grep for it (theres tons of stuff on ps -ef), but i can find out
tommorrow.
also... can i do that time dd command on other resources, like mem0 to
test I/O times to individual pieces of memory? that would be cool.
all your thoughts/ideas are greatly appreciated. i'm new to aix and
it seems there are very few aix ninjas around. again, thanks.
Hmm, that really does sound like the system is memory and/or I/O-starved
(e.g. due to excessive paging *OR* due to application pounding the disk
so much or so hard).
My current hypothesis is that you were already low on memory in AIX
4.3.3 but borderline, and then may have gotten 'pushed over the limit'
with the upgrade to 5.2. Won't know if that's the case until can see
ps -efl and svmon -P output, though.
*IF* this is indeed related to memory, you should be able to buy
inexpensive but good quality Kingston memory.
They're MUCH cheaper than IBM; IBM really stiffs you with memory and
hard drive costs. Lots of AIX admins uses Kingston memory if they don't
have the budget for IBM's memory. http://www.kingston.com/
There's also eBay, too.
I would say 512 MB is probably a comfortable baseline for AIX systems.
I don't remember; did you say what type/model this server was?
If you do not know, you can see it next to the serial number on the
front of the machine. Format is:
xxxx-yyy
bb-ccccc
xxxx = type (7025, 7046, 7026, 9076, etc)
yyy = model (H70, B50, F80, 6H1, etc)
bb = serial number, country code part
ccccc = rest of serial number, unique number for a given country code
Or you can get it from the machine:
# lscfg -vp | grep Model: | grep IBM
It will probably be the first one listed. Will be "IBM,xxxx-yyy" for the
type and model.
You will need to know the t/m to order the correct RAM.
> also... perf.tools is not installed, and the box is like 200 miles
> away... is there any way i could get the proper toolsets installed
> without physical access to the machine?
Sure. It's a piece of cake, especially if you have a copy of the AIX 5.2
CDs where you are.
If you do, just mount it on a Windows or Linux system (or whatever you
have handy). They're just ISO9660 CDs, nothing fancy.
Once CD #1 is mounted, go this directory on the CD:
usr/sys/inst.images
Then upload (ftp, scp, whatever) the file 'bos.perf' to your remote box.
It's about 2.5 MB.
Put it in /tmp on the remote AIX box.
Unmount CD #1, mount CD #2, go to usr/sys/inst.images.
Upload the file 'perfagent.tools' to the remote box. It's about 1.8 MB.
Put it in /tmp on the remote AIX box.
Unmount CD #2 and put the CDs away.
Hop on the remote AIX box.
Do these commands to install it:
# inutoc /tmp
# installp -aXgd/tmp bos.perf perfagent.tools
I would also strongly recommend also applying AIX 5.2 patches if you
haven't already done so. Especially for stuff that touches deep in the
kernel like the performance monitoring tools.
If you want to free up some space after installing the packages, just do:
# rm /tmp/.toc /tmp/bos.perf /tmp/perfagent.tools
If you do not have the AIX 5.2 CDs locally, then I would strongly
suggest you have a copy of the CD set sent to you from one of your
sites. It makes it much easier to deal with things like this remotely.
I can describe how to do patching in a separate message if you're unsure
how to do it. It's pretty easy. Hard part is waiting to download maybe
600 MB of patches. :)
> i'm not sure about the actual app(s) running on the machine, so i cant
> grep for it (theres tons of stuff on ps -ef), but i can find out
> tommorrow.
Ok. Sounds good. And make sure you do ps -efl once you figure out which
app it is, 'cause that gives more details -- especially on memory usage,
that ps -ef doesn't show.
If you're just not sure what app, just post the whole output of ps -efl
without the grep.
It'll probably be easy to tell what the major app(s) are, based on
process's command line arguments , CPU usage, or memory usage.
> also... can i do that time dd command on other resources, like mem0 to
> test I/O times to individual pieces of memory? that would be cool.
I'm afraid that dd is only really useful for disk or tape based stuff.
I use dd as a quick rough-idea benchmarking tool to see if there's a
problem or not. If it looks solid, I don't usually whip out the really
fancy disk benchmarking tools like iozone or bonnie. (They're complex,
and takes much longer to measure.)
However, there are other add-on tools to exercise other resources, I
believe?
For instance, with my IBM tape drives, I've got a tool called 'tapeutil'
which has a built-in benchmarking mode.
> all your thoughts/ideas are greatly appreciated. i'm new to aix and
> it seems there are very few aix ninjas around. again, thanks.
Not a problem. Just keep asking questions 'til you understand or have an
answer. :)
Oh, I have another suggestion. If you can, post the output of 'errpt'.
It will print one-line summaries of every single logged error.
Sometimes, you can see a clear pattern of problems that leads up to a
significant event.
(To review each entry in detail, do 'errpt -a|more'. To just see the
list, 'errpt'.)
-Dan
# lsps -a
Page Space Physical Volume Volume Group Size %Used Active
Auto Type
hd6 hdisk1 rootvg 256MB 41 yes
yes lv
# svmon -G
ksh: svmon: not found.
# netstat -v en0|grep Media
Media Speed Selected: 100 Mbps Full Duplex
Media Speed Running: 100 Mbps Full Duplex
# time dd if=/dev/rhdisk0 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m22.23s
user 0m0.17s
sys 0m0.84s
# time dd if=/dev/rhdisk1 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m25.70s
user 0m0.15s
sys 0m1.02s
# time dd if=/dev/rhdisk2 of=/dev/null bs=64k count=10000
10000+0 records in.
10000+0 records out.
real 0m28.45s
user 0m0.05s
sys 0m0.81s
# netstat -i 1
input (en0) output input (Total) output
packets errs packets errs colls packets errs packets errs colls
3328612 0 3720965 0 0 3586536 0 3979085 0 0
11 0 10 0 0 11 0 10 0 0
35 0 36 0 0 37 0 38 0 0
18 0 16 0 0 18 0 16 0 0
30 0 21 0 0 30 0 21 0 0
---------------
the columns are a bit screwey :) ... but it looks like you are paging
out.. maxperm is set to default ..
these are also of concern .. so you probably want to up your fsbufs
64573 paging space I/Os blocked with no psbuf
412862 filesystem I/Os blocked with no fsbuf
avm is 52596 *4096 == 215433216 == 205MB
Your numperm is 61.2% so you are going to be paging out .. you need to
stop it paging by reducing the level of persisitent pages that you
cache in realmem .. you do this by lowering maxperm .. i would go with
the following as your avm is about ~ 80% of your realmem ..
maxperm%=10
minperm%=5
i.e.
vmo -p -o maxperm%=10 -o minperm%=5 -o maxclient%=10%
And see how you get on ..
check out man ioo and man vmo for tuning tips.
HTH
Mark Taylor
Heh, I'm quite familiar with these two, too, since that's the other
platforms that my team manages at work.
> AIX is something of a strange beast, isn't it? the more i learn about
> it, the more i see how different it really is from other *nix's.
:)
It's certainly *ahem* different and has its unique personality. :)
The differences is what bothers the average experienced UNIX admin (but
new to AIX) the most. I know, I've seen it with all of my co-workers,
and people elsewhere.
If you don't let yourself get psyched out by the differences, you'll do
just fine. I say this based on long experience in training UNIX admins
on AIX system management. Look at it as an interesting and new
adventure, with new food dishes to try.
It took a few months of daily work with AIX and problem-solving various
things before got used to things. On my first day of work almost 10
years ago, I showed up at 9am. By 9:05am, I had the root password, a
short list of starter projects, and an order to "make yourself an
account; you've got root, so login and figure out how to do that, and
enjoy." and my immediate boss walked off to a meeting elsewhere in the
building. Just like that.
Even worse (for me), the account had to be made in DCE rather than
locally on the machine, so learning curve was even more steep! But got
it done. [And now, DCE or Kerberos related stuff... cake.]
In my earliest AIX days, I was using SMIT a lot. I didn't just use it
ONLY as a menu system. I also used it as a teacher, because with many
SMIT menu options, you can type in parameters but instead of executing
it, you ask SMIT to tell you the exact command(s) it would have had
executed on your behalf. I learn about AIX commands fast that way, and
also makes it very easy to find out how to script anything SMIT can do.
The AIX experts here uses mostly AIX commands to manage the box and very
rarely, SMIT. The less experienced AIX admins sticks to SMIT. Doesn't
matter; both gets the job done, and at everybody's comfort level. Or the
AIX experts makes scripts for everybody to run. It's all good. (Reminds
me a lot of the AS/400 here.)
That's a flexibility that I like a lot about AIX. SMIT also exists as a
text mode or as a GUI application, too. (I always stick to the text mode
version, though I've heard the GUI app in v5 is **VERY** nice.)
AIX scales pretty well, similar to Solaris -- has run on everything
ranging from laptops to supercomputers. (I had all of these at work.)
We also have had everything for Solaris, too -- laptops to the high end
stuff.
As for AIX's differences... well, despite how odd it looks, it's
actually one of the more UNIX-compliant boxes from a standards
perspective than most UNIX or UNIX-like platforms these days. IBM spent
a lot of money in standards work and also ensuring AIX matched.
Why? I dunno, I'm guessing standards-compliance may have been a required
checklist item for some of IBM's bigger AIX customers? Probably
particularly true for certain customers like the U.S. government's
procurement (purchasing) requirements.
So, most open source stuff builds out of box on AIX or ports very easily
with minor tweaks here-and-there, thanks to the standards support. One
of my projects was to build about 45 OSS software on AIX 4.3.3 and on
AIX 5.2. Something like about 70% to 80% built out of the box.
Almost all of the rest required only 1 to 3 minor tweaks -- header file
inclusion, compile option, occasional syscall parameter list change.
Only three software required real time and attention to port. (I
submitted patches to upstream maintainers, to make it easier for myself
and my co-workers for future versions, too.)
What makes AIX look different to an admin is the fact that it's got an
heritage related to the AS/400 (OS was OS/400) -- now called iSeries,
and the mainframes (various OSes) -- now called zSeries. AIX was not
directly related to either's code base, but the IBM people whom
originally designed AIX had a strong AS/400 and mainframe background.
You want to know how much of an influence it was? Well, I got my own
AS/400 (type 9406 model 170; ~110 lb *full* size tower case) for home
the other day. I haven't run OS/400 in about 10 years and was really
rusty and had forgotten everything.
So I booted up V5R2 (what we UNIX-heads would call "Version 5.2", but
what IBM-heads would call "Version 5, Release 2") via a D-mode IPL
(AS/400-speak for 'booting from the CD'), and set off to do various
stuff.
The funny thing? I recognized a lot of things from my AIX use! SMIT-ish
menu system and prompting, PTFs (bug fixes), paging, diag (AIX) vs DST
or SST, design of menu options and screen placement, dumps and analysis,
call-home, and many other things.
I believe all these made it into AIX, thanks to people influenced by
AS/400 and mainframes, instead of the other way around. Some stuff tried
first in AIX also makes it into OS/400 eventually, too. (IBM tries hard
to make sure their products generally gets 'best-of-breed' features
after seeing it being well executed on their other product families.)
-Dan
speaking of laptops: I have this comatose ThinkPad 860 on my desk,
it refuses to boot from CD (4.1.5 I guess). Does it need a special
version of 4.1.5 ?
> What makes AIX look different to an admin is the fact that it's got an
> heritage related to the AS/400 (OS was OS/400) -- now called iSeries,
> and the mainframes (various OSes) -- now called zSeries. AIX was not
> directly related to either's code base, but the IBM people whom
> originally designed AIX had a strong AS/400 and mainframe background.
>
Well, smitty somehow reminds me of the mainframe ISPF menu system,
at least the F1/F3/F10 and probably also the F8 (print image) key
remind me of the help/exit/cancel/printscreen menu items.
And F6 (showing the commands) somehow corresponds to invoking
the TSO command line :-)
>> the problem is constant, slow around the clock, and of course worse
>> during peak usertime, but even when i ssh into the thing at night its
>> ridiculously slow even to type commands!
> Hmm, that really does sound like the system is memory and/or I/O-starved
> (e.g. due to excessive paging *OR* due to application pounding the disk
> so much or so hard).
The data that was supplied so far doesn't seem to clearly indicate this.
"Ridiculously slow even to type commands" seems to point to a network
problem. I'd check and doublecheck e.g. duplex settings, not only on
the connection the AIX box is on, but also later in the path.
--
Jurjen Oskam
I am sorry to say that we have not had the TP 860 in a *long* time. :(
So I do not remember any special instructions or tips. Sorry. :(
I hope someone else here either has a working TP 860 or remembers, and
can answer your question.
You can run AIX 4.2.1 on the 860 if you use a special hack:
http://www.tecnopolis.ca/aixtp/tphack860.html
(You will need both AIX 4.2.1 and AIX 4.1.5 CDs to make it work.)
I don't think you can run anything newer than AIX 4.2.1 on the 860.
>> What makes AIX look different to an admin is the fact that it's got an
>> heritage related to the AS/400 (OS was OS/400) -- now called iSeries,
>> and the mainframes (various OSes) -- now called zSeries. AIX was not
>> directly related to either's code base, but the IBM people whom
>> originally designed AIX had a strong AS/400 and mainframe background.
>
> Well, smitty somehow reminds me of the mainframe ISPF menu system,
> at least the F1/F3/F10 and probably also the F8 (print image) key
> remind me of the help/exit/cancel/printscreen menu items.
> And F6 (showing the commands) somehow corresponds to invoking
> the TSO command line :-)
:-)
It is similar for OS/400, too. I have not quite found a direct
equivalent of AIX's F6, but for most other things, it's very similar.
Consistent key function binding, CLI or menu based approach as desired,
helper screens like entering <cmd> at CLI then pressing F4 to be
prompted for various parameters. But by default, all menu screens
prompts for parameters, similar to SMIT. Also has defaults filled in,
and you can change anything allowed.
These days, I now only run TSO via the Hercules emulator for MVS 3.8j.
MVS 3.8j is *very* old, at least both are free. :)
http://www.jaymoseley.com/hercules/
http://www.cbttape.org/mvs38.htm
Runs on Linux and Windows.
-Dan
Well... I have seen such situations as described, in the past, when a
system is massively oversubscribed with memory and has only a very small
amount of physical memory left after all the heavy swapping.
I have seen this with our large news servers once in a long while. :)
It's so bad that it takes a few minutes for the system to free some
memory, after heavy paging in/out, to run executables. Heavy paging also
leads to severe I/O consumption, too. Not much available I/O to quickly
load things; has to wait in a long queue.
That is true, that network could still be broken in some area upstream.
But so far, various data he posted suggested a borderline setup with 256
MB and apps possibly eating a lot.
No hard data yet; svmon will probably clarify if it's a memory situation
or network, upstream. He's going to try and get perfagent.tools loaded
shortly.
errpt will also often indicate if a system is running low on VMM. He
will also check and post that shortly, too.
You're right, not enough information yet. But it should be coming soon.
-Dan
speaking of MVS (or S/390 for that matter):
A long time ago there were these P/390 plug-in cards for PS/2 boxes,
which gave you a mini-mainframe within you PeeCee, eg for development purposes.
Are there many of them still around,
and how useful are they, just in case of the remote chance to find some on eBay?
# inutoc /tmp
# installp -aXgd/tmp bos.perf perfagent.tools
+-----------------------------------------------------------------------------+
Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...done
Results...
WARNINGS
--------
Problems described in this section are not likely to be the source of
any
immediate or serious failures, but further actions may be necessary
or
desired.
Already Installed
-----------------
The number of selected filesets that are either already installed
or effectively installed through superseding filesets is 3. See
the summaries at the end of this installation for details.
NOTE: Base level filesets may be reinstalled using the "Force"
option (-F flag), or they may be removed, using the deinstall or
"Remove Software Products" facility (-u flag), and then reinstalled.
<< End of Warning Section >>
SUCCESSES
---------
Filesets listed in this section passed pre-installation verification
and will be installed.
Selected Filesets
-----------------
bos.perf.diag_tool 5.2.0.0 # Performance Diagnostic
Tool
bos.perf.proctools 5.2.0.0 # Proc Filesystem Tools
bos.perf.tools 5.2.0.0 # Base Performance Tools
perfagent.tools 5.2.0.0 # Local Performance
Analysis &...
<< End of Success Section >>
FILESET STATISTICS
------------------
7 Selected to be installed, of which:
4 Passed pre-installation verification
3 Already installed (directly or via superseding filesets)
----
4 Total to be installed
+-----------------------------------------------------------------------------+
Installing Software...
+-----------------------------------------------------------------------------+
installp: APPLYING software for:
bos.perf.tools 5.2.0.0
bos.perf.proctools 5.2.0.0
bos.perf.diag_tool 5.2.0.0
. . . . . << Copyright notice for bos.perf >> . . . . . . .
Licensed Materials - Property of IBM
5765E6200
(C) Copyright International Business Machines Corp. 1993, 2002.
(C) Copyright BULL 1993, 2002.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
Licensed Materials - Property of IBM
5765E6200
(C) Copyright International Business Machines Corp. 1993, 2002.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . << End of copyright notice for bos.perf >>. . . .
Filesets processed: 3 of 4 (Total time: 12 secs).
installp: APPLYING software for:
perfagent.tools 5.2.0.0
. . . . . << Copyright notice for perfagent.tools >> . . . . . . .
Licensed Materials - Property of IBM
5765E6200
(C) Copyright International Business Machines Corp. 1993, 2002.
(C) Copyright Regents of the University of California 1982, 1986,
1987.
(C) Copyright BULL 1993, 2002.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . << End of copyright notice for perfagent.tools >>. . . .
Finished processing all filesets. (Total time: 20 secs).
+-----------------------------------------------------------------------------+
Summaries:
+-----------------------------------------------------------------------------+
Pre-installation Failure/Warning Summary
----------------------------------------
Name Level Pre-installation
Failure/Warning
-------------------------------------------------------------------------------
bos.perf.tune 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.perfstat 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.libperfstat 5.2.0.0 Already superseded by
5.2.0.50
Installation Summary
--------------------
Name Level Part Event
Result
-------------------------------------------------------------------------------
bos.perf.tools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.proctools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 ROOT APPLY
SUCCESS
perfagent.tools 5.2.0.0 USR APPLY
SUCCESS
perfagent.tools 5.2.0.0 ROOT APPLY
SUCCESS
# svmon -G
exec(): 0509-036 Cannot load program svmon because of the following
errors:
0509-130 Symbol resolution failed for svmon_back because:
0509-136 Symbol get_itos (number 64) is not exported from
dependent module /unix.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
# svmon
exec(): 0509-036 Cannot load program svmon because of the following
errors:
0509-130 Symbol resolution failed for svmon_back because:
0509-136 Symbol get_itos (number 64) is not exported from
dependent module /unix.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
#
> It's so bad that it takes a few minutes for the system to free some
> memory, after heavy paging in/out, to run executables. Heavy paging also
> leads to severe I/O consumption, too. Not much available I/O to quickly
> load things; has to wait in a long queue.
Yes, I've seen this happen as well. "cannot fork: no swap space" is the
dreaded message. :)
However, when you eventually get a shell, typing in the commands is not slow
at all. Until you hit Enter that is: if your command results in forking a
process, that'll take quite a while and could easily fail.
Perhaps I misinterpreted "even typing in the commands is slow".
--
Jurjen Oskam
> Confirmed that network connection appears to be set up and seen ok.
>> # netstat -i 1
>> input (en0) output input (Total) output
>> packets errs packets errs colls packets errs packets errs colls
>> 3328612 0 3720965 0 0 3586536 0 3979085 0 0
>> 11 0 10 0 0 11 0 10 0 0
>> 35 0 36 0 0 37 0 38 0 0
>> 18 0 16 0 0 18 0 16 0 0
>> 30 0 21 0 0 30 0 21 0 0
> Very quiet network activity, and no errors or collisions. Very good.
Well, with the NIC in full duplex, there should be no collisions by
definition right?-) But there could still be errors - although at that
low a pps rate, it could still be that there wasn't any attempt at
simultaneous access to the link by both ends.
rick jones
FWIW, some boilerplate I trot-out from time to time regarding duplex:
How Autoneg is supposed to work:
When both sides of the link are set to autoneg, they will "negotiate"
the duplex setting and select full duplex if both sides can do
full-duplex.
If one side is hardcoded and not using autoneg, the autoneg process
will "fail" and the side trying to autoneg is required by spec to use
half-duplex mode.
If one side is using half-duplex, and the other is using full-duplex,
sorrow and woe is the usual result.
So, the following table shows what will happen given various settings
on each side:
Auto Half Full
Auto Happiness Lucky Sorrow
Half Lucky Happiness Sorrow
Full Sorrow Sorrow Happiness
Happiness means that there is a good shot of everything going well.
Lucky means that things will likely go well, but not because you did
anything correctly :) Sorrow means that there _will_ be a duplex
mis-match.
When there is a duplex mismatch, on the side running half-duplex you
will see various errors and probably a number of late collisions. On
the side running full-duplex you will see things like FCS errors.
Note that those errors are not necessarily conclusive, they are simply
indicators.
Further, it is important to keep in mind that a "clean" ping (or the
like - eg "linkloop") test result is inconclusive here - a duplex
mismatch causes lost traffic _only_ when both sides of the link try to
speak at the same time. A typical ping test, being synchronous, one at
a time request/response, never tries to have both sides talking at the
same time.
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
if so, what do you make of this:
# svmon -G
exec(): 0509-036 Cannot load program svmon because of the following
errors:
0509-130 Symbol resolution failed for svmon_back because:
0509-136 Symbol get_itos (number 64) is not exported from
dependent module /unix.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
??????????????
Maybe he's earning his living?
> Pre-installation Failure/Warning Summary
> ----------------------------------------
> Name Level Pre-installation
> Failure/Warning
> -------------------------------------------------------------------------------
> bos.perf.tune 5.2.0.0 Already superseded by
> 5.2.0.50
> bos.perf.perfstat 5.2.0.0 Already superseded by
> 5.2.0.50
> bos.perf.libperfstat 5.2.0.0 Already superseded by
> 5.2.0.50
>
> [...]
>
> if so, what do you make of this:
> # svmon -G
> exec(): 0509-036 Cannot load program svmon because of the following
> errors:
> 0509-130 Symbol resolution failed for svmon_back because:
> 0509-136 Symbol get_itos (number 64) is not exported from
> dependent module /unix.
> 0509-192 Examine .loader section symbols with the
> 'dump -Tv' command.
I think something went south with your update. How did you
perform the 4.3 to 5.2 upgrade? Did you install an ML after
the upgrade? If so you should use the bos.perf from that ML.
Do a 'instfix -i | grep ML' to see where you're at. Use the
lslpp command to verify you have no packages in BROKEN state.
Regards,
Frank
Probably got to apply patches to the box. Sounds like some sort of
mismatch between installed kernel and the installed perfagent tools.
This is one of the major reasons why you often need patching to ensure
both are in sync, since perfagent tools -- by their very nature -- goes
rather deep in kernel data structures.
You may want to do a backup of rootvg. One way to do that is:
# mksysb -e -i -m -X /somefilesystem/mksysb-060106
What that does is back up rootvg to a file named 'mksysb-060106' in
/somefilesystem (which should a large filesystem be on datavg). Or back
it up to tape by changing /somefilesystem/mksysb-060106 to /dev/rmt0.
Or burn it to an CD or DVD... or even just make the ISO image file and
ftp/scp it to another system to burn to CD or DVD, by doing 'smitty mkcd'.
Doing a backup of rootvg, especially for a less experienced AIX admin
working with a production system, is highly recommended. If things
really, really, really go bad, you can always boot off CD #1, a built CD
or DVD, or the network (NIM) and do a mksysb restore.
Think of AIX's mksysb files to be equivalent to Solaris' Flash Archives
(which are just cpio files).
Now, assuming that stuff is out of the way, moving on now to patching
related stuff.
You'll need to determine your current maintenance level (ML, equivalent
to Solaris's MU or Update x) in order to determine which patch tarball
to fetch.
Do this:
# instfix -i|grep ML
The highest numbered ML will be whatever ML you are at now. For example:
# instfix -i|grep ML
All filesets for 5.2.0.0_AIX_ML were found.
All filesets for 5200-01_AIX_ML were found.
All filesets for 5200-02_AIX_ML were found.
All filesets for 5200-03_AIX_ML were found.
All filesets for 5200-04_AIX_ML were found.
All filesets for 5200-05_AIX_ML were found.
Means that I have ML5. The latest available ML for download is ML7.
So I would get the ML5-to-ML7 upgrade:
ftp://ftp.software.ibm.com/aix/fixes/52/ml/
Then cd to '520507' subdirectory at the FTP site, and fetch the
520507_v1.tar.gz file -- about 350 MB. I am only using my own system and
ML5-to-ML7 as an example here. Yours will probably differ.
There's a second tarball but you probably don't have to fetch it; that
usually has locale-related stuff. Not usually a need for people in the
U.S., but often a must-have elsewhere.
(You can get other ML tarballs instead of the ML5-to-ML7 stuff,
depending on what ML version you have.)
Then you'd put it on the remote AIX server. Put it in a directory with
at least 800 MB of space. If you do not have any existing filesystems
with that much space, then what you can do is make a temporary
filesystem.
# lsvg datavg
Look at the PP size. Let's say it's 128 MB. We want 1 GB, so that's 8 PPs.
(128 MB per PP x 8 PPs = 1024 MB to carve out.)
Also make sure that 'FREE PPs' says there are at least 1024 megabytes.
We also want only one copy (unmirrored) since this is a quick-and-dirty
temporary filesystem, by saying '-c1'. We also don't want it to mount
automatically at boot time by saying '-A no'.
# mklv -c1 -y templv datavg 8
# crfs -v jfs -d /dev/templv -m /temp -A no
# mount /temp
Then upload the tarball to /temp. (Or hop on system and download from
IBM's FTP site directly to the system if you have a good internet
connection for that machine.)
Time to prepare the patches (known as PTFs in IBM-speak -- program
temporary fixes):
# cd /temp
# gzip -dc 520507_v1.tar.gz | tar xf -
# inutoc .
# lslpp -l|grep APPLIED
If you have anything showing up as APPLIED, then may want to commit them
now. To do that:
# smitty install
Pick 'Software Maintenance and Utilities'
Pick 'Commit Applied Software Updates (Remove Saved Files)'
Now you're ready to actually do the patching. Whenever you want to do it
-- say, during a maintenance window or whenever is ideal for you:
# smitty update_all
Enter '/temp' when prompted for input device/directory for software
Press the down arrow key to 'COMMIT software updates?' line and then
press tab once to change from 'yes' to 'no'
Press the down arrow key to 'SAVE replaced files?' line and then press
tab once to change from 'no' to 'yes'
(What changing the two options above buys you is that you'll be able to
back them out later on if it introduces problems.)
Then press the Enter key to start the patching.
When done, press esc then 0 keys to exit SMIT if no errors reported.
Let's assume rootvg has two drives: hdisk0 and hdisk1. We now need to
double check to make sure they now have an updated ramdisk image (for
booting; kinda like Linux's initrd) on both drives, and also make sure
the boot list is appropriate.
# bosboot -a -d /dev/hdisk0
# bosboot -a -d /dev/hdisk1
# bootlist -m normal hdisk0 hdisk1
(This will prefer a boot off hdisk0 during a normal boot; if hdisk0 is
not there due to a failed drive, it will then automatically try hdisk1.
It's equivalent of Solaris' OBP 'boot-device' setting.)
Then reboot.
# shutdown -Fr now
It is best to do the patching + reboot via a serial console connection
(which hopefully you have in place) or if not, via a physical terminal
or monitor/keyboard hooked up as its console.
You will not be able to monitor the boot without a console connection of
some kind, and more importantly, you will need that to recover from any
problems that might pop up.
So it is very important to do this stuff from the console connection.
You can do this stuff remotely if you have a serial console wired up to
its DB-9 port and a terminal server or modem at the remote site.
If the patching + reboot went ok, you can then blow away /temp:
# rmfs /temp
That will give the space back to datavg for future use.
If all went well, svmon *will* work post-patching and reboot.
-Dan
:-)
Nice write-up. Not of particular use to myself, but sure could've used
it for others in the past.
> Further, it is important to keep in mind that a "clean" ping (or the
> like - eg "linkloop") test result is inconclusive here - a duplex
> mismatch causes lost traffic _only_ when both sides of the link try to
> speak at the same time. A typical ping test, being synchronous, one at
> a time request/response, never tries to have both sides talking at the
> same time.
That's true. I'd forgotten to ask him to also do this:
# entstat en0
and post the output. That'd show various AIX network interface-specific
counters for the en0 interface. (Assuming en0 was his primary network
interface and enabled, of course.)
traceroute between him and the remote system might also indicate if
there's any obvious and glaringly bad bottleneck in the path somewhere.
E.g. a hop in-between consistently has a 4000ms delay.
-Dan
If it's an unmanaged switch (i.e. a cheap Netgear or Linksys switch),
then it's almost certainly to be set to auto-negotiation.
If it's a managed switch (the much more expensive stuff that costs mondo
big bucks), should be a CLI or web interface to show port info for a
specific port.
I.e. with a Cisco Catalyst switch running CatOS, you'd do 'show port
x/y' where x = module number, y = port number on the module.
Or just 'show port' to figure out which port based on port description
labels then 'show port x/y' to view details.
On the AIX system, do this:
# lsattr -El ent0 -a media_speed
(Yes, that is ent0, and NOT en0)
That's assuming en0/ent0 is your main active network interface.
(-a selects a specific attribute to display. Leave out -a media_speed if
you want to see all the information for ent0.
ent0 is the physical network adapter.
en0 is the TCP/IP config part of ent0.)
It will probably say either 100_Full_Duplex or Autonegotiation. It is
very important that BOTH ends (the host AND the switch) has the exact
same setting matched up.
If host is 100/full, switch is auto(neg), then either set the host to
autoneg or set the switch to 100/full.
If both are 100/full, great.
If both are auto, great.
If one is something, and the other is something else, not such so great. ;)
At this point, I would say to still keep an open mind as to culprit. It
appears not to be CPU-bound. Still a possibility of network issue.
Probably more likely memory related... but not yet ruled out all of the
network possibilities and don't have detailed memory information yet.
-Dan
> :-)
> Nice write-up. Not of particular use to myself, but sure could've used
> it for others in the past.
Feel free to pass it along with suitable attribution.
>> Further, it is important to keep in mind that a "clean" ping (or the
>> like - eg "linkloop") test result is inconclusive here - a duplex
>> mismatch causes lost traffic _only_ when both sides of the link try to
>> speak at the same time. A typical ping test, being synchronous, one at
>> a time request/response, never tries to have both sides talking at the
>> same time.
> That's true. I'd forgotten to ask him to also do this:
> # entstat en0
> and post the output. That'd show various AIX network interface-specific
> counters for the en0 interface. (Assuming en0 was his primary network
> interface and enabled, of course.)
Cool, now I know the AIX equivalent to lanadmin and ethtool :) I just
hope I can remember it the next time I need such knowledge.
rick jones
--
portable adj, code that compiles under more than one compiler
# lscfg -vp|grep Model: | grep IBM
Model: IBM,7044-170
Model: IBM, Racer/Whip, Rev-id 4.2
Model: IBM, Racer/Whip, Open-PIC, 00
Model: IBM,SPH05195
Model: IBM, Python, Rev-id 3.0
Model: IBM, Python, Rev-id 3.0
# ps -efl |grep udt
240001 A rfuser1 20874 21702 0 60 20 4305 5496 70185244 07:23:16
pts/5
0:02 /usr/ud33/bin/udt
240001 A rfuser1 26958 53934 0 60 20 a37 5716 701a9644 10:43:09
pts/14
0:00 /usr/ud33/bin/udt
200001 A root 35900 49850 2 61 20 b4d 232 30a10fec 11:30:53
pts/8
0:00 grep udt
240001 A jhatch 36704 35462 0 60 20 7919 5728 70198444 08:35:33
pts/6
0:30 /usr/ud52/bin/udt
240001 A rfuser5 39946 38918 0 60 20 3846 5828 7017d044 19:42:19
pts/9
0:14 /usr/ud33/bin/udt
240001 A cf1 41384 43830 2 61 20 67 5452 7024ac44 11:01:23
pts/2
0:00 /usr/ud33/bin/udt
240001 A kkennedy 45042 1 1 60 20 76db 5488 31a253d8 10:54:44
-
0:00 /usr/ud52/bin/udt PHANTOM ED30PROCX 011503
240001 A rfuser5 48740 43412 0 60 20 4882 5776 701a4a44 08:43:10
pts/11
0:03 /usr/ud33/bin/udt
240001 A cf4 51244 40844 0 60 20 32cb 5460 70243844 09:07:59
pts/3
0:00 /usr/ud33/bin/udt
240001 A cf2 51818 45534 0 60 20 680a 5464 70178644 09:07:41
pts/0
0:01 /usr/ud33/bin/udt
240001 A rfuser1 54346 19046 0 60 20 3e9f 5384 70158444 10:34:44
pts/10
0:00 /usr/ud33/bin/udt
# ps -ef|grep udt
rfuser1 20874 21702 0 07:23:16 pts/5 0:02 /usr/ud33/bin/udt
rfuser1 26958 53934 0 10:43:09 pts/14 0:00 /usr/ud33/bin/udt
jhatch 36704 35462 0 08:35:33 pts/6 0:30 /usr/ud52/bin/udt
rfuser5 39946 38918 0 19:42:19 pts/9 0:14 /usr/ud33/bin/udt
cf1 41384 43830 0 11:01:23 pts/2 0:00 /usr/ud33/bin/udt
kkennedy 45042 1 0 10:54:44 - 0:00 /usr/ud52/bin/udt
PHANTOM ED30PRO
CX 011503
rfuser5 48740 43412 0 08:43:10 pts/11 0:03 /usr/ud33/bin/udt
cf4 51244 40844 0 09:07:59 pts/3 0:00 /usr/ud33/bin/udt
cf2 51818 45534 0 09:07:41 pts/0 0:01 /usr/ud33/bin/udt
rfuser1 54346 19046 0 10:34:44 pts/10 0:00 /usr/ud33/bin/udt
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F89FB899 0105150006 P O dumpcheck The copy directory is too
small.
F89FB899 0103150006 P O dumpcheck The copy directory is too
small.
A6DF45AA 0101030906 I O RMCdaemon The daemon is started.
1BA7DF4E 0101030906 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 0101030906 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 0101030906 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 0101030506 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 0101030806 T O errdemon ERROR LOGGING TURNED ON
192AC071 0101030506 T O errdemon ERROR LOGGING TURNED OFF
A6DF45AA 1225030905 I O RMCdaemon The daemon is started.
1BA7DF4E 1225030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1225030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1225030905 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 1225030505 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 1225030805 T O errdemon ERROR LOGGING TURNED ON
192AC071 1225030505 T O errdemon ERROR LOGGING TURNED OFF
F89FB899 1223150005 P O dumpcheck The copy directory is too
small.
A6DF45AA 1221100705 I O RMCdaemon The daemon is started.
1BA7DF4E 1221100605 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1221100605 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1221100605 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 1221100305 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 1221100605 T O errdemon ERROR LOGGING TURNED ON
192AC071 1221100105 T O errdemon ERROR LOGGING TURNED OFF
BA431EB7 1221094705 P S SRC SOFTWARE PROGRAM ERROR
369D049B 1220053505 I O SYSPFS UNABLE TO ALLOCATE SPACE IN
FILE SYSTEM
369D049B 1219123705 I O SYSPFS UNABLE TO ALLOCATE SPACE IN
FILE SYSTEM
A6DF45AA 1218030905 I O RMCdaemon The daemon is started.
1BA7DF4E 1218030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1218030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1218030905 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 1218030505 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 1218030805 T O errdemon ERROR LOGGING TURNED ON
192AC071 1218030505 T O errdemon ERROR LOGGING TURNED OFF
A6DF45AA 1213125705 I O RMCdaemon The daemon is started.
1BA7DF4E 1213125605 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1213125605 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1213125605 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 1213125305 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 1213125605 T O errdemon ERROR LOGGING TURNED ON
192AC071 1213125205 T O errdemon ERROR LOGGING TURNED OFF
E18E984F 1213112805 P S SRC SOFTWARE PROGRAM ERROR
1BA7DF4E 1213112805 P S SRC SOFTWARE PROGRAM ERROR
E18E984F 1213112805 P S SRC SOFTWARE PROGRAM ERROR
E18E984F 1213112805 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1213112805 P S SRC SOFTWARE PROGRAM ERROR
E18E984F 1213112805 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1213112805 P S SRC SOFTWARE PROGRAM ERROR
369D049B 1213112705 I O SYSPFS UNABLE TO ALLOCATE SPACE IN
FILE SYSTEM
F89FB899 1212150005 P O dumpcheck The copy directory is too
small.
F89FB899 1211150005 P O dumpcheck The copy directory is too
small.
A6DF45AA 1211031005 I O RMCdaemon The daemon is started.
1BA7DF4E 1211030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1211030905 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 1211030905 P S SRC SOFTWARE PROGRAM ERROR
2BFA76F6 1211030505 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 1211030905 T O errdemon ERROR LOGGING TURNED ON
192AC071 1211030505 T O errdemon ERROR LOGGING TURNED OFF
C5C09FFA 1210181305 P S SYSVMM SOFTWARE PROGRAM ABNORMALLY
TERMINATED
C5C09FFA 1210181305 P S SYSVMM SOFTWARE PROGRAM ABNORMALLY
TERMINATED
F89FB899 1210150005 P O dumpcheck The copy directory is too
small.
F89FB899 1209150005 P O dumpcheck The copy directory is too
small.
F89FB899 1208150005 P O dumpcheck The copy directory is too
small.
F89FB899 1207150005 P O dumpcheck The copy directory is too
small.
#
I'll let you know as soon as I get svmon going (i just go the 5.2
disks, and i'll probably ftp the filesets as soon as i get back from
lunch!).
Seriously though, thank you for the guidance and insight... your
comments about AIX being weird at first for Unix admins of other
flavors is ON the MONEY! how long have you been doing this stuff,
anyway? are you east coast or west? (i worked in silicon valley for a
few years, but am now back on the east coast (which apparently really
DOES mean wearing a tie to work and working on IBM gear... what an
cliche... west coast = shorts, t-shirt, solaris/linux and sunshine...
east coast = jacket, tie, IBM and rain!))
i have noticed the open-source application friendliness (or "L" for
"Linux Affinity" as IBM marketing types like to call it)) that AIX has
going for it... which is cool. it works out very well for IBM that
most linux packages compile/run so easily on AIX, doesnt it?
also, do you know of any good AIX or pSeries websites/books i might
check out? (i'm not that impressed with rootvg.net so far).
any other usefull commands or tools i should try?
Thanks so much, man. I promise to spread the knowledge to others and
help them when i can.. i think i'll be hanging out in this group for
awhile
anyway, i'll let you know when i get svmon going!
--C
Dan,
okay, i got the disks, ftp the filesets to the box.. no problem. i
installed them and it looked like it went fine... then, the moment of
truth, i go to run svmon and BAM!!! the baine of my existence...
weird AIX error messages!!! check it out:
# inutoc /tmp
# installp -aXgd/tmp bos.perf perfagent.tools
<< End of Warning Section >>
<< End of Success Section >>
+-----------------------------------------------------------------------------+
Installing Software...
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
Summaries:
+-----------------------------------------------------------------------------+
Pre-installation Failure/Warning Summary
----------------------------------------
Name Level Pre-installation
Failure/Warning
-------------------------------------------------------------------------------
bos.perf.tune 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.perfstat 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.libperfstat 5.2.0.0 Already superseded by
5.2.0.50
Installation Summary
--------------------
Name Level Part Event
Result
-------------------------------------------------------------------------------
bos.perf.tools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.proctools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 ROOT APPLY
SUCCESS
perfagent.tools 5.2.0.0 USR APPLY
SUCCESS
perfagent.tools 5.2.0.0 ROOT APPLY
SUCCESS
# svmon -G
exec(): 0509-036 Cannot load program svmon because of the following
errors:
0509-130 Symbol resolution failed for svmon_back because:
0509-136 Symbol get_itos (number 64) is not exported from
dependent module /unix.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
# svmon
exec(): 0509-036 Cannot load program svmon because of the following
errors:
0509-130 Symbol resolution failed for svmon_back because:
0509-136 Symbol get_itos (number 64) is not exported from
dependent module /unix.
0509-192 Examine .loader section symbols with the
'dump -Tv' command.
#wtf?
# inutoc /tmp
# installp -aXgd/tmp bos.perf perfagent.tools
<< End of Warning Section >>
<< End of Success Section >>
+-----------------------------------------------------------------------------+
Installing Software...
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
Summaries:
+-----------------------------------------------------------------------------+
#wtf??
# instfix -i | grep ML
All filesets for 5.2.0.0_AIX_ML were found.
Not all filesets for 5200-01_AIX_ML were found.
Not all filesets for 5200-02_AIX_ML were found.
Not all filesets for 5200-03_AIX_ML were found.
Not all filesets for 5200-04_AIX_ML were found.
Not all filesets for 5200-05_AIX_ML were found.
Not all filesets for 5200-06_AIX_ML were found.
#
# lslpp -E all
====================================================================
Installed License Agreements
====================================================================
The installed software listed below contains license agreements
which have been accepted.
--------------------------------------------------------------------
--------------------------------------------------------------------
Fileset: bos.rte
Product ID:
Description:
Agreement File: /usr/swlag/en_US/BOS.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: devices.pci.14107802.ucode
Product ID: 5765-E6200
Description: PCI-X Dual Channel Ultra320 SCSI RAID Adapter Microcode
Agreement File: /usr/swlag/%L/devices.pci.14107802.ucode.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: devices.pci.14106602.ucode
Product ID: 5765-E6200
Description: PCI-X Dual Channel SCSI Adapter Microcode
Agreement File: /usr/swlag/%L/devices.pci.14106602.ucode.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: rpm.rte
Product ID:
Description: RPM Package Manager
Agreement File: /usr/swlag/%L/rpm.rte.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: bos.net.ncs
Product ID:
Description: Network Computing System 1.5.1
Agreement File: /usr/swlag/%L/NCS.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: ifor_ls.base.cli
Product ID:
Description: License Use Management Runtime Code
Agreement File: /usr/swlag/%L/LUM.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
Fileset: Java14.sdk
Product ID: 5648-C9802
Description: Java SDK 32-bit
Agreement File: /usr/swlag/%L/Java14.la
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
File: cdrecord
Product ID: cdrecord
Description:
Agreement File: /usr/swlag/en_US/as_is.txt
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
File: mkisofs
Product ID: mkisofs
Description:
Agreement File: /usr/swlag/en_US/as_is.txt
Date: Sun Aug 28 13:31:36 EST 2005
Machine ID: 000C328F4C00
> if so, what do you make of this:
> # svmon -G
> exec(): 0509-036 Cannot load program svmon because of the following
The box was installed using product media and then patches and/or
maintenance levels were installed. After that, you installed svmon
using the fileset on the original product media. Result: svmon old,
rest of system new.
--
Jurjen Oskam
> It will probably say either 100_Full_Duplex or Autonegotiation. It is
> very important that BOTH ends (the host AND the switch) has the exact
> same setting matched up.
As a side note: we experience problems when both the host and the switchport
are set to autonegotiate. Throughput is then measured in kilobytes per
second. This is with regular IBM NICs and regular Cisco switches.
When everything is set too 100/Full, it works just fine.
Just a heads up.
--
Jurjen Oskam
You're right, of course.
Christian: you'll need to fetch the updates for perfagent.tools and
bos.perf.tools and install that.
One way to do that is to get the ML6-to-ML7 upgrade tar file, unpack it,
cd to /temp (or whereever you unpack it) then do:
# installp -aXgd. perfagent.tools
See if that helps. If not, then you will also have to do:
# installp -aXgd. bos.perf.tools
I don't recall if IBM makes individual filesets directly downloadable
for AIX v5 like they did with AIX v4... so you might have to grab that
big tarball just to install some stuff from it.
-Dan
Since he has the original filesets from the 5.2 media
installed, why not use fix central to get the current
fileset? Go here:
http://www-912.ibm.com/eserver/support/fixes/search.jsp?system=2&release=5.2
and put the fileset names in the search string field.
Also i'd suggest after the install you check your system
with 'instfix -i | grep ML' again. There's a chance your
perf. problems are related to a not cleanly done update.
Bring the system up to a consistent and possibly newest
ML.
Regards,
Frank
P.S: Please don not constantly open new threads. This will
only drive people away, since it's harder to follow. If
you need more timely replies i'd suggest to open up a call
with IBM. Thanks.
> I don't recall if IBM makes individual filesets directly downloadable
> for AIX v5 like they did with AIX v4... so you might have to grab that
Oh, they do: google for "aix fixes", and the first hit refers to the
page on the IBM site where you can download all sorts of fixes, including
single filesets. It's quite nice: if you supply the output of "lslpp" of
the target system, you get all dependencies tailored to your system.
--
Jurjen Oskam
It's quite nice for one-off individual lookups, but hard to automate via
FTP as they are not made available there by default for v5 fixes.
I have a cron job that runs nightly which fetches latest copies of
patches to a local patch repository.
Our machines are pointed to the local patch repository for getting
fixes, for our various OS platforms. Works great for AIX with v4 for
many years now.
AIX v5 has been problematic in that respect, other than for the ML tarballs.
-Dan
You should probably read up on suma and it's feature to
download only. I think this might just be the right thing
for you.
Regards,
Frank
SUMA information:
http://www-128.ibm.com/developerworks/eserver/library/es-updateaix.html
Ahh! Very nice. I don't have any AIX 5.3 systems, so wasn't aware of it.
Does look great. Hmm. Guess I now have a good reason to order 5.3 for
the local patch repository server. :-)
I see that it can download patches for older OS releases (e.g. 5.2) so
that will be useful.
-Dan
I don't recall precisely, but there was a way for 5.2 systems.
Either suma - which is actually "just" a set of perl scripts -
would run without a hitch or you had to use the lslpp output
from the 5.2 box and feed it to suma on the 5.3 box.
Regards,
Frank
check out the suma command .. its been backported to 5.2
ref: http://www14.software.ibm.com/webapp/set2/sas/f/suma/enabling.html
HTH
Mark Taylor
Nice! I thought I'd have to buy 5.3 to get SUMA, but then found out it's
already in 5.2 ML5 and later (as you pointed out) -- thanks!!!
(And for people with 5.2 ML4 or earlier, or 5.1, can download the
required packages from the URL above.)
-Dan
check out the suma command .. its been backported to 5.2
ref: http://www14.software.ibm.com/webapp/set2/sas/f/suma/enabling.html
HTH
Mark Taylor
check out the suma command .. its been backported to 5.2
tried to post this yesterday ... you can enbable suma on 5.2 and 5.1
sorry to interrupt...
highly possible you are already at ML5 but you only installed the base
for the new filesets...
+--------------------------------------------------------------------------Â---+
Summaries:
+--------------------------------------------------------------------------Â---+
Pre-installation Failure/Warning Summary
----------------------------------------
Name Level Pre-installation
Failure/Warning
---------------------------------------------------------------------------Â----
bos.perf.tune 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.perfstat 5.2.0.0 Already superseded by
5.2.0.50
bos.perf.libperfstat 5.2.0.0 Already superseded by
5.2.0.50
Installation Summary
--------------------
Name Level Part Event
Result
---------------------------------------------------------------------------Â----
bos.perf.tools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.proctools 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 USR APPLY
SUCCESS
bos.perf.diag_tool 5.2.0.0 ROOT APPLY
SUCCESS
perfagent.tools 5.2.0.0 USR APPLY
SUCCESS
perfagent.tools 5.2.0.0 ROOT APPLY
SUCCESS
maybe you can try to do a lppchk?
you should update your filesets for the newly installed to ML5 if it
can be done?
tried to post this yesterday ... you can enbable suma on 5.2 and 5.1
tried to post this yesterday ... you can enbable suma on 5.2 and 5.1
tried to post this yesterday ... you can enbable suma on 5.2 and 5.1
Could you please check your newsreader/server setting,
or move to another NNTP-provider? Your posts are getting
duplicated, which makes the reading of the thread very
hard. Not to mention indexing at the archives ...
Please do *not* repost if your post does not show up
immediately, that's not how NNTP works ;-)
Thanks,
Frank