I few days ago we've noticed a strange behavior on our p690 system (running
5.2 with all the patches as of end of july 2004 applied). Command
ls -l on a JFS2 file system was hanging indefinitely. Has anyone noticed this
behavior before? We resorted into restoring a mksysb from the past.
Another questions relates to failed autofs mounts from a 'crappy' NFS server.
We had to mount an automaounted FS from an unstable NFS file server. When the
server failed (and was restarted) the previously automounted FSs could not be
umounted neither could the same FS mounted afresh. We tried everything to
force unmounts from the failing NFS servers but to no avail.
Any hint, expecially on the first item would be greatly appreciated.
thanks!
Michael
> I few days ago we've noticed a strange behavior on our p690 system (running
> 5.2 with all the patches as of end of july 2004 applied). Command
> ls -l on a JFS2 file system was hanging indefinitely.
Should that happen again, try doing the same command again on another terminal,
but this time trace it with the truss command. Then you'll know exactly where
it hangs.
You said you restored an mksysb when this happened earlier. Was the filesystem
that caused the hang wiped and recreated? (And I don't mean rm -rf wiped, I
crfs wiped.) If not, the cause *might* be a corrupted filesystem. Restoring
the mksysb might not have fixed anything in that case. I'd unmount the
filesystem and fsck it thorougly.
> Another questions relates to failed autofs mounts from a 'crappy' NFS server.
> We had to mount an automaounted FS from an unstable NFS file server. When the
> server failed (and was restarted) the previously automounted FSs could not be
What mount options were used? In the case of NFS, some options can result in
unkillable hung processes when the server is unreachable/unavailable.
--
Jurjen Oskam
"I often reflect that if "privileges" had been called "responsibilities" or
"duties", I would have saved thousands of hours explaining to people why
they were only gonna get them over my dead body." - Lee K. Gleason, VMS sysadmin
do a dd of both filesystem and jfs2log device and open a call with ibm.
we've have 'a few' jf2-related problems by now, so this is becoming
routing.
afterwards, umount & fsck & logform & fsck and it should start behaving
reasonable again.
On Tue, 31 Aug 2004, Florian Heigl wrote:
f| Date: Tue, 31 Aug 2004 01:47:38 +0200
f| From: Florian Heigl <floria...@m4f.net>
f| Newsgroups: comp.unix.aix
f| Subject: Re: AIX 5.2 ls command hanging on JFS2 FS
f|
f| Michael E. Thomadakis wrote:
f| > Hello all.
f| >
f| > A few days ago we've noticed a strange behavior on our p690 system
(running
f| > 5.2 with all the patches as of end of july 2004 applied). Command
f| > ls -l on a JFS2 file system was hanging indefinitely. Has anyone noticed this
f|
f| do a dd of both filesystem and jfs2log device and open a call with ibm.
f| we've have 'a few' jf2-related problems by now, so this is becoming
f| routing.
f|
f| afterwards, umount & fsck & logform & fsck and it should start behaving
f| reasonable again.
f|
So yo think that a fsck could correct the JFS2 problem? I know that we need to
do this from time to time to correct minor inconsistencies.
Thanks,
Michael
On Mon, 30 Aug 2004, Jurjen Oskam wrote:
j| Date: 30 Aug 2004 19:12:52 GMT
j| From: Jurjen Oskam <jos...@quadpro.stupendous.org>
j| Newsgroups: comp.unix.aix
j| Subject: Re: AIX 5.2 ls command hanging on JFS2 FS
j|
j| On 2004-08-30, Michael E. Thomadakis <mi...@hellas.tamu.edu> wrote:
j|
j| > I few days ago we've noticed a strange behavior on our p690 system (running
j| > 5.2 with all the patches as of end of july 2004 applied). Command
j| > ls -l on a JFS2 file system was hanging indefinitely.
j|
j| Should that happen again, try doing the same command again on another terminal,
j| but this time trace it with the truss command. Then you'll know exactly where
j| it hangs.
OK, I will do this.
j|
j| You said you restored an mksysb when this happened earlier. Was the filesystem
j| that caused the hang wiped and recreated? (And I don't mean rm -rf wiped, I
j| crfs wiped.) If not, the cause *might* be a corrupted filesystem. Restoring
j| the mksysb might not have fixed anything in that case. I'd unmount the
j| filesystem and fsck it thorougly.
That FS is a JFS2 that is not part of rootvg. I've also suspected that the
JFS2 FS needed fsck.
j|
j| > Another questions relates to failed autofs mounts from a 'crappy' NFS server.
j| > We had to mount an automaounted FS from an unstable NFS file server. When the
j| > server failed (and was restarted) the previously automounted FSs could not be
j|
j| What mount options were used? In the case of NFS, some options can result in
j| unkillable hung processes when the server is unreachable/unavailable.
I always use the intr,bg,wsize= ,rsize =
I want to mention that after the crappy NFS server had to be restarted the
entire autofs tree would not mount / umount anything. I was trying to force
umounts but the umount command was getting stuck.
Any idea on how to umount FSs from a dead NFS server, when losing data is OK.
Thanks
Michael
j|
j| --
j| Jurjen Oskam
j| "I often reflect that if "privileges" had been called "responsibilities" or
j| "duties", I would have saved thousands of hours explaining to people why
j| they were only gonna get them over my dead body." - Lee K. Gleason, VMS sysadmin
j|
> j| What mount options were used? In the case of NFS, some options can result in
> j| unkillable hung processes when the server is unreachable/unavailable.
>
> I always use the intr,bg,wsize= ,rsize =
I very rarely use NFS, so this might not be too grounded in reality, but the
manpage for mount says that "hard" is the default. That would mean that any
outstanding I/O requests are tried an retried until the process responsible
for the request is interrupted (which is possible because you used "intr").
> Any idea on how to umount FSs from a dead NFS server, when losing data is OK.
Try to determine if there are any processes with open files on the affected
filesystems (with fuser -c and / or fuser -d). Then kill those processes, and
try to umount the filesystems when all processes are gone. As an alternative,
look into the "soft" mount option for NFS.
--
Jurjen Oskam
"I often reflect that if "privileges" had been called "responsibilities" or
"duties", I would have saved thousands of hours explaining to people why
Yes. JFS2 has been nothing but trouble for us on the regatta's.
I've had hangs like yours that caused _all_ filesystems (even JFS1)
to become corrupted... My understanding then was that log updates got
registered, but not data, Which lead to thousands of NULL'ed files,
and restore from sysb was only way out.
But that was on early 5.1. IBM was then advising we shouldn't run JFS2
on 32 bit kernel. Later we moved to 64-bit kernel on 5.1D. Tried to use
JFS2 on a few filesystems, which again caused problems with excessive
page scan activity. Then IBM adviced us that JFS2 wasn't good to our
kind of workload (HPC center).
So, I've given up on JFS2.
-jf
On Sat, 11 Sep 2004, Jan-Frode Myklebust wrote:
j| Date: 11 Sep 2004 16:41:29 GMT
j| From: Jan-Frode Myklebust <janf...@parallab.uib.no>
j| Newsgroups: comp.unix.aix
j| Subject: Re: AIX 5.2 ls command hanging on JFS2 FS
j|
j| On 2004-08-30, Michael E. Thomadakis <mi...@hellas.tamu.edu> wrote:
j| >
j| > I few days ago we've noticed a strange behavior on our p690 system (running
j| > 5.2 with all the patches as of end of july 2004 applied). Command
j| > ls -l on a JFS2 file system was hanging indefinitely. Has anyone noticed this
j| > behavior before?
j|
j| Yes. JFS2 has been nothing but trouble for us on the regatta's.
j| I've had hangs like yours that caused _all_ filesystems (even JFS1)
j| to become corrupted... My understanding then was that log updates got
j| registered, but not data, Which lead to thousands of NULL'ed files,
j| and restore from sysb was only way out.
We are running AIX 5.2 and try to keep it up-to-date as much as possible. In
mid-Summer we applied the (then) patches to bring it to ML 04. Then, one of
our JFS2 filesystems started hanging. We reverted back to ML0 and gradually to
ML02. We never had problems with JFS2 before applying the patches to ML04.
There are some APARs available no for JFS2 hung file systems. IY59082 is the
latest one for AXI 5.2. In Mid Aug they had released another one for the same
problem.
After applying these latest patches, the JFS2 FS seems 'OK'.
j|
j| But that was on early 5.1. IBM was then advising we shouldn't run JFS2
j| on 32 bit kernel. Later we moved to 64-bit kernel on 5.1D. Tried to use
j| JFS2 on a few filesystems, which again caused problems with excessive
j| page scan activity. Then IBM adviced us that JFS2 wasn't good to our
j| kind of workload (HPC center).
j|
j| So, I've given up on JFS2.
We run the 64-bit kernel and we are also an HPC shop. We have spent quite some
time to tune up the VM and IO and other parameters.
I only hope that the JFS2 patch will hold.
j|
j|
j| -jf
j|
-MT
Could you please tell me your settings, and amount of memory ?
And maybe rationale behind the settings ?
We have 3 nodes. Two with 64 GB, and one with 192 GB memory. All
running with:
vmtune -R 64 -f 3840 -F 5888 -p 5 -P 10 -t 10 -y 1 -h 0
When a lot of memory is used for page buffers, and suddenly a large
memory application wants to grab it, we get into a long periode
of page scan activity. It seems the routine for freeing up memory
is highly inefficient. Have you seen this? Found any magic vmtune
to fix it? It's not a big problem, since we're mostly running GPFS,
but Gaussian running on local JFS-filesystem seems quite good a
triggering it..
Also, one of the nodes has a local JFS-filesystem we're running Tivoli
Space Manager (HSM) on, and export it to the two other nodes via NFS.
When writing large files to this filesystem over NFS, we get very high
system load on the NFS server. Most cpus running in system mode, killing
the performance of user applications. I think this is again related to
the page scans, but any hints on NFS tuning would be greatly appreciated.
> I only hope that the JFS2 patch will hold.
Good luck!
-jf
What we have
------------
A 32 PE Regatta using Power4 at 1.3GHz, with 64GB of main memory and ~1TB in
one I/O drawer. 1/2 TB is dedicated to a JFS2 'work' file system where jobs
read and write their usually huge data files (10-30GB is not uncommon) as they
run. We get a lot of I/O activity in this system. (We've recently attached it
to a partitioned SGI TP9500 raid with **much** pain as documentation and
support is non-existent. We had some issues there that I could discuss in a
separate email due to the amount of detail involved.) All other file systems
are JFS only because quota are not supported on JFS2 until AIX 5.3.
To tune this part of the system, AIX gives you a number of options, but the
best starting point is to understand the specific needs of the workload in
your environment. Use the several monitoring tools to see what resources are
requested, their request levels, who makes the requests and when. You will be
surprised to see how often common sense assumptions are violated in
mixed-workload environments. filemon has been very useful to us (along with
vmstat, nmon, etc.) xmperf is alright as a visualization tool too. Here we run
EVERYTHING: compute intensive apps with very large memory footprint all the
way to I/O intensive jobs (Gaussian, BLAST, etc.) We use MPI and OpenMP.
We are using STRICT upper bounds for the %-ntages of JFS, JFS2 and NFS cache
pages. Otherwise, this bound would be advisory and not enforceable. We do not
want computation pages to get evicted by FS caching, BUT stale computation
pages better be used in re-usable FS data. The actual numbers vary by workload
mix and needs monitoring and experimentation. You need to know the statistics
of the I/O and paging requests of your apps: this will guide you in laying out
the appropriate geometry in the logical volumes (stripe width/size, page
read-ahead, release behind, random I/O clustering, etc.)
Pay attention to the logical to physical storage mapping for the file systems.
Make sure that you use as many physical disks as possible for I/O that
initiates service concurrently. The logical volume to physical
disk mapping is determined after ou know the I/O workload.
We are using 'deferred allocation' is used (VM allocated when page needs to be
paged out), and for this reason we have a total of ~50GB of VM backing space
in 4 paging spaces. Note that we are using 4 VM pools (one per 8 CPUs or 1 MCM
), 2 page-sets / vm pool and LRU bucket = 128K pages. These reduce the
contention and overhead of the threads which do VM management. Tune the wake
interval of syncd. Here we have set it to 20seconds. To frequent wakeups means
that dirty pages are written out too often, thus missing the opportunity for a
writeback. Too long a period means that a lot of firty pages
accumulate over time and when you need a free page a lot of flushing will take
place.
For NFS, you can control the number and priority of server threads doing the
NFS I/O for the clients. You need to determine how much of your system should
be an NFS server and how much a compute one. There are resource controls in
AIX that allows you to put tasks in resource usage classes but we haven't used
this feature enough to know how much it buys you.
Networking is another area whisch many times receives less attention than it
deserves. Our regatta has 2 GigE that connect to a public network and another
SMP server, resp. The 2nd server also hosts NFS exported file systems and a
tape library. We are using Jumbo Frames (MTU=9000) and we have enabled all the
nice h/w offloading features that the cards allow us to do: checksum, TCP/UDP
and IP h/w assist. On the TCP/UDP and BSD socket side we have enabled all the
recent enhancements: window scaling, large windows, ECN, SACKs, etc. These
propagate to all apps, including NFS. (Note that, if you are mounting from a
Linux machine you need to have reserved ports enabled.)
Also, devices and device drivers have tunable parameters, such as, command
queue depth, buffer sizes, etc.
In general, you should avoid allowing the system to carry out either
unecessary work or doing it redudantly. Re-paging is one of the killers of
performance. If you know that you have random workload, enable the release
behind option to avoid caching uneeded stuff in memory.
We are using AIX 5.2 so I am going to list the values of our tunnables in a
format produced by these utilities. Pls contact me if something needs some
specific clarification.
VM tunning
----------
# vmo -L
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
memory_frames 16M 16M 4KB pages S
--------------------------------------------------------------------------------
pinnable_frames 14702K 14702K 4KB pages S
--------------------------------------------------------------------------------
maxfree 36607 128 36607 16 200K 4KB pages D
minfree
memory_frames
--------------------------------------------------------------------------------
minfree 3839 36599 3839 8 200K 4KB pages D
maxfree
memory_frames
--------------------------------------------------------------------------------
minperm% 2 20 2 1 100 % memory D
maxperm%
--------------------------------------------------------------------------------
minperm 313077 313077 S
--------------------------------------------------------------------------------
maxperm% 40 80 40 1 100 % memory D
minperm%
maxclient%
--------------------------------------------------------------------------------
maxperm 6114K 6114K S
--------------------------------------------------------------------------------
strict_maxperm 1 0 1 0 1 boolean D
--------------------------------------------------------------------------------
maxpin% 98 80 98 1 99 % memory D
pinnable_frames
memory_frames
--------------------------------------------------------------------------------
maxpin 16056K 16056K S
--------------------------------------------------------------------------------
maxclient% 39 80 39 1 100 % memory D
maxperm%
--------------------------------------------------------------------------------
lrubucket 128K 128K 128K 64K 4KB pages D
--------------------------------------------------------------------------------
defps 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
nokilluid 1 0 1 0 4G-1 uid D
--------------------------------------------------------------------------------
numpsblks 12M 12M 4KB pages S
--------------------------------------------------------------------------------
npskill 96K 96K 111104 1 12M-1 4KB pages D
--------------------------------------------------------------------------------
npswarn 384K 384K 434K 0 12M-1 4KB pages D
--------------------------------------------------------------------------------
v_pinshm 1 0 1 0 1 boolean D
--------------------------------------------------------------------------------
pta_balance_threshold n/a 50 50 0 99 % pta segment R
--------------------------------------------------------------------------------
pagecoloring n/a 0 0 0 1 boolean B
--------------------------------------------------------------------------------
framesets 2 2 2 1 10 B
--------------------------------------------------------------------------------
mempools 4 1 4 1 32 B
--------------------------------------------------------------------------------
lgpg_size 16M 0 16M 0 256M bytes B
lgpg_regions
--------------------------------------------------------------------------------
lgpg_regions 64 0 64 0 B
lgpg_size
--------------------------------------------------------------------------------
num_spec_dataseg 0 0 0 0 B
--------------------------------------------------------------------------------
spec_dataseg_int 512 512 512 0 B
--------------------------------------------------------------------------------
memory_affinity 1 1 1 0 1 boolean B
--------------------------------------------------------------------------------
htabscale -1 -1 -1 -4 0 B
--------------------------------------------------------------------------------
force_relalias_lite 0 0 0 0 1 boolean D
--------------------------------------------------------------------------------
relalias_percentage 0 0 0 0 32K-1 D
--------------------------------------------------------------------------------
data_stagger_interval 161 161 161 0 4K-1 4KB pages D
lgpg_size
--------------------------------------------------------------------------------
large_page_heap_size 0 0 0 0 8E-1 bytes B
lgpg_size
--------------------------------------------------------------------------------
kernel_heap_psize 4K 4K 4K 4K 16M bytes B
lgpg_size
--------------------------------------------------------------------------------
soft_min_lgpgs_vmpool 0 0 0 0 90 % D
lgpg_size
--------------------------------------------------------------------------------
vmm_fork_policy 1 0 1 0 1 boolean D
--------------------------------------------------------------------------------
low_ps_handling 1 1 1 1 2 D
--------------------------------------------------------------------------------
mbuf_heap_psize 4K 4K 4K 4K 16M bytes B
--------------------------------------------------------------------------------
strict_maxclient 1 1 1 0 1 D
--------------------------------------------------------------------------------
cpu_scale_memp 8 8 8 1 64 B
--------------------------------------------------------------------------------
I/O Tunning
===========
# ioo -L
NAME CUR DEF BOOT MIN MAX UNIT
TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
minpgahead 2 2 2 0 4K 4KB pages D
maxpgahead
--------------------------------------------------------------------------------
maxpgahead 128 8 128 0 4K 4KB pages D
minpgahead
--------------------------------------------------------------------------------
pd_npages 64K 64K 64K 1 512K 4KB pages D
--------------------------------------------------------------------------------
maxrandwrt 128 0 128 0 512K 4KB pages D
--------------------------------------------------------------------------------
numclust 32 1 32 0 2G-1 16KB/cluster D
--------------------------------------------------------------------------------
numfsbufs 2K 196 2K 1 2G-1 M
--------------------------------------------------------------------------------
sync_release_ilock 1 0 1 0 1 boolean D
--------------------------------------------------------------------------------
lvm_bufcnt 64 9 64 1 64 128KB/buffer D
--------------------------------------------------------------------------------
j2_minPageReadAhead 2 2 2 0 64K 4KB pages D
--------------------------------------------------------------------------------
j2_maxPageReadAhead 128 128 128 0 64K 4KB pages D
--------------------------------------------------------------------------------
j2_nBufferPerPagerDevice 4K 512 4K 0 2G-1 M
--------------------------------------------------------------------------------
j2_nPagesPerWriteBehindCluster
128 32 128 0 64K D
--------------------------------------------------------------------------------
j2_maxRandomWrite 256 0 256 0 64K 4KB pages D
--------------------------------------------------------------------------------
j2_nRandomCluster 16 0 16 0 64K 16KB clusters D
--------------------------------------------------------------------------------
jfs_clread_enabled 0 0 0 0 1 boolean D
--------------------------------------------------------------------------------
jfs_use_read_lock 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
hd_pvs_opn 18 18 S
--------------------------------------------------------------------------------
hd_pbuf_cnt 7048 2816 5000 0 2G-1 I
--------------------------------------------------------------------------------
j2_inodeCacheSize 400 400 400 1 1000 D
--------------------------------------------------------------------------------
j2_metadataCacheSize 400 400 400 1 1000 D
--------------------------------------------------------------------------------
j2_dynamicBufferPreallocation
16 16 16 0 256 16k slabs M
--------------------------------------------------------------------------------
j2_maxUsableMaxTransfer 512 512 512 1 4K pages M
--------------------------------------------------------------------------------
NFS Tunning
===========
# nfso -L
NAME CUR DEF BOOT MIN MAX UNIT
TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
portcheck 1 0 1 0 1 On/Off D
--------------------------------------------------------------------------------
udpchecksum 1 1 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_socketsize 600000 600000 600000 40000 16M Bytes D
--------------------------------------------------------------------------------
nfs_tcp_socketsize 600000 600000 600000 40000 16M Bytes D
--------------------------------------------------------------------------------
nfs_setattr_error 1 0 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_gather_threshold 4K 4K 4K 512 8K+1 Bytes D
--------------------------------------------------------------------------------
nfs_repeat_messages 1 0 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_udp_duplicate_cache_size
64K 5000 64K 5000 100000 Req I
--------------------------------------------------------------------------------
nfs_tcp_duplicate_cache_size
64K 5000 64K 5000 100000 Req I
--------------------------------------------------------------------------------
nfs_server_base_priority 0 0 0 31 125 Pri D
--------------------------------------------------------------------------------
nfs_dynamic_retrans 1 1 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_iopace_pages 0 0 0 0 64K-1 Pages D
--------------------------------------------------------------------------------
nfs_max_connections 0 0 0 0 10000 Number D
--------------------------------------------------------------------------------
nfs_max_threads 3891 3891 3891 5 3891 Threads D
--------------------------------------------------------------------------------
nfs_use_reserved_ports 1 0 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_device_specific_bufs 1 1 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_server_clread 1 1 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_rfc1323 1 0 1 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_max_write_size 32K 32K 32K 512 64K Bytes D
--------------------------------------------------------------------------------
nfs_max_read_size 32K 32K 32K 512 64K Bytes D
--------------------------------------------------------------------------------
nfs_allow_all_signals 0 0 0 0 1 On/Off D
--------------------------------------------------------------------------------
nfs_v2_pdts 4 1 4 1 8 PDTs M
--------------------------------------------------------------------------------
nfs_v3_pdts 4 1 4 1 8 PDTs M
--------------------------------------------------------------------------------
nfs_v2_vm_bufs 2000 1000 2000 512 5000 Bufs I
--------------------------------------------------------------------------------
nfs_v3_vm_bufs 2000 1000 2000 512 5000 Bufs I
--------------------------------------------------------------------------------
nfs_securenfs_authtimeout 0 0 0 0 60 Seconds D
--------------------------------------------------------------------------------
nfs_v3_server_readdirplus 1 1 1 0 1 On/Off D
--------------------------------------------------------------------------------
lockd_debug_level 0 0 0 0 10 Level D
--------------------------------------------------------------------------------
statd_debug_level 0 0 0 0 10 Level D
--------------------------------------------------------------------------------
statd_max_threads 50 50 128 1 1000 Threads D
--------------------------------------------------------------------------------
# no -L
General Network Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT
TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
extendednetstats 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
fasttimo 200 200 200 50 200 millisecond D
-------------------------------------------------------------------------------
inet_stack_size 16 16 16 1 32K-1 kbyte R
-------------------------------------------------------------------------------
nbc_limit 27852K 27852K 27852K 0 8E-1 kbyte D
-------------------------------------------------------------------------------
nbc_max_cache 131072 131072 131072 1 8E-1 byte D
-------------------------------------------------------------------------------
nbc_min_cache 1 1 1 1 131072 byte D
-------------------------------------------------------------------------------
nbc_ofile_hashsz 12841 12841 12841 1 999999 segment D
-------------------------------------------------------------------------------
nbc_pseg 0 0 0 0 2G-1 segment D
-------------------------------------------------------------------------------
nbc_pseg_limit 32M 32M 32M 0 2G-1 kbyte D
-------------------------------------------------------------------------------
net_malloc_police 0 0 0 0 8E-1 numeric D
-------------------------------------------------------------------------------
sb_max 16M 1M 16M 1 8E-1 byte D
-------------------------------------------------------------------------------
send_file_duration 300 300 300 0 8E-1 second D
-------------------------------------------------------------------------------
sockthresh 85 85 85 0 100 %_of_thewall D
-------------------------------------------------------------------------------
sodebug 0 0 0 0 1 boolean C
-------------------------------------------------------------------------------
somaxconn 1024 1024 1024 0 32K-1 numeric C
-------------------------------------------------------------------------------
tcp_inpcb_hashtab_siz 24499 24499 24499 1 999999 numeric R
-------------------------------------------------------------------------------
thewall 32M 32M 32M 0 1M kbyte S
-------------------------------------------------------------------------------
udp_inpcb_hashtab_siz 24499 24499 24499 1 83000 numeric R
-------------------------------------------------------------------------------
use_isno 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
TCP Network Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
clean_partial_conns 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
delayack 0 0 0 0 3 boolean D
-------------------------------------------------------------------------------
delayackports {} {} {} 0 10 ports_list D
-------------------------------------------------------------------------------
rfc1323 1 0 1 0 1 boolean C
-------------------------------------------------------------------------------
rfc2414 1 0 1 0 1 boolean C
-------------------------------------------------------------------------------
rto_high 64 64 64 2 8E-1 roundtriptime R
-------------------------------------------------------------------------------
rto_length 13 13 13 1 64 roundtriptime R
-------------------------------------------------------------------------------
rto_limit 7 7 7 1 64 roundtriptime R
-------------------------------------------------------------------------------
rto_low 1 1 1 1 63 roundtriptime R
-------------------------------------------------------------------------------
sack 1 0 1 0 1 boolean C
-------------------------------------------------------------------------------
tcp_bad_port_limit 1024 0 1024 0 8E-1 numeric D
-------------------------------------------------------------------------------
tcp_ecn 1 0 1 0 1 boolean C
-------------------------------------------------------------------------------
tcp_ephemeral_high 65535 65535 65535 32769 64K-1 numeric D
-------------------------------------------------------------------------------
tcp_ephemeral_low 32768 32768 32768 1024 65534 numeric D
-------------------------------------------------------------------------------
tcp_finwait2 1200 1200 1200 0 64K-1 halfsecond D
-------------------------------------------------------------------------------
tcp_init_window 0 0 0 0 32K-1 byte C
-------------------------------------------------------------------------------
tcp_keepcnt 8 8 8 0 8E-1 numeric D
-------------------------------------------------------------------------------
tcp_keepidle 14400 14400 14400 1 8E-1 halfsecond C
-------------------------------------------------------------------------------
tcp_keepinit 150 150 150 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
tcp_keepintvl 150 150 150 1 32K-1 halfsecond C
-------------------------------------------------------------------------------
tcp_limited_transmit 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
tcp_maxburst 0 0 0 0 32K-1 numeric D
-------------------------------------------------------------------------------
tcp_mssdflt 1460 1460 1460 1 64K-1 byte C
-------------------------------------------------------------------------------
tcp_nagle_limit 65535 65535 65535 0 64K-1 byte D
-------------------------------------------------------------------------------
tcp_ndebug 100 100 100 0 32K-1 numeric D
-------------------------------------------------------------------------------
tcp_newreno 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
tcp_nodelayack 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
tcp_recvspace 65536 16384 65536 4096 8E-1 byte C
-------------------------------------------------------------------------------
tcp_sendspace 65536 16384 65536 4096 8E-1 byte C
-------------------------------------------------------------------------------
tcp_timewait 1 1 1 1 5 15_second D
-------------------------------------------------------------------------------
tcp_ttl 60 60 60 1 255 0.6_second C
-------------------------------------------------------------------------------
UDP Network Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
udp_bad_port_limit 1024 0 1024 0 8E-1 numeric D
-------------------------------------------------------------------------------
udp_ephemeral_high 65535 65535 65535 32769 64K-1 numeric D
-------------------------------------------------------------------------------
udp_ephemeral_low 32768 32768 32768 1024 65534 numeric D
-------------------------------------------------------------------------------
udp_recvspace 42080 42080 42080 4096 8E-1 byte C
-------------------------------------------------------------------------------
udp_sendspace 16384 9216 16384 4096 8E-1 byte C
-------------------------------------------------------------------------------
udp_ttl 30 30 30 1 255 second C
-------------------------------------------------------------------------------
udpcksum 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
IP Network Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
directed_broadcast 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ie5_old_multicast_mapping 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ip6_defttl 64 64 64 1 255 numeric D
-------------------------------------------------------------------------------
ip6_prune 1 1 1 1 8E-1 second D
-------------------------------------------------------------------------------
ip6forwarding 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ip6srcrouteforward 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
ipforwarding 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ipfragttl 60 60 60 1 255 halfsecond D
-------------------------------------------------------------------------------
ipignoreredirects 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ipqmaxlen 1024 100 1024 100 8E-1 numeric R
-------------------------------------------------------------------------------
ipsendredirects 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
ipsrcrouteforward 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
ipsrcrouterecv 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ipsrcroutesend 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
lo_perf 1 1 1 0 1 boolean R
-------------------------------------------------------------------------------
maxnip6q 20 20 20 1 32K-1 numeric D
-------------------------------------------------------------------------------
multi_homed 1 1 1 0 3 boolean D
-------------------------------------------------------------------------------
nonlocsrcroute 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
subnetsarelocal 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
ARP/NDP Network Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
arpqsize 64 12 64 1 32K-1 numeric D
-------------------------------------------------------------------------------
arpt_killc 20 20 20 0 32K-1 minute D
-------------------------------------------------------------------------------
arptab_bsiz 7 7 7 1 32K-1 bucket_size R
-------------------------------------------------------------------------------
arptab_nb 73 73 73 1 32K-1 buckets R
-------------------------------------------------------------------------------
dgd_packets_lost 10 3 10 1 32K-1 numeric D
-------------------------------------------------------------------------------
dgd_ping_time 5 5 5 1 8E-1 second D
-------------------------------------------------------------------------------
dgd_retry_time 5 5 5 1 32K-1 numeric D
-------------------------------------------------------------------------------
ndp_mmaxtries 3 3 3 0 8E-1 numeric D
-------------------------------------------------------------------------------
ndp_umaxtries 3 3 3 0 8E-1 numeric D
-------------------------------------------------------------------------------
ndpqsize 50 50 50 1 32K-1 numeric D
-------------------------------------------------------------------------------
ndpt_down 3 3 3 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
ndpt_keep 120 120 120 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
ndpt_probe 5 5 5 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
ndpt_reachable 30 30 30 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
ndpt_retrans 1 1 1 1 8E-1 halfsecond D
-------------------------------------------------------------------------------
passive_dgd 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
rfc1122addrchk 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
Stream Header Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT
TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
lowthresh 90 90 90 0 100 %_of_thewall D
-------------------------------------------------------------------------------
medthresh 95 95 95 0 100 %_of_thewall D
-------------------------------------------------------------------------------
nstrpush 8 8 8 8 32K-1 numeric R
-------------------------------------------------------------------------------
psebufcalls 20 20 20 20 8E-1 numeric I
-------------------------------------------------------------------------------
psecache 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
pseintrstack 24576 24576 24576 12288 8E-1 byte R
-------------------------------------------------------------------------------
psetimers 20 20 20 20 8E-1 numeric I
-------------------------------------------------------------------------------
strctlsz 1024 1024 1024 0 32K-1 byte D
-------------------------------------------------------------------------------
strmsgsz 0 0 0 0 32K-1 byte D
-------------------------------------------------------------------------------
strthresh 85 85 85 0 100 %_of_thewall D
-------------------------------------------------------------------------------
strturncnt 15 15 15 1 8E-1 numeric D
-------------------------------------------------------------------------------
Other Network Tunable Parameters
-------------------------------------------------------------------------------
NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
-------------------------------------------------------------------------------
bcastping 1 0 1 0 1 boolean D
-------------------------------------------------------------------------------
icmp6_errmsg_rate 10 10 10 1 255 msg/second D
-------------------------------------------------------------------------------
icmpaddressmask 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
ifsize 256 256 256 8 1024 numeric R
-------------------------------------------------------------------------------
llsleep_timeout 3 3 3 1 8E-1 second D
-------------------------------------------------------------------------------
main_if6 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
main_site6 0 0 0 0 1 boolean D
-------------------------------------------------------------------------------
maxttl 255 255 255 1 255 second D
-------------------------------------------------------------------------------
pmtu_default_age 10 10 10 0 32K-1 minute D
-------------------------------------------------------------------------------
pmtu_rediscover_interval 30 30 30 0 32K-1 minute D
-------------------------------------------------------------------------------
route_expire 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
routerevalidate 1 0 1 0 1 boolean D
-------------------------------------------------------------------------------
site6_index 0 0 0 0 32K-1 numeric D
-------------------------------------------------------------------------------
tcp_pmtu_discover 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
udp_pmtu_discover 1 1 1 0 1 boolean D
-------------------------------------------------------------------------------
On Tue, 14 Sep 2004, Jan-Frode Myklebust wrote:
j| Date: 14 Sep 2004 07:15:47 GMT
j| From: Jan-Frode Myklebust <janf...@parallab.uib.no>
j| Newsgroups: comp.unix.aix
j| Subject: Re: AIX 5.2 ls command hanging on JFS2 FS
j|
j| On 2004-09-13, Michael E. Thomadakis <mi...@hellas.tamu.edu> wrote:
j| >
j| > We run the 64-bit kernel and we are also an HPC shop. We have spent
j| > quite some time to tune up the VM and IO and other parameters.
j|
j| Could you please tell me your settings, and amount of memory ?
j| And maybe rationale behind the settings ?
j|
j| We have 3 nodes. Two with 64 GB, and one with 192 GB memory. All
j| running with:
j|
j| vmtune -R 64 -f 3840 -F 5888 -p 5 -P 10 -t 10 -y 1 -h 0
j|
j| When a lot of memory is used for page buffers, and suddenly a large
j| memory application wants to grab it, we get into a long periode
j| of page scan activity. It seems the routine for freeing up memory
j| is highly inefficient. Have you seen this? Found any magic vmtune
j| to fix it? It's not a big problem, since we're mostly running GPFS,
j| but Gaussian running on local JFS-filesystem seems quite good a
j| triggering it..
j|
j| Also, one of the nodes has a local JFS-filesystem we're running Tivoli
j| Space Manager (HSM) on, and export it to the two other nodes via NFS.
j| When writing large files to this filesystem over NFS, we get very high
j| system load on the NFS server. Most cpus running in system mode, killing
j| the performance of user applications. I think this is again related to
j| the page scans, but any hints on NFS tuning would be greatly appreciated.
j|
j| > I only hope that the JFS2 patch will hold.
j|
j| Good luck!