Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

db2hmon failing with "Too many open files"

53 views
Skip to first unread message

pike

unread,
Mar 21, 2007, 8:08:29 AM3/21/07
to
db2 8.1 FP11 on AIX 5.3.0.0 .

The db2diag.log is intermittently reporting EMFILE (24) "Too many open
files" errors. The culprit is always db2hmon. Sample db2diag.log
output follows:

2007-03-20-07.42.35.269106+060 I14996239C505 LEVEL: Severe (OS)
PID : 2289758 TID : 772 PROC : db2hmon 0
INSTANCE: defser_t NODE : 000
FUNCTION: DB2 UDB, SQO Memory Management, sqlocshr2, probe:200
CALLED : OS, -, shmat
OSERR : EMFILE (24) "Too many open files"
DATA #1 : Memory set handle, PD_TYPE_OSS_MEM_SET_HDL, 20 bytes
0x303B3BD0 : FFFF FFFF FFFF FFFF 0000 0000 2AB0
01A1 ............*...
0x303B3BE0 : 0004 0000 ....

2007-03-20-07.42.36.050018+060 I14996745C396 LEVEL: Error
PID : 2289758 TID : 772 PROC : db2hmon 0
INSTANCE: defser_t NODE : 000
FUNCTION: DB2 UDB, Health Monitor,
HmonMainCB::refreshDbAutonomicSwitches, probe:160
MESSAGE : Failed connecting to database "CLASS_T "
DATA #1 : Hexdump, 4 bytes
0x303B75AC : FFFF FB38 ...8


Unless I'm mistaken, db2hmon doesn't use java and executes locally
(i.e. is not a client app). So EXTSHM and TCP/IP loopbacks do not seem
to me to be a possible solution.
Short of switching off the health monitor, I'm at a loss to explain
these errors.

Any assistance appreciated!

Thank you.

Jeroen van den Broek

unread,
Mar 21, 2007, 9:31:00 AM3/21/07
to

This is an OS-level error, so you probably need to see your sysadmin.
The failing process is 'shmat', so maybe a 'man shmat' will give more
info on this specific error.
My guess is you should raise the value of SHMSEG, at least that's what
I would try on Linux, not sure about AIX though.

HTH.

--
Jeroen

Liam Finnie

unread,
Mar 21, 2007, 9:51:48 AM3/21/07
to

Hello,

TCP/IP loopbacks won't help since db2hmon is an internal server-side
tool and doesn't perform a local connection, but EXTSHM could help.
EXTSHM is used to workaround the 11-13 segment restrictions on AIX for
any 32-bit processes (64-bit processes don't have this same
restriction). It's commonly required for java clients, but can be
useful in other cases where shmat returns EMFILE as well.

Cheers,
Liam.

Liam Finnie

unread,
Mar 21, 2007, 10:02:39 AM3/21/07
to

Just to add to my last response... since this is a server-side tool,
you'll have to do the following to enable EXTSHM for the db2hmon
process:
db2stop
export EXTSHM=ON
db2set DB2ENVLIST=EXTSHM
db2start

If this is a partitioned instance (DPF), you'll need to add the
following to your sqllib/db2profile before you issue the db2start
command:
EXTSHM=ON
export EXTSHM

Cheers,
Liam.

Ian

unread,
Mar 21, 2007, 5:08:00 PM3/21/07
to
Liam Finnie wrote:
>
> If this is a partitioned instance (DPF), you'll need to add the
> following to your sqllib/db2profile before you issue the db2start
> command:
> EXTSHM=ON
> export EXTSHM

Not to claim that I know more than Liam, but best practice would be
to add this to sqllib/userprofile (which is intended for user
modification) -- not the db2profile.

The Boss

unread,
Mar 21, 2007, 7:30:11 PM3/21/07
to

Following IBM Technote describes a similar problem for 32-bit Informix on
AIX:
http://www-1.ibm.com/support/docview.wss?rs=631&context=SSGHZP&dc=DB520&uid=swg21157040&loc=en_US&cs=UTF-8&lang=en&rss=ct631db2

The given example of the error in the message log clearly points to SHMSEG
(like I suggested in my previous post):

23:00:11 shmat: [EMFILE][24]: out of shared memory segments, check system
SHMSEG

However, the solution as given in the Technote is exactly the one you
described above.
Do you perhaps know the relationship between SHMSEG and EXTSHM?
And is this perhaps specific for AIX? I don't recall seeing this on Linux.

--
Jeroen


Liam Finnie

unread,
Mar 22, 2007, 9:07:52 AM3/22/07
to
> AIX:http://www-1.ibm.com/support/docview.wss?rs=631&context=SSGHZP&dc=DB5...

>
> The given example of the error in the message log clearly points to SHMSEG
> (like I suggested in my previous post):
>
> 23:00:11 shmat: [EMFILE][24]: out of shared memory segments, check system
> SHMSEG
>
> However, the solution as given in the Technote is exactly the one you
> described above.
> Do you perhaps know the relationship between SHMSEG and EXTSHM?
> And is this perhaps specific for AIX? I don't recall seeing this on Linux.
>
> --
> Jeroen

Hi Jeroen,

As far as I know, AIX doesn't have a SHMSEG kernel tuneable (maybe
that was pre-4.3.1?). Maybe it's a reference to the maximum allowed
size of a single shared memory segment, but in that case, the error
would be coming from shmget, not shmat. Or it could just be a bad
reference in the error message.

This particular issue is only applicable on AIX for 32-bit processes
(moving to a 64-bit instance would make this issue go away too).
Other platforms have different restrictions - for instance, Linux has
a limit (shmall) on how much shared memory can be allocated by all
processes, among other limits (shmmni, shmmax, etc).

As for the 11-13 segment AIX, by default, each shared memory segment
is attached to at a 256MB aligned address, so 32-bit processes can
quickly run out of suitable addresses for new shared memory segments
(this is what causes the EMFILE errors from shmat). If you enable
EXTSHM, then multiple shared memory segments can be packed into each
256MB address range (the reason why EXTSHM is not the default is that
it does have a negative impact on performance). 64-bit processes on
AIX have things even easier - pretty much all shared memory
restrictions are removed.

Hope this clears things up.

Cheers,
Liam.

Jeroen van den Broek

unread,
Mar 26, 2007, 6:28:28 AM3/26/07
to

Thanks Liam, most insightful.
This will probably become usefull in short time when I have to do some
DB2 sysadmin work on AIX (which I have been mostly been doing on
zLinux until now).

--
Jeroen


0 new messages