Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Getting info on hung process

1,224 views
Skip to first unread message

itclguy

unread,
Jan 21, 2010, 3:52:43 PM1/21/10
to
I have a situation where a process hangs intermittently on AIX 5.3.
There are 13 of these processes running concurrently and 1 or more
hang at random times throughout the day. Does anyone know of a way to
gather information about this process such as a memory dump -
basically anything that may provide information about what the process
is waiting on?

The suspicion is the process is waiting on a row lock in an Oracle 10g
server, but so far we have been unable to determine this. I did not
see anything in the `kill` man page that may force the process to
dump. The code is compiled with gcc but no debug. Any pointers would
be greatly appreciated.

Henry

unread,
Jan 21, 2010, 4:34:03 PM1/21/10
to

you could have a look at traffic with tcpdump, do some profiling with
tprof or look at procmon

itclguy

unread,
Jan 21, 2010, 5:03:59 PM1/21/10
to
On Jan 21, 3:34 pm, Henry <snogfest_hosebe...@yahoo.com> wrote:
>
> you could have a look at traffic with tcpdump, do some profiling with
> tprof or look at procmon

Cool - I'll look into those - thanks!

Joachim Gann

unread,
Jan 22, 2010, 12:58:57 AM1/22/10
to
On 21.01.2010 21:52, itclguy wrote:
> I have a situation where a process hangs intermittently on AIX 5.3.
> There are 13 of these processes running concurrently and 1 or more
> hang at random times throughout the day. Does anyone know of a way to
> gather information about this process such as a memory dump -
> basically anything that may provide information about what the process
> is waiting on?

if you are familiar with system calls, a "truss -fp <pid>" will tell you
what your process (and its freshly forked child processes) is doing

> The suspicion is the process is waiting on a row lock in an Oracle 10g
> server, but so far we have been unable to determine this. I did not
> see anything in the `kill` man page that may force the process to
> dump. The code is compiled with gcc but no debug. Any pointers would
> be greatly appreciated.

the default handlers for SIGQUIT or SIGIOT among others make a process
write a core dump. If your process does not catch these signals nor has
replaced the default signal handler, these commands should create the dump:
kill -QUIT <pid>
kill -IOT <pid>

you can analyze the coredump with dbx or gdb or hand it to the developer.

Joachim

Heinrich Mislik

unread,
Jan 22, 2010, 6:54:51 AM1/22/10
to
In article <141bb062-c6f2-4000...@22g2000yqr.googlegroups.com>, chadsm...@gmail.com says...

>The suspicion is the process is waiting on a row lock in an Oracle 10g
>server, but so far we have been unable to determine this.

When locking in oracle is involved, you should look at the dynamic
performance views v$locked_object, v$lock and v$session. These
allow to track down processes having locks und waitung for locks.

Cheers

Heinrich

--
Heinrich Mislik
Zentraler Informatikdienst der Universitaet Wien
A-1010 Wien, Universitaetsstrasse 7
Tel.: (+43 1) 4277-14056, Fax: (+43 1) 4277-9140

itclguy

unread,
Jan 22, 2010, 10:43:20 AM1/22/10
to

Joachim - thank you for the tips! I am not familiar with truss but I
will definitely look at this. Also, getting a core file from the
process would be perfect. The program does not handle SIGQUIT and
SIGIOT that I am aware of, so using `kill` to generate the core would
be great. However since the program was not compiled with -g, do you
know if the developer will be able to get any helpful info from it?

itclguy

unread,
Jan 22, 2010, 10:45:07 AM1/22/10
to
On Jan 22, 5:54 am, Heinrich.Mis...@univie.ac.at (Heinrich Mislik)
wrote:
> In article <141bb062-c6f2-4000-a016-d670ef2be...@22g2000yqr.googlegroups.com>, chadsmith...@gmail.com says...

>
> >The suspicion is the process is waiting on a row lock in an Oracle 10g
> >server, but so far we have been unable to determine this.
>
> When locking in oracle is involved, you should look at the dynamic
> performance views v$locked_object, v$lock and v$session. These
> allow to track down processes having locks und waitung for locks.
>
> Cheers
>
> Heinrich
>

Heinrich - thank you! I will discuss this with our DBA to get further
information as I am not familiar with dynamic performance views. This
is very helpful information. Thanks for your reply!

Thomas Braunbeck

unread,
Jan 22, 2010, 6:16:03 PM1/22/10
to
Am 21.01.2010 21:52, schrieb itclguy:
>
> The suspicion is the process is waiting on a row lock in an Oracle 10g
> server, but so far we have been unable to determine this. I did not
> see anything in the `kill` man page that may force the process to
> dump. The code is compiled with gcc but no debug. Any pointers would
> be greatly appreciated.

ls -l /usr/bin/proc*
procstack <pid> will get you the stack for process <pid>, etc.
dbx -a <pid>
will attach dbx to the process <pid>. The dbx command detach
will detach (if you quit dbx without detach the process is killed).
Some one pointed to the possibility to send a signal which default
handler will create a core... but if the process catches then :-(
Better use the gencore command:
gencore <pid> <corefile>
Then you got a pointer to truss. As suggested, truss will show the
system calls the program calls. Add the -u *::* options to see
library calls as well.
The AIX documentation is at
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp
There you'll find a book titled
General Programming Concepts: Writing and Debugging Programs
Take a look at this as well. If your process does not stuck in
user space but some where in the kernel, then you need to use
kdb. And see
http://www-01.ibm.com/support/docview.wss?uid=isg3T1011956
It has a link to a tool named pdump.sh. This uses the above
mentioned commands (proc*, dbx, kdb) to dump all information
into one file you then can look at.

itclguy

unread,
Jan 23, 2010, 11:49:05 AM1/23/10
to
On Jan 22, 5:16 pm, Thomas Braunbeck <Thomas.Braunb...@t-online.de>
wrote:

> Am 21.01.2010 21:52, schrieb itclguy:
>
>
>
> > The suspicion is the process is waiting on a row lock in an Oracle 10g
> > server, but so far we have been unable to determine this. I did not
> > see anything in the `kill` man page that may force the process to
> > dump. The code is compiled with gcc but no debug. Any pointers would
> > be greatly appreciated.
>
> ls -l /usr/bin/proc*
> procstack <pid> will get you the stack for process <pid>, etc.
> dbx -a <pid>
> will attach dbx to the process <pid>. The dbx command detach
> will detach (if you quit dbx without detach the process is killed).
> Some one pointed to the possibility to send a signal which default
> handler will create a core... but if the process catches then :-(
> Better use the gencore command:
> gencore <pid> <corefile>
> Then you got a pointer to truss. As suggested, truss will show the
> system calls the program calls. Add the -u *::* options to see
> library calls as well.
> The AIX documentation is athttp://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp

> There you'll find a book titled
> General Programming Concepts: Writing and Debugging Programs
> Take a look at this as well. If your process does not stuck in
> user space but some where in the kernel, then you need to use
> kdb. And seehttp://www-01.ibm.com/support/docview.wss?uid=isg3T1011956

> It has a link to a tool named pdump.sh. This uses the above
> mentioned commands (proc*, dbx, kdb) to dump all information
> into one file you then can look at.

Thomas, this is excellent information. Thank you very much!

0 new messages