Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Process generating its own stack trace

42 views
Skip to first unread message

Wind

unread,
May 15, 2008, 8:18:40 AM5/15/08
to
I have a multi threaded application on RHEL4 (Red hat enterprise
linux
4) . My application is supposed to generate its stack trace when
requested. To do this, one of the thread in this application does
fork() and in child process execvp is called to invoke pstack
<processid>. I am also using pipe() to capture the output of child
process on stdout.
My application process becomes parent for the newly created pstack
process. pstack version used on RHEL4 is 1.2.6. This version
internally launches gdb in silent mode.


My application goes into hung state when pstack child process is
started. When I try to attach GDB to this hung process, its gives
error "ptrace: operation not permitted"


The /proc/<pid>/task/<taskid>status for all threads is as below
Name: myapplcation
State: T (tracing stop)
SleepAVG: 88%
Tgid: 32519
Pid: 32519
PPid: 1
TracerPid: 32723
.........etc


It works with sample multithread application like below: -
pthread_t thread1, thread2;
char *message1 = "Thread 1";
char *message2 = "Thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL,
print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, LaunchPstack, (void*)
message2);

pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);


My application has a wrapper over pthread apis. No other difference

Can anybody throw some light? Am I starting "pstack" child process
correctly?

David Schwartz

unread,
May 15, 2008, 7:36:13 PM5/15/08
to
On May 15, 5:18 am, Wind <umesh_wagh...@yahoo.com> wrote:

> 4) . My application is supposed to generate its stack trace when
> requested. To do this, one of the thread in this application does
> fork() and in child process execvp is called to invoke pstack
> <processid>. I am also using pipe() to capture the output of child
> process on stdout.

You can't call 'fork', use __libc_fork.

DS

Wind

unread,
May 16, 2008, 9:07:52 AM5/16/08
to

Hi David,

Thanks for the reply.

I tried what you suggested. But still facing the same issue.
Below is the code snippet

extern "C" int __libc_fork();

startpstack()
{
// fork child process
if ((childPid = __libc_fork()) <0){
return -1;//error
else if(childPid == 0) { // child
execvp(path, (char **)argv);
_exit(-1); //error
}
else { // parent

waitpid(childPid, &stat_loc, 0);

}
}


David Schwartz

unread,
May 17, 2008, 4:01:49 AM5/17/08
to
On May 16, 6:07 am, Wind <umesh_wagh...@yahoo.com> wrote:

> Thanks for the reply.
>
> I tried what you suggested. But still facing the same issue.
> Below is the code snippet

Sorry, that solved a very similar problem for me. I'm not sure what
else could be giving you a problem.

DS

Paul Pluzhnikov

unread,
May 17, 2008, 12:48:12 PM5/17/08
to
Wind <umesh_...@yahoo.com> writes:

> My application goes into hung state when pstack child process is
> started. When I try to attach GDB to this hung process, its gives
> error "ptrace: operation not permitted"
>
>
> The /proc/<pid>/task/<taskid>status for all threads is as below
> Name: myapplcation
> State: T (tracing stop)

This indicates that gdb successfully attached to your process.

It is likely that this gdb process is now waiting for something,
though I can't imagine what for. Possibly it tries to print something
to tty (possibly some warning), and gets itself SIGSTOPped?

Can you attach a new instance of gdb the the apparently-hung
pstack/gdb process?

What is the State in /proc/<pstack-pid>/status ?

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.

Wind

unread,
May 19, 2008, 1:31:40 AM5/19/08
to
On May 17, 5:48 pm, Paul Pluzhnikov <ppluzhnikov-...@gmail.com> wrote:

Hi Paul,

Thanks for the reply.
Following are few more details

ps -a
====
3088 29440 0 15:45 pts/5 00:00:00 /bin/sh /usr/bin/pstack 29440
3097 3088 0 15:45 pts/5 00:00:00 sed -n -e s/^(gdb) // -e /^#/p -
e /^Thread/p
29440 15781 0 May15 pts/5 00:00:01 ./myapplication
3096 3088 1 15:45 pts/5 00:00:07 gdb --quiet --readnever -nx /
proc/29440/exe 29440


Stack trace of gdb
#0 0x005a47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x00678dd3 in __write_nocancel () from /lib/tls/libc.so.6
#2 0x0061e89f in _IO_new_file_write () from /lib/tls/libc.so.6
#3 0x0061d31b in _IO_new_do_write () from /lib/tls/libc.so.6
#4 0x0061dfa2 in _IO_new_file_sync () from /lib/tls/libc.so.6
#5 0x006136bc in fflush () from /lib/tls/libc.so.6
#6 0x080fe16e in print_frame_info ()
#7 0x080ff5eb in parse_frame_specification ()
#8 0x0807ef6d in throw_exception ()
#9 0x0807f112 in catch_errors ()
#10 0x080ff307 in parse_frame_specification ()
#11 0x0807f52c in execute_command ()
#12 0x08101302 in save_infrun_state ()
#13 0x0807f52c in execute_command ()
#14 0x08104e40 in async_disable_stdin ()
#15 0x0810538e in async_disable_stdin ()
#16 0x081040ad in delete_file_handler ()
#17 0x08103bbb in standard_macro_lookup ()
#18 0x08104352 in gdb_do_one_event ()
#19 0x0807ef6d in throw_exception ()
#20 0x0807f112 in catch_errors ()
#21 0x080b9707 in _initialize_tui_hooks ()
#22 0x00000000 in ?? ()

/proc/<pstack-pid>/status
Name: pstack
State: S (sleeping)
SleepAVG: 68%
Tgid: 3088
Pid: 3088
PPid: 29440
TracerPid: 0
Uid: 5827 5827 5827 5827
Gid: 1000 1000 1000 1000
FDSize: 256
Groups: 1000
VmSize: 5472 kB
VmLck: 0 kB
VmRSS: 996 kB
VmData: 168 kB
VmStk: 1324 kB
VmExe: 577 kB
VmLib: 1271 kB
StaBrk: 080e3000 kB
Brk: 0972f000 kB
StaStk: bfeb8960 kB
ExecLim: 080d8000
Threads: 1
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000416203
SigIgn: 0000000000001005
SigCgt: 0000000000010002
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000


/proc/<gdb-pid>/status

Name: gdb
State: S (sleeping)
SleepAVG: 88%
Tgid: 3096
Pid: 3096
PPid: 3088
TracerPid: 0
Uid: 5827 5827 5827 5827
Gid: 1000 1000 1000 1000
FDSize: 256
Groups: 1000
VmSize: 174804 kB
VmLck: 0 kB
VmRSS: 170508 kB
VmData: 168356 kB
VmStk: 472 kB
VmExe: 2124 kB
VmLib: 1648 kB
StaBrk: 08281000 kB
Brk: 0be9b000 kB
StaStk: bff92cb0 kB
ExecLim: 0825b000
Threads: 1
SigPnd: 0000000000000000
ShdPnd: 0000000000010000
SigBlk: 0000000000416203
SigIgn: 0000000000001000
SigCgt: 0000000000030087
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000

/proc/<sed-pid>/status
Name: sed
State: S (sleeping)
SleepAVG: 90%
Tgid: 3097
Pid: 3097
PPid: 3088
TracerPid: 0
Uid: 5827 5827 5827 5827
Gid: 1000 1000 1000 1000
FDSize: 256
Groups: 1000
VmSize: 4816 kB
VmLck: 0 kB
VmRSS: 640 kB
VmData: 172 kB
VmStk: 1252 kB
VmExe: 40 kB
VmLib: 1252 kB
StaBrk: 08058000 kB
Brk: 08803000 kB
StaStk: bfecd130 kB
ExecLim: 08052000
Threads: 1
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000406203
SigIgn: 0000000000001001
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000

Regards,
Umesh

John Reiser

unread,
May 19, 2008, 6:21:27 AM5/19/08
to
> ps -a
> ====

> 3096 3088 1 15:45 pts/5 00:00:07 gdb --quiet --readnever -nx /
> proc/29440/exe 29440

gdb is waiting for input from "the console" of the processes in the
signal group for process 3096. Often "the console" is /dev/tty,
which is a pseudo-device that is specific to each process group.
See "man setpgrp" for some of the details.

In order to run gdb in batch mode, then you *MUST* specify "-batch"
as a commandline parameter to gdb. If you do not specify "-batch" then
gdb assumes interactive use, and will adjust its usage of buffers
for stdin/stdout, and _will_ attempt to use interactive input.

"--readnever" avoids reading some of the symbolic debug information
for various ELF files, and has nothing to do with command input to gdb.

How to use gdb to get a backtrace while running was thoroughly discussed
about 3.5 to 4 years ago. For instance:
http://groups.google.com/group/comp.os.linux.development.apps/msg/2d78674dc148ed1f
-----
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

char str[100+4096];
char path[4096];
path[readlink("/proc/self/exe", path, -1+ sizeof(path))] = '\0';
sprintf(str, "echo 'bt\ndetach\nquit\n' | gdb -batch -x /dev/stdin %s %d\n",
path, (int)getpid() );
system(str);
-----
where the commands in the 'echo' string [and its length] should be adjusted
to taste. The commands might include "info threads\n", etc.

--

Paul Pluzhnikov

unread,
May 20, 2008, 12:59:44 AM5/20/08
to
John Reiser <jre...@BitWagon.com> writes:

>> 3096 3088 1 15:45 pts/5 00:00:07 gdb --quiet --readnever -nx /
>> proc/29440/exe 29440
>
> gdb is waiting for input from "the console" of the processes in the
> signal group for process 3096.

From the stack trace OP posted, I would conclude that gdb is waiting
in a write(), not a read(). Are you sure your diagnosis is correct?

> In order to run gdb in batch mode, then you *MUST* specify "-batch"
> as a commandline parameter to gdb. If you do not specify "-batch" then
> gdb assumes interactive use, and will adjust its usage of buffers
> for stdin/stdout, and _will_ attempt to use interactive input.
>
> "--readnever" avoids reading some of the symbolic debug information
> for various ELF files, and has nothing to do with command input to gdb.

The OP isn't using gdb directly, he is using it via /usr/bin/pstack.
Missing '-batch' is a bug in pstack then... Which I see is not
fixed in at least pstack-1.2-7.2.2 on FC6.

Here is what above pstack does:

[$readnever=-readnever if gdb supports it;
$backtrace='bt' or 'thread apply all bt' depending on whether the
process is multithreaded.]

$GDB --quiet $readnever -nx /proc/$1/exe $1 <<EOF 2>&1 |
$backtrace
EOF
/bin/sed -n \
-e 's/^(gdb) //' \
-e '/^#/p' \
-e '/^Thread/p'


I don't see how this could hang on either input or output ...

Aha! I think I do: OP wrote:

Wind <umesh_...@yahoo.com> writes:
>> I am also using pipe() to capture the output of child
>> process on stdout.

I bet the output from /usr/bin/pstack is going into a pipe, and
the reader on that pipe is the application for which the stack
trace is being generated!

Wind, if that's in fact what you are doing, you need to think
about it some more. In particular, you need to understand what
happens to pipe writer when pipe reader is suspended.

Wind

unread,
May 20, 2008, 10:48:22 AM5/20/08
to
On May 20, 5:59 am, Paul Pluzhnikov <ppluzhnikov-...@gmail.com> wrote:
> Wind <umesh_wagh...@yahoo.com> writes:
> >> I am also using pipe() to capture the output of child
> >> process on stdout.
>
> I bet the output from /usr/bin/pstack is going into a pipe, and
> the reader on that pipe is the application for which the stack
> trace is being generated!
>
> Wind, if that's in fact what you are doing, you need to think
> about it some more. In particular, you need to understand what
> happens to pipe writer when pipe reader is suspended.
>
> Cheers,
> --
> In order to understand recursion you must first understand recursion.
> Remove /-nsp/ for email.

Hi Paul,

You are correct. After looking at the stack trace of gdb, I revisited
my code to find why gdb is not able to write and in hung state. I am
reading from the pipe in parent process which is suspended by gdb.

Regarding bug in pstack. for the test purpose I copied gstack as
testpstack in my home directory (pstack is a link to gstack, that
itself is a script) I modified "$GDB --quiet $readnever -nx /proc/$1/
exe $1 <<EOF 2>&1 |$backtrace" (last line) to use "-batch", I
observed that with this new command, it never returns.

Do you know if there is a equivalent of pstack of Solaris available
for RedHat. I have pstack1-1-7 on RHEL3 which is a binary and not a
shell script. But with this I get stack trace of only the main thread.

Regards,
Umesh

Paul Pluzhnikov

unread,
May 21, 2008, 2:05:28 AM5/21/08
to
Wind <umesh_...@yahoo.com> writes:

> Regarding bug in pstack. for the test purpose I copied gstack as
> testpstack in my home directory (pstack is a link to gstack, that
> itself is a script) I modified "$GDB --quiet $readnever -nx /proc/$1/
> exe $1 <<EOF 2>&1 |$backtrace" (last line) to use "-batch", I
> observed that with this new command, it never returns.

The command returns allright; it just doesn't print anything.
With '-batch' gdb actually doesn't read anything from stdin,
so the whole "echo ... | gdb ..." business falls apart.

> Do you know if there is a equivalent of pstack of Solaris available
> for RedHat. I have pstack1-1-7 on RHEL3 which is a binary and not a
> shell script. But with this I get stack trace of only the main thread.

You appear to think that switching to a different method of obtaining
the stack trace will cure the problem, but it will not: any method
you choose will have to temporarily stop your process. If you insist
on reading the resulting stack trace by the process itself, and
from a pipe, then you'll hit the exact same problem again.

You need to temporarily buffer pstack output somewhere other than
in the pipe going back to the process. A temporary file will solve
your problem quite nicely, and will work with your current pstack
implementation.

Wind

unread,
May 21, 2008, 4:54:07 AM5/21/08
to
On May 21, 7:05 am, Paul Pluzhnikov <ppluzhnikov-...@gmail.com> wrote:

I agree. To solve this, I have opened one file and replaced
STDOUT_FILENO with the fd of opened file. The pstack output now goes
to the file pointed to by fd. Do you see any issue with this approach?

John Reiser

unread,
May 21, 2008, 10:13:01 AM5/21/08
to
> With '-batch' gdb actually doesn't read anything from stdin,
> so the whole "echo ... | gdb ..." business falls apart.

The code does work:
echo '...' | gdb -batch -x /dev/stdin /proc/<pid>/exe <pid>
where "<pid>" is syntax for the numerical process id.

It's true that "-batch" ignores stdin, but then "-x /dev/stdin"
causes gdb to open and read from file /dev/stdin, which accomplishes
the goal of allowing 'echo' to create a command stream via pipe.

--

0 new messages