Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

return status

1 view
Skip to first unread message

shaanxxx

unread,
Jan 29, 2007, 9:26:50 AM1/29/07
to
I have following peace of code

void autoclose(void)
{

closesystem(); //user function

}

int main()
{

opensystem(); // user function
atexit(autoclose);
return 0;

}

When i run above programme and check the status , i get 144.

Now i modify above code to following code
int main()
{

opensystem(); // user function
closesystem();
return 0;

}

Now i get return status as zero. I am not able to figure what could
be
the reason to get different status.
Expected status is zero.

I am getting zero status on solaris 32 and 64bit machin. Even i am
getting the right status on 32bit linux
platform. But on 64bit linux , i Am getting 144. Any comment would
be
appreciated.

$ uname -a
Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
2006 x86_64 x86_64 x86_64 GNU/Linux

redhats: $ g++ -v
Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.5/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--
infodir=/usr/share/info --enable-shared --enable-threads=posix --
disable-checking --with-system-zlib --enable-__cxa_atexit --disable-
libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)

Thanks,
Shaan.

shaanxxx

unread,
Jan 29, 2007, 10:03:34 AM1/29/07
to

Is there way to know exit status in atexit handler ?

Paul Pluzhnikov

unread,
Jan 29, 2007, 10:33:22 AM1/29/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

> int main()
> {
> opensystem(); // user function
> atexit(autoclose);
> return 0;
> }
>
> When i run above programme and check the status , i get 144.

This means that the program died with signal 16.
You can find out which signal that is by looking in /usr/include/signal.h
or on Linux, in /usr/include/asm/signal.h:

#define SIGSTKFLT 16

> Now i modify above code to following code
> int main()
> {
> opensystem(); // user function
> closesystem();
> return 0;
> }
>
> Now i get return status as zero. I am not able to figure what could be
> the reason to get different status.

There is something about closesystem() that makes it crash when
called from exit handler. You need to find out why it crashes,
and in order to do that, run it under debugger (gdb on Linux).

> $ uname -a
> Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
> 2006 x86_64 x86_64 x86_64 GNU/Linux

I was unable to construct a test case that dies with SIGSTKFLT:
stack exhaustion seems to cause SIGSEGV on my systems.

In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
source:

$ find . -type f | xargs grep SIGSTKFLT
./include/asm-i386/signal.h:#define SIGSTKFLT 16
./include/asm-ia64/signal.h:#define SIGSTKFLT 16
./include/asm-m68k/signal.h:#define SIGSTKFLT 16
./include/asm-ppc/signal.h:#define SIGSTKFLT 16
./include/asm-x86_64/signal.h:#define SIGSTKFLT 16
./kernel/signal.c: * | SIGSTKFLT | terminate |

Are you sure you are getting exit status 144 ?

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.

Paul Pluzhnikov

unread,
Jan 29, 2007, 10:36:59 AM1/29/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

> Is there way to know exit status in atexit handler ?

Think about it: exit status is the value that will be passed to the
parent process by the *kernel* when this process exits *or* dies. It
could be the value that the user passed to exit(3), or it could be
the value of fatal signal that kills the process.

How could the process know the future (without a time machine?)

Barry Margolin

unread,
Jan 29, 2007, 9:59:19 PM1/29/07
to
In article <m3odohc...@somewhere.in.california.localhost>,
Paul Pluzhnikov <ppluzhn...@charter.net> wrote:

> "shaanxxx" <shaa...@yahoo.com> writes:
>
> > Is there way to know exit status in atexit handler ?
>
> Think about it: exit status is the value that will be passed to the
> parent process by the *kernel* when this process exits *or* dies. It
> could be the value that the user passed to exit(3), or it could be
> the value of fatal signal that kills the process.
>
> How could the process know the future (without a time machine?)

Since atexit handlers aren't run if the process dies due to a signal,
that case is obviously irrelevant. So the only relevant case is when
exit(3) is called, and it certainly would be possible to make the exit
code available to atexit handlers, e.g. put it in a global variable.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Paul Pluzhnikov

unread,
Jan 29, 2007, 10:52:20 PM1/29/07
to
Barry Margolin <bar...@alum.mit.edu> writes:

> Since atexit handlers aren't run if the process dies due to a signal,
> that case is obviously irrelevant.

It's not.

It is pretty clear from the original post that the process dies
*while* the atexit handler is executing (or shortly thereafter).

shaanxxx

unread,
Jan 30, 2007, 12:33:08 AM1/30/07
to
On Jan 29, 8:33 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:

> "shaanxxx" <shaan...@yahoo.com> writes:
> > int main()
> > {
> > opensystem(); // user function
> > atexit(autoclose);
> > return 0;
> > }
>
> > When i run above programme and check the status , i get 144.
>
> This means that the program died with signal 16.
> You can find out which signal that is by looking in /usr/include/signal.h
> or on Linux, in /usr/include/asm/signal.h:
>
> #define SIGSTKFLT 16
>
> > Now i modify above code to following code
> > int main()
> > {
> > opensystem(); // user function
> > closesystem();
> > return 0;
> > }
>
> > Now i get return status as zero. I am not able to figure what could be
> > the reason to get different status.
>
> There is something about closesystem() that makes it crash when
> called from exit handler. You need to find out why it crashes,
> and in order to do that, run it under debugger (gdb on Linux).
>

I tried with dbg , i didnt crash .

gdb sessions:
[Thread debugging using libthread_db enabled]
[New Thread 182903270944 (LWP 2377)]
[New Thread 1084229984 (LWP 2380)]
[New Thread 1094719840 (LWP 2381)]

Program received signal SIG32, Real-time event 32.
[Switching to Thread 1094719840 (LWP 2381)]
0x000000320458f1a5 in __nanosleep_nocancel () from /lib64/tls/libc.so.
6
(gdb) c
Continuing.
[Thread 1094719840 (LWP 2381) exited]

Program received signal SIG32, Real-time event 32.
[Switching to Thread 1084229984 (LWP 2380)]
0x000000320500b0cf in __read_nocancel () from /lib64/tls/libpthread.so.
0
(gdb) c
Continuing.
[Thread 1084229984 (LWP 2380) exited]

Program exited with code 0220.

> > $ uname -a
> > Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
> > 2006 x86_64 x86_64 x86_64 GNU/Linux
>
> I was unable to construct a test case that dies with SIGSTKFLT:
> stack exhaustion seems to cause SIGSEGV on my systems.
>
> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
> source:
>
> $ find . -type f | xargs grep SIGSTKFLT
> ./include/asm-i386/signal.h:#define SIGSTKFLT 16
> ./include/asm-ia64/signal.h:#define SIGSTKFLT 16
> ./include/asm-m68k/signal.h:#define SIGSTKFLT 16
> ./include/asm-ppc/signal.h:#define SIGSTKFLT 16
> ./include/asm-x86_64/signal.h:#define SIGSTKFLT 16
> ./kernel/signal.c: * | SIGSTKFLT | terminate |
>
> Are you sure you are getting exit status 144 ?

Yes, I am getting 144 return in all my applications. Bcos of this all
test started to fail.

shaanxxx

unread,
Jan 30, 2007, 4:31:48 AM1/30/07
to
On Jan 29, 8:33 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:

I have installed the signal handler for SIGSTKFLT, But signal handler
is not getting called. I meant process is not getting SIGSTKFLT.


Paul Pluzhnikov

unread,
Jan 30, 2007, 6:19:03 PM1/30/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

> I tried with dbg , i didnt crash .

The whole crash thing was a "red herring".

Your program ectually exit()s with code 144.
When you look from shell, you can't distinguish that from "death
via SIGSTKFLT", but gdb tells you that it exited, not "was killed".

> gdb sessions:
> [Thread debugging using libthread_db enabled]
> [New Thread 182903270944 (LWP 2377)]

...


> Program received signal SIG32, Real-time event 32.

This is rather strange -- properly working gdb/libc/libthread_db
combination should never report SIG32 events.

Have you perhaps updated your libc without a matching update to
libthread_db ? (On RHEL both of these come from glibc package,
so it's unlikely, unless you build glibc "by hand").

> Program exited with code 0220.

So gdb tells you that the last thread exiting did it with exit
status 0220 == 144.

Next: set breakpoint on _exit (and possibly __GI__exit).

If the breakpoint is hit, find out who called _exit(144), and debug
from there.

Barry Margolin

unread,
Jan 31, 2007, 2:09:54 AM1/31/07
to
In article <m37iv49...@somewhere.in.california.localhost>,
Paul Pluzhnikov <ppluzhn...@charter.net> wrote:

> "shaanxxx" <shaa...@yahoo.com> writes:
>
> > I tried with dbg , i didnt crash .
>
> The whole crash thing was a "red herring".
>
> Your program ectually exit()s with code 144.
> When you look from shell, you can't distinguish that from "death
> via SIGSTKFLT", but gdb tells you that it exited, not "was killed".

How would exit(144) get confused with death from SIGSTKFLT? The status
code returned by wait() contains either the exit code or signal number
in the high-order 8 bits, while the low-order 7 bits contain the reason
why wait() returned -- 0 means the process exited normally, 0177 means
it stopped and can be resumed, and anything else means it terminated due
to a signal. You seem to be suggesting some kind of overflow, but I
don't see how that can happen.

Paul Pluzhnikov

unread,
Jan 31, 2007, 9:30:23 AM1/31/07
to
Barry Margolin <bar...@alum.mit.edu> writes:

> How would exit(144) get confused with death from SIGSTKFLT?

echo "int main() { return 144; }" | gcc -xc -o t1 -
echo "int main() { kill(getpid(), 16); }" | gcc -xc -o t2 -

cat << 'EOT' > t.sh
./t1; echo "t1 exited with $?"
./t2; echo "t2 exited with $?"
EOT

sh t.sh 2>/dev/null

Produces:

t1 exited with 144
t2 exited with 144

I claimed that *just* looking at exit code from *shell* (i.e. looking
at $?) one can't tell the difference.

shaanxxx

unread,
Jan 31, 2007, 10:18:11 AM1/31/07
to
On Jan 31, 7:30 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:

gdb session :
#0 0x000000320458f4b0 in _exit () from /lib64/tls/libc.so.6
#1 0x0000003204530cbb in exit () from /lib64/tls/libc.so.6
#2 0x000000320451c4c2 in __libc_start_main () from /lib64/tls/libc.so.
6
#3 0x0000000000400d5a in _start ()

exit is called from __libc_start_main.

I tried strace to see the difference of 2 programme and found
following
diff
< exit_group(0) = ? // this programme
return 0
---
> exit_group(705124752) = ? // // this programme returns 0

I am sure that exit(144) is not called from my programme as we saw in
gdb session. And I think , programme is not getting signal 16. I have
installed signal handler for 16 and the handler was not called from my
programme.

Any comment on this?

Paul Pluzhnikov

unread,
Jan 31, 2007, 11:20:40 AM1/31/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

> On Jan 31, 7:30 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
> wrote:
>>
>> echo "int main() { return 144; }" | gcc -xc -o t1 -
>> echo "int main() { kill(getpid(), 16); }" | gcc -xc -o t2 -

> gdb session :

What does this have to do with the message you are replying to?
Please "reply" to the message you are actually replying to, not
"any random" message in this thread.

> #0 0x000000320458f4b0 in _exit () from /lib64/tls/libc.so.6
> #1 0x0000003204530cbb in exit () from /lib64/tls/libc.so.6
> #2 0x000000320451c4c2 in __libc_start_main () from /lib64/tls/libc.so.
> 6
> #3 0x0000000000400d5a in _start ()
>
> exit is called from __libc_start_main.

The code in sysdeps/generic/libc-start.c is roughly:

if (setjmp(...)) {
result = main (argc, argv, __environ);
} else {
result = 0;
if (!last_thread) __exit_thread(0);
}
exit (result);

I can't think of any way for result to be 0x2a075990 (== 705124752),
unless that's what main() returned.

Debugging this will be difficult and will require dexterity with
'gdb' for assembly-level debugging.

If you can produce a complete test case exhibiting the same
behaviour, you'll have a better chance of someone debugging this
for you.

> I tried strace to see the difference of 2 programme and found
> following
> diff
> < exit_group(0) = ? // this programme
> return 0
> ---
>> exit_group(705124752) = ? // // this programme returns 0

Right: the last byte of that exit code is 0x90 == 144.
The value itself looks like a reasonable pointer on x86_64.

> I am sure that exit(144) is not called from my programme as we saw in
> gdb session.

Yes, we can safely assume that.

> And I think , programme is not getting signal 16.

You *know* it doesn't -- gdb would have told you if it did.

> I have
> installed signal handler for 16 and the handler was not called from my
> programme.

That actually proves nothing.

Cheers

JoelKatz

unread,
Feb 1, 2007, 1:02:43 AM2/1/07
to
On Jan 30, 1:31 am, "shaanxxx" <shaan...@yahoo.com> wrote:

> I have installed the signal handler for SIGSTKFLT, But signal handler
> is not getting called. I meant process is not getting SIGSTKFLT.

You really think the system can call a signal handler after a stack
fault? Where would it put the parameters to the signal handler
function?

DS

Paul Pluzhnikov

unread,
Feb 1, 2007, 1:24:23 AM2/1/07
to
"JoelKatz" <dav...@webmaster.com> writes:

> You really think the system can call a signal handler after a stack
> fault?

Sure, especially if sigaltstack was called before the fault.

> Where would it put the parameters to the signal handler
> function?

On x86_64 under Linux, the first 4 call parameters do not go onto
the stack; they are passed in registers.

You are correct in that OP's installing SIGSTKFLT handler is futile
and doesn't prove anything, but your reasoning is all wrong.

JoelKatz

unread,
Feb 4, 2007, 11:35:34 AM2/4/07
to
On Jan 31, 10:24 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:

> "JoelKatz" <dav...@webmaster.com> writes:
> > You really think the system can call a signal handler after a stack
> > fault?
>
> Sure, especially if sigaltstack was called before the fault.

There is absolutely nothing to suggest the OP did that.

> > Where would it put the parameters to the signal handler
> > function?
>
> On x86_64 under Linux, the first 4 call parameters do not go onto
> the stack; they are passed in registers.

Then where would it put the return address?

> You are correct in that OP's installing SIGSTKFLT handler is futile
> and doesn't prove anything, but your reasoning is all wrong.

What is the correct reasoning then?

DS

Paul Pluzhnikov

unread,
Feb 4, 2007, 11:48:12 AM2/4/07
to
"JoelKatz" <dav...@webmaster.com> writes:

>> Sure, especially if sigaltstack was called before the fault.
>
> There is absolutely nothing to suggest the OP did that.

That's correct.
I was merely objecting to your statement:

JK> You really think the system can call a signal handler after
JK> a stack fault?

Yes, the system *can* call signal handler after a stack fault,
*provided* sigaltstack() was done before.

>> > Where would it put the parameters to the signal handler
>> > function?
>>
>> On x86_64 under Linux, the first 4 call parameters do not go onto
>> the stack; they are passed in registers.
>
> Then where would it put the return address?

Return address is pushed on the stack, but you didn't say "return
address", you said "parameters".

> What is the correct reasoning then?

It has already appeared in this thread:

1. Gdb says "exited with 0220", no "killed with ..." -- this is a sure
indication that SIGSTKFLT (or any other signal) did *not* happen.

2. The same is also evident from OP's strace:

exit_group(705124752) = ?

3. PP> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
PP> source

Each one of the above is sufficient to state that "installing
SIGSTKFLT is futile for this test case -- it will never fire".

JoelKatz

unread,
Feb 5, 2007, 6:40:44 AM2/5/07
to
On Feb 4, 8:48 am, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:

> "JoelKatz" <dav...@webmaster.com> writes:
> >> Sure, especially if sigaltstack was called before the fault.
>
> > There is absolutely nothing to suggest the OP did that.

> That's correct.
> I was merely objecting to your statement:

With an objection that's not relevant to the context in which I made
it?!

> JK> You really think the system can call a signal handler after
> JK> a stack fault?

> Yes, the system *can* call signal handler after a stack fault,
> *provided* sigaltstack() was done before.

Right, but that's not this case.

> >> > Where would it put the parameters to the signal handler
> >> > function?

> >> On x86_64 under Linux, the first 4 call parameters do not go onto
> >> the stack; they are passed in registers.

> > Then where would it put the return address?

> Return address is pushed on the stack, but you didn't say "return
> address", you said "parameters".

The return address is a parameter, a hidden parameter, but a parameter
nonetheless.

> > What is the correct reasoning then?
>
> It has already appeared in this thread:
>
> 1. Gdb says "exited with 0220", no "killed with ..." -- this is a sure
> indication that SIGSTKFLT (or any other signal) did *not* happen.
>
> 2. The same is also evident from OP's strace:
>
> exit_group(705124752) = ?
>
> 3. PP> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
> PP> source
>
> Each one of the above is sufficient to state that "installing
> SIGSTKFLT is futile for this test case -- it will never fire".

I was saying that his installing the signal handler didn't prove
anything. You said I was correct, but that my reasoning was correct.
Then you list a bunch of explanations that have nothing to do with his
signal handler.

I'm baffled, but whatever.

DS

shaanxxx

unread,
Feb 13, 2007, 8:48:09 AM2/13/07
to
On Jan 31, 9:20 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:
i have located the code which was causing the problem. It has some
problem and I am not able to figure out . slight change in
programme(addition of printf) changes the return status one number to
another number.

$ ./asm;echo $? #without printf statement
132
$ ./asm;echo $? #with printf statement.
got mutex :)
20


#include<stdio.h>
#include <stdlib.h>
int mutex(const int process,
int wants_,
int* slot,
int* mut)
{
// The fourth argument is available in the %rcx register. Check
// http://www.x86-64.org. store a copy in a temporary
__asm("movq %rcx, %r12");
if (*mut == 0 ) {
// Store 1 in the temporary register %r11
__asm("movq $1, %r13");
// Do the exchange b/w %r11 and fourth argument (access memory
location
// using the address stored in register.)
__asm("xchg %r13, (%r12)");
// AND
__asm("test %r13, %r13");
// If equal to zero (means %r13 is zero), we got mutex,
Otherwise jump.
__asm("jnz .L1000");
printf("got mutex :)\n");
return 0; // True
}
__asm(".L1000:");

printf("Busy :(\n");
wants_ = 0;
return -1;
}
int i=0;
void at (void)
{
mutex(1,2,0,&i);
}
int main()
{
atexit(at);
return 0;
}
Thanks

Paul Pluzhnikov

unread,
Feb 14, 2007, 1:33:16 PM2/14/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

>> In order to understand recursion you must first understand recursion.

Please trim your replies, and read how to properly post on Usenet
e.g. here: http://www.xs4all.nl/%7ewijnands/nnq/nquote.html

> i have located the code which was causing the problem.

Ah! I should have asked whether you have any inline assembly.
A little knowledge is a dangerous thing.

> It has some problem

Indeed. The problem is that this code is bogus.

> int mutex(const int process,
> int wants_,
> int* slot,
> int* mut)
> {
> // The fourth argument is available in the %rcx register. Check
> // http://www.x86-64.org. store a copy in a temporary
> __asm("movq %rcx, %r12");

Who told you that %r12 is "free" (that compiler did not store
anything important in %r12)? What makes you think you can assign
values to "random" registers villy-nilly, and still have a correct
program?

> // If equal to zero (means %r13 is zero), we got mutex, Otherwise jump.
> __asm("jnz .L1000");

This anounts to complete compiler sabotage -- it's as if you wrote
some C code, and I later come in and insert goto's in a rangom
fasion, then complain that your program doesn't work.

> printf("got mutex :)\n");
> return 0; // True

Didn't they teach you that 0 is FALSE in C?
Call it "success", but please don't call it "True" (even in comments).

> wants_ = 0;
> return -1;

The assignment above serves no purpose whatsoever.

Now, what we know so far is that mutex() may return garbage (due
to incorrect programming), but how does this translate into exit(20),
or on my system into exit(24) ? [The exit code also changes with
different optimization levels.]

Well, it's because according to
http://refspecs.freestandards.org/elf/x86_64-SysV-psABI.pdf
%r12-r15 are callee-saved registers, preserved across function calls.

Therefore, by assigning anything to %r12 (and %r13) and not restoring
previous values, you can cause damage to the function that called
you (in this case you damage exit()) and expected to find %r12
undisturbed.

Bottom line: your "int mutex()" is buggy and must be rewritten.
Here is my attempt at it (with a little help from QT):

#include<stdio.h>
#include <stdlib.h>

// courtesy QT
int q_atomic_test_and_set_int(volatile int *ptr, int expected, int newval)
{
unsigned char ret;
__asm__ volatile("lock cmpxchgl %2,%3\nsete %1\n" :
"=a" (newval), "=qm" (ret) :
"r" (newval), "m" (*ptr), "0" (expected) :
"memory");
return (int)ret;
}

int mutex(int *mut)
{
if (*mut == 0 && q_atomic_test_and_set_int(mut, 0, 1)) {


printf("got mutex :)\n");

return 0; // success


}
printf("Busy :(\n");

return -1; // failure
}


Cheers,

Message has been deleted

shaanxxx

unread,
Feb 16, 2007, 8:40:42 AM2/16/07
to
> > // The fourth argument is available in the %rcx register. Check
> > //http://www.x86-64.org. store a copy in a temporary

> > __asm("movq %rcx, %r12");
>
> Who told you that %r12 is "free" (that compiler did not store
> anything important in %r12)? What makes you think you can assign
> values to "random" registers villy-nilly, and still have a correct
> program?
Can i use r8 , r9 and r11 register to store something?
I believe, these register will not hurt callee. I am not sure that it
can affect current function's working.

Paul Pluzhnikov

unread,
Feb 17, 2007, 12:00:56 AM2/17/07
to
"shaanxxx" <shaa...@yahoo.com> writes:

> Can i use r8 , r9 and r11 register to store something.

Sure, but you have to tell compiler that you are clobbering that
register, so it will either save/reload it for its own use, or will
avoid using it altogether.

0 new messages