void autoclose(void)
{
closesystem(); //user function
}
int main()
{
opensystem(); // user function
atexit(autoclose);
return 0;
}
When i run above programme and check the status , i get 144.
Now i modify above code to following code
int main()
{
opensystem(); // user function
closesystem();
return 0;
}
Now i get return status as zero. I am not able to figure what could
be
the reason to get different status.
Expected status is zero.
I am getting zero status on solaris 32 and 64bit machin. Even i am
getting the right status on 32bit linux
platform. But on 64bit linux , i Am getting 144. Any comment would
be
appreciated.
$ uname -a
Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
2006 x86_64 x86_64 x86_64 GNU/Linux
redhats: $ g++ -v
Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.5/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--
infodir=/usr/share/info --enable-shared --enable-threads=posix --
disable-checking --with-system-zlib --enable-__cxa_atexit --disable-
libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)
Thanks,
Shaan.
Is there way to know exit status in atexit handler ?
> int main()
> {
> opensystem(); // user function
> atexit(autoclose);
> return 0;
> }
>
> When i run above programme and check the status , i get 144.
This means that the program died with signal 16.
You can find out which signal that is by looking in /usr/include/signal.h
or on Linux, in /usr/include/asm/signal.h:
#define SIGSTKFLT 16
> Now i modify above code to following code
> int main()
> {
> opensystem(); // user function
> closesystem();
> return 0;
> }
>
> Now i get return status as zero. I am not able to figure what could be
> the reason to get different status.
There is something about closesystem() that makes it crash when
called from exit handler. You need to find out why it crashes,
and in order to do that, run it under debugger (gdb on Linux).
> $ uname -a
> Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
> 2006 x86_64 x86_64 x86_64 GNU/Linux
I was unable to construct a test case that dies with SIGSTKFLT:
stack exhaustion seems to cause SIGSEGV on my systems.
In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
source:
$ find . -type f | xargs grep SIGSTKFLT
./include/asm-i386/signal.h:#define SIGSTKFLT 16
./include/asm-ia64/signal.h:#define SIGSTKFLT 16
./include/asm-m68k/signal.h:#define SIGSTKFLT 16
./include/asm-ppc/signal.h:#define SIGSTKFLT 16
./include/asm-x86_64/signal.h:#define SIGSTKFLT 16
./kernel/signal.c: * | SIGSTKFLT | terminate |
Are you sure you are getting exit status 144 ?
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
> Is there way to know exit status in atexit handler ?
Think about it: exit status is the value that will be passed to the
parent process by the *kernel* when this process exits *or* dies. It
could be the value that the user passed to exit(3), or it could be
the value of fatal signal that kills the process.
How could the process know the future (without a time machine?)
> "shaanxxx" <shaa...@yahoo.com> writes:
>
> > Is there way to know exit status in atexit handler ?
>
> Think about it: exit status is the value that will be passed to the
> parent process by the *kernel* when this process exits *or* dies. It
> could be the value that the user passed to exit(3), or it could be
> the value of fatal signal that kills the process.
>
> How could the process know the future (without a time machine?)
Since atexit handlers aren't run if the process dies due to a signal,
that case is obviously irrelevant. So the only relevant case is when
exit(3) is called, and it certainly would be possible to make the exit
code available to atexit handlers, e.g. put it in a global variable.
--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
> Since atexit handlers aren't run if the process dies due to a signal,
> that case is obviously irrelevant.
It's not.
It is pretty clear from the original post that the process dies
*while* the atexit handler is executing (or shortly thereafter).
I tried with dbg , i didnt crash .
gdb sessions:
[Thread debugging using libthread_db enabled]
[New Thread 182903270944 (LWP 2377)]
[New Thread 1084229984 (LWP 2380)]
[New Thread 1094719840 (LWP 2381)]
Program received signal SIG32, Real-time event 32.
[Switching to Thread 1094719840 (LWP 2381)]
0x000000320458f1a5 in __nanosleep_nocancel () from /lib64/tls/libc.so.
6
(gdb) c
Continuing.
[Thread 1094719840 (LWP 2381) exited]
Program received signal SIG32, Real-time event 32.
[Switching to Thread 1084229984 (LWP 2380)]
0x000000320500b0cf in __read_nocancel () from /lib64/tls/libpthread.so.
0
(gdb) c
Continuing.
[Thread 1084229984 (LWP 2380) exited]
Program exited with code 0220.
> > $ uname -a
> > Linux redhats.xyz.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST
> > 2006 x86_64 x86_64 x86_64 GNU/Linux
>
> I was unable to construct a test case that dies with SIGSTKFLT:
> stack exhaustion seems to cause SIGSEGV on my systems.
>
> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
> source:
>
> $ find . -type f | xargs grep SIGSTKFLT
> ./include/asm-i386/signal.h:#define SIGSTKFLT 16
> ./include/asm-ia64/signal.h:#define SIGSTKFLT 16
> ./include/asm-m68k/signal.h:#define SIGSTKFLT 16
> ./include/asm-ppc/signal.h:#define SIGSTKFLT 16
> ./include/asm-x86_64/signal.h:#define SIGSTKFLT 16
> ./kernel/signal.c: * | SIGSTKFLT | terminate |
>
> Are you sure you are getting exit status 144 ?
Yes, I am getting 144 return in all my applications. Bcos of this all
test started to fail.
I have installed the signal handler for SIGSTKFLT, But signal handler
is not getting called. I meant process is not getting SIGSTKFLT.
> I tried with dbg , i didnt crash .
The whole crash thing was a "red herring".
Your program ectually exit()s with code 144.
When you look from shell, you can't distinguish that from "death
via SIGSTKFLT", but gdb tells you that it exited, not "was killed".
> gdb sessions:
> [Thread debugging using libthread_db enabled]
> [New Thread 182903270944 (LWP 2377)]
...
> Program received signal SIG32, Real-time event 32.
This is rather strange -- properly working gdb/libc/libthread_db
combination should never report SIG32 events.
Have you perhaps updated your libc without a matching update to
libthread_db ? (On RHEL both of these come from glibc package,
so it's unlikely, unless you build glibc "by hand").
> Program exited with code 0220.
So gdb tells you that the last thread exiting did it with exit
status 0220 == 144.
Next: set breakpoint on _exit (and possibly __GI__exit).
If the breakpoint is hit, find out who called _exit(144), and debug
from there.
> "shaanxxx" <shaa...@yahoo.com> writes:
>
> > I tried with dbg , i didnt crash .
>
> The whole crash thing was a "red herring".
>
> Your program ectually exit()s with code 144.
> When you look from shell, you can't distinguish that from "death
> via SIGSTKFLT", but gdb tells you that it exited, not "was killed".
How would exit(144) get confused with death from SIGSTKFLT? The status
code returned by wait() contains either the exit code or signal number
in the high-order 8 bits, while the low-order 7 bits contain the reason
why wait() returned -- 0 means the process exited normally, 0177 means
it stopped and can be resumed, and anything else means it terminated due
to a signal. You seem to be suggesting some kind of overflow, but I
don't see how that can happen.
> How would exit(144) get confused with death from SIGSTKFLT?
echo "int main() { return 144; }" | gcc -xc -o t1 -
echo "int main() { kill(getpid(), 16); }" | gcc -xc -o t2 -
cat << 'EOT' > t.sh
./t1; echo "t1 exited with $?"
./t2; echo "t2 exited with $?"
EOT
sh t.sh 2>/dev/null
Produces:
t1 exited with 144
t2 exited with 144
I claimed that *just* looking at exit code from *shell* (i.e. looking
at $?) one can't tell the difference.
gdb session :
#0 0x000000320458f4b0 in _exit () from /lib64/tls/libc.so.6
#1 0x0000003204530cbb in exit () from /lib64/tls/libc.so.6
#2 0x000000320451c4c2 in __libc_start_main () from /lib64/tls/libc.so.
6
#3 0x0000000000400d5a in _start ()
exit is called from __libc_start_main.
I tried strace to see the difference of 2 programme and found
following
diff
< exit_group(0) = ? // this programme
return 0
---
> exit_group(705124752) = ? // // this programme returns 0
I am sure that exit(144) is not called from my programme as we saw in
gdb session. And I think , programme is not getting signal 16. I have
installed signal handler for 16 and the handler was not called from my
programme.
Any comment on this?
> On Jan 31, 7:30 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
> wrote:
>>
>> echo "int main() { return 144; }" | gcc -xc -o t1 -
>> echo "int main() { kill(getpid(), 16); }" | gcc -xc -o t2 -
> gdb session :
What does this have to do with the message you are replying to?
Please "reply" to the message you are actually replying to, not
"any random" message in this thread.
> #0 0x000000320458f4b0 in _exit () from /lib64/tls/libc.so.6
> #1 0x0000003204530cbb in exit () from /lib64/tls/libc.so.6
> #2 0x000000320451c4c2 in __libc_start_main () from /lib64/tls/libc.so.
> 6
> #3 0x0000000000400d5a in _start ()
>
> exit is called from __libc_start_main.
The code in sysdeps/generic/libc-start.c is roughly:
if (setjmp(...)) {
result = main (argc, argv, __environ);
} else {
result = 0;
if (!last_thread) __exit_thread(0);
}
exit (result);
I can't think of any way for result to be 0x2a075990 (== 705124752),
unless that's what main() returned.
Debugging this will be difficult and will require dexterity with
'gdb' for assembly-level debugging.
If you can produce a complete test case exhibiting the same
behaviour, you'll have a better chance of someone debugging this
for you.
> I tried strace to see the difference of 2 programme and found
> following
> diff
> < exit_group(0) = ? // this programme
> return 0
> ---
>> exit_group(705124752) = ? // // this programme returns 0
Right: the last byte of that exit code is 0x90 == 144.
The value itself looks like a reasonable pointer on x86_64.
> I am sure that exit(144) is not called from my programme as we saw in
> gdb session.
Yes, we can safely assume that.
> And I think , programme is not getting signal 16.
You *know* it doesn't -- gdb would have told you if it did.
> I have
> installed signal handler for 16 and the handler was not called from my
> programme.
That actually proves nothing.
Cheers
> I have installed the signal handler for SIGSTKFLT, But signal handler
> is not getting called. I meant process is not getting SIGSTKFLT.
You really think the system can call a signal handler after a stack
fault? Where would it put the parameters to the signal handler
function?
DS
> You really think the system can call a signal handler after a stack
> fault?
Sure, especially if sigaltstack was called before the fault.
> Where would it put the parameters to the signal handler
> function?
On x86_64 under Linux, the first 4 call parameters do not go onto
the stack; they are passed in registers.
You are correct in that OP's installing SIGSTKFLT handler is futile
and doesn't prove anything, but your reasoning is all wrong.
There is absolutely nothing to suggest the OP did that.
> > Where would it put the parameters to the signal handler
> > function?
>
> On x86_64 under Linux, the first 4 call parameters do not go onto
> the stack; they are passed in registers.
Then where would it put the return address?
> You are correct in that OP's installing SIGSTKFLT handler is futile
> and doesn't prove anything, but your reasoning is all wrong.
What is the correct reasoning then?
DS
>> Sure, especially if sigaltstack was called before the fault.
>
> There is absolutely nothing to suggest the OP did that.
That's correct.
I was merely objecting to your statement:
JK> You really think the system can call a signal handler after
JK> a stack fault?
Yes, the system *can* call signal handler after a stack fault,
*provided* sigaltstack() was done before.
>> > Where would it put the parameters to the signal handler
>> > function?
>>
>> On x86_64 under Linux, the first 4 call parameters do not go onto
>> the stack; they are passed in registers.
>
> Then where would it put the return address?
Return address is pushed on the stack, but you didn't say "return
address", you said "parameters".
> What is the correct reasoning then?
It has already appeared in this thread:
1. Gdb says "exited with 0220", no "killed with ..." -- this is a sure
indication that SIGSTKFLT (or any other signal) did *not* happen.
2. The same is also evident from OP's strace:
exit_group(705124752) = ?
3. PP> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
PP> source
Each one of the above is sufficient to state that "installing
SIGSTKFLT is futile for this test case -- it will never fire".
> That's correct.
> I was merely objecting to your statement:
With an objection that's not relevant to the context in which I made
it?!
> JK> You really think the system can call a signal handler after
> JK> a stack fault?
> Yes, the system *can* call signal handler after a stack fault,
> *provided* sigaltstack() was done before.
Right, but that's not this case.
> >> > Where would it put the parameters to the signal handler
> >> > function?
> >> On x86_64 under Linux, the first 4 call parameters do not go onto
> >> the stack; they are passed in registers.
> > Then where would it put the return address?
> Return address is pushed on the stack, but you didn't say "return
> address", you said "parameters".
The return address is a parameter, a hidden parameter, but a parameter
nonetheless.
> > What is the correct reasoning then?
>
> It has already appeared in this thread:
>
> 1. Gdb says "exited with 0220", no "killed with ..." -- this is a sure
> indication that SIGSTKFLT (or any other signal) did *not* happen.
>
> 2. The same is also evident from OP's strace:
>
> exit_group(705124752) = ?
>
> 3. PP> In fact, SIGSTKFLT isn't even referenced anywhere in the kernel
> PP> source
>
> Each one of the above is sufficient to state that "installing
> SIGSTKFLT is futile for this test case -- it will never fire".
I was saying that his installing the signal handler didn't prove
anything. You said I was correct, but that my reasoning was correct.
Then you list a bunch of explanations that have nothing to do with his
signal handler.
I'm baffled, but whatever.
DS
$ ./asm;echo $? #without printf statement
132
$ ./asm;echo $? #with printf statement.
got mutex :)
20
#include<stdio.h>
#include <stdlib.h>
int mutex(const int process,
int wants_,
int* slot,
int* mut)
{
// The fourth argument is available in the %rcx register. Check
// http://www.x86-64.org. store a copy in a temporary
__asm("movq %rcx, %r12");
if (*mut == 0 ) {
// Store 1 in the temporary register %r11
__asm("movq $1, %r13");
// Do the exchange b/w %r11 and fourth argument (access memory
location
// using the address stored in register.)
__asm("xchg %r13, (%r12)");
// AND
__asm("test %r13, %r13");
// If equal to zero (means %r13 is zero), we got mutex,
Otherwise jump.
__asm("jnz .L1000");
printf("got mutex :)\n");
return 0; // True
}
__asm(".L1000:");
printf("Busy :(\n");
wants_ = 0;
return -1;
}
int i=0;
void at (void)
{
mutex(1,2,0,&i);
}
int main()
{
atexit(at);
return 0;
}
Thanks
>> In order to understand recursion you must first understand recursion.
Please trim your replies, and read how to properly post on Usenet
e.g. here: http://www.xs4all.nl/%7ewijnands/nnq/nquote.html
> i have located the code which was causing the problem.
Ah! I should have asked whether you have any inline assembly.
A little knowledge is a dangerous thing.
> It has some problem
Indeed. The problem is that this code is bogus.
> int mutex(const int process,
> int wants_,
> int* slot,
> int* mut)
> {
> // The fourth argument is available in the %rcx register. Check
> // http://www.x86-64.org. store a copy in a temporary
> __asm("movq %rcx, %r12");
Who told you that %r12 is "free" (that compiler did not store
anything important in %r12)? What makes you think you can assign
values to "random" registers villy-nilly, and still have a correct
program?
> // If equal to zero (means %r13 is zero), we got mutex, Otherwise jump.
> __asm("jnz .L1000");
This anounts to complete compiler sabotage -- it's as if you wrote
some C code, and I later come in and insert goto's in a rangom
fasion, then complain that your program doesn't work.
> printf("got mutex :)\n");
> return 0; // True
Didn't they teach you that 0 is FALSE in C?
Call it "success", but please don't call it "True" (even in comments).
> wants_ = 0;
> return -1;
The assignment above serves no purpose whatsoever.
Now, what we know so far is that mutex() may return garbage (due
to incorrect programming), but how does this translate into exit(20),
or on my system into exit(24) ? [The exit code also changes with
different optimization levels.]
Well, it's because according to
http://refspecs.freestandards.org/elf/x86_64-SysV-psABI.pdf
%r12-r15 are callee-saved registers, preserved across function calls.
Therefore, by assigning anything to %r12 (and %r13) and not restoring
previous values, you can cause damage to the function that called
you (in this case you damage exit()) and expected to find %r12
undisturbed.
Bottom line: your "int mutex()" is buggy and must be rewritten.
Here is my attempt at it (with a little help from QT):
#include<stdio.h>
#include <stdlib.h>
// courtesy QT
int q_atomic_test_and_set_int(volatile int *ptr, int expected, int newval)
{
unsigned char ret;
__asm__ volatile("lock cmpxchgl %2,%3\nsete %1\n" :
"=a" (newval), "=qm" (ret) :
"r" (newval), "m" (*ptr), "0" (expected) :
"memory");
return (int)ret;
}
int mutex(int *mut)
{
if (*mut == 0 && q_atomic_test_and_set_int(mut, 0, 1)) {
printf("got mutex :)\n");
return 0; // success
}
printf("Busy :(\n");
return -1; // failure
}
Cheers,
> Can i use r8 , r9 and r11 register to store something.
Sure, but you have to tell compiler that you are clobbering that
register, so it will either save/reload it for its own use, or will
avoid using it altogether.