then stepi to walk the assembly. When it is a strange problem like
this the error can frequently be found in the very early code.
ron
-Skip
On Sat, Dec 18, 2010 at 8:51 PM, ron minnich <rmin...@gmail.com> wrote:
> FWIW when I have this kind of weird problem I get into gdb with the
> binary and use
> display/i $pc
>
>
> then stepi to walk the assembly. When it is a strange problem like
To fix something like this you need to tell what the illegal
instruction was. Especially if it is a regression like Skip suspects.
Kai
--
Kai Backman, programmer
http://tinkercad.com - The unprofessional solid CAD
Can you dump the value of lr at the crash site? My guess is that it's
0 which would explain the crash.
The next step would be to test with older versions of go and try to
pinpoint which CL caused the regression. Skip says it is a problem
with the newest release so you could start by binary searching that
interval.
Kai
--
You're close but you're focusing on the wrong thing. The add pc, lr,
#0 is a pretty simple register-to-register move that's not going to
cause a fault. Remember, when you see that PC is x when you stop, the
faulting instruction was at x-4. The swc 0x0000000 may be a bad thing.
It's a supervisor call. So at the start I'm not real clear on two
things:
1) why the go linker would emit that instruction
2) how the kernel on different ARMs manage that instruction
So it's a 2-deep puzzle at the moment.
It's interesting to consider the following little sequence however:
> 0x210b0: mov r7, #173 ; 0xad
> 0x210b4: svc 0x00000000
> 0x210b8: add pc, lr, #0 ; 0x
The first instruction puts 173 in R7; the second does a software
interrupt; what's that third thing?
It's moving the lr (link register) to pc, and on arm, you get an
immediate 16-bit thing you can add for free; so what this looks like
to me is moving the lr to pc. What's the lr?
See any function call:
bl xyz
branch and link: branches to xyz, puts a return pc in the lr.
SO, add pc, lr, #0 is nothing more I think than a
return
See http://infocenter.arm.com/help/topic/com.arm.doc.qrc0001m/QRC0001_UAL.pdf
for more info.
173 is sigreturn.
So my guess here is you've got a syscall which the kernel is upset
about and the kernel lets you know by sending you a signal. This is a
possible outcome when the kernel is very upset with something you're
trying to do :-)
OK, let's look at libc.
gdb /usr/lib/libc.so.6, and
disass syscall
and you'll see that the standard system call sequence on arm is basically
set up args
system call in r7
swc 0
So I guessed right. It's very useful sometimes to take a known-good
thing like a shared library and use gdb to show you sequences of code
for comparison.
Note that gcc is callee-save and 5g looks like caller-save (as in Plan
9 compilers) so you an see interesting differences in the code.
All guesses based on possibly too little coffee and a long night on
planes, but I wanted to say something since I've been on travel and
not helpful since my last comment in this thread :-) Also, there's a
bit of "teach a man to fish" in this note as you can see :-)
Which leads to my next question: can you try running this program with
strace -f and see what you see?
ron
> It's moving the lr (link register) to pc, and on arm, you get an
> immediate 16-bit thing you can add for free
(fx:nitpick)
12 bits, which are an 8-bit value and a 4-bit rotate-twice-this-many-bits count.
IIRC.
Chris
--
Chris "left or your other left?" Dollin
I did not look closely but that makes a lot of sense, since the return
address has to be 32-bit aligned anyway.
The ARM is an interesting architecture, because things seem to have
had some real thought applied; a refreshing change from things you
find in the x86 at times :-)
thanks
ron
well, that's interesting.
to recap, sorry, my memory is not so good at this point, when you
don't use strace you just get an illegal instruction trap?
ron
Here's what my arm binaries do when they start.
rt_sigaction(SIGQUIT, {0x3fb10, ~[], SA_STACK|SA_SIGINFO|0x4000000}, NULL, 8) =
0
rt_sigaction(SIGILL, {0x3fb10, ~[], SA_STACK|SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigaction(SIGTRAP, {0x3fb10, ~[], SA_STACK|SA_SIGINFO|0x4000000}, NULL, 8) =
0
First thing after the execve.
http://linux.die.net/man/2/rt_sigaction
Now, as weird as this sounds, it would be nice to verify that whatever
kernel you're running agrees with your binary about the support for
that system call at that system call number. It's a little hard for me
to believe it does not, but .... something really weird going on here.
Here's a simple test I did to make sure *something* uses that call on
my arm system, and that strace
can see and report about it:
strace -e trace=rt_sigaction /bin/ls
rt_sigaction(SIGRTMIN, {0x401ad2d8, [], SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x401ad170, [],
SA_RESTART|SA_SIGINFO|0x4000000}, NULL, 8) = 0
ron