Guidance on getting proper backtraces in tombstone

141 views
Skip to first unread message

jim.b...@couchbase.com

unread,
Apr 17, 2024, 6:13:28 PMApr 17
to android-ndk
Hello again, and apologies for a question that probably has more to do with C and C++ itself than Android specifically but this problem has been plaguing me for quite some time.  Whenever we get a report from the field including a tombstone, the stack traces are underwhelming.  It seems there is not enough information present in the production native libraries to generate a stack trace.  I guess C++ just wants to make this difficult or something but I notice that other system libraries appear to have it right.  Here is a really bad example from a latest bug report.  This is via .NET, for informational purposes. 

backtrace:
      #00 pc 00000000001ae3a4  /data/app/~~Y9JE4-VEbGMpV5xJIXd8Ag==/embc.zack-JXDgGUeOK9XH-CUJJYT4ow==/lib/arm64/libLiteCore.so (BuildId: 5a2f59a0c2d7ac3056f8b8f783b3ace25b38b0e8)
      #01 pc 000000000004fbd0  <anonymous:6db0603000>

Not only are function names not present in the final frame, but there are only two frames total and one of them is "anonymous".  Sometimes I get an actual trace, without function information for C++, which is a little better since I can see C methods properly and can sort of guess what is going on.  What does this "anonymous" mean and what can I do to prevent this?  I doubt the stack trace is ACTUALLY 2 frames.

For comparison, here are some other things I have found:

#00 pc 00000000000a3064  /apex/com.android.runtime/lib64/bionic/libc.so (__rt_sigsuspend+4) (BuildId: 3e3c5ec517682e9d3afdceafc14d447f)
      #01 pc 00000000000608f0  /apex/com.android.runtime/lib64/bionic/libc.so (sigsuspend+52) (BuildId: 3e3c5ec517682e9d3afdceafc14d447f)
      #02 pc 0000000000287b40  /data/app/~~Y9JE4-VEbGMpV5xJIXd8Ag==/embc.zack-JXDgGUeOK9XH-CUJJYT4ow==/lib/arm64/libmonosgen-2.0.so (suspend_signal_handler+188)

Notice the proper function names present.  Are these simply shipping with debug information present?  Or is there some agreed upon method to get JUST the information needed for stack traces to print properly.  We tried shipping with debug information present once and immediately got complaints about the library bloating from less than 10 MiB to over 40. 

enh

unread,
Apr 17, 2024, 6:35:45 PMApr 17
to andro...@googlegroups.com
On Wed, Apr 17, 2024 at 3:13 PM 'jim.b...@couchbase.com' via
android-ndk <andro...@googlegroups.com> wrote:
>
> Hello again, and apologies for a question that probably has more to do with C and C++ itself than Android specifically but this problem has been plaguing me for quite some time. Whenever we get a report from the field including a tombstone, the stack traces are underwhelming. It seems there is not enough information present in the production native libraries to generate a stack trace. I guess C++ just wants to make this difficult or something but I notice that other system libraries appear to have it right. Here is a really bad example from a latest bug report. This is via .NET, for informational purposes.

thanks for mentioning that, because i suspect that's the active ingredient here!

> backtrace:
> #00 pc 00000000001ae3a4 /data/app/~~Y9JE4-VEbGMpV5xJIXd8Ag==/embc.zack-JXDgGUeOK9XH-CUJJYT4ow==/lib/arm64/libLiteCore.so (BuildId: 5a2f59a0c2d7ac3056f8b8f783b3ace25b38b0e8)
> #01 pc 000000000004fbd0 <anonymous:6db0603000>
>
> Not only are function names not present in the final frame, but there are only two frames total and one of them is "anonymous". Sometimes I get an actual trace, without function information for C++, which is a little better since I can see C methods properly and can sort of guess what is going on. What does this "anonymous" mean and what can I do to prevent this? I doubt the stack trace is ACTUALLY 2 frames.

my guess is that "that's the anonymous memory-mapped region that .NET
has JITed your code into".

> For comparison, here are some other things I have found:
>
> #00 pc 00000000000a3064 /apex/com.android.runtime/lib64/bionic/libc.so (__rt_sigsuspend+4) (BuildId: 3e3c5ec517682e9d3afdceafc14d447f)
> #01 pc 00000000000608f0 /apex/com.android.runtime/lib64/bionic/libc.so (sigsuspend+52) (BuildId: 3e3c5ec517682e9d3afdceafc14d447f)
> #02 pc 0000000000287b40 /data/app/~~Y9JE4-VEbGMpV5xJIXd8Ag==/embc.zack-JXDgGUeOK9XH-CUJJYT4ow==/lib/arm64/libmonosgen-2.0.so (suspend_signal_handler+188)
>
> Notice the proper function names present. Are these simply shipping with debug information present?

not exactly. we strip, but we keep symbol names. (for exactly this use
case. it's a relatively small fraction of the total debugging
information, and has the most direct practical use for the most
people.)

> Or is there some agreed upon method to get JUST the information needed for stack traces to print properly. We tried shipping with debug information present once and immediately got complaints about the library bloating from less than 10 MiB to over 40.

to be able to unwind through a .NET JIT frame, you'll need the .NET
implementation to do some work. (but iirc ART does just use the usual
gdb JIT interface, so if they fix/add the necessary, it'll work better
for everyone else too. it's possible it's already there, and you just
need to enable it, but that's a question for people who know something
about your .NET implementation :-) )

> --
> You received this message because you are subscribed to the Google Groups "android-ndk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to android-ndk...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/android-ndk/fa653c95-8e85-43e6-8232-6da5a1ef0c06n%40googlegroups.com.

jim.b...@couchbase.com

unread,
Apr 17, 2024, 9:14:48 PMApr 17
to android-ndk
Thanks for the info.  Is there a resource where I can learn about including how to strip but keep the symbol names so that stack traces come out more sanely?  I'm actually the author of the .NET implementation in question (in a rare instance of having written about 90% of the lines of code personally) as well so I know almost everything about it (needless background information:  I started working on this project as pure .NET in 2015, then we made a C++ core that it binds to via a C interface in 2018.  I became the lead of the C++ core for a while, before being shifted back to lead of the .NET implementation now with extensive knowledge of how the C++ core works so from top to bottom I know this product inside and out so feel free to throw anything you think might be useful my way, regardless of .NET or C++).  Extraneous  information out of the way, I've actually never heard of this work that the .NET implementation needs to do and so I'm very curious about this as well.  Could you elaborate?

I did find a new technique for another native library I am also working on.  I summarized what I did in this gist (library renamed to "mylib.so") but this still doesn't seem to be enough (it embeds "minisym" into the actual lib, and pulls the rest into an external file).  This one was actually from Java on Linux (no .NET) but I assume the techniques are pretty similar for the native libraries (of which .NET and Java share the same exact one).  The stack trace obviously didn't have any anonymous frames in it but as the last two frames it contained:

C  [mylib.so+0x16448b]
C  [mylib.so+0x581f9]

Which is symptomatic of not having any information to go on.  I'm curious what sort of stripping Android native libs do.

enh

unread,
Apr 18, 2024, 6:22:52 PMApr 18
to andro...@googlegroups.com
(try again. googlegroups rejected this.)

On Thu, Apr 18, 2024 at 3:18 PM enh <e...@google.com> wrote:
>
> On Wed, Apr 17, 2024 at 6:15 PM 'jim.b...@couchbase.com' via
> android-ndk <andro...@googlegroups.com> wrote:
> >
> > Thanks for the info. Is there a resource where I can learn about including how to strip but keep the symbol names so that stack traces come out more sanely?
>
> https://cs.android.com/android/platform/superproject/main/+/main:build/soong/scripts/strip.sh
> is the script our OS build uses, implementing all the variants.
> (annoyingly, there's not just a --do-the-right-thing for this :-( )
>
> > I'm actually the author of the .NET implementation in question (in a rare instance of having written about 90% of the lines of code personally) as well so I know almost everything about it (needless background information: I started working on this project as pure .NET in 2015, then we made a C++ core that it binds to via a C interface in 2018. I became the lead of the C++ core for a while, before being shifted back to lead of the .NET implementation now with extensive knowledge of how the C++ core works so from top to bottom I know this product inside and out so feel free to throw anything you think might be useful my way, regardless of .NET or C++). Extraneous information out of the way, I've actually never heard of this work that the .NET implementation needs to do and so I'm very curious about this as well. Could you elaborate?
>
> https://www.google.com/search?q=gdb+jit+interface shows you both the
> docs themselves (for what they're worth) but also commentary from
> others who've done this (for v8, say). there's also the code in ART
> :-)
> > To view this discussion on the web visit https://groups.google.com/d/msgid/android-ndk/242bde0e-1038-4004-9a91-57929abd4048n%40googlegroups.com.

jim.b...@couchbase.com

unread,
Apr 18, 2024, 9:25:30 PMApr 18
to android-ndk
Ah ok I think I misunderstood ".NET implementation" to mean "the .NET product I am working on" rather than "the .NET runtime provided by Microsoft" essentially.  There seems to be various talks about doing this on the Microsoft side dating back to 2016 or so but totally unsure of the current state.  Oh well, I can live without that I think but thanks for the strip script, I will digest that and see if I can get the native library to at least output symbol names. 

enh

unread,
Apr 19, 2024, 12:11:51 PMApr 19
to andro...@googlegroups.com
heh, yeah, i should have guessed we were talking at cross purposes
from the fact that you weren't sure what the anonymous region in the
backtrace was --- but since basically everyone on our team works on
one language compiler/runtime or another... :-) (plus i'm pretty
ignorant where .NET is concerned, and although i have some vague idea
that there are multiple implementations, i've no idea how many, nor do
i really know anything about them!)

On Thu, Apr 18, 2024 at 6:25 PM 'jim.b...@couchbase.com' via
> To view this discussion on the web visit https://groups.google.com/d/msgid/android-ndk/8e505a71-e321-4638-8584-3bce6da7eb63n%40googlegroups.com.

jim.b...@couchbase.com

unread,
Apr 19, 2024, 6:01:26 PMApr 19
to android-ndk
Not to clutter the mailing list with possibly superfluous information, but the reason I got mixed up was the wording of "implementation" vs "runtime".  There probably doesn't exist a standard for this wording but I consider the SDK I work on in C# to be an implementation of our product in C# (.NET) while the thing that would be in charge of the JIT interface would be the runtime.  You are correct that more than one exists (probably two...but maybe more that I have not heard of) but at this point they have all been consolidated under Microsoft, basically, and "de-duplicated" target-wise so to speak so that so that each one is meant for a unique set of deployments.
Reply all
Reply to author
Forward
0 new messages