Potential bug in NDK r16?

1,063 views
Skip to first unread message

Andreas Grässer

unread,
Nov 23, 2017, 9:02:58 AM11/23/17
to android-ndk
Hi
We've integrated Ableton Link in all our apps since a while.
Ableton Link is a technology that synchronizes beat, phase and tempo of Ableton Live and Link-enabled apps over a wireless network.
The initial integration was done in April this year with NDK r14. The current live version is compiled with NDK r15c.

The integration is based on the Ableton Link cross platform library ( https://github.com/Ableton/link ) and partially on Peter Brinkmann’s Ableton Link for PD Android example (https://github.com/libpd/abl_link ).
Ableton Link uses ifaddrs for the communication over WLAN : https://github.com/libpd/abl_link/tree/master/external/android-ifaddrs

Last week drove some first tests with NDK r16.
Compiling with NDK r16 worked without any problems.

But we ran into crashes on ARMv7 devices as soon as Ableton Link starts to communicate over the wireless network.
All devices with ARMv7 CPU architecture ran into these crashes, independent of the installed Android version. I’ve tested it on Android 5.x, 6.x and 7.x. I don’t know how it is on Android 8.x, as we only have ARM64v8a devices running on the latest OS. But I’m pretty sure that the behavior will be the same.

Devices with ARM64v8a CPU architecture work without any problems (tested on Android 7.x and 8.x).
Also, when compiling the same code with NDK r15c, all CPU architectures work perfectly fine.

Here's the log output of the r16 crashes:

11-23 08:00:30.531 4396-4438/? A/libc: Fatal signal 7 (SIGBUS), code 1, fault addr 0x449d in tid 4438 (neth.linktomidi)
11-23 08:00:30.588 190-190/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
11-23 08:00:30.588 190-190/? A/DEBUG: Build fingerprint: 'google/razor/flo:6.0.1/MOB30X/3036618:user/release-keys'
11-23 08:00:30.588 190-190/? A/DEBUG: Revision: '0'
11-23 08:00:30.588 190-190/? A/DEBUG: ABI: 'arm'
11-23 08:00:30.588 190-190/? A/DEBUG: pid: 4396, tid: 4438, name: neth.linktomidi >>> com.planeth.linktomidi <<<
11-23 08:00:30.588 190-190/? A/DEBUG: signal 7 (SIGBUS), code 1 (BUS_ADRALN), fault addr 0x449d
11-23 08:00:30.612 190-190/? A/DEBUG: r0 b6c994b5 r1 0000449d r2 a03fce3c r3 a03fce8e
11-23 08:00:30.612 190-190/? A/DEBUG: r4 00004499 r5 a0adb968 r6 d66ab45d r7 a0cf37c8
11-23 08:00:30.612 190-190/? A/DEBUG: r8 a03e6000 r9 b3abb3d8 sl a0aaff40 fp a0cf389c
11-23 08:00:30.612 190-190/? A/DEBUG: ip a0fdf804 sp a0cf37c0 lr a0f633cb pc a0f633e4 cpsr 000f0030
11-23 08:00:30.627 190-190/? A/DEBUG: backtrace:
11-23 08:00:30.627 190-190/? A/DEBUG: #00 pc 0006e3e4 /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so
11-23 08:00:30.627 190-190/? A/DEBUG: #01 pc 0006e3c7 /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZNSt14__shared_countILN9__gnu_cxx12_Lock_policyE2EED2Ev+10)
11-23 08:00:30.628 190-190/? A/DEBUG: #02 pc 0007e34b /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZNK7ableton4util16SafeAsyncHandlerINS_9platforms4asio6SocketILj512EE4ImplEEclIJRKSt10error_codeRKjEEEvDpOT_+82)
11-23 08:00:30.628 190-190/? A/DEBUG: #03 pc 0007e24b /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZN4asio6detail27reactive_socket_recvfrom_opINS_17mutable_buffers_1ENS_2ip14basic_endpointINS3_3udpEEEN7ableton4util16SafeAsyncHandlerINS7_9platforms4asio6SocketILj512EE4ImplEEEE11do_completeEPvPNS0_19scheduler_operationERKSt10error_codej+90)
11-23 08:00:30.628 190-190/? A/DEBUG: #04 pc 00070d43 /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code+218)
11-23 08:00:30.628 190-190/? A/DEBUG: #05 pc 00070bbd /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZN4asio6detail9scheduler3runERSt10error_code+92)
11-23 08:00:30.628 190-190/? A/DEBUG: #06 pc 00070b35 /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZN4asio10io_context3runEv+32)
11-23 08:00:30.628 190-190/? A/DEBUG: #07 pc 000766d9 /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so (_ZZN7ableton9platforms4asio7ContextINS0_5posix13ScanIpIfAddrsENS_4util7NullLogEEC1INS_4link10ControllerISt8functionIFvjEESB_IFvNS9_5TempoEEENS0_5linux5ClockILi1EEES7_E23UdpSendExceptionHandlerEEET_ENKUlRN4asio10io_contextESL_E_clESP_SL_+38)
11-23 08:00:30.628 190-190/? A/DEBUG: #08 pc 000b53cf /data/app/com.planeth.linktomidi-1/lib/arm/libabl-link-facade.so
11-23 08:00:30.629 190-190/? A/DEBUG: #09 pc 0003f45f /system/lib/libc.so (_ZL15__pthread_startPv+30)
11-23 08:00:30.629 190-190/? A/DEBUG: #10 pc 00019b43 /system/lib/libc.so (__start_thread+6)
11-23 08:00:31.391 190-190/? A/DEBUG: Tombstone written to: /data/tombstones/tombstone_09
11-23 08:00:31.391 190-190/? E/DEBUG: AM write failed: Broken pipe

Any help would be appreciated.
Thanks in advance

Andreas Grässer

unread,
Nov 23, 2017, 3:36:33 PM11/23/17
to android-ndk
Just for completion:
I just did the same tests with Peter Brinkmann’s Ableton Link for PD Android example (https://github.com/libpd/abl_link ).
Compiling with NDK r15c: All devices work without problems.
Compiling with NDK r16: ARMv7 devices run into the same crash as they do with our integration ... as soon as network traffic between the app and at least one other Ableton Link peer occurs.
The crash can, but usually does not occur right after joining the Ableton Link session.
It can be forced either by doing tempo changes on one of the other peers, 
It can also be forced by turning the wireless LAN off and then on again (on the peer with the r16 compiled app).

11-23 17:01:36.388 17870-17899/? A/libc: Fatal signal 7 (SIGBUS), code 1, fault addr 0xb510b375 in tid 17899 (r.abllinksample)
11-23 17:01:36.448 193-193/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
11-23 17:01:36.448 193-193/? A/DEBUG: Build fingerprint: 'google/razor/flo:6.0.1/MOB30X/3036618:user/release-keys'
11-23 17:01:36.448 193-193/? A/DEBUG: Revision: '0'
11-23 17:01:36.448 193-193/? A/DEBUG: ABI: 'arm'
11-23 17:01:36.449 193-193/? A/DEBUG: pid: 17870, tid: 17899, name: r.abllinksample  >>> com.noisepages.nettoyeur.abllinksample <<<
11-23 17:01:36.449 193-193/? A/DEBUG: signal 7 (SIGBUS), code 1 (BUS_ADRALN), fault addr 0xb510b375
11-23 17:01:36.476 193-193/? A/DEBUG:     r0 b6cf34b5  r1 b510b375  r2 b38a683c  r3 b38a688e
11-23 17:01:36.476 193-193/? A/DEBUG:     r4 b510b371  r5 b3877768  r6 77f5e481  r7 a19c37c8
11-23 17:01:36.477 193-193/? A/DEBUG:     r8 ab338d00  r9 b36ab368  sl ab338740  fp a19c389c
11-23 17:01:36.477 193-193/? A/DEBUG:     ip aecb78c8  sp a19c37c0  lr aec3ef7b  pc aec3ef94  cpsr 800f0030
11-23 17:01:36.493 193-193/? A/DEBUG: backtrace:
11-23 17:01:36.493 193-193/? A/DEBUG:     #00 pc 0006df94  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so
11-23 17:01:36.494 193-193/? A/DEBUG:     #01 pc 0006df77  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZNSt14__shared_countILN9__gnu_cxx12_Lock_policyE2EED2Ev+10)
11-23 17:01:36.494 193-193/? A/DEBUG:     #02 pc 0007956b  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZNK7ableton4util16SafeAsyncHandlerINS_9platforms4asio6SocketILj512EE4ImplEEclIJRKSt10error_codeRKjEEEvDpOT_+82)
11-23 17:01:36.494 193-193/? A/DEBUG:     #03 pc 0007946b  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZN4asio6detail27reactive_socket_recvfrom_opINS_17mutable_buffers_1ENS_2ip14basic_endpointINS3_3udpEEEN7ableton4util16SafeAsyncHandlerINS7_9platforms4asio6SocketILj512EE4ImplEEEE11do_completeEPvPNS0_19scheduler_operationERKSt10error_codej+90)
11-23 17:01:36.494 193-193/? A/DEBUG:     #04 pc 0006da8f  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code+218)
11-23 17:01:36.495 193-193/? A/DEBUG:     #05 pc 0006d8b1  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZN4asio6detail9scheduler3runERSt10error_code+92)
11-23 17:01:36.495 193-193/? A/DEBUG:     #06 pc 0006d829  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZN4asio10io_context3runEv+32)
11-23 17:01:36.495 193-193/? A/DEBUG:     #07 pc 0007019f  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so (_ZZN7ableton9platforms4asio7ContextINS0_5posix13ScanIpIfAddrsENS_4util7NullLogEEC1INS_4link10ControllerISt8functionIFvjEESB_IFvNS9_5TempoEEENS0_3stl5ClockES7_E23UdpSendExceptionHandlerEEET_ENKUlRN4asio10io_contextESK_E_clESO_SK_+40)
11-23 17:01:36.495 193-193/? A/DEBUG:     #08 pc 000b30a7  /data/app/com.noisepages.nettoyeur.abllinksample-2/lib/arm/libabl_link_tilde.so
11-23 17:01:36.495 193-193/? A/DEBUG:     #09 pc 0003f45f  /system/lib/libc.so (_ZL15__pthread_startPv+30)
11-23 17:01:36.496 193-193/? A/DEBUG:     #10 pc 00019b43  /system/lib/libc.so (__start_thread+6)
11-23 17:01:37.095 193-193/? W/debuggerd: type=1400 audit(0.0:344): avc: denied { read } for name="kgsl-3d0" dev="tmpfs" ino=4008 scontext=u:r:debuggerd:s0 tcontext=u:object_r:gpu_device:s0 tclass=chr_file permissive=0
11-23 17:01:37.095 193-193/? W/debuggerd: type=1400 audit(0.0:345): avc: denied { read } for name="kgsl-3d0" dev="tmpfs" ino=4008 scontext=u:r:debuggerd:s0 tcontext=u:object_r:gpu_device:s0 tclass=chr_file permissive=0
11-23 17:01:37.293 193-193/? A/DEBUG: Tombstone written to: /data/tombstones/tombstone_00
11-23 17:01:37.293 193-193/? E/DEBUG: AM write failed: Broken pipe

Phil Burk

unread,
Nov 27, 2017, 3:52:14 PM11/27/17
to android-ndk
Hello Andreas,

Thanks for the bug report.  I have created an internal bug for this:   b/69802177

There are pretty clear instructions for building a test app here:


So we should be able to reproduce it.

Phil Burk

Mikhail Naganov

unread,
Nov 28, 2017, 11:07:19 AM11/28/17
to android-ndk
Although, it seems that you are just using std::shared_ptr (from looking at https://github.com/Ableton/link/blob/master/include/ableton/util/SafeAsyncHandler.hpp).

Then indeed you are not much in control of how the shared_ptr's counter is allocated, so this must be a compiler / STL bug in this NDK.

Mikhail Naganov

unread,
Nov 28, 2017, 11:07:19 AM11/28/17
to android-ndk
I typically saw such crashes (signal 7 (SIGBUS), code 1 (BUS_ADRALN)) on ARMv7 whenever there is an attempt to access a field atomically, when the field isn't aligned by a 32-bit boundary, and looking at your faul addresses, they indeed aren't.

This usually happens when you don't specify alignment for your fields. Then it all depends on how compiler would arrange them, and this can change from version to version, so updating NDK version could cause that.

If you believe you specify the alignment properly, then it might be a compiler error. In this case please provide a minimalistic repro case.


On Thursday, 23 November 2017 06:02:58 UTC-8, Andreas Grässer wrote:

Andreas Grässer

unread,
Nov 28, 2017, 11:07:30 AM11/28/17
to android-ndk
Thanks for your answer, Phil.
The problem just has been solved in this moment.
It was based in the clang optimization settings (the default is -Oz in NDK r16).
Switching back to -Os solved the problem.

All details here:

Andreas

Ryan Prichard

unread,
Nov 28, 2017, 11:07:34 AM11/28/17
to andro...@googlegroups.com
The app developer also filed an issue on the NDK GitHub repository: https://github.com/android-ndk/ndk/issues/579.

-Ryan


--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to android-ndk+unsubscribe@googlegroups.com.
To post to this group, send email to andro...@googlegroups.com.
Visit this group at https://groups.google.com/group/android-ndk.
To view this discussion on the web visit https://groups.google.com/d/msgid/android-ndk/38f02fad-bd5b-48be-8994-440abd13d7d6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Andreas Grässer

unread,
Nov 28, 2017, 1:48:45 PM11/28/17
to android-ndk
Thank you Mikhail.
The problem has been resolved meanwhile.
However, you were right, it was/is a compiler problem.

Peter Holly

unread,
Dec 1, 2017, 10:45:37 AM12/1/17
to android-ndk
Hi,

we had similar problem with this code 
a = bswap(*(unsigned int*)block);

comiled with nkd16

.text:0006A0D8             t = R0                                  ; int
.text:0006A0D8 DB F8 00 10                 LDR.W           R1, [block]
.text:0006A0DC 04 C9                       LDMIA           R1!, {R2}              ; <- code crashed here
.text:0006A0DE CB F8 00 10                 STR.W           R1, [block]
.text:0006A0E2 11 BA                       REV             R1, R2
.text:0006A0E4             a = R1

compiled with ndk15

.text:0006365E                 LDR             R0, [block]
.text:00063660                 MOVS            R1, #0
.text:00063662
.text:00063662 loc_63662
.text:00063662 t = R1                                  ; int
.text:00063662                 LDR.W           R2, [R0,t,LSL#2]
.text:00063666                 REV             R2, R2
.text:00063668 a = R2 


r16 used LDMIA instruction which does not support unaligned adresses, and while it loads only one register it's probably not the best choice here

peter
Reply all
Reply to author
Forward
0 new messages