Can't see symbols from Erlang NIF library in core file

已查看 78 次
跳至第一个未读帖子

Attila Rajmund Nohl

未读,
2021年7月20日 13:38:182021/7/20
收件人 Erlang
Hello!

I'm working on an Erlang wrapper over a 3rd party C library on Ubuntu
Linux on x86, so I'm creating a NIF. Sometimes my code (I think)
crashes, resulting in a core file. Unfortunately the stacktrace is not
really helpful:

(gdb) bt
#0 0x00007fc22229968a in ?? ()
#1 0x0000000060e816d8 in ?? ()
#2 0x0000000007cd48b0 in ?? ()
#3 0x00007fc228031410 in ?? ()
#4 0x00007fc228040b80 in ?? ()
#5 0x00007fc228040c50 in ?? ()
#6 0x00007fc22223de0b in ?? ()
#7 0x0000000000000000 in ?? ()

even though I built my NIF .so file with debug info:

ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
linked, BuildID[sha1]=b70dd1f2450f5c0e9980c8396aaad2e1cd29024c, with
debug_info, not stripped

The beam binary also has debug info:

ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=e0a5dba6507b8c2b333faebc89fbc6ea2f7263b9, for GNU/Linux
3.2.0, with debug_info, not stripped

However, info sharedlibrary doesn't show neither the NIF nor the 3rd party lib:

(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007fc28942ed50 0x00007fc289432004 Yes
/lib/x86_64-linux-gnu/libgtk3-nocsd.so.0
0x00007fc289429220 0x00007fc28942a179 Yes
/lib/x86_64-linux-gnu/libdl.so.2
0x00007fc2892e83c0 0x00007fc28938ef18 Yes
/lib/x86_64-linux-gnu/libm.so.6
0x00007fc2892b76a0 0x00007fc2892c517c Yes
/lib/x86_64-linux-gnu/libtinfo.so.6
0x00007fc28928dae0 0x00007fc28929d4d5 Yes
/lib/x86_64-linux-gnu/libpthread.so.0
0x00007fc2890b9630 0x00007fc28922e20d Yes
/lib/x86_64-linux-gnu/libc.so.6
0x00007fc289657100 0x00007fc289679674 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007fc24459c040 0x00007fc2445ab8ad Yes
/home/nar/otp/23.3.4.2/lib/crypto-4.9.0.2/priv/lib/crypto.so
0x00007fc2239e3000 0x00007fc223b7c800 Yes (*)
/lib/x86_64-linux-gnu/libcrypto.so.1.1
0x00007fc2896500e0 0x00007fc28965028c Yes
/home/nar/otp/23.3.4.2/lib/crypto-4.9.0.2/priv/lib/crypto_callback.so
0x00007fc289649380 0x00007fc28964bc1c Yes
/home/nar/otp/23.3.4.2/lib/asn1-5.0.15/priv/lib/asn1rt_nif.so
0x00007fc289638720 0x00007fc28963bd70 Yes
/lib/x86_64-linux-gnu/librt.so.1

I found an answer at stackoverflow
(https://stackoverflow.com/a/32727752/2414208) mentioning that "The
Erlang VM doesn't load NIF libraries with global symbols exposed".
Could this be the reason why I don't see the symbols? Is there a way
to tell gdb to look up symbols from my .so file?

Lukas Larsson

未读,
2021年8月10日 03:07:192021/8/10
收件人 Attila Rajmund Nohl、Erlang
Hello!

Did you manage to figure out how to get symbols for your nif? There should not be anything special that you have to do other than compiling the nif with debug symbols. Could it be the 3rd party C library that does not have symbols?

The asn1rt_nif.so and crypto.so nifs have no special treatment and are loaded in exactly the same way as a user-defined nif. So there is most likely something different/wrong with how your nif is compiled and/or loaded.
 

I found an answer at stackoverflow
(https://stackoverflow.com/a/32727752/2414208) mentioning that "The
Erlang VM doesn't load NIF libraries with global symbols exposed".
Could this be the reason why I don't see the symbols? Is there a way
to tell gdb to look up symbols from my .so file?

I don't think the stackoverflow question is related to your problem as it seems to be related to symbols not resolving when doing dlopen and not when using gdb.

 

Attila Rajmund Nohl

未读,
2021年8月10日 07:32:042021/8/10
收件人 Erlang
Hello!

The solution was to start the VM with -debug, so I got a crash at one
of the asserts and that led me to the bug. The 3rd party library is
stripped, but I thought that my code should be somewhere in the call
stack and gdb should show it.... My bug was that I released an other
resource than I wanted, so there was a "double free", probably leading
to memory corruption and it might explain that the call stack was so
bad.

Lukas Larsson

未读,
2021年8月10日 08:09:462021/8/10
收件人 Erlang
On Tue, Aug 10, 2021 at 1:31 PM Attila Rajmund Nohl <attila...@gmail.com> wrote:
Hello!

The solution was to start the VM with -debug, so I got a crash at one
of the asserts and that led me to the bug. The 3rd party library is
stripped, but I thought that my code should be somewhere in the call
stack and gdb should show it.... My bug was that I released an other
resource than I wanted, so there was a "double free", probably leading
to memory corruption and it might explain that the call stack was so
bad.

Great that you found the bug! Running the debug emulator is always a good idea to catch errors in NIFs.

Depending on what flags were used to compile the third-party library, gdb may need the debug information to properly rewind the stack so that you can see your own frame.
回复全部
回复作者
转发
0 个新帖子