[llvm-dev] objcopy --prefix-symbols and undefined symbols

32 views
Skip to first unread message

Keith Smiley via llvm-dev

unread,
Nov 12, 2021, 8:01:52 PM11/12/21
to llvm...@lists.llvm.org
Hey folks,

I went to implement objcopy's --prefix-symbol support for MachO binaries and was a bit surprised by the behavior of it with elf binaries (which matches binutils' objcopy as well). Prefixing applies to all symbols, including undefined symbols, meaning something as simple as this example, will not work:

```
% cat /tmp/main.c
#include <stdio.h>

int main() {
  printf("hi\n");
}
% clang /tmp/main.c -o /tmp/main.o
% llvm-objcopy --prefix-symbols=bar /tmp/main.o
% clang /tmp/main.o
/usr/bin/ld: error in /tmp/main.o(.eh_frame); no .eh_frame_hdr table will be created
/usr/bin/ld: /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-gnu/crt1.o: in function `_start':
(.text+0x24): undefined reference to `main'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```

While prefixing `main` specifically might not be a common use case, my expected use case for this would be to prefix symbols in a static library, which has a similar issue for undefined symbols:

```
/usr/bin/ld: prefixed.o:(.data+0x6b0): undefined reference to `barmunmap'
/usr/bin/ld: prefixed.o:(.data+0x6c8): undefined reference to `barmremap'
/usr/bin/ld: prefixed.o:(.data+0x6f8): undefined reference to `barreadlink'
/usr/bin/ld: prefixed.o:(.data+0x710): undefined reference to `barlstat64'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```

I must be understanding the purpose of this flag in general so I'm curious if someone could clarify this use case to see if it's worth following through on MachO support for.

Thanks!
--
Keith Smiley

James Henderson via llvm-dev

unread,
Nov 15, 2021, 3:40:01 AM11/15/21
to Keith Smiley, llvm-dev
Hi Keith,

llvm-objcopy's ELF behaviour is closely modelled on GNU objcopy's behaviour, with the intent that it is a drop-in replacement for that tool. There are a limited number of exceptions to this, but --prefix-symbols is not one of them. When using GNU objcopy v2.30, I get the following behaviour:

$ cat test.cpp
void foo();
int main(){
   foo();
}
$ gcc -c test.cpp
$ nm test.o
                 U _GLOBAL_OFFSET_TABLE_
0000000000000000 T main
                 U _Z3foov
$ objcopy test.o test.prefix.o --prefix-symbols=bar
$ nm test.prefix.o
                 U bar_GLOBAL_OFFSET_TABLE_
0000000000000000 T barmain
                 U bar_Z3foov

I agree with you that I don't know under what situations this behaviour can be useful, apart from some pretty niche cases. Perhaps worth raising on the GNU mailing list?

James


_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Fangrui Song via llvm-dev

unread,
Nov 20, 2021, 4:58:25 PM11/20/21
to jh737...@my.bristol.ac.uk, llvm-dev

The upside of the behavior is:
relocatable_link(prefix_symbols(a.o), prefix_symbols(b.o)) = prefix_symbols(relocatable_link(a.o, b.o))

I.e. make symbol resolution work for the symbol set within the library.

That said, I think in most times --prefix-symbols is not useful.
Users need --redefine-syms which apply to a selective set of symbols, it
is very difficult to rule out:

undefined libc symbols
undefined libgcc/clang_rt.builtins symbols
undefined ABI symbols (e.g. _GLOBAL_OFFSET_TABLE_)

Reply all
Reply to author
Forward
0 new messages