[PATCH v2] Add new mechanism for function aliases - outside original source file

13 views
Skip to first unread message

Nadav Har'El

unread,
Aug 3, 2020, 7:18:13 PM8/3/20
to osv...@googlegroups.com, Nadav Har'El
For a long time, adding an alias to a function required us to use the
"weak_alias" macro in the *same* source file as the original function.
This caused us to modify some Musl files we didn't want to modify.

In this patch I add a new mechanism for creating an alias for functions
without modifying their original file. The original symbol's address is
only known at link time, not compile time, so we do this symbol copying
via a linker script - we have a new file libc/aliases.ld with a simple
list of symbol assignments.

To demonstrate the easiness and useful of this feature, we drop one
file which we had to change from musl - res_init.c - just because
we wanted to add an alias to it. With the new aliases.ld - we no
longer need to modify the original file.

In followup patches we can move a lot of the aliases we added in other
ways (weak_alias / alias in modified files, wrappers in libc/math/aliases.cc)
to the new aliases.ld.

Signed-off-by: Nadav Har'El <n...@scylladb.com>
---
Makefile | 2 +-
arch/aarch64/loader.ld | 1 +
arch/x64/loader.ld | 1 +
libc/aliases.ld | 12 ++++++++++++
libc/network/res_init.c | 7 -------
5 files changed, 15 insertions(+), 8 deletions(-)
create mode 100644 libc/aliases.ld
delete mode 100644 libc/network/res_init.c

diff --git a/Makefile b/Makefile
index cd490a76..9bab08c0 100644
--- a/Makefile
+++ b/Makefile
@@ -1375,7 +1375,7 @@ musl += network/getservbyport.o
libc += network/getifaddrs.o
libc += network/if_nameindex.o
musl += network/if_freenameindex.o
-libc += network/res_init.o
+musl += network/res_init.o

musl += prng/rand.o
musl += prng/rand_r.o
diff --git a/arch/aarch64/loader.ld b/arch/aarch64/loader.ld
index ad2135ee..a02e52b7 100644
--- a/arch/aarch64/loader.ld
+++ b/arch/aarch64/loader.ld
@@ -8,6 +8,7 @@
*/

INCLUDE "loader_options.ld"
+INCLUDE "libc/aliases.ld"
SECTIONS
{
/* Set the initial program counter to one page beyond the minimal
diff --git a/arch/x64/loader.ld b/arch/x64/loader.ld
index f981859d..dc963108 100644
--- a/arch/x64/loader.ld
+++ b/arch/x64/loader.ld
@@ -6,6 +6,7 @@
*/

INCLUDE "loader_options.ld"
+INCLUDE "libc/aliases.ld"
SECTIONS
{
/* Set the initial program counter to one page beyond the minimal
diff --git a/libc/aliases.ld b/libc/aliases.ld
new file mode 100644
index 00000000..89a87373
--- /dev/null
+++ b/libc/aliases.ld
@@ -0,0 +1,12 @@
+/* This file defines symbols as *aliases* to other symbols. The linker
+ * statically-linking the OSv kernel will set the alias's address to be
+ * the same one as the original symbol.
+ *
+ * This technique is more powerful than the C compiler's "alias(...)"
+ * attribute - the compiler-only technique is only usable when the alias
+ * and original symbol are defined in the same translation unit, because
+ * it is the compiler - not the linker - who need to copy the symbol's
+ * address.
+ */
+
+__res_init = res_init;
diff --git a/libc/network/res_init.c b/libc/network/res_init.c
deleted file mode 100644
index 66f3f95a..00000000
--- a/libc/network/res_init.c
+++ /dev/null
@@ -1,7 +0,0 @@
-#include "libc.h"
-
-int res_init()
-{
- return 0;
-}
-weak_alias(res_init, __res_init);
--
2.26.2

Waldek Kozaczuk

unread,
Aug 5, 2020, 12:36:54 AM8/5/20
to Nadav Har'El, osv...@googlegroups.com
This looks like a great solution. I was about to commit but I realized that alias symbols like  __res_init become global symbols vs weak in loader.elf. Does it have any practical consequences to how they are handled by OSv dynamic linker?

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/20200803231809.1432323-1-nyh%40scylladb.com.

Nadav Har'El

unread,
Aug 5, 2020, 3:23:42 AM8/5/20
to Waldek Kozaczuk, Pekka Enberg, Osv Dev
On Wed, Aug 5, 2020 at 7:36 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
This looks like a great solution. I was about to commit but I realized that alias symbols like  __res_init become global symbols vs weak in loader.elf. Does it have any practical consequences to how they are handled by OSv dynamic linker?

I never understood why we used weak aliases in OSv and just used the "weak_alias" macro we had like a parrot. Pekka maybe you remember?

Now you asked me to think about it, my conclusion is that: 1. we never needed to use weak aliases, and 2. We used them wrong :-)

The correct usage for weak aliases is for statically-linked libraries. You usually define the normal name as a weak symbol and the
"__" name as a regular symbol (opposite of what we did in OSv many times!). Here is a random example from nm of glibc.a:

0000000000000000 T _IO_fputs
0000000000000000 W fputs

This couple of symbols allows the user to redefine fputs() in his code as some sort of wrapper that can also call the original version
under its alternative name, _IO_fputs. If fputs() was not a weak symbol, and the user code redefined it, the same code could not
call _IO_fputs because the linker while bringing _IO_fputs would also bring in fputs (the static linker brings in entire objects from
the ".a" archive) and cause a duplicate symbol. The weak symbol doesn't cause this duplicate symbol problem.

This is not relevant to OSv, which is dynamically linked to OSv, and each symbol can be overridden by the application regardless
of any other symbols. We never should have used weak symbols in OSv, they never made any difference.

Nadav.

Commit Bot

unread,
Aug 5, 2020, 9:50:31 AM8/5/20
to osv...@googlegroups.com, Nadav Har'El
From: Nadav Har'El <n...@scylladb.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

Add new mechanism for function aliases - outside original source file

For a long time, adding an alias to a function required us to use the
"weak_alias" macro in the *same* source file as the original function.
This caused us to modify some Musl files we didn't want to modify.

In this patch I add a new mechanism for creating an alias for functions
without modifying their original file. The original symbol's address is
only known at link time, not compile time, so we do this symbol copying
via a linker script - we have a new file libc/aliases.ld with a simple
list of symbol assignments.

To demonstrate the easiness and useful of this feature, we drop one
file which we had to change from musl - res_init.c - just because
we wanted to add an alias to it. With the new aliases.ld - we no
longer need to modify the original file.

In followup patches we can move a lot of the aliases we added in other
ways (weak_alias / alias in modified files, wrappers in libc/math/aliases.cc)
to the new aliases.ld.

Signed-off-by: Nadav Har'El <n...@scylladb.com>
Message-Id: <20200803231809...@scylladb.com>

---
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -1375,7 +1375,7 @@ musl += network/getservbyport.o
libc += network/getifaddrs.o
libc += network/if_nameindex.o
musl += network/if_freenameindex.o
-libc += network/res_init.o
+musl += network/res_init.o

musl += prng/rand.o
musl += prng/rand_r.o
diff --git a/arch/aarch64/loader.ld b/arch/aarch64/loader.ld
--- a/arch/aarch64/loader.ld
+++ b/arch/aarch64/loader.ld
@@ -8,6 +8,7 @@
*/

INCLUDE "loader_options.ld"
+INCLUDE "libc/aliases.ld"
SECTIONS
{
/* Set the initial program counter to one page beyond the minimal
diff --git a/arch/x64/loader.ld b/arch/x64/loader.ld
--- a/arch/x64/loader.ld
+++ b/arch/x64/loader.ld
@@ -6,6 +6,7 @@
*/

INCLUDE "loader_options.ld"
+INCLUDE "libc/aliases.ld"
SECTIONS
{
/* Set the initial program counter to one page beyond the minimal
diff --git a/libc/aliases.ld b/libc/aliases.ld
--- a/libc/aliases.ld
+++ b/libc/aliases.ld
@@ -0,0 +1,12 @@
+/* This file defines symbols as *aliases* to other symbols. The linker
+ * statically-linking the OSv kernel will set the alias's address to be
+ * the same one as the original symbol.
+ *
+ * This technique is more powerful than the C compiler's "alias(...)"
+ * attribute - the compiler-only technique is only usable when the alias
+ * and original symbol are defined in the same translation unit, because
+ * it is the compiler - not the linker - who need to copy the symbol's
+ * address.
+ */
+
+__res_init = res_init;
diff --git a/libc/network/res_init.c b/libc/network/res_init.c
--- a/libc/network/res_init.c
+++ b/libc/network/res_init.c
Reply all
Reply to author
Forward
0 new messages