Hiding C++ standard library

25 views
Skip to first unread message

Waldek Kozaczuk

unread,
May 28, 2020, 1:10:19 AM5/28/20
to OSv Development
The question of whether to hide the C++ standard library in the kernel stands at the crossroads of solving two issues:
To recap, on one hand keeping C++ exposed, reduces the size of the image (kernel + app files) and lowers memory utilization for C++ apps as the libstdc++.so is provided. But then, on the other hand, it increases the size of the kernel ELF and is wasteful for non-C++ apps that do not depend on libstdc++.so and also potentially creates incompatibility issues where some C++ apps depend on a different version of the stdc++ than the one OSv kernel is linked '--whole-archive' with.

Hiding the C++ standard library, in essence, reverses the advantages and disadvantages stated above. 

The extra caveat is that many internal apps (cpiod, httpserver, etc) are C++ apps, and worse they use the C++ API to interact with OSv kernel.

In theory, there are 3 options we might consider:
  1. Keep the C++ standard library exposed which means keep all relevant C++ symbols exported when fixing #821. Not sure how we could solve "different C++ library" incompatibility issue.
  2. Hide the C++ standard library completely (--no-whole-archive) and hide in a version script file. This would affect internal C++ apps - at least we would need to add libstdc++.so to the image but also probably change existing C++ API in OSv kernel that is used by those apps to integrate to the C API.
    • What about exceptions? Do we need to worry about them in any away as far as calls between apps and kernel goes?
  3. Hide the C++ standard library "as much as possible". In other words, link only enough of it into the kernel and expose it only to the internal C++ apps. And hide it completely to all other C++ apps. I fear this may not be possible or very cumbersome to accomplish.
I think that option 2 is the cleanest and easiest one to accomplish. What do you think?

Relatedly, here are the sizes of some libraries and OSv kernel when built on Ubuntu 19.04 with gcc 8.3 and Ubuntu 20.04 with gcc 9.3. I am also showing a reduced kernel size when linking C++ standard library with '--no-whole-archive' after applying this patch (please note we are not really hiding any symbols from C++ library yet):

diff --git a/Makefile b/Makefile
index 20ddf3b1..fbad7b21 100644
--- a/Makefile
+++ b/Makefile
@@ -1874,9 +1874,9 @@ $(out)/kernel.elf: $(stage1_targets) arch/$(arch)/loader.ld $(out)/empty_bootfs.
                -Bdynamic --export-dynamic --eh-frame-hdr --enable-new-dtags -L$(out)/arch/$(arch) \
            $(^:%.ld=-T %.ld) \
            --whole-archive \
-             $(libstdc++.a) $(libgcc_eh.a) \
+             $(libgcc_eh.a) \
              $(boost-libs) \
-           --no-whole-archive $(libgcc.a), \
+           --no-whole-archive $(libgcc.a) $(libstdc++.a), \
                LINK kernel.elf)
        $(call quiet, $(STRIP) $(out)/kernel.elf -o $(out)/kernel-stripped.elf, STRIP kernel.elf -> kernel-stripped.elf )
        $(call very-quiet, cp $(out)/kernel-stripped.elf $(out)/kernel.elf)


Ubuntu 19.04 with gcc 8.3:
4.7M /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.a
56K /usr/lib/gcc/x86_64-linux-gnu/8/libgcc_eh.a
68K /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/libboost_system.a
2.9M /usr/lib/gcc/x86_64-linux-gnu/8/libgcc.a
1.9M /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.26

6.2M    ./build/release/kernel.elf #Before the patch
5.9M    ./build/release/kernel.elf #After the patch

Ubuntu 20.04 with gcc 9.3:
5.6M /usr/lib/gcc/x86_64-linux-gnu/9/libstdc++.a
60K /usr/lib/gcc/x86_64-linux-gnu/9/libgcc_eh.a
4.0K /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libboost_system.a
3.0M /usr/lib/gcc/x86_64-linux-gnu/9/libgcc.a
1.9M /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28

6.8M    ./build/release/kernel.elf #Before the patch
6.1M    ./build/release/kernel.elf #After the patch

Why is the difference of 6.8M and 6.2 between 20.04 and 19.04 - gcc version and larger libstdc++.a? BTW it is still amazing that given libstdc++.a is 5.6M in size, linking whole archive produces "only" 6.8M large kernel.elf.

Why is the difference between 6.1M and 5.9M after applying the patch between 20.04 and 19.04? Is it, because gcc 9.3 and 8.3 optimized the code for size differently, possibly because some default optimizations are off with 9.3?

Finally, there is also output from bloaty, ELF size analyzer, against kernel.elf built on Ubuntu 20.04 before the patch and after:

Before the patch:
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  55.8%  3.74Mi  51.3%  3.74Mi    .text
  10.4%   714Ki   9.6%   714Ki    .dynstr
   0.0%       0   8.1%   605Ki    .bss
   8.7%   597Ki   8.0%   597Ki    .eh_frame
   6.8%   467Ki   6.3%   467Ki    .rodata
   5.8%   397Ki   5.3%   397Ki    .dynsym
   2.8%   192Ki   2.6%   192Ki    .percpu
   2.1%   146Ki   2.0%   146Ki    .gnu.hash
   1.9%   130Ki   1.7%   130Ki    .hash
   1.9%   130Ki   1.7%   130Ki    .eh_frame_hdr
   1.5%   102Ki   1.4%   102Ki    .gcc_except_table
   1.2%  80.1Ki   1.1%  80.0Ki    .data.rel.ro
   0.4%  26.7Ki   0.4%  26.7Ki    .data
   0.2%  10.4Ki   0.2%  12.5Ki    [LOAD #0 [RWX]]
   0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
   0.2%  10.6Ki   0.1%  10.6Ki    .data.rel.local
   0.1%  6.98Ki   0.1%  3.95Ki    [47 Others]
   0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
   0.0%  2.22Ki   0.0%  2.16Ki    .init_array
   0.0%  1.99Ki   0.0%       0    .shstrtab
   0.0%       0   0.0%  1.73Ki    .tbss
 100.0%  6.71Mi 100.0%  7.30Mi    TOTAL

After the patch:
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  56.4%  3.42Mi  51.4%  3.42Mi    .text
   9.8%   608Ki   8.9%   608Ki    .dynstr
   0.0%       0   8.9%   604Ki    .bss
   8.7%   541Ki   7.9%   541Ki    .eh_frame
   7.3%   454Ki   6.7%   454Ki    .rodata
   5.7%   352Ki   5.2%   352Ki    .dynsym
   3.1%   192Ki   2.8%   192Ki    .percpu
   1.9%   119Ki   1.8%   119Ki    .eh_frame_hdr
   1.7%   106Ki   1.6%   106Ki    .gnu.hash
   1.5%  90.9Ki   1.3%  90.9Ki    .hash
   1.4%  90.1Ki   1.3%  90.0Ki    .gcc_except_table
   1.2%  74.6Ki   1.1%  74.5Ki    .data.rel.ro
   0.4%  26.7Ki   0.4%  26.7Ki    .data
   0.2%  11.7Ki   0.2%  13.8Ki    [LOAD #0 [RWX]]
   0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
   0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
   0.1%  6.43Ki   0.1%  3.83Ki    [39 Others]
   0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
   0.0%  2.20Ki   0.0%  2.13Ki    .init_array
   0.0%       0   0.0%  1.73Ki    .tbss
   0.0%  1.62Ki   0.0%       0    .shstrtab
 100.0%  6.07Mi 100.0%  6.66Mi    TOTAL

And against kernel.elf built on Ubuntu 19.04 before the patch and after:

Before the patch:
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  54.7%  3.36Mi  49.9%  3.36Mi    .text
  10.5%   663Ki   9.6%   663Ki    .dynstr
   0.0%       0   8.8%   605Ki    .bss
   9.0%   567Ki   8.2%   567Ki    .eh_frame
   7.5%   471Ki   6.8%   471Ki    .rodata
   6.0%   379Ki   5.5%   379Ki    .dynsym
   3.1%   192Ki   2.8%   192Ki    .percpu
   2.0%   124Ki   1.8%   124Ki    .eh_frame_hdr
   1.8%   111Ki   1.6%   111Ki    .gnu.hash
   1.5%  95.3Ki   1.4%  95.3Ki    .hash
   1.4%  91.1Ki   1.3%  91.0Ki    .gcc_except_table
   1.2%  74.8Ki   1.1%  74.7Ki    .data.rel.ro
   0.4%  26.7Ki   0.4%  26.7Ki    .data
   0.2%  10.3Ki   0.2%  12.4Ki    [LOAD #0 [RWX]]
   0.2%  12.3Ki   0.2%  12.3Ki    .tracepoint_patch_sites
   0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
   0.1%  6.81Ki   0.1%  3.95Ki    [44 Others]
   0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
   0.0%  2.21Ki   0.0%  2.15Ki    .init_array
   0.0%  1.84Ki   0.0%       0    .shstrtab
   0.0%       0   0.0%  1.73Ki    .tbss
 100.0%  6.14Mi 100.0%  6.73Mi    TOTAL

After the patch:
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  55.2%  3.25Mi  50.2%  3.25Mi    .text
   0.0%       0   9.1%   604Ki    .bss
  10.0%   601Ki   9.1%   601Ki    .dynstr
   9.0%   541Ki   8.2%   541Ki    .eh_frame
   7.7%   461Ki   7.0%   461Ki    .rodata
   5.8%   349Ki   5.3%   349Ki    .dynsym
   3.2%   192Ki   2.9%   192Ki    .percpu
   2.0%   118Ki   1.8%   118Ki    .eh_frame_hdr
   1.8%   106Ki   1.6%   106Ki    .gnu.hash
   1.5%  90.3Ki   1.4%  90.3Ki    .hash
   1.4%  85.4Ki   1.3%  85.3Ki    .gcc_except_table
   1.2%  70.6Ki   1.1%  70.5Ki    .data.rel.ro
   0.4%  26.7Ki   0.4%  26.7Ki    .data
   0.2%  12.7Ki   0.2%  14.8Ki    [LOAD #0 [RWX]]
   0.2%  12.3Ki   0.2%  12.3Ki    .tracepoint_patch_sites
   0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
   0.1%  6.53Ki   0.1%  3.87Ki    [41 Others]
   0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
   0.0%  2.20Ki   0.0%  2.13Ki    .init_array
   0.0%       0   0.0%  1.73Ki    .tbss
   0.0%  1.66Ki   0.0%       0    .shstrtab
 100.0%  5.88Mi 100.0%  6.47Mi    TOTAL

Waldek

Nadav Har'El

unread,
May 28, 2020, 2:43:39 AM5/28/20
to Waldek Kozaczuk, OSv Development
On Thu, May 28, 2020 at 8:10 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
The question of whether to hide the C++ standard library in the kernel stands at the crossroads of solving two issues:
To recap, on one hand keeping C++ exposed, reduces the size of the image (kernel + app files) and lowers memory utilization for C++ apps as the libstdc++.so is provided.

I see you did below some nice benchmarks of how much kernel size the "--whole-archive" costs us, and it's around 0.3 - 0.7 MB.
If that is the cost, for C++ applications there is definitely a benefit of paying 0.3 - 0.7 MB in the kernel size instead of adding a 2 MB libstdc++.so to the image.
On the other hand, both numbers are pretty low, and it might be beneficial to pay an extra

If we stop using "--whole-archive", we can also start considering whether some specific C++ features we use cause a lot of stuff from libstdc++.a to be included, and try to avoid this big stuff.


 But then, on the other hand, it increases the size of the kernel ELF and is wasteful for non-C++ apps that do not depend on libstdc++.so and also potentially creates incompatibility issues where some C++ apps depend on a different version of the stdc++ than the one OSv kernel is linked '--whole-archive' with.

Hiding the C++ standard library, in essence, reverses the advantages and disadvantages stated above. 

The extra caveat is that many internal apps (cpiod, httpserver, etc) are C++ apps, and worse they use the C++ API to interact with OSv kernel.

The latter thing is the biggest issue, and the main reason why we put the C++ library in the kernel. Maybe we need to rethink it if there're important advantages to not having C++ exposed by the kernel, and change the very few C++ APIs we have to not be C++ :-(

The internal apps can (?) be compiled with C++ using the static libstdc++.a, maybe? (another possibility besides adding libstdc++.so to the image).


In theory, there are 3 options we might consider:
  1. Keep the C++ standard library exposed which means keep all relevant C++ symbols exported when fixing #821. Not sure how we could solve "different C++ library" incompatibility issue.
  2. Hide the C++ standard library completely (--no-whole-archive) and hide in a version script file. This would affect internal C++ apps - at least we would need to add libstdc++.so to the image but also probably change existing C++ API in OSv kernel that is used by those apps to integrate to the C API.
    • What about exceptions? Do we need to worry about them in any away as far as calls between apps and kernel goes?
 I don't think so. OSv functions should not throw exceptions to the applications that use them (these applications don't even have to use C++).
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/1aca0fe1-42b6-46e7-a0fd-32ec356a3c60%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages