Building customized kernel

61 views
Skip to first unread message

Waldek Kozaczuk

unread,
Jun 11, 2020, 12:55:57 AM6/11/20
to OSv Development
Imagine we wanted to allow building a kernel that is customized to a specific app - in other words, it provides more-less only the functionality needed by the aforementioned app and nothing more. Please note that one of the previous conversations was about creating "microVM" version of the kernel.elf which is also customized but to a specific type of hypervisor. So how could we achieve this?

It turns out it is now that difficult as long as we rely on a version script and the linker garbage collecting all unneeded code and data. All we need to do is to manufacture a custom version script that exports symbols only needed by the given app. For example, a native hello world would have a version script like this:

{
  global:
                 __cxa_finalize;
                 __gmon_start__;
                 _ITM_deregisterTMCloneTable;
                 _ITM_registerTMCloneTable;
                 __libc_start_main;
                 puts;

  local:
    *;
};

I have hand-crafted it but it can be quite easy to automate the generation of it like this:

nm -uD apps/native-example/hello

After re-linking with this specific version script we arrive with kernel.elf that is only 2.7M in size (2728432 bytes) and exposes 3 symbols only:

nm -CD build/release/kernel.elf 
00000000402233b0 T __cxa_finalize
00000000403211c0 T __libc_start_main
0000000040357480 T puts

The size difference is around 10% which is not that dramatic but one advantage is that the kernel exposes only symbols needed by the app so it is somewhat more secure.

I have also experimented with more non-trivial apps like redis and tomcat (java) and even then the resulting kernel.elf was still around 2.7M (2826976 bytes) and around 400 symbols exposed. For java that depends on many shared libraries and I had to sum all the needed symbols sets and intersect them with the ones from the generic kernel version script.

Please note that the above-sketched recipe to produce customized kernel does not require re-compiling any kernel sources but only re-linking the final artifact - kernel.elf - to provide only needs symbols and eliminate any unneeded code and data. Which means it is pretty fast (unless you start using lto about which I will write later).

Waldek

Dor Laor

unread,
Jun 11, 2020, 1:21:32 AM6/11/20
to Waldek Kozaczuk, OSv Development
Waldek, can you please describe what's your motivation in shrinking
the size of the kernel? Compared for example to modern RAM and also
to the JVM that many times will be there, the current size is already
'small enough', isn't it?

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/da86ac70-a301-47e0-a870-1b4a1b3b4913n%40googlegroups.com.

Pekka Enberg

unread,
Jun 11, 2020, 2:25:06 AM6/11/20
to Dor Laor, Waldek Kozaczuk, OSv Development
Hi Dor,

On Thu, Jun 11, 2020 at 8:21 AM Dor Laor <d...@scylladb.com> wrote:
Waldek, can you please describe what's your motivation in shrinking
the size of the kernel? Compared for example to modern RAM and also
to the JVM that many times will be there, the current size is already
'small enough', isn't it?

Not necessarily if the lifetime of a VM is in the order of, say, hundreds of milliseconds.

My understanding is that these extreme optimizations to reduce size are motivated by the emergence of light-weight VMs, which allow cloud providers to launch VMs on-demand for a group of service requests. This allows them to more efficiently utilize the hardware because an idle VM still consumes some resources. And in any case, smaller image size anyway means shorter boot time and more packing of VMs per node.

- Pekka

Waldek Kozaczuk

unread,
Jun 11, 2020, 9:32:59 AM6/11/20
to OSv Development
Exactly. The density is the main motivation. But also somewhat better security - the app has only access to (in terms of the dynamic linker at least) to the subset of libc so at least in theory it should not be able to do anything else than advertised per its ELF.  

With the kernel of size 2.7M we can run hello app on OSv with 11M memory only (~5MB less) and boot 1-2ms faster - now I pretty consistently get boot times of ~4ms on firecracker (20-30% faster). 

Another advantage is that there are fewer symbols in the kernel the dynamic linker has to search to resolve. I know it uses a hashmap that makes things pretty fast already but still smaller the hashmap with smaller buckets the faster lookup should be. 

Finally, whatever mechanism to build custom kernels is, it should also account for the usage of syscall instruction and properly eliminate stuff from linux.cc so that the linker can garbage collect even better.

- Pekka

Dor Laor

unread,
Jun 11, 2020, 2:04:02 PM6/11/20
to Waldek Kozaczuk, OSv Development
On Thu, Jun 11, 2020 at 6:33 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:


On Thursday, June 11, 2020 at 2:25:06 AM UTC-4 Pekka Enberg wrote:
Hi Dor,

On Thu, Jun 11, 2020 at 8:21 AM Dor Laor <d...@scylladb.com> wrote:
Waldek, can you please describe what's your motivation in shrinking
the size of the kernel? Compared for example to modern RAM and also
to the JVM that many times will be there, the current size is already
'small enough', isn't it?

Not necessarily if the lifetime of a VM is in the order of, say, hundreds of milliseconds.

My understanding is that these extreme optimizations to reduce size are motivated by the emergence of light-weight VMs, which allow cloud providers to launch VMs on-demand for a group of service requests. This allows them to more efficiently utilize the hardware because an idle VM still consumes some resources. And in any case, smaller image size anyway means shorter boot time and more packing of VMs per node.
Exactly. The density is the main motivation. But also somewhat better security - the app has only access to (in terms of the dynamic linker at least) to the subset of libc so at least in theory it should not be able to do anything else than advertised per its ELF.  

With the kernel of size 2.7M we can run hello app on OSv with 11M memory only (~5MB less) and boot 1-2ms faster - now I pretty consistently get boot times of ~4ms on firecracker (20-30% faster). 

4ms is pretty amazing, it makes OSv+kvm/firecracker a good candidate for lambdas.
For heavier weight frameworks like JVMs, the best would be to preboot them and
load the app dynamically.
 

Another advantage is that there are fewer symbols in the kernel the dynamic linker has to search to resolve. I know it uses a hashmap that makes things pretty fast already but still smaller the hashmap with smaller buckets the faster lookup should be. 

Finally, whatever mechanism to build custom kernels is, it should also account for the usage of syscall instruction and properly eliminate stuff from linux.cc so that the linker can garbage collect even better.

- Pekka

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages