Hi all – I'd call Geraldo's idea
exogenous tree-shaking, where the code to shake out would not be application code, but external libraries, kernel modules, apis, syscalls, etc. This sounds good if OSv supports it – if it's modular enough along the right boundaries for such tree-shaking.
Nadav asked "The biggest question I have, is what sort of applications will truely see a big gain from the smaller kernel."
Good question. Lots of possible answers. Using conventional approaches, I predict you'll see the biggest gains if you use link-time optimization (-flto) and profile guided optimization on the remaining, smaller kernel. I searched the enormous OSv root makefile, and lto is never used. Some of the biggest wins in real world applications are not from -O2 or -O3, but from -flto and pgo.
But there's an interesting approach that would bring OSv and the application together in a novel way: the
ALLVM project:
An ‘ALLVM system’ is one in which all software components — except a small set needed for bootstrapping — are represented in a virtual instruction set instead of native machine code. The goal of the approach is to enable sophisticated compiler analyses and transformations to be applied across arbitrary software boundaries — not just caller-callee boundaries analyzed using traditional interprocedural techniques, but also several others: between applications and third-party libraries; applications and the underlying operating system; and between communicating processes in a distributed system.
Many software components already ship as virtual instruction sets (loosely defined as “not a native hardware instruction set”), including software in managed languages like Java, C# and Scala; scripting languages like Python and Javascript; and GPGPU code in languages like CUDA and OpenCL. The major change ALLVM enables is for statically compiled languages like C, C++, Fortran, OCaml, Swift, etc. For software written in these languages, we represent and ship code using the LLVM Virtual Instruction Set (see http://llvm.org), previously developed in our research group and now widely used in production systems, including MacOS, iOS, and FreeBSD. LLVM already provides some of the capabilities required for an ALLVM system, including the ability to ship software in LLVM bitcode form and the ability to perform install-time and just-in-time compilation.
The key difference between LLVM and ALLVM is that LLVM enables individual software components to be analyzed and optimized throughout their lifetime (“lifelong compilation”) whereas ALLVM enables all the software on a system to be analyzed and optimized together, throughout the lifetime of the software (“system-wide, lifelong compilation”). Several research projects within the ALLVM umbrella are exploring the performance, reliability, security and software engineering benefits of the ALLVM approach.
This sounds cool as hell. If it works – I haven't had time to explore how experimental or usable it is right now, but I feel like it's inevitable. It makes too much sense to fail.
Another observation. When OSv is built/installed, you take the standard nix approach of installing countless app and library binaries that were compiled by random strangers using extremely weak compiler optimization settings. Like this line:
apt-get install build-essential libboost-all-dev genromfs autoconf libtool openjdk-7-jdk ant qemu-utils maven libmaven-shade-plugin-java python-dpkt tcpdump gdb qemu-system-x86 gawk gnutls-bin openssl python-requests lib32stdc++-4.9-dev p11-kit
My big complaint against the Unix/Linux culture is that people don't pay nearly enough attention to the compiler, and to all the things modern compilers can do for them. Behind every one of the packages above will be a vanilla makefile that does not use lto or other major optimizations, fast math, and which treats a Pentium 4 or something as the default baseline hardware and instruction set (because they don't even stipulate a -march in many cases). Beyond that, there are other compilers in the world – you don't have to use gcc. For example, it's likely (from scattered evidence) that the Intel compiler (Parallel Studio) is better than gcc 7.x. clang might be better now too, though the gcc vs. clang benchmarks of the past were somewhat murky. And MS Visual Studio 2017 might build better code than gcc for both Windows Servers and Linux. (It might still lean on clang for Linux, or the Intel compiler – not sure).
For both optimizations and modularity, you might want to take a look at
Intel's Clear Linux. Note also that compiling the OS oneself, or at least the kernel, is somewhat more common in the FreeBSD community than in Linux land – there might be something useful in what they're doing other than pruning unneeded drivers, for example, so it might be worth a look.
I'm pretty sure we're giving away more than 10 percent performance with the status quo approach of vanilla makefiles and unused compiler optimization flags. At this point, people shouldn't even be talking about gcc 4.x or even 5.x. We need to move forward, and I think there's a lot of performance headroom left for OSv right now, with gcc 7.3, lto, pgo, etc. Ideally a unikernel should support exogenous tree-shaking of the sort Geraldo proposed, where the app only gets the libraries and calls it needs. (I think the Microsoft unikernels or library OSes were like this.) Then we should be able to optimize the hell out of such apps/kernel hybrids with modern compilers, Haswell and up stipulations on march, and boundary busting bytecode like the ALLVM project.
Speaking of static analysis, why did OSv stop using the free Coverity Scan service?

It would be handy to be able to break out different profiles and subsets of OSv and submit them to Coverity separately. In principle, the application/kernel hybrids that y'all are talking about could be forked into different projects and also submitted to Coverity Scan.
Cheers,
Joe Duarte, PhD
Phoenix, AZ