qemu-system-x86_64 -m 2G -smp 4 \
-device virtio-blk-pci,id=blk0,bootindex=0,drive=hd0,scsi=off \
-drive file=/home/wkozaczuk/projects/osv/build/last/usr.img,if=none,id=hd0,cache=none,aio=native \
-enable-kvm -cpu host,+x2apic \
-chardev stdio,mux=on,id=stdio,signal=off \
-mon chardev=stdio,mode=readline
-device isa-serial,chardev=stdio
The bhyve design requires a processor that supports Intel® Extended Page Tables (EPT) or AMD® Rapid Virtualization Indexing (RVI) or Nested Page Tables (NPT). Hosting Linux® guests or FreeBSD guests with more than one vCPU requires VMX unrestricted mode support (UG). Most newer processors, specifically the Intel® Core™ i3/i5/i7 and Intel® Xeon™ E3/E5/E7, support these features. UG support was introduced with Intel's Westmere micro-architecture. For a complete list of Intel® processors that support EPT, refer to http://ark.intel.com/search/advanced?s=t&ExtendedPageTables=true. RVI is found on the third generation and later of the AMD Opteron™ (Barcelona) processors"
Last week I have been trying to hack OSv to run on hyperkit and finally I managed to execute native hello world example with ROFS.
Here is a timing on hyperkit/OSX (the bootchart does not work on hyperkit due to not granular enough timer):OSv v0.24-516-gc872202Hello from C codereal 0m0.075s
user 0m0.012ssys 0m0.058scommand to boot it (please note that I hacked the lzloader ELF to support multiboot):
ROFS mounted: 128.65ms, (+2.28ms)Total time: 128.65ms, (+0.00ms)Hello from C codeVFS: unmounting /devVFS: unmounting /procVFS: unmounting /ROFS: spent 1.00 ms reading from diskROFS: read 21 512-byte blocks from diskROFS: allocated 18 512-byte blocks of cache memoryROFS: hit ratio is 89.47%Powering off.real 0m1.049s
user 0m0.173ssys 0m0.253sbooted like so:qemu-system-x86_64 -m 2G -smp 4 \-device virtio-blk-pci,id=blk0,bootindex=0,drive=hd0,scsi=off \-drive file=/home/wkozaczuk/projects/osv/build/last/usr.img,if=none,id=hd0,cache=none,aio=native \-enable-kvm -cpu host,+x2apic \-chardev stdio,mux=on,id=stdio,signal=off \-mon chardev=stdio,mode=readline-device isa-serial,chardev=stdioIn both cases I am not using networking - only block device. BTW I have not tested how networking nor SMP on hyperkit with OSv.So as you can see OSv is 10 (ten) times faster on the same hardware. I am not sure if my results are representative. But if they are it would mean that QEMU is probably the culprit. Please see my questions/consideration toward the end of the email.Anyway let me give you some background. What is hyperkit? Hyperkit (https://github.com/moby/hyperkit) is a fork by Docker of xhyve (https://github.com/mist64/xhyve) which itself is a port of bhyve (https://www.freebsd.org/doc/handbook/virtualization-host-bhyve.html) - hypervisor on FreeBSD. Bhyve architecture is similar to that of KVM/QEMU but QEMU-equivalent of bhyve is much lighter and simpler:"The bhyve BSD-licensed hypervisor became part of the base system with FreeBSD 10.0-RELEASE. This hypervisor supports a number of guests, including FreeBSD, OpenBSD, and many Linux® distributions. By default, bhyve provides access to serial console and does not emulate a graphical console. Virtualization offload features of newer CPUs are used to avoid the legacy methods of translating instructions and manually managing memory mappings.The bhyve design requires a processor that supports Intel® Extended Page Tables (EPT) or AMD® Rapid Virtualization Indexing (RVI) or Nested Page Tables (NPT). Hosting Linux® guests or FreeBSD guests with more than one vCPU requires VMX unrestricted mode support (UG). Most newer processors, specifically the Intel® Core™ i3/i5/i7 and Intel® Xeon™ E3/E5/E7, support these features. UG support was introduced with Intel's Westmere micro-architecture. For a complete list of Intel® processors that support EPT, refer to http://ark.intel.com/search/advanced?s=t&ExtendedPageTables=true. RVI is found on the third generation and later of the AMD Opteron™ (Barcelona) processors"
Hyperkit/Xhyve is a port of bhyve but targets Apple OSX as a host system and instead of FreeBSD vmm kernel module uses Apple hypervisor framework (https://developer.apple.com/documentation/hypervisor). Docker, I think, forked xhyve to create hyperkit in order to provide lighter alternative of running Docker containers on Linux on Mac. So in essence hyperkit is a component of Docker for Mac vs Docker Machine/Toolbox (based on VirtualBox). Please see for details there - https://docs.docker.com/docker-for-mac/docker-toolbox/.How does it apply to OSv? It only applies if you want to run OSv on Mac. Now the only choice is QEMU (dog slow because no KVM) or VirtualBox (pretty fast once OSv is up but it takes long time to boot and has other configuration quirks). Based on my experiments hyperkit becomes very compelling new alternative.
Reletely you maybe aware of uKVM (https://github.com/Solo5/solo5) - very light hypervisor to run clean-state unikernels like MirageOS or IncludeOS. There is an OSv issue - https://github.com/cloudius-systems/osv/issues/886- and corresponding one on uKVM - https://github.com/Solo5/solo5/issues/249 - to track what it would take to boot OSv on it. It turns our (read issues for details) that in current form uKVM is too minimalistic for OSv. For example there is no interrupts. The experiment with hyperkit made me think that it would be nice if there an alternative to QEMU on Linux - simpler and lighter than qemu but not as minimalistic as uKVM - something equivalent to bhyve for Linux.
WaldekPS. I have created an issue https://github.com/cloudius-systems/osv/issues/948 to track what it would take to make OSv run on hyperkit.
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Tue, Apr 10, 2018 at 10:29 PM, Waldek Kozaczuk <jwkoz...@gmail.com> wrote:Last week I have been trying to hack OSv to run on hyperkit and finally I managed to execute native hello world example with ROFS.Excellent :-)
Here is a timing on hyperkit/OSX (the bootchart does not work on hyperkit due to not granular enough timer):OSv v0.24-516-gc872202Hello from C codereal 0m0.075sImpressive :-)
user 0m0.012ssys 0m0.058scommand to boot it (please note that I hacked the lzloader ELF to support multiboot):What kind of hack is this?
user 0m0.173ssys 0m0.253sbooted like so:qemu-system-x86_64 -m 2G -smp 4 \-device virtio-blk-pci,id=blk0,bootindex=0,drive=hd0,scsi=off \-drive file=/home/wkozaczuk/projects/osv/build/last/usr.img,if=none,id=hd0,cache=none,aio=native \-enable-kvm -cpu host,+x2apic \-chardev stdio,mux=on,id=stdio,signal=off \-mon chardev=stdio,mode=readline-device isa-serial,chardev=stdioIn both cases I am not using networking - only block device. BTW I have not tested how networking nor SMP on hyperkit with OSv.So as you can see OSv is 10 (ten) times faster on the same hardware. I am not sure if my results are representative. But if they are it would mean that QEMU is probably the culprit. Please see my questions/consideration toward the end of the email.Anyway let me give you some background. What is hyperkit? Hyperkit (https://github.com/moby/hyperkit) is a fork by Docker of xhyve (https://github.com/mist64/xhyve) which itself is a port of bhyve (https://www.freebsd.org/doc/handbook/virtualization-host-bhyve.html) - hypervisor on FreeBSD. Bhyve architecture is similar to that of KVM/QEMU but QEMU-equivalent of bhyve is much lighter and simpler:"The bhyve BSD-licensed hypervisor became part of the base system with FreeBSD 10.0-RELEASE. This hypervisor supports a number of guests, including FreeBSD, OpenBSD, and many Linux® distributions. By default, bhyve provides access to serial console and does not emulate a graphical console. Virtualization offload features of newer CPUs are used to avoid the legacy methods of translating instructions and manually managing memory mappings.The bhyve design requires a processor that supports Intel® Extended Page Tables (EPT) or AMD® Rapid Virtualization Indexing (RVI) or Nested Page Tables (NPT). Hosting Linux® guests or FreeBSD guests with more than one vCPU requires VMX unrestricted mode support (UG). Most newer processors, specifically the Intel® Core™ i3/i5/i7 and Intel® Xeon™ E3/E5/E7, support these features. UG support was introduced with Intel's Westmere micro-architecture. For a complete list of Intel® processors that support EPT, refer to http://ark.intel.com/search/advanced?s=t&ExtendedPageTables=true. RVI is found on the third generation and later of the AMD Opteron™ (Barcelona) processors"
Hyperkit/Xhyve is a port of bhyve but targets Apple OSX as a host system and instead of FreeBSD vmm kernel module uses Apple hypervisor framework (https://developer.apple.com/documentation/hypervisor). Docker, I think, forked xhyve to create hyperkit in order to provide lighter alternative of running Docker containers on Linux on Mac. So in essence hyperkit is a component of Docker for Mac vs Docker Machine/Toolbox (based on VirtualBox). Please see for details there - https://docs.docker.com/docker-for-mac/docker-toolbox/.How does it apply to OSv? It only applies if you want to run OSv on Mac. Now the only choice is QEMU (dog slow because no KVM) or VirtualBox (pretty fast once OSv is up but it takes long time to boot and has other configuration quirks). Based on my experiments hyperkit becomes very compelling new alternative.Yes, it definitely looks like it. Probably super-important for people with Macs (i.e., not me :-) ).
Reletely you maybe aware of uKVM (https://github.com/Solo5/solo5) - very light hypervisor to run clean-state unikernels like MirageOS or IncludeOS. There is an OSv issue - https://github.com/cloudius-systems/osv/issues/886- and corresponding one on uKVM - https://github.com/Solo5/solo5/issues/249 - to track what it would take to boot OSv on it. It turns our (read issues for details) that in current form uKVM is too minimalistic for OSv. For example there is no interrupts. The experiment with hyperkit made me think that it would be nice if there an alternative to QEMU on Linux - simpler and lighter than qemu but not as minimalistic as uKVM - something equivalent to bhyve for Linux.I wonder if kvmtool is such a thing? But I never tried it myself.
WaldekPS. I have created an issue https://github.com/cloudius-systems/osv/issues/948 to track what it would take to make OSv run on hyperkit.
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
Last week I have been trying to hack OSv to run on hyperkit and finally I managed to execute native hello world example with ROFS.Here is a timing on hyperkit/OSX (the bootchart does not work on hyperkit due to not granular enough timer):OSv v0.24-516-gc872202Hello from C codereal 0m0.075suser 0m0.012ssys 0m0.058scommand to boot it (please note that I hacked the lzloader ELF to support multiboot):hyperkit -A -m 512M \-s 0:0,hostbridge \-s 31,lpc \-l com1,stdio \-s 4,virtio-blk,test.img \-f multiboot,lzloader.elf
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Asias
OSv v0.24-516-gc872202
console_multiplexer::console_multiplexer()
acpi::early_init()
interrupt_descriptor_table initialized
### apic_driver: Read base 00000000fee00900
### apic_driver: _apic_base 00000000fee00000
Hello from C code
real 0m0.165s
user 0m0.027s
sys 0m0.141s
OSv v0.24-519-g94a7640
console_multiplexer::console_multiplexer()
acpi::early_init()
interrupt_descriptor_table initialized
### apic_driver:read_base() - read base as : 00000000fee00900
### apic_driver:read_base() - saved base as : 00000000fee00000
### apic_driver:enable() - enabling with base as : 00000000fee00900
1 CPUs detected
Firmware vendor: BHYVE
bsd: initializing - done
VFS: mounting ramfs at /
VFS: mounting devfs at /dev
net: initializing - done
---> blk::blk - enabled MSI 1
device_register(): registering device vblk0
device_register(): registering device vblk0.1
virtio-blk: Add blk device instances 0 as vblk0, devsize=40128000
device_register(): registering device console
device_register(): registering device null
random: intel drng, rdrand registered as a source.
device_register(): registering device random
device_register(): registering device urandom
random: <Software, Yarrow> initialized
VFS: unmounting /dev
VFS: mounting rofs at /rofs
[rofs] device vblk0.1 opened!
[rofs] read superblock!
[rofs] read structure blocks!
VFS: mounting devfs at /dev
VFS: mounting procfs at /proc
VFS: mounting ramfs at /tmp
java.so: Starting JVM app using: io/osv/nonisolated/RunNonIsolatedJvmApp
java.so: Setting Java system classloader to NonIsolatingOsvSystemClassLoader
random: device unblocked.
Hello, World!
VFS: unmounting /dev
VFS: unmounting /proc
VFS: unmounting /
ROFS: spent 42.90 ms reading from disk
ROFS: read 35323 512-byte blocks from disk
ROFS: allocated 35568 512-byte blocks of cache memory
ROFS: hit ratio is 83.34%
Powering off.
real 0m0.338s
user 0m0.035s
sys 0m0.298s
OSv v0.24-520-gf577249
console_multiplexer::console_multiplexer()
acpi::early_init()
smp_init()
interrupt_descriptor_table initialized
### Before create_apic_driver
### apic_driver:read_base() - read base as : 00000000fee00900
### apic_driver:read_base() - saved base as : 00000000fee00000
### xapic:enable() - enabling with base as : 00000000fee00900
1 CPUs detected
Firmware vendor: BHYVE
smp_launch() -> DONE
bsd: initializing - done
VFS: mounting ramfs at /
VFS: mounting devfs at /dev
net: initializing - done
eth0: ethernet address: d2:e2:e3:b0:2e:38
---> blk::blk - enabled MSI 1
device_register(): registering device vblk0
device_register(): registering device vblk0.1
virtio-blk: Add blk device instances 0 as vblk0, devsize=17775616
device_register(): registering device console
device_register(): registering device null
random: intel drng, rdrand registered as a source.
device_register(): registering device random
device_register(): registering device urandom
random: <Software, Yarrow> initialized
VFS: unmounting /dev
VFS: mounting rofs at /rofs
[rofs] device vblk0.1 opened!
[rofs] read superblock!
[rofs] read structure blocks!
VFS: mounting devfs at /dev
VFS: mounting procfs at /proc
VFS: mounting ramfs at /tmp
[I/27 dhcp]: Broadcasting DHCPDISCOVER message with xid: [1012600569]
[I/27 dhcp]: Waiting for IP...
random: device unblocked.
[I/27 dhcp]: Broadcasting DHCPDISCOVER message with xid: [2082401196]
[I/27 dhcp]: Waiting for IP...
[I/33 dhcp]: Received DHCPOFFER message from DHCP server: 192.168.64.1 regarding offerred IP address: 192.168.64.4
[I/33 dhcp]: Broadcasting DHCPREQUEST message with xid: [2082401196] to SELECT offered IP: 192.168.64.4
[I/33 dhcp]: Received DHCPACK message from DHCP server: 192.168.64.1 regarding offerred IP address: 192.168.64.4
[I/33 dhcp]: Server acknowledged IP 192.168.64.4 for interface eth0 with time to lease in seconds: 85536
eth0: 192.168.64.4
[I/33 dhcp]: Configuring eth0: ip 192.168.64.4 subnet mask 255.255.255.0 gateway 192.168.64.1 MTU 1500
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_env.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_trace.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_api.so
could not load libicui18n.so.55
could not load libicuuc.so.55
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_file.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_network.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_os.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_app.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_hardware.so
httpserver: loaded plugin from path: /usr/mgmt/plugins/libhttpserver-api_fs.so
OSv v0.24-519-g94a7640
console_multiplexer::console_multiplexer()
acpi::early_init()
smp_init()
interrupt_descriptor_table initialized
### Before create_apic_driver
### apic_driver:read_base() - read base as : 00000000fee00900
### apic_driver:read_base() - saved base as : 00000000fee00000
### xapic:enable() - enabling with base as : 00000000fee00900
2 CPUs detected
Firmware vendor: BHYVE
### xapic:enable() - enabling with base as : 00000000fee00800
smp_launch() -> DONE
bsd: initializing - done
VFS: mounting ramfs at /
VFS: mounting devfs at /dev
net: initializing - done
---> blk::blk - enabled MSI 1
device_register(): registering device vblk0
device_register(): registering device vblk0.1
virtio-blk: Add blk device instances 0 as vblk0, devsize=8520192
device_register(): registering device console
device_register(): registering device null
random: intel drng, rdrand registered as a source.
device_register(): registering device random
device_register(): registering device urandom
random: <Software, Yarrow> initialized
VFS: unmounting /dev
VFS: mounting rofs at /rofs
[rofs] device vblk0.1 opened!
[rofs] read superblock!
[rofs] read structure blocks!
VFS: mounting devfs at /dev
VFS: mounting procfs at /proc
VFS: mounting ramfs at /tmp
Hello from C code
VFS: unmounting /dev
VFS: unmounting /proc
VFS: unmounting /
ROFS: spent 0.14 ms reading from disk
ROFS: read 21 512-byte blocks from disk
ROFS: allocated 18 512-byte blocks of cache memory
ROFS: hit ratio is 89.47%
Powering off.
real 0m0.085s
user 0m0.013s
sys 0m0.068s
void apic_driver::read_base()
{
static constexpr u64 base_addr_mask = 0xFFFFFF000;
_apic_base = rdmsr(msr::IA32_APIC_BASE) & base_addr_mask;
}
### apic_driver:read_base() - read base as : 00000000fee00900
### apic_driver:read_base() - saved base as : 00000000fee00000
### xapic:enable() - enabling with base as : 00000000fee00900
To make SMP working I had to hack OSv to pass 00000000fee00900 when enabling APIC for first CPU and 00000000fee00800 for all other CPUs. It looks like (based on source code of hyperkit) it requires that the APIC registers memory area base address passed in when enabling it needs to be the same as when it was read. But why is it different for each CPU? It looks like QEMU/KVM, VMware, XEN hypervisors OSv runs on do not have this requirement. Unfortunately I am not very familiar with APIC so if anybody can enlighten me I would appreciate it.I was also wondering if anybody knows what is the reason behind this logic:void apic_driver::read_base()
{
static constexpr u64 base_addr_mask = 0xFFFFFF000;
_apic_base = rdmsr(msr::IA32_APIC_BASE) & base_addr_mask;
}Why are we masking with 0xFFFFFF000? Based on the logs from OSv when running on hyperkit this logic effectively overwrites original APIC base address as 00000000fee00000:### apic_driver:read_base() - read base as : 00000000fee00900
### apic_driver:read_base() - saved base as : 00000000fee00000
### xapic:enable() - enabling with base as : 00000000fee00900
So in case of hyperkit when we pass 00000000fee00000 instead of 00000000fee00900 (which is what hyperkit returned in read_base()) it rejects it. However the same logic works just fine with other hypervisors.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscribe@googlegroups.com.