Unusable disk performance inside VMs

351 views
Skip to first unread message

Matthew Mosesohn

unread,
Jan 9, 2019, 10:46:01 AM1/9/19
to kubevirt-dev
I'm trying to use kubevirt to do Kubernetes-in-Kubernetes deployments as part of a CI system. The k8s version run is v1.13.1 and kube-virt is v0.11.0. The disks of the hosts are SSD and the kube nodes have 0 workloads on them except a single test VM. I can use dd to show ~600mb/s write performance, but inside a VM, I only get ~10mb/s write performance. Standing up k8s in a VM is not working well because the disk is too slow for kube-apiserver to respond quickly enough to health checks from kubelet.

I appealed on IRC, and was directed here. Here is the instance definition and the qemu command that was run on the system:
- apiVersion: kubevirt.io/v1alpha2
  kind
: VirtualMachine
  metadata
:
    creationTimestamp
: "2019-01-09T12:13:34Z"
    generation
: 3
    name
: instance-0
   
namespace: 42700454-143441269
    resourceVersion
: "20285"
    selfLink
: /apis/kubevirt.io/v1alpha2/namespaces/42700454-143441269/virtualmachines/instance-0
    uid
: fc4a07d3-1407-11e9-ae0b-0cc47a2059ee
  spec
:
    running
: true
   
template:
      metadata
:
        creationTimestamp
: null
        labels
:
          kubevirt
.io/domain: 42700454-143441269
          kubevirt
.io/size: small
      spec
:
        domain
:
          cpu
:
            cores
: 2
          devices
:
            disks
:
           
- disk:
                bus
: virtio
              name
: containerdisk
              volumeName
: containervolume
           
- disk:
                bus
: virtio
              name
: cloudinitdisk
              volumeName
: cloudinitvolume
            interfaces
:
           
- bridge: {}
              name
: default
          machine
:
            type
: ""
          resources
:
            requests
:
              memory
: 4Gi
        networks
:
       
- name: default
          pod
: {}
        terminationGracePeriodSeconds
: 0
        volumes
:
       
- containerDisk:
            image
: quay.io/kubespray/vm-ubuntu-1604
          name
: containervolume
       
- cloudInitNoCloud:
            userDataBase64
: (REDACTED)
          name
: cloudinitvolume
  status
:
    created
: true
    ready
: true

/usr/bin/qemu-system-x86_64 -name guest=42700454-143441269_instance-0,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-42700454-143441269_i/master-key.aes -machine pc-q35-3.0,accel=tcg,usb=off,dump-guest-core=off -cpu EPYC,acpi=on,ss=on,hypervisor=on,erms=on,mpx=on,pcommit=on,clwb=on,pku=on,la57=on,3dnowext=on,3dnow=on,vme=off,fma=off,avx=off,f16c=off,rdrand=off,avx2=off,rdseed=off,sha-ni=off,xsavec=off,fxsr_opt=off,misalignsse=off,3dnowprefetch=off,osvw=off,topoext=off -m 4096 -realtime mlock=off -smp 2,sockets=1,cores=2,threads=1 -object iothread,id=iothread1 -uuid 754ab090-2e93-5ebb-a827-d658e196366b -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=22,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 -drive file=/var/run/kubevirt-ephemeral-disks/container-disk-data/42700454-143441269/instance-0/disk_containervolume/disk-image.qcow2,format=qcow2,if=none,id=drive-ua-containerdisk,cache=none -device virtio-blk-pci,scsi=off,bus=pci.3,addr=0x0,drive=drive-ua-containerdisk,id=ua-containerdisk,bootindex=1,write-cache=on -drive file=/var/run/kubevirt-ephemeral-disks/cloud-init-data/42700454-143441269/instance-0/noCloud.iso,format=raw,if=none,id=drive-ua-cloudinitdisk,cache=none -device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-ua-cloudinitdisk,id=ua-cloudinitdisk,write-cache=on -netdev tap,fd=24,id=hostua-default -device virtio-net-pci,netdev=hostua-default,id=ua-default,mac=2a:40:d5:6a:c4:59,bus=pci.1,addr=0x0 -chardev socket,id=charserial0,fd=25,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=26,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -vnc vnc=unix:/var/run/kubevirt-private/fc4d7ef0-1407-11e9-ae0b-0cc47a2059ee/virt-vnc -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

I am prepared to try setting up local-volume-provisioner to work around the issue, but I don't think it should be necessary. emptyDir pods can get decent disk performance, so I'm thinking it's a problem with the instance definition.

Roman Mohr

unread,
Jan 9, 2019, 10:51:23 AM1/9/19
to Matthew Mosesohn, kubevirt-dev
Hi Mathew,

On Wed, Jan 9, 2019 at 4:46 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
I'm trying to use kubevirt to do Kubernetes-in-Kubernetes deployments as part of a CI system. The k8s version run is v1.13.1 and kube-virt is v0.11.0. The disks of the hosts are SSD and the kube nodes have 0 workloads on them except a single test VM. I can use dd to show ~600mb/s write performance, but inside a VM, I only get ~10mb/s write performance. Standing up k8s in a VM is not working well because the disk is too slow for kube-apiserver to respond quickly enough to health checks from kubelet.


It looks like you are writing  your data right now directly into the containerDisk (so into the overlay fs of the container runtime). If you have fast ssd disks on the nodes where you run the VMs, which you want to directly use, you can try to use a `emptyDisk` [1] or a `hostDisk` [2] volume.

Let us know if that helps.

Best Regards,

Roman
 
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To post to this group, send email to kubevi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/659d7589-2986-4a48-9b7d-b88811606329%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Roman Mohr

unread,
Jan 9, 2019, 10:55:47 AM1/9/19
to Matthew Mosesohn, kubevirt-dev
On Wed, Jan 9, 2019 at 4:51 PM Roman Mohr <rm...@redhat.com> wrote:
Hi Mathew,

On Wed, Jan 9, 2019 at 4:46 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
I'm trying to use kubevirt to do Kubernetes-in-Kubernetes deployments as part of a CI system. The k8s version run is v1.13.1 and kube-virt is v0.11.0. The disks of the hosts are SSD and the kube nodes have 0 workloads on them except a single test VM. I can use dd to show ~600mb/s write performance, but inside a VM, I only get ~10mb/s write performance. Standing up k8s in a VM is not working well because the disk is too slow for kube-apiserver to respond quickly enough to health checks from kubelet.


It looks like you are writing  your data right now directly into the containerDisk (so into the overlay fs of the container runtime). If you have fast ssd disks on the nodes where you run the VMs, which you want to directly use, you can try to use a `emptyDisk` [1] or a `hostDisk` [2] volume.


A small addition: `emptyDisk` is placed wherever `emptyDir` on a Pod would be.

Matthew Mosesohn

unread,
Jan 9, 2019, 11:29:42 AM1/9/19
to kubevirt-dev
Roman, I appreciate your reply and suggestion.

Unfortunately, emptyDisk does not have any difference in performance from the base VM disk. I believe they are the same type of disk for all intents and purposes.
hostDisk will probably be more promising, but it has the downside that it will not get automatically cleaned up after the VM is deleted.

Roman Mohr

unread,
Jan 9, 2019, 12:01:01 PM1/9/19
to Matthew Mosesohn, kubevirt-dev
On Wed, Jan 9, 2019 at 5:30 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
Roman, I appreciate your reply and suggestion.

Unfortunately, emptyDisk does not have any difference in performance from the base VM disk. I believe they are the same type of disk for all intents and purposes.
hostDisk will probably be more promising, but it has the downside that it will not get automatically cleaned up after the VM is deleted.

Hm, do you see the 600mb/s write performance inside an emptyDir? I understand that qcow2 is slower than raw or directly writing to a file in an emptyDir, but it should not be almost zero.

So regarding to a low write performance on the containerDisk, I am not surprised, but for the other one, could you confirm that emptyDir itself is fast? Also did you really do the tests on the mounted disk inside the VM? We also have some options to fine-tune disk performance [3], if nothing else seems to help.

Finally, just to rule other possibilities out, your kube-nodes where you install kubevirt are your bare-metal machines?

Best Regards,
Roman

--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To post to this group, send email to kubevi...@googlegroups.com.

Fabian Deutsch

unread,
Jan 9, 2019, 1:09:06 PM1/9/19
to Matthew Mosesohn, Adam Litke, kubevirt-dev
On Wed, Jan 9, 2019 at 5:30 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
Roman, I appreciate your reply and suggestion.

Unfortunately, emptyDisk does not have any difference in performance from the base VM disk. I believe they are the same type of disk for all intents and purposes.

I really wonder why the performance is that bad.

Adam, any idea where this sloweness could come from in the containerDisk case?

My rough guess is alongside Roman's, that it is a side-effect of the layered container fs + some write/cache flags libvirt is setting.
 
hostDisk will probably be more promising, but it has the downside that it will not get automatically cleaned up after the VM is deleted.

Did you consider a DataVolume?

It's life-cycle is bound to a VirtualMachine (i.e. if you remove the VM, the DV is getting removed as well).
In essence it's some additional features (cloning, provisioning, VM bound life-cycle) on top of PVs.

Adam might have more details on this as well.

- fabian
 

Matthew Mosesohn

unread,
Jan 10, 2019, 2:05:54 AM1/10/19
to Fabian Deutsch, Adam Litke, kubevirt-dev
Fabian, thanks for the additional suggestion.

This test cluster doesn't use PVCs yet, but using DataVolume
definitely an option and it was my next thought. I was just so
surprised by the poor performance that I wanted to find some answers
first. This is a bare metal cluster, so I will use
local-volume-provisioner for PVs.

Best Regards,
Matthew Mosesohn

Roman Mohr

unread,
Jan 10, 2019, 2:21:47 AM1/10/19
to Matthew Mosesohn, Fabian Deutsch, Adam Litke, kubevirt-dev
On Thu, Jan 10, 2019 at 8:05 AM Matthew Mosesohn <matthew....@gmail.com> wrote:
Fabian, thanks for the additional suggestion.

This test cluster doesn't use PVCs yet, but using DataVolume
definitely an option and it was my next thought. I was just so
surprised by the poor performance that I wanted to find some answers
first. This is a bare metal cluster, so I will use
local-volume-provisioner for PVs.

This is really strange. There should not really be a difference between emptyDir and a local-volume-PVC performance wise.

Roman 

Matthew Mosesohn

unread,
Jan 10, 2019, 4:28:21 AM1/10/19
to kubevirt-dev
Sorry for this bad issue report. Kubevirt was misconfigured with the
following option: --from-literal debug.useEmulation=true. Disabling it
caused disk write performance to be 90-95% that of host disk
performance. I thank you all for your suggestions!

Roman Mohr

unread,
Jan 10, 2019, 4:31:31 AM1/10/19
to Matthew Mosesohn, kubevirt-dev
On Thu, Jan 10, 2019 at 10:28 AM Matthew Mosesohn <matthew....@gmail.com> wrote:
Sorry for this bad issue report. Kubevirt was misconfigured with the
following option: --from-literal debug.useEmulation=true. Disabling it
caused disk write performance to be 90-95% that of host disk
performance. I thank you all for your suggestions!

Great to hear that it is working now.

Roman
 

Fabian Deutsch

unread,
Jan 10, 2019, 6:32:42 AM1/10/19
to Roman Mohr, Matthew Mosesohn, kubevirt-dev
On Thu, Jan 10, 2019 at 10:31 AM Roman Mohr <rm...@redhat.com> wrote:


On Thu, Jan 10, 2019 at 10:28 AM Matthew Mosesohn <matthew....@gmail.com> wrote:
Sorry for this bad issue report. Kubevirt was misconfigured with the
following option: --from-literal debug.useEmulation=true. Disabling it
caused disk write performance to be 90-95% that of host disk
performance. I thank you all for your suggestions!

Great to hear that it is working now.

+1

Glad to hear this :)

- fabian
 

Adam Litke

unread,
Jan 10, 2019, 11:10:02 AM1/10/19
to Fabian Deutsch, Matthew Mosesohn, kubevirt-dev
On Wed, Jan 9, 2019 at 1:09 PM Fabian Deutsch <fdeu...@redhat.com> wrote:


On Wed, Jan 9, 2019 at 5:30 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
Roman, I appreciate your reply and suggestion.

Unfortunately, emptyDisk does not have any difference in performance from the base VM disk. I believe they are the same type of disk for all intents and purposes.

I really wonder why the performance is that bad.

Adam, any idea where this sloweness could come from in the containerDisk case?

My rough guess is alongside Roman's, that it is a side-effect of the layered container fs + some write/cache flags libvirt is setting.

I would agree with this.  I was still waiting to hear if the performance was better when using a PVC.  CDI is delivering a feature soon that will allow you to use the same containerDisk image in a PVC so this should become easier to test soon.
 
 
hostDisk will probably be more promising, but it has the downside that it will not get automatically cleaned up after the VM is deleted.

Did you consider a DataVolume?

It's life-cycle is bound to a VirtualMachine (i.e. if you remove the VM, the DV is getting removed as well).
In essence it's some additional features (cloning, provisioning, VM bound life-cycle) on top of PVs.

Adam might have more details on this as well.

You should try to use either a DataVolume (or direct PVC) as a disk.  As far as I know the containerDisk mode is mostly for demonstration purposes and write performance to the overlayfs is not optimized.


--
Adam Litke

Adam Litke

unread,
Jan 10, 2019, 11:10:57 AM1/10/19
to Fabian Deutsch, Matthew Mosesohn, kubevirt-dev
I should have read the latest messages in the thread before posting.  Glad to hear that overlayfs performs well enough!
--
Adam Litke

Roman Mohr

unread,
Jan 10, 2019, 11:18:20 AM1/10/19
to Adam Litke, Fabian Deutsch, Matthew Mosesohn, kubevirt-dev
On Thu, Jan 10, 2019 at 5:10 PM Adam Litke <ali...@redhat.com> wrote:


On Wed, Jan 9, 2019 at 1:09 PM Fabian Deutsch <fdeu...@redhat.com> wrote:


On Wed, Jan 9, 2019 at 5:30 PM Matthew Mosesohn <matthew....@gmail.com> wrote:
Roman, I appreciate your reply and suggestion.

Unfortunately, emptyDisk does not have any difference in performance from the base VM disk. I believe they are the same type of disk for all intents and purposes.

I really wonder why the performance is that bad.

Adam, any idea where this sloweness could come from in the containerDisk case?

My rough guess is alongside Roman's, that it is a side-effect of the layered container fs + some write/cache flags libvirt is setting.

I would agree with this.  I was still waiting to hear if the performance was better when using a PVC.  CDI is delivering a feature soon that will allow you to use the same containerDisk image in a PVC so this should become easier to test soon.
 
 
hostDisk will probably be more promising, but it has the downside that it will not get automatically cleaned up after the VM is deleted.

Did you consider a DataVolume?

It's life-cycle is bound to a VirtualMachine (i.e. if you remove the VM, the DV is getting removed as well).
In essence it's some additional features (cloning, provisioning, VM bound life-cycle) on top of PVs.

Adam might have more details on this as well.

You should try to use either a DataVolume (or direct PVC) as a disk.  As far as I know the containerDisk mode is mostly for demonstration purposes and write performance to the overlayfs is not optimized.

It is very helpful for ephemeral VMIs where you want to use fast local storage of your nodes. You basically use a "containerDisk" or a "hostDisk" with the base image and then add an "emptyDisk" for writing. That means that you can for instance create isolated k8s clusters with kubevirt where you don't even need distributed storage in any way.

Best Regards,
Roman
 
Reply all
Reply to author
Forward
0 new messages