Bulk VMIs deployment and selective VMI deletion

Huy Pham

unread,

Dec 11, 2020, 2:37:25 PM12/11/20

to kubevirt-dev

Hey guys,

My team has been trying bulk VMIs creation request. Is that something the community has talked about before? While exploring if VMIReplicaSet fits our needs, we have some follow-up questions regarding VMRS implementation, performance, and scalability.

Does VMIRS have runtime performance benefits? VMIRS seems to be built on top VMIs objects as an additional abstraction for better VMI management.

Is there there a way to make the VMI processing more efficient for time-sensitive and large workloads such as parallel scheduling?

With the VMIRS approach, is there a way to scale down a VMIRS with selective VMIs deletion? We want to have control of what VMIs to be removed.

Even with the sequential process rate, we realized that by tuning QPS and controller threads, we can increase performance. We wonder if there is a benchmark that records the maximum QPS and controller threads to achieve optimal performance.

We look forward to your feedback and hope that we can solve this together.

Thank you,

Huy

dvo...@redhat.com

unread,

Dec 14, 2020, 8:50:52 AM12/14/20

to kubevirt-dev

On Friday, December 11, 2020 at 2:37:25 PM UTC-5 hup...@nvidia.com wrote:

Hey guys,

My team has been trying bulk VMIs creation request. Is that something the community has talked about before? While exploring if VMIReplicaSet fits our needs, we have some follow-up questions regarding VMRS implementation, performance, and scalability.

Does VMIRS have runtime performance benefits? VMIRS seems to be built on top VMIs objects as an additional abstraction for better VMI management.

VMIRS does not have any runtime performance benefits outside of what you can do with a standalone VMI directly.

Is there there a way to make the VMI processing more efficient for time-sensitive and large workloads such as parallel scheduling?

this will likely help, https://kubevirt.io/user-guide/#/creation/dedicated-cpu. If you have multiple numa nodes and need numa affinity, look into topology manager in order to schedule workloads on a dedicated numa node. This stuff gets pretty complicated and is largely workload and hardware dependent. If you find any gaps in what kubevirt offers here as it relates to your use case, we're definitely interested in improving the situation.

You may also want to look into kernel tunings on the actual host machine that the VMs are running on in order to further tune how the scheduler behaves.

With the VMIRS approach, is there a way to scale down a VMIRS with selective VMIs deletion? We want to have control of what VMIs to be removed.

not right now. how would you expect something like this to work?

Even with the sequential process rate, we realized that by tuning QPS and controller threads, we can increase performance. We wonder if there is a benchmark that records the maximum QPS and controller threads to achieve optimal performance.

With tuning controller threads, what are you hoping to increase performance of? Is this an attempt to shave off time during scheduling up to the point of booting the vmi image?

dvo...@redhat.com

unread,

Dec 14, 2020, 9:06:55 AM12/14/20

to kubevirt-dev

On Monday, December 14, 2020 at 8:50:52 AM UTC-5 dvo...@redhat.com wrote:

On Friday, December 11, 2020 at 2:37:25 PM UTC-5 hup...@nvidia.com wrote:
Hey guys,

My team has been trying bulk VMIs creation request. Is that something the community has talked about before? While exploring if VMIReplicaSet fits our needs, we have some follow-up questions regarding VMRS implementation, performance, and scalability.

Does VMIRS have runtime performance benefits? VMIRS seems to be built on top VMIs objects as an additional abstraction for better VMI management.

VMIRS does not have any runtime performance benefits outside of what you can do with a standalone VMI directly.

Is there there a way to make the VMI processing more efficient for time-sensitive and large workloads such as parallel scheduling?

this will likely help, https://kubevirt.io/user-guide/#/creation/dedicated-cpu. If you have multiple numa nodes and need numa affinity, look into topology manager in order to schedule workloads on a dedicated numa node. This stuff gets pretty complicated and is largely workload and hardware dependent. If you find any gaps in what kubevirt offers here as it relates to your use case, we're definitely interested in improving the situation.

You may also want to look into kernel tunings on the actual host machine that the VMs are running on in order to further tune how the scheduler behaves.

With the VMIRS approach, is there a way to scale down a VMIRS with selective VMIs deletion? We want to have control of what VMIs to be removed.

not right now. how would you expect something like this to work?

I take that back. Technically you could achieve this by pausing the VMIRS using vmirs.Spec.Paused, then selectively deleting the VMIs, then re-setting replica count to the new desired count, and finally unpausing the VMIRS.

Roman Mohr

unread,

Dec 14, 2020, 9:16:31 AM12/14/20

to dvo...@redhat.com, kubevirt-dev

On Mon, Dec 14, 2020 at 3:12 PM dvo...@redhat.com <dvo...@redhat.com> wrote:

On Monday, December 14, 2020 at 8:50:52 AM UTC-5 dvo...@redhat.com wrote:
On Friday, December 11, 2020 at 2:37:25 PM UTC-5 hup...@nvidia.com wrote:
Hey guys,

My team has been trying bulk VMIs creation request. Is that something the community has talked about before? While exploring if VMIReplicaSet fits our needs, we have some follow-up questions regarding VMRS implementation, performance, and scalability.

Does VMIRS have runtime performance benefits? VMIRS seems to be built on top VMIs objects as an additional abstraction for better VMI management.

VMIRS does not have any runtime performance benefits outside of what you can do with a standalone VMI directly.

Is there there a way to make the VMI processing more efficient for time-sensitive and large workloads such as parallel scheduling?

this will likely help, https://kubevirt.io/user-guide/#/creation/dedicated-cpu. If you have multiple numa nodes and need numa affinity, look into topology manager in order to schedule workloads on a dedicated numa node. This stuff gets pretty complicated and is largely workload and hardware dependent. If you find any gaps in what kubevirt offers here as it relates to your use case, we're definitely interested in improving the situation.

You may also want to look into kernel tunings on the actual host machine that the VMs are running on in order to further tune how the scheduler behaves.

With the VMIRS approach, is there a way to scale down a VMIRS with selective VMIs deletion? We want to have control of what VMIs to be removed.

not right now. how would you expect something like this to work?

I take that back. Technically you could achieve this by pausing the VMIRS using vmirs.Spec.Paused, then selectively deleting the VMIs, then re-setting replica count to the new desired count, and finally unpausing the VMIRS.

The following does not exist, but: We could also come up with e.g. adding an annotation on a VMI and then scaling down and let the controller prefer VMIs with such an annotation.

Also I think that my deletion candidate detection algorithm is less good than the one from the k8s ReplicaSet. I think we can improve this too.

Best Regards,

Roman

Even with the sequential process rate, we realized that by tuning QPS and controller threads, we can increase performance. We wonder if there is a benchmark that records the maximum QPS and controller threads to achieve optimal performance.

With tuning controller threads, what are you hoping to increase performance of? Is this an attempt to shave off time during scheduling up to the point of booting the vmi image?

We look forward to your feedback and hope that we can solve this together.

Thank you,

Huy

--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/480f0f00-be08-4053-b00b-18ec2a587c68n%40googlegroups.com.

Huy Pham

unread,

Dec 14, 2020, 12:48:31 PM12/14/20

to kubevirt-dev

Thank you all for the feedback. That's correct, we're looking for a way to reduce the time from when the bulk VMIs requests are sent until the VMIs are available for service.

Since we can't control the booting time of VMI image, we look to shave off scheduling time as much as possible.

I will loop this information back to my team and update with yall soon.

David Vossel

unread,

Dec 14, 2020, 2:29:27 PM12/14/20

to Huy Pham, kubevirt-dev

In many cases the KubeVirt controllers aren't actually the primary component in the VM startup chain of events that incurs the largest amount of time.

For instance, if you're using a containerDisk which isn't cached on the host a VMI is about to run on, then the kubelet pulling that container image down is likely way more time intensive than anything else. It would be worth investigating further where the time is spent during VMI startup for your specific setup. There might be gains to be had that are outside of the kubevirt runtime entirely.

I will loop this information back to my team and update with yall soon.

--
You received this message because you are subscribed to a topic in the Google Groups "kubevirt-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubevirt-dev/gNG2vdhrSjM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/bb9f3ba9-1dcc-4fc4-8af5-6fce79dbdef2n%40googlegroups.com.

Ezra Silvera

unread,

Dec 14, 2020, 3:37:30 PM12/14/20

to Huy Pham, kubevirt-dev

I'm not sure the scheduling time is the main issue here (or that we may define scheduling differently). In general the actual scheduling mainly depends on the number of Nodes (and special constraints), as the scheduler scans and then scores those nodes. So unless you have a very large cluster this shouldn't take too much time. In addition, although the initial placement is serial, once a candidate node is found, it is marked and the k8s scheduler goes to the next placement request even before we know if the VM/Pod brought up successfully on the node.

Do you have any timing measurements breakout? You can, for example, monitor the Pod till you see it was scheduled ?

I will loop this information back to my team and update with yall soon.

--

You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.

Reply all

Reply to author

Forward