Hi all!Recently, we have observed that if you enable a given feature gate, spin a VM using the feature behind the feature, remove the feature gate and try to stop the VM, you get an error from the vmi create admitter.For instance, if we have a VMI trying to use downwardMetrics:1. Enable the DownwardMetrics FG.2. Spin a VM using the DownwardMetrics (add spec.template.spec.domain.devices.downwardMetrics: {})3. Remove the DownwardMetrics FG.4. Stop the VM.Then you will get the following error:Error stopping VirtualMachine Internal error occurred: admission webhook "virtualmachine-validator.kubevirt.io" denied the request: downwardMetrics virtio serial is not allowed: DownwardMetrics feature gate is not enabledI've checked it out with other FG such as VSOCK, and it shows the same behavior.
Therefore, I have two questions:1) Should we add some special case to allow stopping a running VM even if the FG has been disabled?2) Should we do not allow disabling a FG if a running VM is using a feature protected by that FG?
Thanks a lot!J.Cano.--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/ea74706c-1a6a-4e66-b25c-0e870df153c6n%40googlegroups.com.
On Wed, Apr 3, 2024 at 5:55 PM Edward Haas <edw...@redhat.com> wrote:On Wed, Apr 3, 2024 at 5:30 PM Javier Cano Cano <jcan...@redhat.com> wrote:Hi all!Recently, we have observed that if you enable a given feature gate, spin a VM using the feature behind the feature, remove the feature gate and try to stop the VM, you get an error from the vmi create admitter.For instance, if we have a VMI trying to use downwardMetrics:1. Enable the DownwardMetrics FG.2. Spin a VM using the DownwardMetrics (add spec.template.spec.domain.devices.downwardMetrics: {})3. Remove the DownwardMetrics FG.4. Stop the VM.Then you will get the following error:Error stopping VirtualMachine Internal error occurred: admission webhook "virtualmachine-validator.kubevirt.io" denied the request: downwardMetrics virtio serial is not allowed: DownwardMetrics feature gate is not enabledI've checked it out with other FG such as VSOCK, and it shows the same behavior.Nice catch!
Therefore, I have two questions:1) Should we add some special case to allow stopping a running VM even if the FG has been disabled?2) Should we do not allow disabling a FG if a running VM is using a feature protected by that FG?I think this is a classic incident of when validation admitters are a bad solution to a validation need.A webhook admitter does not provide the ability for the system or the user to fix the problem, potentiallycanceling the desired request instead of reconciling (retry) it until success.The solution IMO is to move the validation to the relevant controller. At which point it will know notto validate the FG beyond start time (or whatever strategy is chosen).
To mitigate this, here is a potential practical (quick?) solution:Make sure the FG validations are only occurring on creation of the manifest and not on updates.
As a follow up, FG validations on the spec (i.e. do not input a field that its FG is not enabled) should be done onlyat the controller. But this may be controversial and more effort needs to be put in it.I suggest a design, as the behavior will change a bit and convestation is expected.
On Wed, Apr 3, 2024 at 5:30 PM Javier Cano Cano <jcan...@redhat.com> wrote:Hi all!Recently, we have observed that if you enable a given feature gate, spin a VM using the feature behind the feature, remove the feature gate and try to stop the VM, you get an error from the vmi create admitter.For instance, if we have a VMI trying to use downwardMetrics:1. Enable the DownwardMetrics FG.2. Spin a VM using the DownwardMetrics (add spec.template.spec.domain.devices.downwardMetrics: {})3. Remove the DownwardMetrics FG.4. Stop the VM.Then you will get the following error:Error stopping VirtualMachine Internal error occurred: admission webhook "virtualmachine-validator.kubevirt.io" denied the request: downwardMetrics virtio serial is not allowed: DownwardMetrics feature gate is not enabledI've checked it out with other FG such as VSOCK, and it shows the same behavior.Nice catch!Therefore, I have two questions:1) Should we add some special case to allow stopping a running VM even if the FG has been disabled?2) Should we do not allow disabling a FG if a running VM is using a feature protected by that FG?I think this is a classic incident of when validation admitters are a bad solution to a validation need.A webhook admitter does not provide the ability for the system or the user to fix the problem, potentiallycanceling the desired request instead of reconciling (retry) it until success.The solution IMO is to move the validation to the relevant controller. At which point it will know notto validate the FG beyond start time (or whatever strategy is chosen).
Hi,
Am Mittwoch, dem 03.04.2024 um 17:58 +0300 schrieb Orel Misan:
> Hi Javier,
>
> It makes sense to me that in general, the settings that were present
> in the cluster when the VM was first created - should
> persist throughout the VM's life-cycle.
To guarantee the presence of a feature persists throughout the
lifecycle of a VM it sounds to me like we need a webhook guarding the
removal of feature gates still in use. IMO that would be suited best
for the way feature gates work at the moment and without changing
actual controller logic, at least in short term.
> For example, the logic behind the AlignCPUs feature gate [1] persists
> the state that was determined during the creation of the VMI.
> Changing this feature gate will not change the behavior of
> existing VMs, but will affect new ones.
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/95ace7c4c16ef3a904c7a9e3d3c8258d0a60f334.camel%40redhat.com.
On Tue, Apr 9, 2024 at 10:02 AM Felix Matouschek <fmato...@redhat.com> wrote:Hi,
Am Mittwoch, dem 03.04.2024 um 17:58 +0300 schrieb Orel Misan:
> Hi Javier,
>
> It makes sense to me that in general, the settings that were present
> in the cluster when the VM was first created - should
> persist throughout the VM's life-cycle.
To guarantee the presence of a feature persists throughout the
lifecycle of a VM it sounds to me like we need a webhook guarding the
removal of feature gates still in use. IMO that would be suited best
for the way feature gates work at the moment and without changing
actual controller logic, at least in short term.This may be a good short term work-around, but I am concerned about the long term and larger installations.This would require an admin to halt new VM creation in-order to shutdown affected VMs and disable feature gates.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/CAHbqe-m-9-5dU8PcaRM4f-KEoZsbrduLvoufV5FJ%2BjhgrHeBFg%40mail.gmail.com.
On Tue, Apr 9, 2024 at 6:52 PM Sarah Bennert ( she / her ) <sben...@redhat.com> wrote:On Tue, Apr 9, 2024 at 10:02 AM Felix Matouschek <fmato...@redhat.com> wrote:Hi,
Am Mittwoch, dem 03.04.2024 um 17:58 +0300 schrieb Orel Misan:
> Hi Javier,
>
> It makes sense to me that in general, the settings that were present
> in the cluster when the VM was first created - should
> persist throughout the VM's life-cycle.
To guarantee the presence of a feature persists throughout the
lifecycle of a VM it sounds to me like we need a webhook guarding the
removal of feature gates still in use. IMO that would be suited best
for the way feature gates work at the moment and without changing
actual controller logic, at least in short term.This may be a good short term work-around, but I am concerned about the long term and larger installations.This would require an admin to halt new VM creation in-order to shutdown affected VMs and disable feature gates.It is for sure inconvenient, but I tend to agree with Felix that it would be correct.If a feature is in use, but the FG is representing something else, then this is just asking for error conditions and problems.
On Wed, Apr 10, 2024 at 3:09 AM Fabian Deutsch <fdeu...@redhat.com> wrote:On Tue, Apr 9, 2024 at 6:52 PM Sarah Bennert ( she / her ) <sben...@redhat.com> wrote:On Tue, Apr 9, 2024 at 10:02 AM Felix Matouschek <fmato...@redhat.com> wrote:Hi,
Am Mittwoch, dem 03.04.2024 um 17:58 +0300 schrieb Orel Misan:
> Hi Javier,
>
> It makes sense to me that in general, the settings that were present
> in the cluster when the VM was first created - should
> persist throughout the VM's life-cycle.
To guarantee the presence of a feature persists throughout the
lifecycle of a VM it sounds to me like we need a webhook guarding the
removal of feature gates still in use. IMO that would be suited best
for the way feature gates work at the moment and without changing
actual controller logic, at least in short term.This may be a good short term work-around, but I am concerned about the long term and larger installations.This would require an admin to halt new VM creation in-order to shutdown affected VMs and disable feature gates.It is for sure inconvenient, but I tend to agree with Felix that it would be correct.If a feature is in use, but the FG is representing something else, then this is just asking for error conditions and problems.I do not disagree.Perhaps moving forward, a more granular approach to all feature gates might be considered.If all feature gates could be limited to a namespace, role, or user, then the impact can be managed better than any cluster-wide feature gate would be.