State of voting lanes for kubernetes 1.35 providers

Daniel Hiller

unread,

Feb 9, 2026, 6:00:27 AMFeb 9

to kubevirt-dev

Hey all!

The PR for making 1.35 voting [1] is not yet merged. There's two tests that are still failing on the periodics sig-compute for 1.35 [2], which are being tracked [3].
As of Friday the failures were still under investigation [4], now a PR to fix them is on the way [5].

Thanks to everyone involved!

[1]: https://github.com/kubevirt/project-infra/pull/4662

[2]: https://testgrid.k8s.io/kubevirt-periodics#periodic-kubevirt-e2e-k8s-1.35-sig-compute&width=20

[3]: https://github.com/kubevirt/kubevirt/issues/15976

[4]: https://github.com/kubevirt/kubevirt/issues/15976#issuecomment-3859955440

[5]: https://github.com/kubevirt/kubevirt/pull/16758

--

Kind regards,

Daniel Hiller

He / Him / His

Principal Software Engineer, KubeVirt CI, OpenShift Virtualization

Red Hat

dhi...@redhat.com

Red Hat GmbH, Registered seat: Werner von Siemens Ring 12, D-85630 Grasbrunn, Germany  
Commercial register: Amtsgericht Muenchen/Munich, HRB 153243,
Managing Directors: Ryan Barnhart, Charles Cachera, Avril Crosse O'Flaherty

Federico Fossemo

unread,

Feb 17, 2026, 10:48:04 AM (11 days ago) Feb 17

to Daniel Hiller, kubevirt-dev

Hey Daniel,

Since https://github.com/kubevirt/kubevirt/pull/16758 was merged, we can already see the good result:
https://testgrid.k8s.io/kubevirt-periodics#periodic-kubevirt-e2e-k8s-1.35-sig-compute&width=20

Thank you to anyone involved. We can start moving forward.

Regards,
-FF

--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kubevirt-dev/CAK%2BeyL6mRSiTfD_RMNbG3WVLOKt9Mg4zpU%2Bvt%3DrzXX9x3YhixQ%40mail.gmail.com.

Dan Kenigsberg

unread,

Feb 17, 2026, 10:58:57 AM (11 days ago) Feb 17

to Federico Fossemo, Daniel Hiller, kubevirt-dev

Thank you very much for this progress with 1.35!

Do we have a clue why other lanes are failing so much?
https://kubevirt.io/ci-health/#kubevirtkubevirt
For example, pull-kubevirt-e2e-k8s-1.34-sig-compute-serial is failing 30% on merged PRs

Regards,

Dan.

To view this discussion visit https://groups.google.com/d/msgid/kubevirt-dev/CAJxgHm5mbVfB0bhLkF3_KKV%2BpRznddKw7T%2BD2J06vWYYLaQGVw%40mail.gmail.com.

Federico Fossemo

unread,

Feb 17, 2026, 11:16:46 AM (11 days ago) Feb 17

to Dan Kenigsberg, Daniel Hiller, kubevirt-dev

Hey Dan,

We were discussing this in the last sig-ci meeting[1].
Generally speaking, we observed an increase in the clustered failures.
We discovered a correlation between this increase in serial failures and the making of 1.35 lanes to `always_run: true`.
The root cause is still under investigation, but I think that making the switch to the mandatory 1.35 provider, and

the deprecation of 1.32 provider lanes will help a lot in this direction.
Right now, we are running 4 provider jobs for each PR.

Hope this helps,

[1] https://docs.google.com/document/d/1pWg6JvjStffoZ0NnVk4ALQOs3MYOdHtBpkLBlJlHTqA/edit?tab=t.0#heading=h.l7tbe52517xe

Regards,

--FF

Daniel Hiller

unread,

Feb 17, 2026, 12:35:22 PM (11 days ago) Feb 17

to Federico Fossemo, Dan Kenigsberg, kubevirt-dev

Hey all,

On Tue, Feb 17, 2026 at 5:16 PM Federico Fossemo <ffos...@redhat.com> wrote:

Hey Dan,

We were discussing this in the last sig-ci meeting[1].
Generally speaking, we observed an increase in the clustered failures.

To expand on that: what we are seeing is that all sig-compute periodic lanes show several clustered failures (ex 1.35 [2]), while the presubmit lanes do not show this. I believe there's a correlation of clustered failures on periodics caused by adding the additional 1.35 always_run lanes on top, however there's no strong indication when looking at the graphs.

https://grafana.ci.kubevirt.io/d/efpTS3t4z/e2e-jobs-overview-v2?orgId=1&refresh=2h&from=now-45d&to=now&var-job_name=periodic-kubevirt-e2e-k8s-.%2A&viewPanel=15

Blue lines from left to right:

* 1st marks the always_run: true for 1.35

* 2nd marks the 1st occurrence of a cluster failure on 1.35 sig-compute

* 3rd marks KubeVirtCI bump 1.35 to stable

We discovered a correlation between this increase in serial failures and the making of 1.35 lanes to `always_run: true`.
The root cause is still under investigation, but I think that making the switch to the mandatory 1.35 provider, and
the deprecation of 1.32 provider lanes will help a lot in this direction.

I agree. Removing the old lanes will also give more capacity overall.

[2]: https://testgrid.k8s.io/kubevirt-periodics#periodic-kubevirt-e2e-k8s-1.35-sig-compute&width=20

--

Best,

Daniel

Daniel Hiller

unread,

Feb 17, 2026, 12:38:27 PM (11 days ago) Feb 17

to Federico Fossemo, Dan Kenigsberg, kubevirt-dev

On Tue, Feb 17, 2026 at 6:35 PM Daniel Hiller <dhi...@redhat.com> wrote:

Hey all,

On Tue, Feb 17, 2026 at 5:16 PM Federico Fossemo <ffos...@redhat.com> wrote:
Hey Dan,

We were discussing this in the last sig-ci meeting[1].
Generally speaking, we observed an increase in the clustered failures.
To expand on that: what we are seeing is that all sig-compute periodic lanes show several clustered failures (ex 1.35 [2]), while the presubmit lanes do not show this. I believe there's a correlation of clustered failures on periodics caused by adding the additional 1.35 always_run lanes on top, however there's no strong indication when looking at the graphs.

Correction: 1.34 and 1.35 serial presubmit lanes both show clustered failures [3]

https://grafana.ci.kubevirt.io/d/efpTS3t4z/e2e-jobs-overview-v2?orgId=1&refresh=2h&from=now-45d&to=now&var-job_name=periodic-kubevirt-e2e-k8s-.%2A&viewPanel=15

Blue lines from left to right:
* 1st marks the always_run: true for 1.35
* 2nd marks the 1st occurrence of a cluster failure on 1.35 sig-compute
* 3rd marks KubeVirtCI bump 1.35 to stable

We discovered a correlation between this increase in serial failures and the making of 1.35 lanes to `always_run: true`.
The root cause is still under investigation, but I think that making the switch to the mandatory 1.35 provider, and
the deprecation of 1.32 provider lanes will help a lot in this direction.

I agree. Removing the old lanes will also give more capacity overall.

[2]: https://testgrid.k8s.io/kubevirt-periodics#periodic-kubevirt-e2e-k8s-1.35-sig-compute&width=20

[3]: https://testgrid.k8s.io/kubevirt-presubmits#pull-kubevirt-e2e-k8s-1.35-sig-compute-serial&width=20

--

Best,

Daniel

Dan Kenigsberg

unread,

Feb 17, 2026, 3:11:13 PM (11 days ago) Feb 17

to Daniel Hiller, Federico Fossemo, kubevirt-dev

On Tue, Feb 17, 2026 at 7:38 PM Daniel Hiller <dhi...@redhat.com> wrote:

On Tue, Feb 17, 2026 at 6:35 PM Daniel Hiller <dhi...@redhat.com> wrote:
Hey all,

On Tue, Feb 17, 2026 at 5:16 PM Federico Fossemo <ffos...@redhat.com> wrote:
Hey Dan,

We were discussing this in the last sig-ci meeting[1].
Generally speaking, we observed an increase in the clustered failures.
To expand on that: what we are seeing is that all sig-compute periodic lanes show several clustered failures (ex 1.35 [2]), while the presubmit lanes do not show this. I believe there's a correlation of clustered failures on periodics caused by adding the additional 1.35 always_run lanes on top, however there's no strong indication when looking at the graphs.

Correction: 1.34 and 1.35 serial presubmit lanes both show clustered failures [3]

https://grafana.ci.kubevirt.io/d/efpTS3t4z/e2e-jobs-overview-v2?orgId=1&refresh=2h&from=now-45d&to=now&var-job_name=periodic-kubevirt-e2e-k8s-.%2A&viewPanel=15

Blue lines from left to right:
* 1st marks the always_run: true for 1.35
* 2nd marks the 1st occurrence of a cluster failure on 1.35 sig-compute
* 3rd marks KubeVirtCI bump 1.35 to stable

We discovered a correlation between this increase in serial failures and the making of 1.35 lanes to `always_run: true`.
The root cause is still under investigation, but I think that making the switch to the mandatory 1.35 provider, and
the deprecation of 1.32 provider lanes will help a lot in this direction.

I agree. Removing the old lanes will also give more capacity overall.

What holds us from stopping 1.32 on main right now?

Luboslav Pivarc

unread,

Feb 17, 2026, 3:28:05 PM (11 days ago) Feb 17

to Dan Kenigsberg, Daniel Hiller, Federico Fossemo, kubevirt-dev

On Tue, Feb 17, 2026 at 9:11 PM 'Dan Kenigsberg' via kubevirt-dev <kubevi...@googlegroups.com> wrote:

On Tue, Feb 17, 2026 at 7:38 PM Daniel Hiller <dhi...@redhat.com> wrote:

On Tue, Feb 17, 2026 at 6:35 PM Daniel Hiller <dhi...@redhat.com> wrote:
Hey all,

On Tue, Feb 17, 2026 at 5:16 PM Federico Fossemo <ffos...@redhat.com> wrote:
Hey Dan,

We were discussing this in the last sig-ci meeting[1].
Generally speaking, we observed an increase in the clustered failures.
To expand on that: what we are seeing is that all sig-compute periodic lanes show several clustered failures (ex 1.35 [2]), while the presubmit lanes do not show this. I believe there's a correlation of clustered failures on periodics caused by adding the additional 1.35 always_run lanes on top, however there's no strong indication when looking at the graphs.

Correction: 1.34 and 1.35 serial presubmit lanes both show clustered failures [3]

https://grafana.ci.kubevirt.io/d/efpTS3t4z/e2e-jobs-overview-v2?orgId=1&refresh=2h&from=now-45d&to=now&var-job_name=periodic-kubevirt-e2e-k8s-.%2A&viewPanel=15

Blue lines from left to right:
* 1st marks the always_run: true for 1.35
* 2nd marks the 1st occurrence of a cluster failure on 1.35 sig-compute
* 3rd marks KubeVirtCI bump 1.35 to stable

We discovered a correlation between this increase in serial failures and the making of 1.35 lanes to `always_run: true`.
The root cause is still under investigation, but I think that making the switch to the mandatory 1.35 provider, and
the deprecation of 1.32 provider lanes will help a lot in this direction.

I agree. Removing the old lanes will also give more capacity overall.

What holds us from stopping 1.32 on main right now?

That's a very good question... https://github.com/kubevirt/project-infra/pull/4726

To view this discussion visit https://groups.google.com/d/msgid/kubevirt-dev/CAHOEP55LUo8oA2T_MUvZeTUvtWghjx6vsY2MQ3-o0e-7q7DidA%40mail.gmail.com.

Daniel Hiller

unread,

Feb 18, 2026, 6:12:34 AM (10 days ago) Feb 18

to Luboslav Pivarc, Dan Kenigsberg, Federico Fossemo, kubevirt-dev

Hey all,

1.35 are voting now: https://github.com/kubevirt/project-infra/pull/4662

--

Best,

Daniel

Reply all

Reply to author

Forward