[RFC] Quarantine lifecycle timelines and tracking requirements

19 views
Skip to first unread message

Daniel Hiller

unread,
Apr 28, 2026, 12:30:11 PMApr 28
to kubevirt-dev, Denis Ollier Pinas, Dylan White
Hey folks,                                                  

I've opened a PR to add explicit lifecycle timelines and tracking requirements to our test quarantine process: https://github.com/kubevirt/kubevirt/pull/17633

Why?

Our quarantine process currently defines clear entry and exit criteria, but has no deadlines between them. In practice this means quarantined tests can — and do — remain quarantined indefinitely, sometimes for over a year. Without a fixed timeline or tracking requirements, no mechanism ensures that quarantined tests are actively being worked on.
Thus Quarantine becomes a permanent escape hatch rather than a temporary safety valve.

The goal is straightforward: every quarantined test should either be fixed or deleted within a bounded timeframe. This keeps SIGs accountable and prevents the quarantine list from growing unbounded.                                                                                                                                     

The PR only changes docs/quarantine.md — no code or CI changes. I'd appreciate feedback on whether the proposed timelines do work.

Later I will follow up on how the transition for existing quarantined tests should work; I still need to think about that.

--

Kind regards,


Daniel Hiller

He / Him / His

Principal Software Engineer, KubeVirt CI, OpenShift Virtualization

Red Hat

dhi...@redhat.com   

Red Hat GmbH, Registered seat: Werner von Siemens Ring 12, D-85630 Grasbrunn, Germany  
Commercial register: Amtsgericht Muenchen/Munich, HRB 153243,
Managing Directors: Ryan Barnhart, Charles Cachera, Avril Crosse O'Flaherty  

Daniel Hiller

unread,
Apr 30, 2026, 3:47:52 AMApr 30
to kubevirt-dev, Denis Ollier Pinas, Dylan White, Luboslav Pivarc, Jed Lejosne, Orel Misan, Felix Matouschek
Hey all,



On Tue, Apr 28, 2026 at 6:29 PM Daniel Hiller <dhi...@redhat.com> wrote:
Hey folks,                                                  

I've opened a PR to add explicit lifecycle timelines and tracking requirements to our test quarantine process: https://github.com/kubevirt/kubevirt/pull/17633

Why?

Our quarantine process currently defines clear entry and exit criteria, but has no deadlines between them. In practice this means quarantined tests can — and do — remain quarantined indefinitely, sometimes for over a year. Without a fixed timeline or tracking requirements, no mechanism ensures that quarantined tests are actively being worked on.
Thus Quarantine becomes a permanent escape hatch rather than a temporary safety valve.

The goal is straightforward: every quarantined test should either be fixed or deleted within a bounded timeframe. This keeps SIGs accountable and prevents the quarantine list from growing unbounded.                    

OK, after thinking about this I think I got your point: it's about roles and responsibilities here.

The real issue I believe here is that responsibility is not assigned to a specific person, however sig-ci wouldn't be the right group to assign this to.

How about we detail the process further:

* The test is quarantined, receives SIG label, and SIG chair is assigned to the tracker issue
* SIG chair
  * either delegates/reassigns a contributor responsible for handling the test or 
  * keeps assignment themselves
* Assignee decides what to do with the test (fix/delete)
* If the assignee commits to fixing the test, it must be done within the fix window
* After three weeks in sig-ci warns about that the fix window is going to expire within one week
* After four weeks the deletion window begins
* Assignee may ask for an extension of max. two weeks to fix the test
* If that extension expires the test is deleted

An alternative to deleting the test would be creating the tracker issue with release-blocker state, which would block the next release until the issue is being closed.

WDYT?
 
                                                                                                                

The PR only changes docs/quarantine.md — no code or CI changes. I'd appreciate feedback on whether the proposed timelines do work.

Later I will follow up on how the transition for existing quarantined tests should work; I still need to think about that.

--

Kind regards,


Daniel Hiller

He / Him / His

Principal Software Engineer, KubeVirt CI, OpenShift Virtualization

Red Hat

dhi...@redhat.com   

Red Hat GmbH, Registered seat: Werner von Siemens Ring 12, D-85630 Grasbrunn, Germany  
Commercial register: Amtsgericht Muenchen/Munich, HRB 153243,
Managing Directors: Ryan Barnhart, Charles Cachera, Avril Crosse O'Flaherty  


--
-- 
Best,
Daniel

Daniel Hiller

unread,
Apr 30, 2026, 3:49:00 AMApr 30
to kubevirt-dev, Denis Ollier Pinas, Dylan White, Luboslav Pivarc, Jed Lejosne, Orel Misan, Felix Matouschek
On Thu, Apr 30, 2026 at 9:47 AM Daniel Hiller <dhi...@redhat.com> wrote:
Hey all,



On Tue, Apr 28, 2026 at 6:29 PM Daniel Hiller <dhi...@redhat.com> wrote:
Hey folks,                                                  

I've opened a PR to add explicit lifecycle timelines and tracking requirements to our test quarantine process: https://github.com/kubevirt/kubevirt/pull/17633

Why?

Our quarantine process currently defines clear entry and exit criteria, but has no deadlines between them. In practice this means quarantined tests can — and do — remain quarantined indefinitely, sometimes for over a year. Without a fixed timeline or tracking requirements, no mechanism ensures that quarantined tests are actively being worked on.
Thus Quarantine becomes a permanent escape hatch rather than a temporary safety valve.

The goal is straightforward: every quarantined test should either be fixed or deleted within a bounded timeframe. This keeps SIGs accountable and prevents the quarantine list from growing unbounded.                    

OK, after thinking about this I think I got your point: it's about roles and responsibilities here.

The real issue I believe here is that responsibility is not assigned to a specific person, however sig-ci wouldn't be the right group to assign this to.

How about we detail the process further:

* The test is quarantined, receives SIG label, and SIG chair is assigned to the tracker issue
* SIG chair
  * either delegates/reassigns a contributor responsible for handling the test or 
  * keeps assignment themselves
* Assignee decides what to do with the test (fix/delete)
* If the assignee commits to fixing the test, it must be done within the fix window
* After three weeks in sig-ci warns about that the fix window is going to expire within one week
* After four weeks the deletion window begins
* Assignee may ask for an extension of max. two weeks to fix the test
* If that extension expires the test is deleted

One point I forgot to mention: sig-ci is responsible for deleting the test.
 

An alternative to deleting the test would be creating the tracker issue with release-blocker state, which would block the next release until the issue is being closed.

WDYT?
 
                                                                                                                

The PR only changes docs/quarantine.md — no code or CI changes. I'd appreciate feedback on whether the proposed timelines do work.

Later I will follow up on how the transition for existing quarantined tests should work; I still need to think about that.

--

Kind regards,


Daniel Hiller

He / Him / His

Principal Software Engineer, KubeVirt CI, OpenShift Virtualization

Red Hat

dhi...@redhat.com   

Red Hat GmbH, Registered seat: Werner von Siemens Ring 12, D-85630 Grasbrunn, Germany  
Commercial register: Amtsgericht Muenchen/Munich, HRB 153243,
Managing Directors: Ryan Barnhart, Charles Cachera, Avril Crosse O'Flaherty  


--
-- 
Best,
Daniel


--
-- 
Best,
Daniel

Daniel Hiller

unread,
May 6, 2026, 5:56:12 AMMay 6
to kubevirt-dev, Denis Ollier Pinas, Dylan White, Luboslav Pivarc, Jed Lejosne, Orel Misan, Felix Matouschek
Hey all,

On Thu, Apr 30, 2026 at 9:48 AM Daniel Hiller <dhi...@redhat.com> wrote:


On Thu, Apr 30, 2026 at 9:47 AM Daniel Hiller <dhi...@redhat.com> wrote:
Hey all,



On Tue, Apr 28, 2026 at 6:29 PM Daniel Hiller <dhi...@redhat.com> wrote:
Hey folks,                                                  

I've opened a PR to add explicit lifecycle timelines and tracking requirements to our test quarantine process: https://github.com/kubevirt/kubevirt/pull/17633

Why?

Our quarantine process currently defines clear entry and exit criteria, but has no deadlines between them. In practice this means quarantined tests can — and do — remain quarantined indefinitely, sometimes for over a year. Without a fixed timeline or tracking requirements, no mechanism ensures that quarantined tests are actively being worked on.
Thus Quarantine becomes a permanent escape hatch rather than a temporary safety valve.

The goal is straightforward: every quarantined test should either be fixed or deleted within a bounded timeframe. This keeps SIGs accountable and prevents the quarantine list from growing unbounded.                    

OK, after thinking about this I think I got your point: it's about roles and responsibilities here.

The real issue I believe here is that responsibility is not assigned to a specific person, however sig-ci wouldn't be the right group to assign this to.

How about we detail the process further:

* The test is quarantined, receives SIG label, and SIG chair is assigned to the tracker issue
* SIG chair
  * either delegates/reassigns a contributor responsible for handling the test or 
  * keeps assignment themselves
* Assignee decides what to do with the test (fix/delete)
* If the assignee commits to fixing the test, it must be done within the fix window
* After three weeks in sig-ci warns about that the fix window is going to expire within one week
* After four weeks the deletion window begins
* Assignee may ask for an extension of max. two weeks to fix the test
* If that extension expires the test is deleted

During today's Test Quarantine meeting, someone asked about the proposal's progress.

Thank you to those who've chimed in on the PR; I've incorporated your feedback. Please take another look to see if this satisfies you.

Also we agreed to create tracker issues for the quarantined tests to help track the transition towards the policy. We will create one tracker issue per SIG to make it easier tracking the currently quarantined tests until the lifecycle becomes effective.
 

One point I forgot to mention: sig-ci is responsible for deleting the test.
 

An alternative to deleting the test would be creating the tracker issue with release-blocker state, which would block the next release until the issue is being closed.

WDYT?
 
                                                                                                                

The PR only changes docs/quarantine.md — no code or CI changes. I'd appreciate feedback on whether the proposed timelines do work.

Later I will follow up on how the transition for existing quarantined tests should work; I still need to think about that.

--

Kind regards,


Daniel Hiller

He / Him / His

Principal Software Engineer, KubeVirt CI, OpenShift Virtualization

Red Hat

dhi...@redhat.com   

Red Hat GmbH, Registered seat: Werner von Siemens Ring 12, D-85630 Grasbrunn, Germany  
Commercial register: Amtsgericht Muenchen/Munich, HRB 153243,
Managing Directors: Ryan Barnhart, Charles Cachera, Avril Crosse O'Flaherty  


--
-- 
Best,
Daniel


--
-- 
Best,
Daniel


--
-- 
Best,
Daniel

Daniel Hiller

unread,
May 6, 2026, 8:29:36 AMMay 6
to kubevirt-dev, Denis Ollier Pinas, Dylan White, Luboslav Pivarc, Jed Lejosne, Orel Misan, Felix Matouschek
Reply all
Reply to author
Forward
0 new messages