Prevent SNR from installing/ remediating control plane nodes

31 views
Skip to first unread message

JamesG

unread,
Oct 15, 2025, 3:42:56 PMOct 15
to medik8s
Is it possible via configuration to disable SNR control plane remediation and/ or prevent daemonsets from being installed? I would like to prevent control plane remediation for right now. Ive been reading through the documentation and only see how to add additional tolerations not remove existing defaults.

Thanks!
James

Carlo Lobrano

unread,
Oct 16, 2025, 3:28:52 AMOct 16
to medik8s
Hi James,

I think you could set the "remediation.medik8s.io/exclude-from-remediation" label on the nodes you want, and then manually delete the daemonsets on the same nodes. They won't be re-deployed, as per daemonset nodeAffinity configuration

```
// https://github.com/medik8s/self-node-remediation/blob/0bfa3d8cc4f3191c831d36b702aed4b32b7ef625/install/self-node-remediation-deamonset.yaml#L29-L37
affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: remediation.medik8s.io/exclude-from-remediation
                  operator: NotIn
                  values:
                  - "true"
```

JamesG

unread,
Oct 17, 2025, 9:29:06 AMOct 17
to medik8s
Close but no luck
It looked like it was going to work. Applying the label did remove the daemonset from the master nodes but the remaining worker daemonsets started crashlooping after they were rolled.

Error for reference
```

ERROR    setup    problem running manager    {"error": "Node master-00 has no matching Pod"} main.main /app/self-node-remediation/main.go:167 runtime.main /usr/lib/golang/src/runtime/proc.go:272

 ```


self-node-remediation v0.10.2

ocp 4.17.26/ 4.18.22


Carlo Lobrano

unread,
Oct 17, 2025, 10:46:11 AMOct 17
to medik8s
{"error": "Node master-00 has no matching Pod"}

Did that actually cause any problem?
That error message shouldn't prevent the other nodes to work

JamesG

unread,
Oct 21, 2025, 5:57:46 AMOct 21
to medik8s
It did not initially cause any problems but once the remaining datasets rolled, they started crashlooping. It was not until I brought the masters back online that the crashlooping stopped. I observed this on a few clusters I tested on. I can test applying the changes again.

Carlo Lobrano

unread,
Oct 22, 2025, 2:37:49 AMOct 22
to medik8s
I see, could you tell me which SNR version you are running? You might have incurred in a problem we noticed a couple of weeks ago (specifically the "no matching Pod" causing issues)

JamesG

unread,
Oct 22, 2025, 11:59:23 AMOct 22
to medik8s
Apologies. I thought I had added version info
OCP 4.17.26,  4.18.22
SNR v0.10.2

Carlo Lobrano

unread,
Oct 28, 2025, 10:40:14 AMOct 28
to medik8s
Sorry for the delay, and thanks for the information. There is indeed a bug in that version of SNR which should explain this specific problem and also a fix already in main branch
Reply all
Reply to author
Forward
0 new messages