SNR and FAR operators with cluster api k8s cluster

25 views
Skip to first unread message

Hai Wu

unread,
Oct 6, 2024, 11:50:22 PM10/6/24
to medik8s
For the SNR operator to work properly, it seems it would delete unhealthy k8s node from the cluster. For a k8s cluster installed via cluster api, that would be in conflict with cluster api provider, which means SNR operator can't be used for such clusters? 

If using FAR operator, then there's no such concern for cluster api provider installed k8s cluster. 

Is there anything missing in terms of SNR operator? Is this point correct? 

Marc Sluiter

unread,
Oct 7, 2024, 3:59:46 AM10/7/24
to Hai Wu, medik8s
Hey,

it seems you're using an old SNR version, SNR does not delete nodes anymore for quite a while. It either just deletes pods on the unhealthy node, or uses the "out of service" taint if available.

BR, Marc


--
You received this message because you are subscribed to the Google Groups "medik8s" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medik8s+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/medik8s/ccbe9ff0-ea9f-4275-80ba-3720371a25b7n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Sluiter

unread,
Oct 8, 2024, 6:11:40 AM10/8/24
to hai wu, medik8s

On Mon, Oct 7, 2024 at 3:23 PM hai wu <haiw...@gmail.com> wrote:
Thanks Marc for confirming this!

I guess the docs might need to be updated accordingly later. I was
reading the docs here:
https://www.medik8s.io/remediation/self-node-remediation/faq/ and
here: https://www.medik8s.io/remediation/self-node-remediation/how-it-works/,
and there are a few places where it mentions SNR would delete the node
object.

Also Openshift 4.11 has NodeDeletion strategy mentioned here:
https://docs.openshift.com/container-platform/4.11/nodes/nodes/eco-self-node-remediation-operator.html,
but I could not find SNR being mentioned for OpenShift versions higher
than 4.11.

Thanks
Hai

hai wu

unread,
Oct 8, 2024, 9:41:09 AM10/8/24
to Marc Sluiter, medik8s
Thanks Marc for confirming this!

I guess the docs might need to be updated accordingly later. I was
reading the docs here:
https://www.medik8s.io/remediation/self-node-remediation/faq/ and
here: https://www.medik8s.io/remediation/self-node-remediation/how-it-works/,
and there are a few places where it mentions SNR would delete the node
object.

Also Openshift 4.11 has NodeDeletion strategy mentioned here:
https://docs.openshift.com/container-platform/4.11/nodes/nodes/eco-self-node-remediation-operator.html,
but I could not find SNR being mentioned for OpenShift versions higher
than 4.11.

Thanks
Hai

On Mon, Oct 7, 2024 at 2:59 AM Marc Sluiter <mslu...@redhat.com> wrote:
>

Or Raz

unread,
Oct 8, 2024, 9:49:17 AM10/8/24
to hai wu, medik8s, Marc Sluiter
Please be aware that there are two more (optional) remediators that you can use with NHC: Fence Agents Remediation, and Machine Deletion Remediation.
Their documentation is on the same page (and under the product Workload Availability for Red Hat OpenShift).

We need to update our upstream documentation. Thank you for bringing this to our attention.

Best regards,
OR


Reply all
Reply to author
Forward
0 new messages