Go Operator Resource Ownership

400 views
Skip to first unread message

Nick Zelei

unread,
Jul 22, 2022, 12:51:00 PM7/22/22
to Operator Framework
Hey all,
I'm trying to figure out the best way to handle resource ownership with an operator I'm building.

I can't figure out why the resources my operator creates don't have the metadata.ownerReferences associated with them.
I want this so that if the main resource my operator controls gets deleted, it automatically handles cleaning up those sub-resources that it creates.

I assumed that configuring the SetupWithManager function would be sufficient, but it doesn't seem to tie ownership to resources and I'm having to write code to manually handle the resource cleanup during the reconcile method.

Example:
func (r *EnvironmentReconciler) SetupWithManager(mgr ctrl.Manager) error {
  return ctrl.NewControllerManagedBy(mgr).
    For(&entitiesv1alpha1.Environment{}).
    Owns(&corev1.Namespace{}).
    Owns(&istionetworkingv1beta1.Gateway{}).
    Owns(&istiosecurityv1beta1.PeerAuthentication{}).
    Owns(&istiosecurityv1beta1.AuthorizationPolicy{}).
    Owns(&certmgrv1.Certificate{}).
    Complete(r)
}

Basically my Environment resource manages the resources that are marked by the Owns function. I assumed that was enough to have it attach ownership metadata to it, but I must be misunderstanding how all of this is connected, and I couldn't find much information on the docs.

Any help is appreciated!

Nick Carboni

unread,
Jul 22, 2022, 2:03:36 PM7/22/22
to Nick Zelei, Operator Framework
I've typically done this manually on the resources my operator creates using SetControllerReference from controller-runtime [1]


--
You received this message because you are subscribed to the Google Groups "Operator Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to operator-framew...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/operator-framework/1053fa53-9897-4b30-a6cf-110caea7bb02n%40googlegroups.com.

Nick Zelei

unread,
Jul 22, 2022, 4:57:59 PM7/22/22
to Operator Framework
I see! Yes I think that is what I am looking for.
I was hoping that was (somehow) handled automatically by the system, but appears not so. 
That's ok, as long as there is a way to handle it. Of course, now I'm not entirely sure what that manager setup configuration is for, but maybe that's another research project. Still new to the operator system.

Anyways, thanks for the help!

Michael Hrivnak

unread,
Jul 22, 2022, 5:20:38 PM7/22/22
to Nick Zelei, Operator Framework
On Fri, Jul 22, 2022 at 4:58 PM Nick Zelei <ni...@usenucleus.cloud> wrote:
I see! Yes I think that is what I am looking for.
I was hoping that was (somehow) handled automatically by the system, but appears not so. 
That's ok, as long as there is a way to handle it. Of course, now I'm not entirely sure what that manager setup configuration is for, but maybe that's another research project. Still new to the operator system.

The "Owns" calls cause your controller to not only watch resources of that type, but to also reconcile the owning resource in response to events on the owned resource.

For example in your case, if a Certificate resource gets modified or for any reason generates an event, and that Certificate resource has an owner reference to an Environment resource, then your controller will reconcile the Environment. That gives your controller an opportunity to correct config drift, re-create the Certificate if someone deleted it, etc.

--

Michael Hrivnak

Senior Principal Software EngineerRHCE 

Red Hat

Nick Zelei

unread,
Jul 22, 2022, 6:17:06 PM7/22/22
to Operator Framework
Makes a lot of sense.

In doing some testing, I am actually not seeing this at all though.

As a simple example:

I have a resource the operator manages called DockerLogin - when that is created, it creates an accompanying k8s secret with docker credentials in it.
If I update the DockerLogin resource, the reconciler kicks in. However, if I update or delete the Secret, nothing happens. It's also not attaching the owner references to it (I updated the code to make the call to SetControllerReference)

err = ctrl.SetControllerReference(dl, secret, r.Scheme)
if err != nil {
  logger.Error(err, "unable to set controller reference on create")
}

I should note: The operator is in a completely separate namespace than the resources it is managing. Both the DockerLogin and its accompanying Secret are in the same namespace (Namespace B) while the operator is in Namespace A.

// SetupWithManager sets up the controller with the Manager.
func (r *DockerLoginReconciler) SetupWithManager(mgr ctrl.Manager) error {
  return ctrl.NewControllerManagedBy(mgr).
    For(&entitiesv1alpha1.DockerLogin{}).
    Owns(&corev1.Secret{}).
    Owns(&corev1.ServiceAccount{}).
    Complete(r)
}

Michael Hrivnak

unread,
Jul 22, 2022, 6:26:00 PM7/22/22
to Nick Zelei, Operator Framework
On Fri, Jul 22, 2022 at 6:17 PM Nick Zelei <ni...@usenucleus.cloud> wrote:
Makes a lot of sense.

In doing some testing, I am actually not seeing this at all though.

As a simple example:

I have a resource the operator manages called DockerLogin - when that is created, it creates an accompanying k8s secret with docker credentials in it.
If I update the DockerLogin resource, the reconciler kicks in. However, if I update or delete the Secret, nothing happens. It's also not attaching the owner references to it (I updated the code to make the call to SetControllerReference)

err = ctrl.SetControllerReference(dl, secret, r.Scheme)
if err != nil {
  logger.Error(err, "unable to set controller reference on create")
}

I should note: The operator is in a completely separate namespace than the resources it is managing. Both the DockerLogin and its accompanying Secret are in the same namespace (Namespace B) while the operator is in Namespace A.

// SetupWithManager sets up the controller with the Manager.
func (r *DockerLoginReconciler) SetupWithManager(mgr ctrl.Manager) error {
  return ctrl.NewControllerManagedBy(mgr).
    For(&entitiesv1alpha1.DockerLogin{}).
    Owns(&corev1.Secret{}).
    Owns(&corev1.ServiceAccount{}).
    Complete(r)
}

Are you calling "r.client.Update(ctx, secret)" after setting the controller reference? The SetControllerReference call does not actually save the resource for you; it only mutates its in-memory state.

If the code is open, feel free to send a github link.

Nick Zelei

unread,
Jul 22, 2022, 6:37:23 PM7/22/22
to Operator Framework
Huzzah! I missed that part in the docs. 
I updated the logic to set the ref prior to a create/update and it's setting the owner reference now!

And making changes to the secret now result in proper reconciler updates. Wonderful!
This is all making tons of sense now.

This works great for resources that exist in the same namespace right? Meaning: object ownership only works for resources that exist in the same namespace. And a finalizer needs to be used for any cross-ns resource cleanup?

The usecase here is that I need to create a certificate in the istio-system namespace for TLS, but the main managed resource is created in a separate namespace.

P.S. Code is unfortunately not open to the public, otherwise I'd send a link!

Thanks for the help and quick turnaround!

Nick Zelei

unread,
Jul 22, 2022, 7:19:15 PM7/22/22
to Operator Framework
The other option is to of course just write the code to manually clean it up too, which is probably the easier solution.

Camila Macedo

unread,
Jul 23, 2022, 7:14:42 AM7/23/22
to Nick Zelei, Operator Framework
Hi Nick, 

The best approach is you ensure that any resource created in the reconciliation will set the ownerRef. In this way, you let the Kubernetes API do its job to remove all resources that depend on the custom resource that you are reconciling when it is removed. More info: https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/. Also, it will allow you to "watch/observer" the changes properly, see: 

How to set the ownerRef:
// Set the ownerRef for the Deployment
// More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/
	if err := ctrl.SetControllerReference(busybox, dep, r.Scheme); err != nil {
		return nil, err
	}

Now, let's understand what the code shared by you does:

Example:
func (r *EnvironmentReconciler) SetupWithManager(mgr ctrl.Managererror {
  return ctrl.NewControllerManagedBy(mgr).
    For(&entitiesv1alpha1.Environment{}).
    Owns(&corev1.Namespace{}).
    Owns(&istionetworkingv1beta1.Gateway{}). 
    Owns(&istiosecurityv1beta1.PeerAuthentication{}).
    Owns(&istiosecurityv1beta1.AuthorizationPolicy{}).
    Owns(&certmgrv1.Certificate{}).
    Complete(r)
}

The above code will configure the "watches" feature. Therefore it means that when events regarding creating/updating/deleting these resources be raised, the reconciliation will be re-trigged. You'll be able to ensure the desired state on the cluster. 
  • For(&myCRDKind) specifies the custom resource ( kind of custom resource definition ) as the primary resource to watch. The reconciliation will be executed when a CR of this Kind (CRD) is created, updated or deleted. 
Example: For(&cachev1alpha1.Memcached{}). 
>> You will find it in the SDK tutorial. It will ensure that the manager will watch the changes in resources of this Kind. That means when we create a CR for the Memcached KIND it will trigger the reconciliation and we will be able to ensure that we have a Deployment on the cluster to run the Operand image 
  • For Owns(&Kind of the resource that depends on the CR to ensure the desired state of what the CR represents on the cluster) specifies the resource as a secondary resource to watch. Thereby, the reconciliation will be executed when resources of this Kind owned by (with the ownerRef) are created, updated or deleted.
Example: Owns(&appsv1.Deployment{}).
>> You will find it in the SDK tutorial. It will ensure that the manager will watch the changes in the Deployments resources which have the ownerRef ( which is owned/created by ). For example, if you change the number of replicas the reconciliation will be re-trigged. For the tutorial example, we ensure that the number of replicas is the same as what is specified in the CR via the spec size. 

See the tutorial: https://sdk.operatorframework.io/docs/building-operators/golang/

By using the Operators, it’s possible not only to provide all expected resources but also to manage them dynamically, programmatically, and at runtime. To illustrate this idea, imagine if someone accidentally changed a configuration or removed a resource by mistake; in this case, the Operator could fix it without any human intervention.  

Therefore, the goal is to always ensure a desired state on the cluster via idempotent solutions, and because of this, we "watch" the required resources on the cluster.

I recommend you check the common suggestions doc under the best practices section on the SDK website, see: https://sdk.operatorframework.io/docs/best-practices/common-recommendation/.

I hope that can help you out.

Cheers, 

CAMILA MACEDO

SR. SOFTWARE ENGINEER 

RED HAT Operator framework

Red Hat UK

She / Her / Hers

IM: cmacedo

I respect your work-life balance. Therefore there is no need to answer this email out of your office hours.





--
You received this message because you are subscribed to the Google Groups "Operator Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to operator-framew...@googlegroups.com.

Nick Zelei

unread,
Jul 27, 2022, 1:21:25 PM7/27/22
to Operator Framework
Appreciate the detailed response Camila!

Does this work for resources that exist across namespaces as well? 
For instance the CRD is placed in Namespace A, but it owns a resource that is in Namespace B?

Camila Macedo

unread,
Jul 28, 2022, 5:55:54 AM7/28/22
to Nick Zelei, Operator Framework
Hi Nick, 

Appreciate the detailed response Camila!
Thank you for the feedback. I am glad that was helpful.

Does this work for resources that exist across namespaces as well? 

By default all projects scaffolded with the tool are cluster-scope.  That means the manager will watch/cache the whole cluster. So, yes it will work across all namespaces on the cluster. For more info see: https://sdk.operatorframework.io/docs/building-operators/golang/operator-scope/

For instance the CRD is placed in Namespace A, but it owns a resource that is in Namespace B?

The CRD is what defines the API and by default, it is scaffolded as cluster scoped
Therefore, after you apply it to the cluster it is valid for the whole cluster. 

What you apply on the namespaces are the Custom Resources ( like an instance of the CRD ). 
I'd like to recommend you check out the doc: https://book.kubebuilder.io/cronjob-tutorial/gvks.html for a better understanding. 

Shawn Hurley

unread,
Jul 28, 2022, 12:04:35 PM7/28/22
to Camila Macedo, Nick Zelei, Operator Framework
Hello All,

I don’t know if this is fixed in a newer version, but there was a thing where if you have two namespaces objects and they are in different namespaces, and one owns the other. Garbage collection could delete the owned resource.

Sorry if this is unhelpful or I am misunderstanding the question.

Thanks,

Shawn Hurley 

On Jul 28, 2022, at 5:56 AM, Camila Macedo <cma...@redhat.com> wrote:



Alex Greene

unread,
Jul 29, 2022, 5:28:11 PM7/29/22
to Shawn Hurley, Camila Macedo, Nick Zelei, Operator Framework
Does this work for resources that exist across namespaces as well? 
For instance the CRD is placed in Namespace A, but it owns a resource that is in Namespace B?



--
Alexander Greene
He - Him - His
Senior Software Developer
IRC: agreene

Reply all
Reply to author
Forward
0 new messages