Hi all,
I would really appreciate if someone could please clarify my query regarding to the rabbitmq pod behaviour.
We are using rabbitmq cluster operator v0.8.0 with rabbitmq image 3.8.5 (or in some cases 3.8.3) in GKE clusters.
We have noticed that the rabbitmq operator pod essentially has cluster level permissions so that the operator pod is able to reconcile to rabbitmq clusters deployed in multiple different namespaces.
We were working on the idea that the operator pod in one namespace would not be able to affect the rabbitmq clusters in other namespaces.
A few consequences that we have noticed due to this is:
1. Operator pod repeatedly crashing with OOM Killed error since it is essentially managing more than its capacity. (At the time of this observation, operator pod already had 8GB of memory allocated; I assume it is more than enough for proper functioning of one pod).
2. Operator pod trying to delete rabbitmq resources which doesn’t exist anymore in the cluster. For example trying to delete a rabbitmq cluster in another namespace which were already deleted. This results in error messages in the operator pod logs and it has become difficult to figure out the reason for error messages.
Now, my query is
two-fold -
a) If there are multiple namespaces with rabbitmq operator deployed in the same GKE cluster, do we need to be worried that one operator pod can control (update/delete) rabbitmq resources from other namespaces? If so, how can we combat this issue - would giving unique names to rabbitmq clusters and operator pods help?
b) Is there a future scope of this operator providing namespace isolation as a feature so the operator pod is only able to control resources in a particular namespace? Does the updated operator version solve this issue?
Any and all information helps.
Thanks,
Anjitha M.