rabbitmq operator pod reconciling with rabbitmq clusters in multiple namespaces

Anjitha M

unread,

Mar 19, 2021, 12:06:08 PM3/19/21

to rabbitmq-users

Hi all,

I would really appreciate if someone could please clarify my query regarding to the rabbitmq pod behaviour.

We are using rabbitmq cluster operator v0.8.0 with rabbitmq image 3.8.5 (or in some cases 3.8.3) in GKE clusters.

We have noticed that the rabbitmq operator pod essentially has cluster level permissions so that the operator pod is able to reconcile to rabbitmq clusters deployed in multiple different namespaces.

We were working on the idea that the operator pod in one namespace would not be able to affect the rabbitmq clusters in other namespaces.

A few consequences that we have noticed due to this is:

1. Operator pod repeatedly crashing with OOM Killed error since it is essentially managing more than its capacity. (At the time of this observation, operator pod already had 8GB of memory allocated; I assume it is more than enough for proper functioning of one pod).

2. Operator pod trying to delete rabbitmq resources which doesn’t exist anymore in the cluster. For example trying to delete a rabbitmq cluster in another namespace which were already deleted. This results in error messages in the operator pod logs and it has become difficult to figure out the reason for error messages.

Now, my query is two-fold -

a) If there are multiple namespaces with rabbitmq operator deployed in the same GKE cluster, do we need to be worried that one operator pod can control (update/delete) rabbitmq resources from other namespaces? If so, how can we combat this issue - would giving unique names to rabbitmq clusters and operator pods help?

b) Is there a future scope of this operator providing namespace isolation as a feature so the operator pod is only able to control resources in a particular namespace? Does the updated operator version solve this issue?

Any and all information helps.

Thanks,

Anjitha M.

M K

unread,

Mar 19, 2021, 1:49:10 PM3/19/21

to rabbitmq-users

I have asked a few folks who work on the Operator to comment. In the meantime, please upgrade

to RabbitMQ 3.8.14. You are about a year behind.

M K

unread,

Mar 19, 2021, 1:50:05 PM3/19/21

to rabbitmq-users

oh, and the latest Kubernetes Operator release is 1.5.0 [1].

1. https://github.com/rabbitmq/cluster-operator/releases/tag/v1.5.0

Message has been deleted

M K

unread,

Mar 19, 2021, 4:04:55 PM3/19/21

to rabbitmq-users

I'm posting this on behalf of my VMware colleague, David Ansari, who responded

to these questions on our team's Slack.

You can set environment variable OPERATOR_SCOPE_NAMESPACE to have the operator watch a single namespace.
See here.
So, you could deploy one RabbitMQ operator per namespace which then only reconciles RabbitMQ clusters in its namespace.Two related issues which aren’t implemented:
https://github.com/rabbitmq/cluster-operator/issues/166: Currently, it’s not possible to have the operator watch multiple namespace but not all.
https://github.com/rabbitmq/cluster-operator/issues/212: Resource bindings need to be updated8GB memory for the operator pod is a lot. How many RabbitMQ clusters does your operator handle?For your 2nd point “trying to delete a rabbitmq cluster in another namespace which were already deleted. This results in error messages in the operator pod logs and it has become difficult to figure out the reason for error messages.“:
Would you mind opening an issue in https://github.com/rabbitmq/cluster-operator with steps to reproduce?
I just tried to reproduce. The operator indeed logs some errors when just deleting a namespace.
However, once the namespace is already deleted, deleting the RabbitMQ cluster results in Error from server (NotFound): rabbitmqclusters.rabbitmq.com “r1” not foundwithout the operator logging errors.

On Friday, March 19, 2021 at 7:06:08 PM UTC+3 anjit...@gmail.com wrote:

Anjitha M

unread,

Mar 30, 2021, 3:31:41 AM3/30/21

to rabbitmq-users

Hi,

@michael.s.klishin thanks for all the information, this really helps.

Can you clarify a query about this statement?

"You can set environment variable OPERATOR_SCOPE_NAMESPACE to have the operator watch a single namespace.
See here."

By this, did you mean that I would modify the operator deployment file (as part of https://github.com/rabbitmq/cluster-operator/releases/download/v1.5.0/cluster-operator.yml ) to add this env variable and then provide the value for the variable as whichever namespace I want the operator to watch?