Kubernetes persistence volume replication strategy

487 views
Skip to first unread message

Abhinav Kulkarni

unread,
Jun 30, 2021, 5:17:12 AM6/30/21
to rabbitmq-users
Hi,

I managed to spin up a simple hello-world RabbitMQ cluster on Kubernetes and I noticed that the PVC access mode is RWO (ReadWriteOnce). This means, only one pod can read and write from this PV. If scale the cluster to, say 5 replicas, 5 different PVs are created. 

I have a couple of questions regarding this:
  1. Is the data shared across all the 5 PVs?
  2. Is the data replicated across all the 5 PVs?
  3. If I gracefully shutdown one pod, what happens to the data - does it get moved to remaining 4 PVs?
  4. What if a PV crashes due to network failure, disk failure, etc.? Is the data lost forever in that case?
Thanks!

Michal Kuratczyk

unread,
Jul 5, 2021, 5:47:41 AM7/5/21
to rabbitm...@googlegroups.com
Hi,

1. You can just use the RabbitMQ Operator to easily deploy RabbitMQ to Kubernetes
2. Each node/pod should have its own PV
3. Some data is always replicated between RabbitMQ nodes (metadata about vhosts, queues, etc) and some data can be replicated (quorum queues, etc) but it's each node's responsibility to persist the data locally
4. Since each node should have its own PV, an unavailable PV should only cause unavailability of one node. That node will resynchronize its state once it joins the cluster again.

Best,

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/a8c0ca88-fae9-4e2d-b11a-b6465ff3c832n%40googlegroups.com.


--
Michał
RabbitMQ team

Albion

unread,
Jul 5, 2023, 11:03:18 AM7/5/23
to rabbitmq-users

Hi all,

So I have also a question related to this.
I am trying to created Persistent Volumes on vSphere for my RabbitMQ cluster. As you know the PV Claims for rabbitmq cluster use accessmode: ReadWriteOnce.
Based on the official documentation: 

ReadWriteOnce
the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.

But what will happen, if the pod where one of the RabbitMQ nodes lives, is restarted, and then that "same" pod is started on another Kubernetes Node?
That means that the RabbitMQ node now lives in another Kubernetes Node and does not have "Write Access" to its earlier PV, true?
Which consequences would this have for the RabbitMQ cluster in general?

Thank you

Albion

Mark Weaver

unread,
Jul 5, 2023, 11:40:59 AM7/5/23
to rabbitm...@googlegroups.com
On 05/07/2023 16:03, Albion wrote:

Hi all,

So I have also a question related to this.
I am trying to created Persistent Volumes on vSphere for my RabbitMQ cluster. As you know the PV Claims for rabbitmq cluster use accessmode: ReadWriteOnce.
Based on the official documentation: 

ReadWriteOnce
the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.

But what will happen, if the pod where one of the RabbitMQ nodes lives, is restarted, and then that "same" pod is started on another Kubernetes Node?
That means that the RabbitMQ node now lives in another Kubernetes Node and does not have "Write Access" to its earlier PV, true?
Which consequences would this have for the RabbitMQ cluster in general?


This depends on your storage. kubernetes has multiple different kinds of storage, it can support both node local storage and network attached storage. In the former case, if a PV is node local then kubernetes will only schedule (run) the pod on the node that has the local storage attached. In the latter case it can be scheduled elsewhere, and of course since the storage is network attached (and almost certainly replicated) then it doesn't matter that the pod no longer runs on the original node. Since these are persistent volumes, then the data will always be present  -- if you have local storage and the node is unavailable the pod will not start. This is ok if you have enough RabbitMQ members (for example have 3 nodes with local storage, then offline one -- this is fine as there's still a quorum).

vSphere seems to have a specific kubernetes CSI:

https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/index.html

I don't know how this works, but it looks like it provides some network storage (one of the pictures says NFS). It's probably worth figuring out how to create a node local or a NAS based PVC using this CSI. For small numbers of cluster nodes I think node local makes sense -- RabbitMQ already replicates the data between cluster members so there doesn't seem to be much point backing that with a replicated storage solution. For larger number of nodes it might not be worth having to micromanage things that much.

In our use case we have real hardware and use OpenEBS LocalPV on top of LVM, this works well for us, and we have a small number of nodes so we just schedule RabbitMQ on each one.

Hope that helps,

Mark
Reply all
Reply to author
Forward
0 new messages