Compatibility rabbitmq with glusterfs StorageClass

26 views
Skip to first unread message

Leo Retrain

unread,
Nov 9, 2020, 4:19:59 AM11/9/20
to rabbitmq-users

Hello,
I have an issue in such environment then i wonder if rabbitmq is compatible with glusterFs storageClass.

I explain the context & issue :

I work on a cluster using glusterfs storageClass, i run rabbitmq with 3 replicas.
For each there is a pvc declared ( in glusterfs ) on :
          - name: data
            mountPath: /var/lib/rabbitmq/
The issue is that rabbitmq replicas need to restart 2 3 time before stabilizing ~15 min
And i get these error in logs :

"Cannot open dets table",schema,[{file,"/var/lib/rabbitmq/mnesia/rabbit@ornery-clownfish-crmq-1/"Cannot open dets table",schema,[{file,"/var/lib/rabbitmq/mnesia/rabbit@ornery-clownfish-crmq-1/schema.DAT"

 I also see on logs after restart :

dets: file "/var/lib/rabbitmq/mnesia/rabbit@ornery-clownfish-crmq-1/schema.DAT" not properly closed, repairing ...

I also check files in replicas that are not working ( just before they restart ) and it seems that there are missing file.

I did the same test with an other pvc ( still in glusterfs ) on :

          - name: log
            mountPath: /var/log/rabbitmq/

And i don't get any issue ...

And if i use an other StorageClass there are no issue at all.

I wonder if you already get a similar issue,
If you know how to fix it or if there are effectively some limitation with glusterFs.
We are suspecting like multiple rabbitmq processes accessing same file maybe because gluserfs is sharing the data of the 3 pod.

One more information the 3 replicas are launch in sequential, one after the others. ( after readiness prob 1/1 )

Thanks a lot.

Regards,
Léo

Alex K

unread,
Nov 9, 2020, 5:14:53 AM11/9/20
to rabbitm...@googlegroups.com
I guess since you are sharing /var/lib/rabbitmq at all nodes, and since each node when clustered uses this dir in its own way, you are facing race/locking issues. I am using Rabbitmq on top gluster with /var/lib/rabbitmq shared between all nodes, but with only one RabbitMQ instance running. This as a more simple clustering approach, at least for me which am not familiar yet with the native RabbitMQ clustering features.

Thanks a lot.

Regards,
Léo

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/1a64f720-a909-496a-b463-efcd8099510en%40googlegroups.com.

Leo Retrain

unread,
Nov 9, 2020, 9:16:49 AM11/9/20
to rabbitmq-users
Thanks for your answer,
This is what i was afraid of.
I'm gonna wait a little if somebody else have already try rabbitmq with multiple replicas in glusterfs.
We never know there is maybe a way.

Leo Retrain

unread,
Nov 18, 2020, 4:12:55 AM11/18/20
to rabbitmq-users

Hello,

We still looking for the issue on our side and we found that it's finally not a race/locking issue.

It can't be because, yes we use glusterfs so we might think that the data is shared between the 3 pod but in our case we are declaring a pvc for each pod.

Then each pod have is own data.

The issue still coming from the file schema.DAT because the file is corrupted.

This file is "Schema Definition Export and Import" for more information check : https://www.rabbitmq.com/definitions.html#overview

the thing is that his file is storing information "metadata or topology. Users, vhosts, queues, exchanges, bindings, runtime parameters ..."

And the interesting part is : "Definitions are stored in an internal database and replicated across all cluster nodes. Every node in a cluster has its own replica of all definitions. When a part of definitions changes, the update is performed on all nodes in a single transaction. This means that in practice, definitions can be exported from any cluster node with the same result."

Then in our case because our file is corrupted when crmq-1 try to access this file to update it's own it fail then restart.

The unknown part is now why this file is corrupted when we use glusterfs.

Reply all
Reply to author
Forward
0 new messages