Copy Mnesia Data

564 views
Skip to first unread message

Kushagra Bindal

unread,
Dec 23, 2019, 7:53:39 AM12/23/19
to rabbitm...@googlegroups.com
Hi,

We are doing some POC around behavior of 3.8.2 version around Blue/Green strategy. For this we are working on migrating mnesia data from Blue/Green in my google deployment.

I was able to successfully detach and attach the data volume successfully. But green deployment is not populated with required data already present in Blue deployment.

Please help me in identifying the root cause of this.

And if we have step identified to perform this migration then please let me know.

--
Regards,
Kushagra Bindal

Kushagra Bindal

unread,
Dec 23, 2019, 9:45:00 AM12/23/19
to rabbitm...@googlegroups.com
In addition to that, I want to highlight few more items.
1. It is our preferred approach to migrate data through disk data migration as I mentioned earlier.
2. We are using google cloud deployment in our project. So far we have performed below activities around the same.
a. We create the mnesia backup through google api.
2. On Green deployment we detached the actual disk and import the snapshot disk of blue environment.
3. After our changes RabbitMQ UI of green environment become unresponsive and was throwing "{"error":"Object Not Found","reason":"Not Found"}".
4. We are working in docker environment, and so we had to restart the docker to get the RabbitMQ UI working. 
5. We have observed that by doing so existing mqowner/mqowner user itself.
6. After creating the mqowner user again we found that none of the metadata (user/vhost/queues etc) from blue got imported properly and we are having an empty RabbitMQ.

We found one URL  https://www.rabbitmq.com/backup.html#definitions-export, we tried to follow the steps mentioned. 
While executing rename_cluster_node command we got below error.

Error: this command requires the target node to be stopped.
Arguments given:
        rename_cluster_node rabbit@<<BLUE_NODE>> rabbit@<<GREEN_NODE>>
 
Usage
 
rabbitmqctl [--node <node>] [--longnames] [--quiet] rename_cluster_node <oldnode1> <newnode1> [oldnode2] [newnode2] ...

We have executed rabbitmqctl stop_app command before triggring above command.

Please let us know if we are missing anything in this.

  
--
Regards,
Kushagra Bindal

Johan Rhodin

unread,
Dec 23, 2019, 10:42:00 AM12/23/19
to rabbitm...@googlegroups.com
There is usually no need to copy "mnesia data" between the clusters in a blue/green upgrade scenario. You can export the definitions from the old cluster into the new cluster without manually trying to copy mnesia files. See the official guide [1] and an example of queue federation [2].

/Johan

[1] https://www.rabbitmq.com/blue-green-upgrade.html
[2] https://www.cloudamqp.com/blog/2015-07-08-migrate-between-plans-rabbitmq-queue-federation.html
> --
> You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CALGrN%2B9uNh-Ff71zxp8jY3o5dDANNhOmi-c440Go7jxe%2BqA7eQ%40mail.gmail.com.

Kushagra Bindal

unread,
Dec 24, 2019, 12:03:19 AM12/24/19
to rabbitm...@googlegroups.com
Hi Johan,

Thanks for your response. But we are actually using the same to shovel our data.

Due to large size of my metadata we want to migrate simply the mnesia from Blue to Green environment. 

We are doing with few of the third party tools and it is working properly with those tools. Do we have any such feature support from RabbitMQ side where we can copy mnesia data directly from Blue to Green environment.




--
Regards,
Kushagra Bindal

Kushagra Bindal

unread,
Dec 24, 2019, 7:30:02 AM12/24/19
to rabbitm...@googlegroups.com
Hi,


We are trying to move Mnesia persistent db from one node to another and perform blue/green switch. For this we copy all the metadata at path “/var/lib/rabbitmq/mnesia/” from node A and move it to node B on same path.

 

Then we used the “rename_cluster_node” to get all the Q’s, exchange etc. We followed below steps:

 

Single node Stanalone rabbitmq:

 

  • rabbitmqctl shutdown

            Output:  Shutting down RabbitMQ node rabbit@devyashish-mayank running at PID 16735

Waiting for PID 16735 to terminate

RabbitMQ node rabbit@devyashish-mayank running at PID 16735 successfully shut down

 

  • rabbitmqctl rename_cluster_node rabbit@<<OLD_NODE>>  rabbit@<<NEW_NODE>>

            Output: Renaming cluster nodes:

rabbit@<<OLD_NODE>> -> rabbit@<<NEW_NODE>>

 

Impersonating node: rabbit@<<OLD_NODE>>... done

Mnesia directory  : /var/lib/rabbitmq/mnesia/rabbit@<<OLD_NODE>>

Impersonating node: rabbit@<<NEW_NODE>>... done

Mnesia directory  : /var/lib/rabbitmq/mnesia/rabbit@<<OLD_NODE>>

 

  • rabbitmqctl status

OutputError: unable to perform an operation on node 'rabbit@ <<OLD_NODE>> '. Please see diagnostics information and suggestions below.

 

Most common reasons for this are:

 

* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)

* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)

* Target node is not running

 

In addition to the diagnostics info below:

 

* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more

* Consult server logs on node rabbit@ <<OLD_NODE>>

* If target node is configured to use long node names, don't forget to use --longnames with CLI tools

 

DIAGNOSTICS

===========

 

attempted to contact: ['rabbit@ <<OLD_NODE>> ']

 

rabbit@ <<OLD_NODE>> :

  * connected to epmd (port 4369) on <<OLD_NODE>>

  * epmd reports: node 'rabbit' not running at all

                  no other nodes on <<OLD_NODE>>

  * suggestion: start the node

 

Current node details:

* node name: 'rabbitmqcli-8318-rabbit@<<OLD_NODE>> '

* effective user's home directory: /var/lib/rabbitmq

* Erlang cookie hash: xkhwV9fuCJ5pw7E3BSr4dw==

 

  • service rabbitmq-server restart



Even after executing all these commands successfully we were not able to get the Q’s, exchange or any metadata on new node.

When again we tried executing the “rename_cluster_node” command then we got below error :

 

  • rabbitmqctl rename_cluster_node rabbit@ <<OLD_NODE>>   rabbit@ <<NEW_NODE>>  

Renaming cluster nodes:

rabbit@<<OLD_NODE>>   -> rabbit@ <<NEW_NODE>> 

 

Error:

{:rename_failed, {:rename_in_progress, 'Restart node under old name to roll back'}}

 

 

Single node Docker Stanalone rabbitmq:

 

Apart from the above issue we are also facing issue in our docker environment where the container stops when we use execute  “rabbitmqctl shutdown” and the automatically gets restarted before we execute  the “rename_cluster_node” command. We are getting error for the same as below.

 

·       docker exec -it c0a57e3cd739 rabbitmqctl shutdown

warning: the VM is running with native name encoding of latin1 which may cause Elixir to malfunction as it expects utf8. Please ensure your locale is set to UTF-8 (which can be verified by running "locale" in your shell)

Shutting down RabbitMQ node rabbit@ <<NEW_NODE>> running at PID 197

Waiting for PID 197 to terminate

·       docker exec -it c0a57e3cd739 rabbitmqctl rename_cluster_node rabbit<<OLD_NODE>>   rabbit@<<NEW_NODE>> 

warning: the VM is running with native name encoding of latin1 which may cause Elixir to malfunction as it expects utf8. Please ensure your locale is set to UTF-8 (which can be verified by running "locale" in your shell)

Renaming cluster nodes:

rabbit@ <<OLD_NODE>> -> rabbit@ <<NEW_NODE>>

 

  * Impersonating node: rabbit@ <<OLD_NODE>> ...Error:

{:rename_failed, {:badmatch, {:error, {{:shutdown, {:failed_to_start_child, :net_kernel, {:EXIT, :nodistribution}}}, {:child, :undefined, :net_sup_dynamic, {:erl_distribution, :start_link, [[:"rabbit@ <<OLD_NODE>> ", :shortnames], false]}, :permanent, 1000, :supervisor, [:erl_distribution]}}}}}

 

After this on trying to re-execute  “rename_cluster_node” command we get error as below:

 

·       docker exec -it c0a57e3cd739 rabbitmqctl rename_cluster_node rabbit@ <<OLD_NODE>>   rabbit@<<NEW_NODE>> 

warning: the VM is running with native name encoding of latin1 which may cause Elixir to malfunction as it expects utf8. Please ensure your locale is set to UTF-8 (which can be verified by running "locale" in your shell)

Renaming cluster nodes:

rabbit@ <<OLD_NODE>> -> rabbit@ <<NEW_NODE>>

 

Error:

{:rename_failed, {:rename_in_progress, 'Restart node under old name to roll back'}}

 

 

We followed link: https://www.rabbitmq.com/backup.html

Since the documentation regarding the Mnesia directory movement from one node to another is not that clear we are not able to understand what went wrong during the blue/green switch.

Please help us in this issue and if available provide us detailed documentation regarding this usecase.

--
Regards,
Kushagra Bindal

Johan Rhodin

unread,
Dec 24, 2019, 4:31:29 PM12/24/19
to rabbitm...@googlegroups.com
Is your metadata really too big to migrate by using import/export? In my experience this is a sign that something is wrong in the architecture (millions of bindings, exchanges, queues, etc?)

Based on your list of stuff you have done it is hard to give any insight on where the issue is. Is it "Docker" or "google backup" etc? I would do it manually at first to make sure I have all data. Stop RabbitMQ, calculate a checksum(s) for you data directory, tar -zcvf it, untar on the new machine and calculate the checksum there. Make sure environment files are copied too, where applicable. The backup and restore guide is helpful [2].

/Johan

[1] https://www.rabbitmq.com/relocate.html
[2] https://www.rabbitmq.com/backup.html
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CALGrN%2B-ftqgSMUbxBAFVDrf09887FzHmyJYG_0Mvk%2Bcf5Br3tw%40mail.gmail.com.

Kushagra Bindal

unread,
Dec 24, 2019, 9:55:17 PM12/24/19
to rabbitm...@googlegroups.com
Hi Johan

My biggest data will be having approximately 1600 user/vhost and approximately 16000 queues which I think is not very huge. We are using n1-standard-8 machine and while doing blue/green activities the exported file is of approximately 8MB and green deployment is failing while importing it. On my last discussion in another thread it was advisable to either break the json into smaller chunk or choose another approach. 

So, we have opted to switch to copy complete mnesia to green deployment. We are using google deployment machine and inside that we are deploying erlang & rabbitmq in docker.

Yes, I know that the information provided in below email is detailed. As we tried to perform below activities to achieve the required goal but we are not able to do so.

Please let me know if any further information is required on this from my side so that I can try to address your queries. And probably we will be able to achieve what we are looking for.

Marry Christmas. 

Regards
Kushagra 


Luke Bakken

unread,
Dec 26, 2019, 12:22:20 PM12/26/19
to rabbitmq-users
Hello,

After running rename_cluster_node you must also rename the appropriate mnesia directories so they use the new node name. Did you do that?

Luke Bakken

unread,
Dec 26, 2019, 12:23:21 PM12/26/19
to rabbitmq-users
That step is documented here:


Rename the local database directories (note, you do not need to do this if you have set the RABBITMQ_MNESIA_DIR environment variable):

mv \
  /var/lib/rabbitmq/mnesia/rabbit\@misshelpful \
  /var/lib/rabbitmq/mnesia/rabbit\@cordelia
mv \
  /var/lib/rabbitmq/mnesia/rabbit\@misshelpful-rename \
  /var/lib/rabbitmq/mnesia/rabbit\@cordelia-rename
mv \
  /var/lib/rabbitmq/mnesia/rabbit\@misshelpful-plugins-expand \
  /var/lib/rabbitmq/mnesia/rabbit\@cordelia-plugins-expand
        

Johan Rhodin

unread,
Dec 26, 2019, 1:02:49 PM12/26/19
to rabbitm...@googlegroups.com
OK the previous conversation was [1] and the recommendation there still holds I think: to split up the definitions.
Another workaround I can think of is to keep a script that generates all the virtual hosts, queues, bindings, etc via the RabbitMQ HTTP API [2] and run that script on the new node.
Lastly I don't know if importing the definitions via load_definitions in RabbitMQ startup [3] works differently from regular HTTP API, but I've had some cases where that works when the HTTP API doesn't work.

/Johan

[1] https://groups.google.com/d/msg/rabbitmq-users/HDS9RaGiogw/Cby-V6ZqBAAJ
[2] https://rawcdn.githack.com/rabbitmq/rabbitmq-management/v3.8.1/priv/www/api/index.html
[3] https://www.rabbitmq.com/management.html#load-definitions
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CALGrN%2B-QzFtNii1BN%2BQazWEwT3-6eJx%2B%2BbM1XtREV5SxtvW7Gg%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages