2. With the same node arrangement listed above.... I set up A(created vhost,
users, set permissions etc) and then clustered B against A,B (rabbitmqctl
cluster nodeA nodeB). I took down A and my messages did not persist even
though B should be a disk node. Is there something wrong with my steps?
3. What is the conventional/non-conventional way to determine whether or not
individual nodes in a cluster are a disk or ram nodes?
Thanks.
--
View this message in context: http://old.nabble.com/Disk-node-vs-ram-node-tp29356379p29356379.html
Sent from the RabbitMQ mailing list archive at Nabble.com.
_______________________________________________
rabbitmq-discuss mailing list
rabbitmq...@lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
The disk nodes are the ones explicitly listed in the "rabbitmqctl
cluster" command.
So,
% rabbitmqctl cluster rabbit@A rabbit@B
results in a cluster where A and B are disk nodes. If that command is
run on either A or B, the cluster has just the two nodes.
If, on the other hand, that command is run on C, you get a cluster with
A and B disc nodes (same as above) and C as a ram node.
> 2. With the same node arrangement listed above.... I set up A(created vhost,
> users, set permissions etc) and then clustered B against A,B (rabbitmqctl
> cluster nodeA nodeB). I took down A and my messages did not persist even
> though B should be a disk node. Is there something wrong with my steps?
Queues exist physically only on the node on which they were declared.
Clustering just makes them (and exchanges) accesible transparently on
all of the nodes.
Thus, if the queue was declared on A, and A was taken down, the messages
stored on the queue would become inaccesible.
If you want to never lose messages, have a look at our high-availability
clustering guide in the documentation section of the website.
> 3. What is the conventional/non-conventional way to determine whether or not
> individual nodes in a cluster are a disk or ram nodes?
To find out whether nodes are disc or ram, run
% rabbitmqctl status
on any node in the cluster.
Unfortunately, due to a bug, that command will not report the correct
node type, so it's not of much help to you now. This problem has been
fixed in our default branch and will be in the release.
Cheers,
Alex
I've just tried that on default.
All the nodes start up as disc nodes. At that point, you turn *Node2*
into a ram node. In the third group of commands, you also turn *Node3*
into a ram node.
So, the sequence is:
1. Start Node1 as a disc node
2. Cluster Node2 with Node1; Node2 becomes a ram node
3. Cluster Node3 with Node1; Node3 becomes a ram node
At the end, Node1 is disc; Node2 and Node3 are ram
> 2. To be clear, if I send a message that creates the queue Q1 in node A in a cluster with node A, node B, and node C and node A goes down, then I will lose those messages on node A. Node A is down, but B and C should still be alive. If I send the same message to be queued on Q1, will Q1 be created on node B and node C? For me so far, that is not the case. When I send a message to be queued on Q1 (where the messages on Q1 on node A are now lost because node A is down) I do not see the queue getting created on either node B or node C. Is there something wrong with my steps?
I'm not quite sure what you mean by ``message that creates the queue''.
If you declare a queue Q1 on node A with queue.declare, then Q1 will
reside on node A. If node A goes down, all the messages on its queues
(including Q1) will become inaccessible. If Q1 was declared as durable,
then the messages published as persistent on it will become available
when node A comes back online.
In a cluster, when a node goes down, its queues become inaccesbile.
They are *not* automatically created on other nodes.
Incidentally, which client are you using?
> 3. I am using RabbitMQ 1.7.2. Just to be clear there is _no way_ on 1.7.2 to determine whether or not a node is a disk node or ram node?
There's no way to do it with the provided tools.
If you're not squeamish about running a bit of Erlang, you could do
this:
% erl -sname mgmtsh -remsh rabbit@localhost
This gives you a remote Erlang shell named mgmtsh, connected to
rabbit@localhost
Then, running this should give you the disc nodes:
(rabbit@localhost)1> mnesia:table_info(rabbit_durable_exchange, disc_copies).
And running this should give you the ram nodes:
(rabbit@localhost)2> mnesia:table_info(rabbit_durable_exchange, ram_copies).
There's no automatic way to create missing queues inside a cluster in
RabbitMQ, as far as I know.
Whether a node is disc or ram refers to where the broker metadata
(queues, exchanges, bindings) is kept. Persistent messages are always
stored on disk. When a ram node goes down, it looses all its metadata.
If it's in a cluster, it will just connect to some other node and
recover the information. If the entire cluster goes down (say, power
outage), the metadata can only be recovered if at least one node was
disc. In general, there's not much sense in having a lot of disc nodes.
One or two should be enough. Ram nodes are faster.
> #3. Not sure if I should do anything before hand, but I ran the command you provided and got this output:
> $ erl -sname mgmtsh -remsh rabbit@localhost
> Erlang R13B03 (erts-5.7.4) [source] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]
>
> *** ERROR: Shell process terminated! (^G to start new job) ***
> {error_logger,{{2010,8,5},{14,52,59}},"~s~n",["Error in process <0.33.0> on node 'mgmtsh@ebony' with exit value: {badarg,[{erlang,list_to_existing_atom,[\"rabbit@ubuntu\"]},{dist_util,recv_challenge,1},{dist_util,handshake_we_started,1}]}\n"]}
>
> =ERROR REPORT==== 5-Aug-2010::14:52:59 ===
> Error in process <0.33.0> on node 'mgmtsh@ebony' with exit value: {badarg,[{erlang,list_to_existing_atom,["rabbit@ubuntu"]},{dist_util,recv_challenge,1},{dist_util,handshake_we_started,1}]}
>
> FYI: I'm running Ubuntu and installed rabbitMQ 1.7.2 with sudo apt-get install rabbitmq-server.
I assumed the rabbit@localhost was your node. Try rabbit@ebony.
On Aug 5, 2010, at 5:38 PM, Alexandru Scvortov <alex...@rabbitmq.com> wrote:
> If you declare a queue Q1 on node A with queue.declare, then Q1 will
> reside on node A. If node A goes down, all the messages on its queues
> (including Q1) will become inaccessible. If Q1 was declared as durable,
> then the messages published as persistent on it will become available
> when node A comes back online.
>
> In a cluster, when a node goes down, its queues become inaccesbile.
> They are *not* automatically created on other nodes.
I'm using the java client (1.8.1) and I automatically reconnect my producers/consumers to another node when a node goes down. I do this by setting up a shutdown listener hook on the connection.
When a node with a queue goes down, can I write something using the java client that re-declares all known queues on the other node. May be have all the queue names(and params) registered somewhere in the app and upon reconnect, passively re-declare them. If passive declaration throws an exception, catch that and do an actual declaration. Would that work?
If that works, then what happens if we bring up the node that went down that had the queue originally? Would it play nice?
I'd like to find a solution without resorting to using the pacemaker approach.
This is important not just for node failures but for other things as upgrading or reconfiguring rabbit without affecting any clients. (our rabbits sit behind a round-robin load balancer).
> When a node with a queue goes down, can I write something using the java client that re-declares all known queues on the other node. May be have all the queue names(and params) registered somewhere in the app and upon reconnect, passively re-declare them. If passive declaration throws an exception, catch that and do an actual declaration. Would that work?
>
> If that works, then what happens if we bring up the node that went down that had the queue originally? Would it play nice?
> When a node with a queue goes down, can I write something using the java client that re-declares all known queues on the other node. May be have all the queue names(and params) registered somewhere in the app and upon reconnect, passively re-declare them. If passive declaration throws an exception, catch that and do an actual declaration. Would that work?
If you:
1) declare a durable queue on one of the clustered nodes,
2) take down that node,
3) declare the same queue on another node
you get a "404: Not Found".
If it's not durable, sure.
As an aside, if you passive declare something, then immediately declare
it, the first step is redundant. There's no problem with declaring
the same thing repeatedly. (assuming it's declared in exactly the
same way and it's not unreachable)
Cheers,
Alex
Ouch... so no way to ensure that there's something to catch messages when the
node with the original durable queue goes down?
There are several mentions throughout the site about out-of-the-box live
failover in future release, how far out is that? 6 months? 1 year?
Not a big fan of the pacemaker approach due to the number of moving parts and
different packages (pacemaker/corosync/heartbeat/erbd) to maintain and manage.
And also haven't seen any indication of people actually using that setup in a
complex VM environment in production.
----- Original Message ----
From: Alexandru Scvorţov <alex...@rabbitmq.com>
To: Dave Greggory <davegr...@yahoo.com>
Cc: "rabbitmq...@lists.rabbitmq.com" <rabbitmq...@lists.rabbitmq.com>
Sent: Thu, August 12, 2010 9:33:54 AM
Subject: Re: [rabbitmq-discuss] Disk node vs ram node
One of the things on our todo list, as you can imagine, is to
understand how to best provide HA in a VM based setting. Some VM
technologies provide useful failover / snapshot capability...
alexis