On 10.06.2015 16:22, Jean-Sébastien Pédron wrote:
> On 09.06.2015 16:17, Bogdan Dobrelya wrote:
> I know neither Pacemaker nor the OCF spec in details, so I may
> understand things incorrectly. For instance, when a node is demoted from
> master to slave, RabbitMQ (the Erlang application, not the entire node)
> is stopped but not restarted. Am I correct? Is this expected?
Yes, as you can see from the flow charts, OCF agent starts the rabbit
app only after the Master pacemaker resource promoted successfully or
the Slave resource started & joined rabbit cluster by existing Master.
This is handled by post promote and post start notifications sent
cluster-wide by Pacemaker.
> When a node is stopped, it is removed from the cluster. I see nowhere in
> the code if you wait for the HA synchronization to finish before
> removing a node which is the master for some queues.
This OCF agent considers any non running state of the rabbit pacemaker
resource as a failure, synchronization is not expected (the most
pessimistic case is assumed). If it was a Slave down, it will be
re-joined later, once/if available again. If it was a Master down, the
new one will be re-elected from the rest of the nodes remaining running
as Slaves - as a part of fail-over procedure. The one who has the most
uptime of the rabbit app will win the Master role. The rest will re-join
him on post-promote or post-start events received. Mnesia reset depends
on the situation. For example, if node thinks it is clustered with
another node, but that one disagrees.
But you're right. Perhaps, the HA synchronization should be expected if
the resource stop is
gracefull, for example by the operator request. Do you think this case
should be addressed as a bug?
> By the way, when you call a node a master, does this mean you want all
> queue's masters to run on that particular node?
No, queue masters may belong to any rabbit nodes. The Master and Slave
we're referring here are only for the pacemaker multistate clone roles.
The "Master" is also the rabbit node that is normally specified by
another members ("Slaves") as a target for the join_cluster command.
> I agree, it would be easier to comment on the code. For now, I attached
> a commented version to this mail. My comments start with "# XXX" and
> address only implementation details, not the workflow itself.
> The current RA is barely a copy of a simple init script with no policy
> enforcement. Your implementation is designed for a cluster with all
> resources replicated on all nodes and dynamically removes/adds nodes to
> the cluster. As you enforce a particular choice, I'm not sure this new
> resource agent can be backward-compatible with the current one.
> Of course, this doesn't mean we won't include it. I personnaly have no
> idea how the current RA is used in production by users: the current one
> brings nearly no value compared to a simple init script.
> I would like to hear from people using the OCF RA: what do you think of
> this new one? Should we include both? Should the new one replace the
> current one? In general, what do you expect from the official RA?