Hi community,
We use RabbitMQ Server (release 3.6.0-1) with erlang 1.8 in Oracle Linux server 7.1
We try to configure a pacemaker cluster for rabbitmq
Description : the cluster is defined with 2 machines : b1pc01 , b1pc36
I would like to use a virtual ip adress (10.126.70.239)
Here are the pcs commands I used
pcs property set stonith-enabled=false
pcs property set maintenance-mode=false
pcs property set no-quorum-policy=ignore
pcs resource defaults resource-stickiness=100
pcs -f cib.xml resource create vipRabbit ocf:heartbeat:Ipaddr2 ip="10.126.70.239" cidr_netmask="22" op start interval="0s" timeout="60s" op monitor interval="5s" timeout="20s" op stop interval="0s" timeout="60s"
pcs -f cib.xml resource create serviceRabbit ocf:rabbitmq:rabbitmq-server ip="10.126.70.239" nodename="rabbit@localhost"
pcs cluster cib-push cib.xml
Here is the output of pcs status:
pcs status
Cluster name: pcmk-cluster
Last updated: Wed Apr 6 06:52:56 2016 Last change: Wed Apr 6 06:49:48 2016 by root via cibadmin on b1pc01
Stack: corosync
Current DC: b1pc01 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 2 resources configured
Online: [ b1pc01 b1pc36 ]
Full list of resources:
vipRabbit (ocf::heartbeat:IPaddr2): Started b1pc01
serviceRabbit (ocf::rabbitmq:rabbitmq-server): FAILED (unmanaged)[ b1pc36 b1pc01 ]
Failed Actions:
* serviceRabbit_stop_0 on b1pc36 'unknown error' (1): call=10, status=complete, exitreason='none',
last-rc-change='Wed Apr 6 06:49:50 2016', queued=0ms, exec=767ms
* serviceRabbit_stop_0 on b1pc01 'unknown error' (1): call=11, status=complete, exitreason='none',
last-rc-change='Wed Apr 6 06:49:50 2016', queued=0ms, exec=767ms
PCSD Status:
b1pc01: Offline
b1pc36: Offline
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: unknown/disabled
I would like to understand why the serviceRabbit resource is unamanaged. Have I done something wrong or/and incomplete ?
I have modified the agent file "/usr/lib/ocf/resource.d/rabbitmq/rabbitmq-server" (by intercepting error code 69) and it seems to work.
rabbitmqctl_action() {
local rc
local action
action=$@
$RABBITMQ_CTL $NODENAME_ARG $action > /dev/null 2> /dev/null
rc=$?
case "$rc" in
0)
ocf_log debug "RabbitMQ server is running normally"
return $OCF_SUCCESS
;;
2)
ocf_log debug "RabbitMQ server is not running"
return $OCF_NOT_RUNNING
;;
69)
ocf_log debug "RabbitMQ server is not running (code error 69)"
return $OCF_NOT_RUNNING
;;
*)
ocf_log err "Unexpected return from rabbitmqctl $NODENAME_ARG $action: $rc"
exit $OCF_ERR_GENERIC
esac
}
My first impression is : may be there is a bug in version 3.6 of rabbitmq.
Thanks for your help.
Regards