Service Rabbitmq unmanaged in pacemaker cluster with rabbitmq-server version 3.6.* in Oracle Linux

467 views
Skip to first unread message

Jean-Pierre PEYEN

unread,
Apr 6, 2016, 7:08:00 AM4/6/16
to rabbitmq-users
Hi community,

We use RabbitMQ Server (release 3.6.0-1)  with erlang 1.8 in Oracle Linux server 7.1

We try to configure a pacemaker cluster for rabbitmq

Description : the cluster is defined with 2 machines : b1pc01 , b1pc36

I would like to use a virtual ip adress (10.126.70.239)

Here are the pcs commands I used


pcs property set stonith-enabled=false

pcs property
set maintenance-mode=false
pcs property
set no-quorum-policy=ignore
pcs resource defaults resource
-stickiness=100

pcs
-f cib.xml resource create vipRabbit ocf:heartbeat:Ipaddr2 ip="10.126.70.239" cidr_netmask="22" op start interval="0s" timeout="60s" op monitor interval="5s" timeout="20s" op stop interval="0s" timeout="60s"

pcs
-f cib.xml resource create serviceRabbit  ocf:rabbitmq:rabbitmq-server ip="10.126.70.239"  nodename="rabbit@localhost"  

pcs cluster cib
-push cib.xml



Here is the output of pcs status:

pcs status
Cluster name: pcmk-cluster
Last updated: Wed Apr  6 06:52:56 2016          Last change: Wed Apr  6 06:49:48 2016 by root via cibadmin on b1pc01
Stack: corosync
Current DC: b1pc01 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 2 resources configured


Online: [ b1pc01 b1pc36 ]


Full list of resources:


 vipRabbit      
(ocf::heartbeat:IPaddr2):       Started b1pc01
 serviceRabbit  
(ocf::rabbitmq:rabbitmq-server):        FAILED (unmanaged)[ b1pc36 b1pc01 ]


Failed Actions:
* serviceRabbit_stop_0 on b1pc36 'unknown error' (1): call=10, status=complete, exitreason='none',
   
last-rc-change='Wed Apr  6 06:49:50 2016', queued=0ms, exec=767ms
* serviceRabbit_stop_0 on b1pc01 'unknown error' (1): call=11, status=complete, exitreason='none',
   
last-rc-change='Wed Apr  6 06:49:50 2016', queued=0ms, exec=767ms




PCSD
Status:
  b1pc01
: Offline
  b1pc36
: Offline


Daemon Status:
  corosync
: active/disabled
  pacemaker
: active/disabled
  pcsd
: unknown/disabled



I would like to understand why the serviceRabbit resource is unamanaged. Have I done something wrong or/and incomplete ?


I have modified  the agent file "/usr/lib/ocf/resource.d/rabbitmq/rabbitmq-server"  (by intercepting error code 69) and it seems to work.


rabbitmqctl_action() {
   
local rc
   
local action
    action
=$@
    $RABBITMQ_CTL $NODENAME_ARG $action
> /dev/null 2> /dev/null
    rc
=$?
   
case "$rc" in
       
0)
            ocf_log debug
"RabbitMQ server is running normally"
           
return $OCF_SUCCESS
           
;;
       
2)
            ocf_log debug
"RabbitMQ server is not running"
           
return $OCF_NOT_RUNNING
           
;;
       
69)
            ocf_log debug
"RabbitMQ server is not running (code error 69)"
           
return $OCF_NOT_RUNNING
           
;;
       
*)
            ocf_log err
"Unexpected return from rabbitmqctl $NODENAME_ARG $action: $rc"
           
exit $OCF_ERR_GENERIC
   
esac
}



My first impression is :  may be there is a bug in version 3.6 of rabbitmq.


Thanks for your help.

Regards







Reply all
Reply to author
Forward
0 new messages