Strange behavior on a 2 node configuration

309 views
Skip to first unread message

Pierre Mavro

unread,
Mar 20, 2014, 5:52:37 PM3/20/14
to prm-d...@googlegroups.com
Hello,

I'm facing to a strange behavior on a MariaDB 5.5 replication with PRM. I'm wondering if my corosync configuration is fine and if someone could give me inputs to resolve that issue.

Here is the scenario, I got this configuration:
node master1 \
    attributes p_mysql_mysql_master_IP="192.168.33.31"
node master2 \
    attributes p_mysql_mysql_master_IP="192.168.33.32"
primitive p_mysql ocf:percona:mysql \
    params config="/etc/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" replication_user="replication" replication_passwd="password" max_slave_lag="60" evict_outdated_slaves="false" binary="/usr/sbin/mysqld" test_user="test_user" test_passwd="password" \
    op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \
    op monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1" \
    op start interval="0" timeout="60s" \
    op stop interval="0" timeout="60s"
primitive reader_vip_1 ocf:heartbeat:IPaddr2 \
    params ip="192.168.33.101" nic="eth2" \
    op monitor interval="10s" \
    meta target-role="Started"
primitive reader_vip_2 ocf:heartbeat:IPaddr2 \
    params ip="192.168.33.102" nic="eth2" \
    op monitor interval="10s"
primitive writer_vip ocf:heartbeat:IPaddr2 \
    params ip="192.168.33.100" nic="eth2" \
    op monitor interval="10s"
ms ms_MySQL p_mysql \
    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Started" is-managed="true"
location loc-No-reader-vip-2 reader_vip_2 \
    rule $id="rule-no-reader-vip-2" -inf: readable gt 0
location loc-no-reader-vip-1 reader_vip_1 \
    rule $id="rule-no-reader-vip-1" -inf: readable gt 0
colocation writer_vip_on_master inf: writer_vip ms_MySQL:Master
order ms_MySQL_promote_before_vip inf: ms_MySQL:promote writer_vip:start
property $id="cib-bootstrap-options" \
    dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
    cluster-infrastructure="openais" \
    expected-quorum-votes="2" \
    no-quorum-policy="ignore" \
    stonith-enabled="false" \
    last-lrm-refresh="1395333299"
property $id="mysql_replication" \
    p_mysql_REPL_INFO="192.168.33.31|mariadb-bin.000190|245" \
    p_mysql_REPL_STATUS="mariadb-bin.000190|245|104857600"

When I start corosync, during a few seconds, I can see readers are ready, but not writer:
reader_vip_1    (ocf::heartbeat:IPaddr2):       master1
reader_vip_2    (ocf::heartbeat:IPaddr2):       master2
writer_vip      (ocf::heartbeat:IPaddr2):       Stopped


Then they are totally unusable and the writer vip is only working:
reader_vip_1    (ocf::heartbeat:IPaddr2):       Stopped
reader_vip_2    (ocf::heartbeat:IPaddr2):       Stopped
writer_vip      (ocf::heartbeat:IPaddr2):       Started master1


If I take a look at the master2, it is correcting acting as a slave server. So don't see why it doesn't want to add a read IP on it.

In the logs, it tells me that there was a problem on check_slave. Howerver I confirm the replication is correct! Here is a sample of what I've got:
Both mysql are started and the replication seams working. However, I do not get any VIP for read purpose. Here is what I got in the logs:
Mar 20 21:40:06 master1 mysql[10951]: ERROR: check_slave invoked on an instance that is not a replication slave.
Mar 20 21:40:06 master1 mysql[10951]: WARNING: Attempted to unset the replication master on an instance that is not configured as a replication slave
...
Mar 20 21:40:06 master1 crm_resource: [10916]: info: Invoked: /usr/sbin/crm_resource --list 
Mar 20 21:40:06 master1 mysql[10895]: ERROR: check_slave invoked on an instance that is not a replication slave.
Mar 20 21:40:06 master1 lrmd: [10450]: info: operation notify[10] on p_mysql:0 for client 10453: pid 10895 exited with return code 0
Mar 20 21:40:06 master1 crmd: [10453]: info: process_lrm_event: LRM operation p_mysql:0_notify_0 (call=10, rc=0, cib-update=0, confirmed=true) ok
Mar 20 21:40:06 master1 lrmd: [10450]: info: rsc:p_mysql:0 promote[11] (pid 10951)
Mar 20 21:40:06 master1 crm_resource: [10972]: info: Invoked: /usr/sbin/crm_resource --list 
Mar 20 21:40:06 master1 mysql[10951]: ERROR: check_slave invoked on an instance that is not a replication slave.
Mar 20 21:40:06 master1 mysql[10951]: WARNING: Attempted to unset the replication master on an instance that is not configured as a replication slave
Mar 20 21:40:06 master1 attrd: [10451]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-p_mysql:0

Does somebody get an idea?

Thanks in advance

Pierre

Yves Trudeau

unread,
Mar 22, 2014, 1:13:35 PM3/22/14
to prm-d...@googlegroups.com

Hi,
   The location rule for the reader vip is wrong,  the attribute readable will be set to 1 if everything is ok so it can't be " -inf: readable gt 0", try lt 1.

Yves

--
You received this message because you are subscribed to the Google Groups "PRM-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prm-discuss...@googlegroups.com.
To post to this group, send email to prm-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/prm-discuss.
For more options, visit https://groups.google.com/d/optout.

Yves Trudeau

unread,
Mar 25, 2014, 2:29:54 PM3/25/14
to prm-d...@googlegroups.com
Hi,
   I realize the doc was wrong, it will be changed soon.

Regards,

Yves


--

Pierre Mavro

unread,
Mar 25, 2014, 5:13:52 PM3/25/14
to prm-d...@googlegroups.com
Thanks, I'll test asap and will get back to you. But it seams you're right, the documentation is wrong

Pierre Mavro

unread,
Mar 27, 2014, 1:33:10 AM3/27/14
to prm-d...@googlegroups.com
Hi,

It works like a charm now.

Thanks a lot
Reply all
Reply to author
Forward
0 new messages