Standby question

Erik De Neve

unread,

Sep 24, 2013, 6:08:12 AM9/24/13

to prm-d...@googlegroups.com

Hi Yves, Hi List,

I'm rather new in using your RA, but I've a questions about your RA.

We're using the RA to have a master-slave mysql setup for our Zabbix monitoring tool.
I've also configured zabbix as a LSB resource in corosync.

My question is:

If I put the master in 'standby', the database is directly (I think because of the notify) put in read-only mode and demoted.
The problem is that Zabbix try to flush all data in their buffers, but mysql is already gone and the database became corrupt.

I thought normal behavior if I put a node in standby, it would follow the order defined in corosync:
First shutdown zabbix (so it can flush all the buffers), and after that stop mysql.

Is my understanding right or not?
So if we want to do maintenance on the master, we can't just put the master into standby mode (because of this problem).

Thanks!
Erik

Here is our corosync config:

node zabbix-node1 \
        attributes standby="off"
node zabbix-testnode3 \
        attributes standby="off"
primitive p_mysql ocf:percona:mysql \
        params config="/etc/mysql/my.cnf" log="/var/log/mysql/error.log" pid="/var/lib/mysql/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" replication_user="repl_user" replication_passwd="xxx" max_slave_lag="60" evict_outdated_slaves="false" binary="/usr/sbin/mysqld" test_user="test_user" test_passwd="xxx" \
        op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \
        op monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1" \
        op start interval="0" timeout="300s" \
        op stop interval="0" timeout="300s"
primitive vip ocf:heartbeat:IPaddr2 \
        params ip="172.24.195.51" nic="bond0" \
        op monitor interval="15" \
        meta target-role="Started"
primitive zabbix lsb:zabbix-server \
        op start interval="0" timeout="60" delay="5s" \
        op monitor interval="30s"
ms ms_MySQL p_mysql \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Started" is-managed="true"
colocation vip_and_zabbix_on_master inf: vip zabbix ms_MySQL:Master
order ms_MySQL_promote_before_zabbix inf: ms_MySQL:promote zabbix
order zabbix_before_vip inf: zabbix vip:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore"
property $id="mysql_replication" \
        p_mysql_REPL_INFO="zabbix-node1|mysql-bin-node1.000153|107"
rsc_defaults $id="rsc-options" \
        resource-stickiness="100"

Yves Trudeau

unread,

Sep 25, 2013, 3:03:43 PM9/25/13

to prm-d...@googlegroups.com

Hi Erik,

I am currently in a team meeting, I'll take a look a soon as I have a minute.

Regards,

Yves

2013/9/24 Erik De Neve <e.de...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "PRM-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prm-discuss...@googlegroups.com.
To post to this group, send email to prm-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/prm-discuss.
For more options, visit https://groups.google.com/groups/opt_out.

Erik De Neve

unread,

Oct 2, 2013, 4:08:49 AM10/2/13

to prm-d...@googlegroups.com

Hi Yves,

Maybe just an update.
The same issue happens if I migrate the resource the another node.

Regrads,
Erik

2013/9/25 Yves Trudeau <trud...@gmail.com>

Yves Trudeau

unread,

Oct 2, 2013, 1:21:49 PM10/2/13

to prm-d...@googlegroups.com

Hi Erik,
Great, I am back home and trying to replicate your setup. It could be just a configuration issue in Pacemaker. I'll update shortly.

Regards,

Yves

Erik De Neve

unread,

Oct 2, 2013, 1:31:12 PM10/2/13

to prm-d...@googlegroups.com

Thanks in advance Yves!

Erik

Yves Trudeau

unread,

Oct 3, 2013, 2:16:33 PM10/3/13

to prm-d...@googlegroups.com

Hi Erik,

Looking at the config, why are you forcing zabbix to run on the master node, that seems counter intuitive for me.

Regards,

Yves

2013/10/2 Erik De Neve <e.de...@gmail.com>

Erik De Neve

unread,

Oct 3, 2013, 3:29:42 PM10/3/13

to prm-d...@googlegroups.com

Yves,

We've indeed chosen to run all resources on one node.
This has a few reasons (long story ... ;-) ) e.g. one server is more powerful (the master), the slave is a little less powerful and less responsive when he is the master (although it's working).
Also when we run it on different nodes, we need 2 VIPs, one for the db (where zabbix server pointing to) and another for zabbix-server (where all monitored devices are pointing to) (and we think this make it more complex, and because we have one powerful server who can handle the db AND zabbix load we don't see an advantage of that ...)
I could give some more reasons but it's indeed our idea to run all resources at the same node.

Is this a (the) problem?

Thanks,
Erik

Op donderdag 3 oktober 2013 20:16:33 UTC+2 schreef yves:

Yves Trudeau

unread,

Oct 3, 2013, 5:42:27 PM10/3/13

to prm-d...@googlegroups.com

Hi Eric,

I'll see with the notification the zabbix script receives, I'll create a dummy script to emulate. I was pulled in something else today, I'll create this tomorrow morning and test. Maybe you'll need to beef up the zabbix script you have to understand notification.

Regards,

Yves

2013/10/3 Erik De Neve <e.de...@gmail.com>

Erik De Neve

unread,

Oct 4, 2013, 2:28:44 AM10/4/13

to prm-d...@googlegroups.com

Hi Yves,

Thanks!

Attached the zabbix script we use.
I think the problem is not related to zabbix.

It's because zabbix is trying to flush his buffers but the database is already in read-only.

So (in my opinion) the read-only mode on the database should wait until zabbix has done a 'clean shutdown' (written his buffers).

Erik

2013/10/3 Yves Trudeau <trud...@gmail.com>

zabbix-server

Yves Trudeau

unread,

Oct 4, 2013, 3:24:53 PM10/4/13

to prm-d...@googlegroups.com

Hi Eric,

thanks. I understand the issue with zabbix flushing while stopping. The setting to read_only in the pre-demote notification is mandatory in the PRM logic. Your case is quite exceptional though since only zabbix speaks to mysql and it is cohosted. The odds are very small (if any) that zabbix will write to a slave and breaks replication since it will stopped before mysql. What about if you try to give the SUPER privilege to the zabbix mysql user? That would be the simplest work around.

Regards,

Yves

2013/10/4 Erik De Neve <e.de...@gmail.com>

Erik De Neve

unread,

Oct 7, 2013, 3:16:57 AM10/7/13

to prm-d...@googlegroups.com

Hi Yves,

We will do that as a workaround.

Thanks for your time!
Erik

Op vrijdag 4 oktober 2013 21:24:53 UTC+2 schreef yves:

Erik De Neve

unread,

Oct 8, 2013, 3:53:50 AM10/8/13

to prm-d...@googlegroups.com

Yves,

I've tested with the SUPER privilege.
But it didn't succeed.
What happens is the following:

I set the master node in standby. Zabbix flush it's data (it can take up to 1minute or so, because it hold huge amount in RAM) but before it's finished mysql is already gone ...
It starts the DB and zabbix on the other node but I've errors of 'duplicate entries ... for primary keys'.

So I need to investigate this further.
Your tips are welcome!

Thanks already for your time!

Erik

Op maandag 7 oktober 2013 09:16:57 UTC+2 schreef Erik De Neve:

Yves Trudeau

unread,

Oct 8, 2013, 10:13:06 AM10/8/13

to prm-d...@googlegroups.com

Hi Erik,

That's strange, maybe it is related to this:

if start-stop-daemon --stop --quiet --pidfile $PID --retry=TERM/10/KILL/5; then

from the stop action of the zabbix lsb script. The stop sends a TERM which starts the flushing, wait for 10s and then sends a KILL (-9). The KILL for sure ends the flushing. Could it be the issue you are facing? 10s is very short if it needs 1min+ for flushing. That shouldn't happen only with PRM though. Other than that, I suggest you add this setting to the zabbix-server primitive:

op stop interval="0" timeout="300s"

so that pacemaker gives enough time to zabbix to stop. 300s is maybe overkill but let's see if it fix first.

If that doesn't fix, we'll need to look at MySQL and see what exactly is colliding and why.

Regards,

Yves

2013/10/8 Erik De Neve <e.de...@gmail.com>

Erik De Neve

unread,

Oct 9, 2013, 9:47:30 AM10/9/13

to prm-d...@googlegroups.com

Yves,

Indeed the --retry looks a bit strange.
I've modified the lsb script to:

stop)
        log_daemon_msg "Stopping $DESC" "$NAME"
        if start-stop-daemon --stop --quiet --pidfile $PID --retry=TERM/600/KILL/5; then
                log_end_msg 0
        else
                start-stop-daemon --stop --oknodo --exec $DAEMON --name $NAME --retry=TERM/600/KILL/5
                log_end_msg $?
        fi
        ;;

I've also modified my corosync configuration. (added the timeout as you said, and also a timeout on promote because it was too short (20s => 600s))

node zabbix-node1 \
attributes standby="off"
node zabbix-testnode3 \

attributes standby="on"

primitive p_mysql ocf:percona:mysql \

params config="/etc/mysql/my.cnf" log="/var/log/mysql/error.log" pid="/var/lib/mysql/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" replication_user="repl_user" replication_passwd="slavepw" max_slave_lag="60" evict_outdated_slaves="false" binary="/usr/sbin/mysqld" test_user="test_user" test_passwd="testpw" \

op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \
op monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1" \

        op start interval="0" timeout="600s" \
        op stop interval="0" timeout="600s" \
        op promote interval="0" timeout="600s"

primitive vip ocf:heartbeat:IPaddr2 \
        params ip="172.24.195.51" nic="bond0" \
        op monitor interval="15" \
        meta target-role="Started"
primitive zabbix lsb:zabbix-server \
        op start interval="0" timeout="60" delay="5s" \

op monitor interval="30s" \
op stop interval="0" timeout="600s"

ms ms_MySQL p_mysql \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Started" is-managed="true"
colocation vip_and_zabbix_on_master inf: vip zabbix ms_MySQL:Master
order ms_MySQL_promote_before_zabbix inf: ms_MySQL:promote zabbix
order zabbix_before_vip inf: zabbix vip:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \

        no-quorum-policy="ignore" \
        last-lrm-refresh="1381309001"
property $id="mysql_replication" \
        p_mysql_REPL_INFO="zabbix-testnode3|mysql-bin.000002|267851"

rsc_defaults $id="rsc-options" \
resource-stickiness="100"

But now I have a 'duplicate entry' probem in the database.
It seems like the 'zabbix sync' goes fine. But after the sync mysql is flushing all pages (see output hereunder) and at that moment the DB on the other node is already up (and writes to the (new master) database).
I think this is creating the 'duplicate entries', I got in the log and the DB gets corrupt.
Should the promote not waiting until the demote is finished? or how does it works internally?
PS: Mysql Page flushing can take up to 10minutes on the slowest server.

It's during this process (on the slave that is going in standby) the other DB is up at that moment and accepting queries... (but in my opinion not processed all queries from the 'old' master)

131009 15:37:29 [Note] /usr/sbin/mysqld: Normal shutdown
131009 15:37:29 [Note] Event Scheduler: Purging the queue. 0 events
131009 15:37:29 InnoDB: Starting shutdown...
131009 15:37:31 InnoDB: Waiting for 211 pages to be flushed
131009 15:38:29 InnoDB: Waiting for master thread to be suspended
131009 15:38:33 InnoDB: Waiting for 208 pages to be flushed
131009 15:39:29 InnoDB: Waiting for master thread to be suspended
131009 15:39:34 InnoDB: Waiting for 209 pages to be flushed
131009 15:40:29 InnoDB: Waiting for master thread to be suspended
131009 15:40:35 InnoDB: Waiting for 195 pages to be flushed
131009 15:41:00 InnoDB: Shutdown completed; log sequence number 218746000465

Thanks!
Erik

Op dinsdag 8 oktober 2013 16:13:06 UTC+2 schreef yves:

Erik De Neve

unread,

Oct 10, 2013, 8:56:20 AM10/10/13

to prm-d...@googlegroups.com

After digging into it we think the following is happening...

The zabbix sync works fine, but the problem is after the notify is sent to mysql (because we executed the standby) he doen't send the binlog to the other node anymore.
So with the SUPER privilege on, the local database will have 'more' data then the data that is on the other database (new master).

And after migrating it a few times it becomes corrupt like duplicate entries ...

We are thinking together how we can solve it ...

Thanks for your time already Yves!
Erik

Op woensdag 9 oktober 2013 15:47:30 UTC+2 schreef Erik De Neve:

Yves Trudeau

unread,

Oct 10, 2013, 10:47:47 AM10/10/13

to prm-d...@googlegroups.com

Hi Erik,

I'll process it on my side too. If your goal is only to swith role between the servers, you can also do:

crm resource demote ms_MySQL; sleep 2; crm resource demote ms_MySQL

That will not shutdown MySQL and should allow replication to complete.

Regards,

Yves

2013/10/10 Erik De Neve <e.de...@gmail.com>

Reply all

Reply to author

Forward