MySQL 5.6 / GTID status?

215 views
Skip to first unread message

Thorn Roby

unread,
Jan 13, 2014, 2:20:09 PM1/13/14
to prm-d...@googlegroups.com
I understood from the December webinar on the Geo enhancements that MySQL with GTID would be available later in the month. I've downloaded the Github repository and I'm not really sure what to do with it. Do I just need the mysql-prm agent piece? At this point I don't need the booth components. 
I'm mostly not sure what to do about the GTID requirement for log-slave-updates conflicting with the installation document prohibiting them.

Yves Trudeau

unread,
Jan 13, 2014, 6:29:44 PM1/13/14
to prm-d...@googlegroups.com

Hi Thorn,
   I am working on it, most of Fred's work is integrated, I am now merging the new master score code.  Ping me back in a week, it should be ready.

Regards,

Yves

I understood from the December webinar on the Geo enhancements that MySQL with GTID would be available later in the month. I've downloaded the Github repository and I'm not really sure what to do with it. Do I just need the mysql-prm agent piece? At this point I don't need the booth components. 
I'm mostly not sure what to do about the GTID requirement for log-slave-updates conflicting with the installation document prohibiting them.

--
You received this message because you are subscribed to the Google Groups "PRM-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prm-discuss...@googlegroups.com.
To post to this group, send email to prm-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/prm-discuss.
For more options, visit https://groups.google.com/groups/opt_out.

Yves Trudeau

unread,
Jan 16, 2014, 12:49:25 PM1/16/14
to prm-d...@googlegroups.com
Hi Thorn,
   Have a look on github, I pushed it this morning.  I'll blog and announce in the next few days.

Regards,

Yves


2014/1/13 Yves Trudeau <trud...@gmail.com>

Thorn Roby

unread,
Jan 16, 2014, 3:24:41 PM1/16/14
to prm-d...@googlegroups.com
Great, thanks, I'll give it a try. Am I right in understanding that if I'm not doing the Geo stuff the only piece I need to download is the mysql-prm agent?


You received this message because you are subscribed to a topic in the Google Groups "PRM-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prm-discuss/XUKhE7QAs30/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prm-discuss...@googlegroups.com.

Yves Trudeau

unread,
Jan 16, 2014, 3:43:16 PM1/16/14
to prm-d...@googlegroups.com
The geo is built-in.  For gtid, download mysql_prm56.

Regards,

Yves


2014/1/16 Thorn Roby <thor...@gmail.com>

Thorn Roby

unread,
Jan 22, 2014, 7:13:36 PM1/22/14
to prm-d...@googlegroups.com
I've set up a 3 node cluster using the mysql_prm56 agent, with one set of addresses for the VIPs to be accessed by DB clients  (on the first NIC, named em1 on these servers, with addresses on "MM.NN.50.X" in the following status output) and another set on the second (em2, on "MM.NN.32.X"). MySQL (Percona 5.6.15 using GTID) is running and replication is working (except that the cluster seems to intermittently kill the mysql process on the 2 slaves). After trying a number of things, I eventually added an explicit "nic" attribute for the em2 interfaces, but that made no difference. The VIPs (these are distinct from the primary physical addresses of the em1 interfaces) appear to be assigned correctly, but the mysql processes on the em2 interfaces are never seen. Here is the output of "crm configure show", "crm status" and "show slave status":

crm configure show:

        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.181" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.182" nic="em2"
primitive p_mysql ocf:rootpass:mysql \
        params config="/etc/my.cnf" pid="/var/lib/mysql/mysqld.pid" socket="/var/lib/mysql/mysql.sock" replication_user="repl" replication_passwd="reppass" max_slave_lag="60" evict_outdated_slaves="false" binary="/root/PS5615/bin/mysqld" test_user="root" test_passwd="rootpass" \
        op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \
        op monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1" \
        op start interval="0" timeout="60s" \
        op stop interval="0" timeout="60s"
primitive reader_vip_1 ocf:heartbeat:IPaddr2 \
        params ip="MM.NN.2.10" nic="em1" \
        op monitor interval="10s"
primitive reader_vip_2 ocf:heartbeat:IPaddr2 \
        params ip="MM.NN.2.12" nic="em1" \
        op monitor interval="10s"
primitive reader_vip_3 ocf:heartbeat:IPaddr2 \
        params ip="MM.NN.2.14" nic="em1" \
        op monitor interval="10s"
primitive writer_vip ocf:heartbeat:IPaddr2 \
        params ip="MM.NN.2.9" nic="em1" \
        op monitor interval="10s"
ms ms_MySQL p_mysql \
        meta master-max="1" master-node-max="1" clone-max="3" clone-node-max="1" notify="true" globally-unique="false" target-role="Master" is-managed="true"
location loc-No-reader-vip-2 reader_vip_2 \
        rule $id="rule-no-reader-vip-2" -inf: readable gt 0
location loc-No-reader-vip-3 reader_vip_3 \
        rule $id="rule-no-reader-vip-3" -inf: readable gt 0
location loc-no-reader-vip-1 reader_vip_1 \
        rule $id="rule-no-reader-vip-1" -inf: readable gt 0
colocation writer_vip_on_master inf: writer_vip ms_MySQL:Master
order ms_MySQL_promote_before_vip inf: ms_MySQL:promote writer_vip:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.8-7.el6-394e906" \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes="3" \
        no-quorum-policy="ignore" \
        stonith-enabled="false" \
        last-lrm-refresh="1338928815"
property $id="mysql_replication" \
        p_mysql_REPL_STATUS="9f4f986b-82e5-11e3-9869-d4ae52e8cd5b:9" \
        p_mysql_REPL_INFO="eng-mysqlha-p2.mydomain.net|:"
#vim:set syntax=pcmk

crm status:

Last updated: Wed Jan 22 16:45:56 2014
Last change: Wed Jan 22 16:44:33 2014 via cibadmin on eng-mysqlha-p1.mydomain.net
Stack: classic openais (with plugin)
Current DC: eng-mysqlha-p3.mydomain.net - partition with quorum
Version: 1.1.8-7.el6-394e906
6 Nodes configured, 3 expected votes
7 Resources configured.



 reader_vip_1   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p2.mydomain.net
 reader_vip_2   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p3.mydomain.net
 reader_vip_3   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ eng-mysqlha-p1.mydomain.net ]
     Stopped: [ p_mysql:1 p_mysql:2 ]

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: MM.NN.32.180
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql_bin.000010
          Read_Master_Log_Pos: 1837
               Relay_Log_File: mysqld-relay-bin.000013
                Relay_Log_Pos: 448
        Relay_Master_Log_File: mysql_bin.000010
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1837
              Relay_Log_Space: 622
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 9f4f986b-82e5-11e3-9869-d4ae52e8cd5b
             Master_Info_File: /data/mysql/data5615/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9f4f986b-82e5-11e3-9869-d4ae52e8cd5b:1-14
            Executed_Gtid_Set: 05b8fa69-82f4-11e3-98c8-d4ae52e8ccb9:1-7,
9f4f986b-82e5-11e3-9869-d4ae52e8cd5b:1-14
                Auto_Position: 1

Yves Trudeau

unread,
Jan 23, 2014, 11:03:03 AM1/23/14
to prm-d...@googlegroups.com
Hi Thorn,
   is it normal you have 6 nodes defined?  I think you shouldn't have the eng-mysqlem2-p*.mydomain.net nodes and only the eng-mysqlha-p*.mydomain.net nodes.  In such a case you should have:


        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.181" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.182" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
node eng-mysqlha-p3.mydomain.net \
attributes p_mysql_mysql_master_IP="MM.
NN.32.182" nic="em2"






2014/1/22 Thorn Roby <thor...@gmail.com>

Yves Trudeau

unread,
Jan 23, 2014, 11:08:00 AM1/23/14
to prm-d...@googlegroups.com
Sent too soon...  gmail shortcuts :(

In such a case you should have only:

node eng-mysqlha-p1.mydomain.net \
        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
node eng-mysqlha-p2.mydomain.net \
        attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
node eng-mysqlha-p3.mydomain.net \
        attributes p_mysql_mysql_master_IP="MM.NN.32.182" nic="em2"

Also activate the trace log on the slaves, that may tell us why.  To enable the trace:

mkdir -p /tmp/mysql.ocf.ra.debug
touch /tmp/mysql.ocf.ra.debug/log

if there's a stop, there will be a "stop" event in the log file, looks at the previous monitor event to have a clue why.

Regards,

Yves


2014/1/23 Yves Trudeau <trud...@gmail.com>

Thorn Roby

unread,
Jan 23, 2014, 1:35:56 PM1/23/14
to prm-d...@googlegroups.com
Maybe I should clarify what I'm trying to do (keep corosync/pacemaker/replication traffic on a "private" network MM.NN.32.176/28 on the second nic , and "public" DB client access to the VIPs (as well as other access like ssh) on MM.NN.2.50/24 on the first nic):

                 ___________________________________________________________________________________________________________  db client access
                 |                                                                  |                                                                        |
__________|__em1_________                                 ____|__em1__________________                      ____ |_em1_________________
|  MM.NN.2.28    (+VIP)         |                                | MM.NN.2.29   (+VIP)                 |                      | MM.NN.2.30 (+ VIP)               |
|  mysqlha-p1                        |                                | mysqlha-p2                              |                      | mysqlha-p3                            |
|                                           |                                |                                                |                      |                                              |
| MM.NN.32.180                    |                                | MM.NN.32.181                         |                      | MM.NN.32.182                        |
| mysqlem2-p1                      |                                | mysqlem2-p2                           |                       | mysqlem2-p3                         |
_________________________                                 ____________________________                       ___________________________
                |    em2                                                        |     em2                                                             |   em2
                |                                                                  |                                                                        |
                ______________________________________________________________________________________ replication, corosync, pacemaker traffic

There is no gateway on the MM.NN.32.176/28 network and it is (supposedly) configured for multicast, but I'm not certain (but corosync seems OK).
Here are my assumptions: 
1. Replication: traffic on em2, initially the first system is the master, the other 2 are slaves. MySQL thinks this is working.
2. Corosync/Pacemaker - monitoring on em2 interfaces, able to establish VIP reader/writer addresses on em1 interfaces.
3. Client DB access to VIPs established on em1.
4. Other access (ssh) to physical em1 address.

Corosync.conf:

compatibility: whitetank

totem {
        version: 2
        secauth: on
        threads: 0
        interface {
                ringnumber: 0
                bindnetaddr: MM.NN.32.176
                mcastaddr: 226.94.1.1
                mcastport: 5405
                ttl: 1
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        debug: off
        logfile: /var/log/cluster/corosync.log
        debug: on
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
     mode: disabled
}

When I define 3 nodes on em2 like this:
node eng-mysqlem2-p1.mydomain.net  attributes p_mysql_mysql_master_IP="MM.NN.32.180" nic="em2"
node eng-mysqlem2-p2.mydomain.net  attributes p_mysql_mysql_master_IP="MM.NN.32.181" nic="em2"
node eng-mysqlem2-p3.mydomain.net  attributes p_mysql_mysql_master_IP="MM.NN.32.182" nic="em2"

The other mysqlha nodes on em1 get automatically restored to the pacemaker configuration, whether I make the change on node 1 or node 3, which has become the current DC:

Current DC: eng-mysqlha-p3.mydomain.net - partition with quorum
Version: 1.1.8-7.el6-394e906
6 Nodes configured, 3 expected votes
7 Resources configured.


 reader_vip_1   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p2.mydomain.net
 reader_vip_2   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p3.mydomain.net
 reader_vip_3   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ eng-mysqlha-p1.mydomain.net ]
     Stopped: [ p_mysql:1 p_mysql:2 ]
[root@eng-mysqlha-p1 ~]# crm configure show
        attributes p_mysql_mysql_master_IP="10.50.32.180" nic="em2"
        attributes p_mysql_mysql_master_IP="10.50.32.181" nic="em2"
        attributes p_mysql_mysql_master_IP="10.50.32.182" nic="em2"

If I define the 3 nodes as the em1 interfaces:

        attributes p_mysql_mysql_master_IP="MM.NN.2.28" nic="em1"
        attributes p_mysql_mysql_master_IP="MM.NN.2.29" nic="em1"
        attributes p_mysql_mysql_master_IP="MM.NN.2.30" nic="em1"

[root@eng-mysqlha-p3 ~]# crm status
Last updated: Thu Jan 23 11:30:37 2014
Last change: Thu Jan 23 11:30:33 2014 via cibadmin on eng-mysqlha-p3.mydomain.net
Stack: classic openais (with plugin)
Current DC: eng-mysqlha-p3.mydomain.net - partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
 reader_vip_1   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p2.mydomain.net
 reader_vip_2   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p3.mydomain.net
 reader_vip_3   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ eng-mysqlha-p1.mydomain.net ]
     Stopped: [ p_mysql:1 p_mysql:2 ]

Which looks pretty good except the slave mysql processes are actually running. Do I need to manually start them in pacemaker? And is it OK to have a configuration that shows no knowledge of the network on which the pacemaker traffic is running?

Yves Trudeau

unread,
Jan 23, 2014, 3:30:22 PM1/23/14
to prm-d...@googlegroups.com
Hi Thorn,

in the node section, you need this:

        attributes p_mysql_mysql_master_IP="10.50.32.180"
        attributes p_mysql_mysql_master_IP="10.50.32.181"
        attributes p_mysql_mysql_master_IP="10.50.32.182"

and that's all. Replication will use the 10.5.32.x  IPs.

check the trace log files on the slaves to see why pacemaker think they are stopped.  Have you started them manually with the init.d script or have you let pacemaker started them?  You must make sure the pid and socket file in the cib match the ones in my.cnf if you start manually.  Also starting manually uses mysqld_safe with is not desirable.  Best is to let Pacemaker starts Mysql.

Regards,

Yves


2014/1/23 Thorn Roby <thor...@gmail.com>

--

Thorn Roby

unread,
Jan 23, 2014, 7:11:32 PM1/23/14
to prm-d...@googlegroups.com

I tried that but no change. Then I tried to simplify things but that didn't help either. 
I think there's something wrong with how the OCF variables are being populated by the mysqld_56prm script. I remember having some similar issues when I tried to set it up last year. What seems to happen is that the mysqld command line that the script executes fails due to the fact that it's not getting all the variables out of my.cnf. In particular, I have the mysql datadir set to /data/mysql/data5615, and that should be the default innodb data directory also, but the mysqld command line started by the script uses /var/lib/mysql instead (and it creates a new ibdata1 file there, then dies because it doesn't find an existing one). I tried symlinking my real directory to /var/lib/mysql  but it still dies, with a different error, which leads me to suspect some OCF variable is unset. I tried forcing OCF_ROOT to /usr/lib/ocf in root's environment but that didn't help.
The mysql error I'm getting that suggests a problem with the OCF_ROOT (or some other) variable (this is without forcing it in root's env) :

2014-01-23 16:58:11 7f12c981e7e0  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2014-01-23 16:58:11 27911 [ERROR] InnoDB: File .//data/mysql/data5615/ibdata1: 'create' returned OS error 71. Cannot continue operation

The .//data prefix should be just /data.

If I try to run the script manually, I get
 /bin/bash /usr/lib/ocf/resource.d/percona/mysql start
/usr/lib/ocf/resource.d/percona/mysql: line 61: /lib/heartbeat/ocf-shellfuncs: No such file or directory

(missing the /usr/lib/ocf prefix which I think OCF_ROOT should be set to)



Thorn Roby

unread,
Jan 23, 2014, 7:14:27 PM1/23/14
to prm-d...@googlegroups.com

Sorry, by "simplify things" I meant I reassigned everything to a single network (the MM.NN.32.0 network where the replication would normally run).

Yves Trudeau

unread,
Jan 24, 2014, 9:17:05 AM1/24/14
to prm-d...@googlegroups.com
Before you even started you need replication to work.  The agent is not supposed to be called directly, it needs numerous variables set by Pacemaker.  Next...  in your config you have this:

primitive p_mysql ocf:rootpass:mysql \

and based on what I see in your email, it should be:

primitive p_mysql ocf:percona:mysql \

For the nodes, it _must_ be like I wrote, otherwise, it will not work.

Regards,

Yves


2014/1/23 Thorn Roby <thor...@gmail.com>

Thorn Roby

unread,
Jan 24, 2014, 12:23:28 PM1/24/14
to prm-d...@googlegroups.com

Yes, I followed the setup guide. And, as I said, replication is working fine. The configuration does have
 
 primitive p_mysql ocf:percona:mysql 

(the other text was an error resulting from overzealous redaction on my part).

The nodes are configured as you suggested:

        attributes p_mysql_mysql_master_IP="MM.NN.32.180"
        attributes p_mysql_mysql_master_IP="MM.NN.32.181"
        attributes p_mysql_mysql_master_IP="MM.NN.32.182"

I realize that running the script from the shell requires environment variables, in particular OCF_ROOT, to be set. I tried that, and also tried ocf-tester, with no luck. However, my observations about the apparent failure to read variables from my.cnf (specifically datadir, which is what is causing the restarted mysql instance to fail) were taken from watching the actual arguments in use during the attempted startup, not from running the script manually. Here is the command as the script starts it, note datadir is forced to /var/lib/mysql, not /data/mysql/data5615, as it is in my.cnf. I tried specifically adding the innodb-path-to-datadir variable (which defaults to the mysql datadir if it is not specified, which should also work) but that was also ignored. I also tried symlinking the real datadir to /var/lib/mysql, which also failed but for other reasons. Here are the startup arguments as run by the mysqld script (mysqld_safe is not running, I turned off automatic startup via chkconfig):
 
/root/PS5615/bin/mysqld --defaults-file=/etc/my.cnf --enforce_gtid_consistency=1 --gtid_mode=on --pid-file=/var/lib/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --datadir=/var/lib/mysql --user=mysql --skip-slave-start --read-only 

With the real datadir symlinked to /var/lib/mysql, the failure looks like this (which is what suggests to me that some of the OCF variables are not getting set right, independent of the problem of not reading datadir from my.cnf):

grep datadir /etc/my.cnf
datadir=/data/mysql/data5615

2014-01-23 17:39:40 35056 [Note] InnoDB: Completed initialization of buffer pool
2014-01-23 17:39:42 7f8fcfd0c7e0  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2014-01-23 17:39:42 35056 [ERROR] InnoDB: File .//data/mysql/data5615/ibdata1: 'create' returned OS error 71. Cannot continue operation

If the working directory of the mysqld script happens to be "/", this directory syntax would be OK, but in fact ibdata1 already exists in that directory and does not need to be recreated.

lrwxrwxrwx 1 root root 20 Jan 23 16:38 /var/lib/mysql -> /data/mysql/data5615
 ls -ld /data/mysql/data5615
drwxr-xr-x 5 mysql mysql 104 Jan 24 09:59 /data/mysql/data5615
 ls -l /data/mysql/data5615/ibdata1
-rw-rw---- 1 mysql mysql 12582912 Jan 23 16:13 /data/mysql/data5615/ibdata1


Yves Trudeau

unread,
Jan 24, 2014, 1:53:44 PM1/24/14
to prm-d...@googlegroups.com
Hi Thorn,
   hmmm,  it would be simpler if you just set datadir=/data/mysql/data5615 in the p_mysql cib configuration instead of pointing to /var/lib/mysql.  Maybe I am wrong but have you set something like this in your my.cnf

innodb_data_file_path=/data/mysql/data5615/ibdata1

if so... keep in mind that path will interpreted as:

/var/lib/mysql/.//data/mysql/data5615/ibdata1


by MySQL.  That looks like your error.  I suggest you simply remove this line from your my.cnf and retry.

Regards,

Yves


2014/1/24 Thorn Roby <thor...@gmail.com>

Thorn Roby

unread,
Jan 27, 2014, 4:10:59 PM1/27/14
to prm-d...@googlegroups.com

By adding the "datadir" parameter to the CIB I was able to get the cluster to start up and start mysql. However, only 2 of the 3 nodes (P2,P3) successfully join the cluster and start replication. Replication is running on all 3 nodes and is caught up but the cluster reports P1 is not accessible, although it also assigns all 3 reader_vips to it. The cluster sees P2 as current master, and P3 agrees, but P1 still sees the master as P3 instead of P2 (which it was earlier) . Manually changing the master on P1 to be P2 is successful, but the cluster still reports P1 p_mysql is not started.

 Current DC: eng-mysqlha-p1.mydomain.net - partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
7 Resources configured.



 reader_vip_1   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 reader_vip_2   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 reader_vip_3   (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p1.mydomain.net
 writer_vip     (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p2.mydomain.net
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ eng-mysqlha-p2.mydomain.net ]
     Slaves: [ eng-mysqlha-p3.mydomain.net ]
     Stopped: [ p_mysql:2 ]

Failed actions:
    p_mysql_start_0 (node=eng-mysqlha-p1.mydomain.net, call=31, rc=1, status=Timed Out): unknown error

I've also noticed that "pacemaker service stop" shuts down the mysql instances on P2 and P3, but not on P1 (I have to do service mysql stop to shut it down).


Thorn Roby

unread,
Jan 29, 2014, 5:34:51 PM1/29/14
to prm-d...@googlegroups.com

After increasing the "start" timeout from 60 to 120 seconds I am able to get the 3 node cluster running. For a while I was able to connect to reader VIPs. However, now the pattern is that the cluster comes up, briefly shows all reader vips assigned to a single node (the one most recently started), then after about one minute all reader VIPs disappear, and no mysql connection can be made to them. The writer VIP remains intact, and all instances of mysql are running. Testing the readable parameter results in all nodes reporting readable:

cibadmin -Q |grep readable |grep nvpair
          <nvpair id="status-eng-mysqlha-p1.mydomain.net-readable" name="readable" value="1"/>
          <nvpair id="status-eng-mysqlha-p3.mydomain.net-readable" name="readable" value="1"/>
          <nvpair id="status-eng-mysqlha-p2.mydomain.net-readable" name="readable" value="1"/>

Restarting all pacemaker processes produces the same end result. Replication is intact, mysql connections can be made via the physical nic addresses and the writer VIP, but no reader VIPs exist.
 

Yves Trudeau

unread,
Jan 29, 2014, 7:03:42 PM1/29/14
to prm-d...@googlegroups.com

Hi Thorn,
  At least there's a progress.  Please send me the output of 'crm_mon -A1'  went the rvips are gone.

Regards,

Yves

--

Thorn Roby

unread,
Jan 30, 2014, 12:12:59 PM1/30/14
to prm-d...@googlegroups.com
Last updated: Thu Jan 30 10:09:47 2014
Last change: Wed Jan 29 17:05:40 2014 via crmd on eng-mysqlha-p2.mydomain.net
Stack: classic openais (with plugin)
Current DC: eng-mysqlha-p3.mydomain.net - partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
7 Resources configured.



 writer_vip     (ocf::heartbeat:IPaddr2):       Started eng-mysqlha-p3.mydomain.net
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ eng-mysqlha-p3.mydomain.net ]

Node Attributes:
    + master-p_mysql                    : 60
    + nic                               : em2
    + p_mysql_mysql_master_IP           : MM.NN.32.180
    + readable                          : 1
    + master-p_mysql                    : 60
    + nic                               : em2
    + p_mysql_mysql_master_IP           : MM.NN.32.181
    + readable                          : 1
    + master-p_mysql                    : 1060
    + nic                               : em2
    + p_mysql_mysql_master_IP           : MM.NN.32.182
    + readable                          : 1

Yves Trudeau

unread,
Jan 30, 2014, 4:38:04 PM1/30/14
to prm-d...@googlegroups.com
Hi Thorn,
   I think I found your problem...


location loc-No-reader-vip-2 reader_vip_2 \
        rule $id="rule-no-reader-vip-2" -inf: readable gt 0
location loc-No-reader-vip-3 reader_vip_3 \
        rule $id="rule-no-reader-vip-3" -inf: readable gt 0
location loc-no-reader-vip-1 reader_vip_1 \
        rule $id="rule-no-reader-vip-1" -inf: readable gt 0

if readable is 1, that forbids the resource to start.  You should have:

location loc-No-reader-vip-2 reader_vip_2 \
        rule $id="rule-no-reader-vip-2" -inf: readable eq 0
location loc-No-reader-vip-3 reader_vip_3 \
        rule $id="rule-no-reader-vip-3" -inf: readable eq 0
location loc-no-reader-vip-1 reader_vip_1 \
        rule $id="rule-no-reader-vip-1" -inf: readable eq 0

Regards,

Yves



2014-01-30 Thorn Roby <thor...@gmail.com>:

Thorn Roby

unread,
Jan 30, 2014, 6:02:52 PM1/30/14
to prm-d...@googlegroups.com
Great, thanks. It looks like I cut-and-pasted the summary of all CIB entries at the bottom of the setup guide, which has "gt 0", but looking back at the original entry further up in the document, it is "eq 0". I'll go over the rest of them and see if there are any other discrepancies.

Thorn Roby

unread,
Feb 3, 2014, 7:56:52 PM2/3/14
to prm-d...@googlegroups.com
The cluster runs but continually resets the slave. I'm not sure if this has been happening all along, it's possible I just didn't notice because the status looks OK unless I tail the mysql log, and replication does work (I tested a few inserts) . I'm attaching mysql, corosync and pacemaker logs for the duration of a cycle starting up corosync, then pacemaker, and waiting until the master/slave status is established, then shutting everything down.
harlandlogs.tgz

Thorn Roby

unread,
Feb 4, 2014, 12:03:29 PM2/4/14
to prm-d...@googlegroups.com
I forgot to mention that with pacemaker off, replication functions normally with no thread restarts.

Yves Trudeau

unread,
Feb 4, 2014, 1:14:03 PM2/4/14
to prm-d...@googlegroups.com
Hi Thorn,
   the pace.log was the way to go!  In the merge process, a "return" was missing, causing the continous reconfig.  I pushed the updated agent, please update yours.

wget https://github.com/percona/percona-pacemaker-agents/raw/master/agents/mysql_prm56

Regards,

Yves


2014-02-04 Thorn Roby <thor...@gmail.com>:
I forgot to mention that with pacemaker off, replication functions normally with no thread restarts.


On Monday, February 3, 2014 5:56:52 PM UTC-7, Thorn Roby wrote:
The cluster runs but continually resets the slave. I'm not sure if this has been happening all along, it's possible I just didn't notice because the status looks OK unless I tail the mysql log, and replication does work (I tested a few inserts) . I'm attaching mysql, corosync and pacemaker logs for the duration of a cycle starting up corosync, then pacemaker, and waiting until the master/slave status is established, then shutting everything down.

--
Reply all
Reply to author
Forward
0 new messages