Read-only node on non-primary status - SET GLOBAL read_only=1 via wsrep_notifiy_cmd processes FOREVER :(

1,243 views
Skip to first unread message

Theo Smith

unread,
Jul 24, 2013, 8:16:00 AM7/24/13
to codersh...@googlegroups.com
Hi Guys,

I am loving Galera, truly a well written and thought out multi-master replication solution.

I unfortunately only have one issue or one requirement. I need a non-primary server to be READ, so still serving select queries.

The best way I thought of doing this was through the wsrep_notify_cmd. No matter if I write a custom script, or use the "example"
script to pick up the cases, when I parse the queries to MySQL, the queries sit in the processlist forever.

I can however parse the commands/queries on CLI without a problem, it is only a problem when it executes through the wsrep_notify_cmd :(

Here is my present script, I have written several person, with or without cases, parsing mysql queries differently, nothing help :(

(I have tried various options as well, with splitbrain on true, etc, same result)

--
#!/bin/sh -eu

PRIMARY=`mysql -e "show status like 'wsrep_cluster_status';" -uroot -p123456 | awk '{print $2}' | sed -n '2p'`

echo $PRIMARY >/tmp/PRIMARY_STATUS

#echo "$2" >/tmp/PRIMARY
if [ "$PRIMARY" = "non-Primary" ]; then
        echo "SET GLOBAL wsrep_on=0;" >/tmp/read_only.sql
        echo "FLUSH TABLES WITH READ LOCK;" >>/tmp/read_only.sql
        echo "SET GLOBAL read_only = ON;" >>/tmp/read_only.sql
        mysql -B -uroot -p123456 < /tmp/read_only.sql
        echo "now in non-primary" > /tmp/primary
fi
if [ "$PRIMARY" = "Primary" ]; then
        echo "SET GLOBAL read_only = OFF;" >/tmp/read_only_off.sql
        echo "UNLOCK TABLES;" >>/tmp/read_only_off.sql
        mysql -B -uroot -p123456 < /tmp/read_only_off.sql
        echo "now in primary" > /tmp/primary
fi
exit 0
--

--
Every 1,0s: mysql -e "show processlist;" -p123456                                                      Wed Jul 24 14:14:09 2013

Id      User    Host    db      Command Time    State   Info
1       system user             NULL    Sleep   71626   committed 1846  NULL
2       system user             NULL    Sleep   75516   wsrep aborter idle      NULL
3       system user             NULL    Sleep   71864   committed 1842  NULL
4       system user             NULL    Sleep   71684   committed 1845  NULL
5       system user             NULL    Sleep   71744   committed 1844  NULL
164     root    localhost       NULL    Query   22      NULL    FLUSH TABLES WITH READ LOCK <- Just sits forever :(
187     root    localhost       NULL    Query   0       NULL    show processlist
--

Any help or suggestion would greatly help.

Not sure if it is a bug or something!?

Regards,

Theo

Ps.

Here is some information:

# cat /etc/issue
Ubuntu 12.04.2 LTS \n \l

# dpkg -l | grep -i galera
ii  galera                               23.2.4                       Galera Replication Framework

# dpkg -l | grep -i mysql
ii  libdbd-mysql-perl                    4.020-1build2                Perl5 database interface to the MySQL database
rc  libmysqlclient16                     5.1.66-0ubuntu0.11.10.2      MySQL database client library
ii  libmysqlclient18                     1:5.5.32-rel31.0-549.precise Percona Server database client library
ii  mysql-client                         5.5.31-0ubuntu0.12.04.2      MySQL database client (metapackage depending on the latest version)
ii  mysql-client-5.5                     5.5.31-0ubuntu0.12.04.2      MySQL database client binaries
ii  mysql-client-core-5.5                5.5.31-0ubuntu0.12.04.2      MySQL database core client binaries
ii  mysql-common                         5.5.31-0ubuntu0.12.04.2      MySQL database common files, e.g. /etc/mysql/my.cnf
ii  mysql-server-wsrep                   5.5.29-23.7.3                wsrep-enabled MySQL server
ii  php5-mysql                           5.3.10-1ubuntu3.7            MySQL module for php5

# cat /etc/mysql/my.cnf | grep -i notify
wsrep_notify_cmd=/usr/local/bin/wsrep_notify.sh

# cat /etc/mysql/my.cnf | grep -i provider_options
wsrep_provider_options="gcache.size=1G; evs.keepalive_period=PT3S; evs.inactive_check_period=PT7S; evs.suspect_timeout=PT15S; evs.inactive_timeout=PT30S; evs.consensus_timeout=PT30S; evs.send_window=1024; evs.user_send_window=512;"




Theo Smith

unread,
Jul 29, 2013, 3:47:12 AM7/29/13
to codersh...@googlegroups.com
Bump

I am no closer to solving this issue :(

Anybody trying to do something similar!?

Ilias Bertsimas

unread,
Jul 29, 2013, 9:01:57 AM7/29/13
to codersh...@googlegroups.com
Hi Theo,

I do not think you can execute a flush tables after the node has lost primary.

The only way that I can see making a node available to queries is by making it PRIMARY on it's own by setting wsrep_provider_options="pc.bootstrap=1".
This will probably make the node inconsistent with the old cluster and you will not have the ability to make it join the old one automatically once the network issues or whatever issue made it split from the cluster in the first place is resolved.

If it happens to keep the same cluster id when it becomes PRIMARY and you make sure there are no writes so the writeset position remains the same one as when it lost PRIMARY there might be a chance for it to join the old cluster. That is of course pure speculation on my part as I have not tested something like that.

Regards,
Ilias.

Alex Yurchenko

unread,
Jul 29, 2013, 4:50:03 PM7/29/13
to codersh...@googlegroups.com
Theo,

You can't use flush tables with read lock in such situation (I guess
that's a bug), and that should not be necessary. Just

SET GLOBAL read_only=1; SET GLOBAL wsrep_on=0;

(in that order, recovery should be in the opposite order) should be
enough.

If that does not work for you, please submit feature request at
https://launchpad.net/codership-mysql

Regards,
Alex
>> *164 root localhost NULL Query 22 NULL FLUSH
>> TABLES WITH READ LOCK* <- Just sits forever :(
--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Andrew Basterfield

unread,
Jan 23, 2015, 12:29:54 PM1/23/15
to codersh...@googlegroups.com
SET GLOBAL read_only=1 will also hang indefinitely due to the fix for this bug http://bugs.mysql.com/bug.php?id=11733 in MySQL 5.1+
Reply all
Reply to author
Forward
0 new messages