Production usage

6,045 views
Skip to first unread message

Simon Balz

unread,
Feb 10, 2012, 8:38:32 AM2/10/12
to codership
Dear all,

I'm currently evaluating a MySQL galera cluster and I'm very happy
with the results so far.
My question is, if you already can recommend galera cluster for
production usage? Does anyone have experience, running such a cluster
on a virtualized environment, VMware for example?
What is the worst-case scenario when using galera? The plan is to run
daily full dumps and activating binary logging to have the
possibility, to restore the database on a native mysql server in case
of troubles.

Looking forward for your replies.
Best regards,
Simon

Sam Bashton

unread,
Feb 10, 2012, 8:48:46 AM2/10/12
to Simon Balz, codership

We're using in production on Amazon Web Services, in multiple
availability zones in a single region. Been running since the beginning
of the year with 1.1 and no issues thus far.

There's no issues with restoring a dump created from a Galera server
into standard MySQL. I *think* you could have a standard MySQL server as
a slave of one of the Galera servers, as a warm standby, although I've
not actually tested this.

Regards,


--
Sam Bashton, Director: Bashton Ltd
Bashton: Business Focussed Open Source
Building highly scalable Linux systems since 2004
www.bashton.com | 0161-424-9600 | DDI 0161-424-9611

signature.asc

Henrik Ingo

unread,
Feb 10, 2012, 8:51:41 AM2/10/12
to Simon Balz, codership
On Fri, Feb 10, 2012 at 3:38 PM, Simon Balz <si...@balz.me> wrote:
> I'm currently evaluating a MySQL galera cluster and I'm very happy
> with the results so far.
> My question is, if you already can recommend galera cluster for
> production usage?

We are yet to actually run it in production, but yes, I am
recommending it for production usage. I benchmarked the 0.8 version
for 5 weeks last Summer without a crash or issue. I'm also more
confident in Galera not screwing up data integrity compared to other
commonly used solutions like MySQL replicaiton - and I find the
management more pleasant.

> Does anyone have experience, running such a cluster
> on a virtualized environment, VMware for example?

Haven't yet. I don't really see an issue, other than things that could
be said about standalone MySQL in a VM too.

> What is the worst-case scenario when using galera? The plan is to run
> daily full dumps and activating binary logging to have the
> possibility, to restore the database on a native mysql server in case
> of troubles.

Sounds like a good idea. Notice that the binary log is only useful
together with a backup taken from the same node, and the node must not
have been subject to a full state snapshot after the backup.

henrik

--
henri...@avoinelama.fi
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559

Henrik Ingo

unread,
Feb 10, 2012, 8:53:17 AM2/10/12
to Sam Bashton, Simon Balz, codership
> into standard MySQL.  I *think* you could have a standard MySQL server as
> a slave of one of the Galera servers, as a warm standby, although I've
> not actually tested this.

Yep, done that.

admin extremeshok.com

unread,
Feb 10, 2012, 9:11:57 AM2/10/12
to codersh...@googlegroups.com
Wait for galera 2 release, as it fixes a bunch of nasty bugs, before
going into production.

Simon Balz

unread,
Feb 10, 2012, 9:35:56 AM2/10/12
to codership
> Sounds like a good idea. Notice that the binary log is only useful
> together with a backup taken from the same node, and the node must not
> have been subject to a full state snapshot after the backup.
So how can I achieve a consistent archive log without massively
decrease the performance? Does that mean I have to activate binary
logging on all cluster nodes?
Or what is the best practice for a point-in-time recovery backup of a
galera cluster?

And what do you guys use for load distribution? Galera recommends a
setup with a local haproxy or similar on each client connecting to the
cluster.
Do you think, using a physical loadbalancer, e.g. F5 is a good idea?

Sam Bashton

unread,
Feb 10, 2012, 9:41:16 AM2/10/12
to Simon Balz, codership
On Fri, Feb 10, 2012 at 06:35:56AM -0800, Simon Balz wrote:
> > Sounds like a good idea. Notice that the binary log is only useful
> > together with a backup taken from the same node, and the node must not
> > have been subject to a full state snapshot after the backup.
> And what do you guys use for load distribution? Galera recommends a
> setup with a local haproxy or similar on each client connecting to the
> cluster.
> Do you think, using a physical loadbalancer, e.g. F5 is a good idea?

We're using HAProxy on each client. No reason you couldn't use an F5 if
you've got one lying around, but it'd be hard to justify the expenditure
for this use only.

signature.asc

admin extremeshok.com

unread,
Feb 10, 2012, 9:44:11 AM2/10/12
to Simon Balz, codership
using HAProxy for loadbalancing and ha.

For backups, we run an extra galera server that we use for mysql dump at
30minute intervals.

HA PROXY <===> 3x galera (r/w) ----> galera (r) (backup only)

Alexey Yurchenko

unread,
Feb 10, 2012, 10:15:12 AM2/10/12
to codership
Hi Simon,

I guess I'll add something although I'm not impartial.

On Feb 10, 3:38 pm, Simon Balz <si...@balz.me> wrote:
> Dear all,
>
> I'm currently evaluating a MySQL galera cluster and I'm very happy
> with the results so far.
> My question is, if you already can recommend galera cluster for
> production usage?

1) Galera is a tool. Like any tool it has limitations, both obvious
and subtle, which determine its suitability for your use case.
2) As any sufficiently complex piece of software it has bugs, both
known and unknown and will always have, as it is not sanely possible
to test it against the whole multitude of possible use cases.

So, whare does it leave us with respect to "ready-for-production"
idiom? I guess it is only developers' inner feeling that "well, now
this can be used". And so it is any GA release. Because it works
perfectly - sans limitations and sans bugs. If it works for you, well,
then it works. So in the end it is you who decides.

> Does anyone have experience, running such a cluster
> on a virtualized environment, VMware for example?

We routinely use EC2 as a testbed. Also KVM on local machines. As
Henrik noted it is no defferent from running standalone MySQL server
and mostly that means rather poor client connection latencies.

> What is the worst-case scenario when using galera?

The worst case is the abort of N-1 nodes due to database corruption on
the remaining one (e.g. transaction tries to update a row that exists
only on this node). It is very user-unfriendly, but that the situation
at the moment. At some point we'll implement a quorum procedure for
such accidents.

In short, in that case you need to shut down the remaining server and
restart aborted ones.

> The plan is to run
> daily full dumps and activating binary logging to have the
> possibility, to restore the database on a native mysql server in case
> of troubles.

Unless you want to achieve point in time recovery, this is
unnecessary. You already have N servers with _identical_ states, you
don't need a dump and binlog to recover a failed node (even in the
"worst case" mentioned above). If what you're looking for is PITR, one
thing to note is that in this case you don't need various "HA"
settings like sync-binlog. That should speed things up a little.

And yes, you'd better wait for 2.0 ;)

Regards,
Alex

Alexey Yurchenko

unread,
Feb 10, 2012, 10:23:49 AM2/10/12
to codership
On Feb 10, 4:35 pm, Simon Balz <si...@balz.me> wrote:
> Do you think, using a physical loadbalancer, e.g. F5 is a good idea?

Well, HW load balancer is the next best thing after in-application
balancing. User-space load balancers are terrible CPU hogs. But
beware, it will only detect node failure, but not cluster partition
(i.e. when a node is alive but can't connect to other nodes and so
can't replicate and commit). This should be a very rare situation
though, probably impossible in LAN.

Haris Zukanovic

unread,
Feb 10, 2012, 11:01:38 AM2/10/12
to codersh...@googlegroups.com
Hi Alex,

Are you actively participating/monitoring also bugs reported in relation
to Percona mysql server?
Are these planned to be included resolved in upcoming 2.0 release?
https://bugs.launchpad.net/percona-xtradb-cluster

kind regards
Haris

--
--
Haris Zukanovic

Alex Yurchenko

unread,
Feb 10, 2012, 11:12:01 AM2/10/12
to codersh...@googlegroups.com
On 2012-02-10 19:01, Haris Zukanovic wrote:
> Hi Alex,
>
> Are you actively participating/monitoring also bugs reported in
> relation to Percona mysql server?
> Are these planned to be included resolved in upcoming 2.0 release?
> https://bugs.launchpad.net/percona-xtradb-cluster
>
>
>
> kind regards
> Haris
>

Hi Haris,

All of those bugs (except lp:917837, which is Percona-specific) are
mapped to either MySQL-wsrep or Galera projects. And if I'm not mistaken
they are all fixed already in Codership trees (so yes, they will be
included in 2.0). The process is then such that Percona pulls fixes from
Codership repos and applies them to their code base.

Regards,
Alex

PATRICKZ...@comcast.net

unread,
Feb 10, 2012, 4:58:10 PM2/10/12
to ad...@extremeshok.com, codership, Simon Balz
Hi,

Was curious if you'd be able to expand on how you are using the HA Proxy for LB and HA?

How do you handle the traffic to a specific node when it's removed from the cluster (ie RSU ops or a node failure)?  How does HA Proxy know that the desync'd node is no longer available?

I have a colleague looking at Scalebase to try and manage the traffic between the App Server and DBs and I don't think he has found a good way to handle desync'd nodes (or even to ID when a new node is brought into the cluster).

Was just curious if you were able to share more specifics about your use of HA Proxy?

Cheers
Patrick


From: "admin extremeshok.com" <ad...@extremeshok.com>
To: "Simon Balz" <si...@balz.me>
Cc: "codership" <codersh...@googlegroups.com>
Sent: Friday, February 10, 2012 9:44:11 AM
Subject: Re: [codership-team] Re: Production usage
--
You received this message because you are subscribed to the Google Groups "codership" group.
To post to this group, send email to codersh...@googlegroups.com.
To unsubscribe from this group, send email to codership-tea...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/codership-team?hl=en.

Haris Zukanovic

unread,
Feb 10, 2012, 6:38:38 PM2/10/12
to codersh...@googlegroups.com
As far as I know, the only way so far to recognise a desync'ed node is to use the notification script and notify the LB.

my.cnf cutout

# Command to call when node status or cluster membership changes.
# Will be passed all or some of the following options:
# --status  - new status of this node
# --uuid    - UUID of the cluster
# --primary - whether the component is primary or not ("yes"/"no")
# --members - comma-separated list of members
# --index   - index of this node in the list
#wsrep_notify_cmd=

I am not sure if that will work in a cluster partitioned due to network failures.
-- 
--
Haris Zukanovic

Simon Balz

unread,
Feb 11, 2012, 8:16:42 AM2/11/12
to codership
Hi Alex

Thanks for your input.
Still one more quesition:

> Unless you want to achieve point in time recovery, this is
> unnecessary. You already have N servers with _identical_ states, you
> don't need a dump and binlog to recover a failed node (even in the
> "worst case" mentioned above). If what you're looking for is PITR, one
> thing to note is that in this case you don't need various "HA"
> settings like sync-binlog. That should speed things up a little.

The main goal for PITR is not for system recovery but for data
recovery after user maloperation, like somebody (or some bad piece of
software ;) ) drops a table at 04:07pm.
Does this somehow work in a galera cluster? I didn't understand that
so far, sorry :)

> And yes, you'd better wait for 2.0 ;)
It won't be long until, right? :)

Simon

Simon Balz

unread,
Feb 11, 2012, 8:22:12 AM2/11/12
to codership

On Feb 10, 4:23 pm, Alexey Yurchenko <alexey.yurche...@codership.com>
wrote:
> Well, HW load balancer is the next best thing after in-application
> balancing. User-space load balancers are terrible CPU hogs. But
> beware, it will only detect node failure, but not cluster partition
> (i.e. when a node is alive but can't connect to other nodes and so
> can't replicate and commit). This should be a very rare situation
> though, probably impossible in LAN.

What is your best-practice so far? Does glb fit best with a galera
cluster? I've read a lot about haproxy and mysql loadbalancing and I'm
not very optimistic but let's see. Actually, the test setup is running
behind a physical loadbalancer, which works pretty fine since it
natively supports layer 7 mysql checks (e.g. running a simple query
every x seconds), which is a huge advantage compared to haproxy.

Sam Bashton

unread,
Feb 11, 2012, 8:36:20 AM2/11/12
to Simon Balz, codership

HAProxy also supports layer 7 checks, albeit in a slightly hacky way
because you have to return HTTP content.
I can't reproduce the script we use in production because it does some
specific checks for that implementation, but we took the following blog
post as 'inspiration':

http://www.alexwilliams.ca/blog/2009/08/10/using-haproxy-for-mysql-failover-and-redundancy/

We actually use a Python script, started via upstart to ensure if it dies for
some reason it gets restarted.

Anyway, if your hardware load balancer can do layer 7 mysql checks it
seems like it might be simpler to just use that, and IMO simpler is
better in 99% of cases :)

signature.asc

Alex Yurchenko

unread,
Feb 11, 2012, 10:19:45 AM2/11/12
to codersh...@googlegroups.com

Yes, I'd say that would be the only purpose of making backups of Galera
cluster. And it works like with a standalone mysql: all Galera nodes are
identical. You backuped one - you backuped them all. You restored table
to one - you restored it to all.

I didn't mean to say that it is not an important use case. I just
wanted to point out that Galera cluster is supposed to give you enough
redundancy against hardware or software failures (in a sense you have
synchronous binlogging to other nodes already), so the focus of taking
backups is somewhat different.

>> And yes, you'd better wait for 2.0 ;)
> It won't be long until, right? :)

Yes, yes, we're just squashing the regressions. Meanwhile you can try
2.0beta - just to have a feel.

Regards,
Alex

Alex Yurchenko

unread,
Feb 11, 2012, 11:11:58 AM2/11/12
to codersh...@googlegroups.com
On 2012-02-11 16:22, Simon Balz wrote:
> On Feb 10, 4:23 pm, Alexey Yurchenko <alexey.yurche...@codership.com>
> wrote:
>> Well, HW load balancer is the next best thing after in-application
>> balancing. User-space load balancers are terrible CPU hogs. But
>> beware, it will only detect node failure, but not cluster partition
>> (i.e. when a node is alive but can't connect to other nodes and so
>> can't replicate and commit). This should be a very rare situation
>> though, probably impossible in LAN.
>
> What is your best-practice so far?

Heh, we're software developers and, ironically, have little use for
MySQL clusters. We practice with gcc and gdb. ;)

Seriously, most of experience that we have comes from our internal
testing, and that one is rather synthetic (to be efficient in hitting
problems) and obviously it is in no way comparable to using cluster in
production. Moreover, users whom we know to use Galera in production
have such different use cases and requirements, that we can't derive
much from there.

What I can say for certain, is that any user-space load balancer at
least doubles client connection latency, and usually it is even worse.
And client connection latency is so bad for SQL performance, that
sometimes it is impossible to saturate 16 core server to more than 400%
with sysbench no matter how many connections you use. (It is most likely
a bug in sysbench, but still). In short, the client network can easily
be your bottleneck and whatever you can do to make it faster, you should
do.

> Does glb fit best with a galera cluster?

GLB was written for our testing purposes, so the main requirement for
it was performance. So it is rather primitive feature-wise and, most
importantly, we didn't have much feedback about it in production.

> I've read a lot about haproxy and mysql loadbalancing and I'm not
> very optimistic but let's see.

HAProxy is rather popular, and I guess that's its main advantage -
there are volumes of proven track records.

> Actually, the test setup is running
> behind a physical loadbalancer, which works pretty fine since it
> natively supports layer 7 mysql checks (e.g. running a simple query
> every x seconds), which is a huge advantage compared to haproxy.

As I said before, I would not even consider anything else had I had a
HW load balancer. You obviously have a LAN cluster there, cluster
partition with preserved client connectivity is next to impossible
there, so even layer 7 checks are not of much importance.

In any case, you could write a simple notification script that changes
balancer's routing table in response to node leaving primary component.
This is way better than layer 7 polling.

Regards,
Alex


Simon Balz

unread,
Feb 11, 2012, 11:12:07 AM2/11/12
to codership
Ok. For final understanding, does all that mean I need the binary log
only on one node?
Does the binary log affect the overall performance of the cluster
somehow?

Thanks for patience with my questions :)
Simon

On Feb 11, 4:19 pm, Alex Yurchenko <alexey.yurche...@codership.com>
wrote:

Alex Yurchenko

unread,
Feb 11, 2012, 2:59:02 PM2/11/12
to codersh...@googlegroups.com
On 2012-02-11 19:12, Simon Balz wrote:
> Ok. For final understanding, does all that mean I need the binary log
> only on one node?

Exactly. You should not forget to enable log_slave_updates though.

> Does the binary log affect the overall performance of the cluster
> somehow?

Well, it obviously does, but it is impossible to foretell by how much.
Since for PITR, you don't need to sync or flush it, it should not be
much.

Regards,
Alex

EllisGL

unread,
Feb 11, 2012, 7:48:36 PM2/11/12
to codership
Here's what I have done for HAProxy.

On each Galera node:
/etc/services
mysqlchk 9200/tcp # mysqlchk

/opt/mysqlchk (chmod +x)
#!/bin/bash
#
# This script checks if a mysql server is healthy running on
localhost. It will
# return:
#
# "HTTP/1.x 200 OK\r" (if mysql is running smoothly)
#
# - OR -
#
# "HTTP/1.x 500 Internal Server Error\r" (else)
#
# The purpose of this script is make haproxy capable of monitoring
mysql properly
#
# Author: Unai Rodriguez
#
# It is recommended that a low-privileged-mysql user is created to be
used by
# this script. Something like this:
#
# mysql> GRANT SELECT on mysql.* TO 'mysqlchkusr'@'localhost' \
# -> IDENTIFIED BY '257retfg2uysg218' WITH GRANT OPTION;
# mysql> flush privileges;

MYSQL_HOST="localhost"
MYSQL_PORT="3306"
MYSQL_USERNAME="mysqlchkusr"
MYSQL_PASSWORD="mysqlchkusrpassword!"

TMP_FILE="/tmp/mysqlchk.out"
ERR_FILE="/tmp/mysqlchk.err"

#
# We perform a simple query that should return a few results :-p
#
/usr/bin/mysql --host=$MYSQL_HOST --port=$MYSQL_PORT --user=
$MYSQL_USERNAME \
--password=$MYSQL_PASSWORD -e"show databases;" > $TMP_FILE 2>
$ERR_FILE

#
# Check the output. If it is not empty then everything is fine and we
return
# something. Else, we just do not return anything.
#
if [ "$(/bin/cat $TMP_FILE)" != "" ]
then
# mysql is fine, return http 200
/bin/echo -e "HTTP/1.1 200 OK\r\n"
/bin/echo -e "Content-Type: Content-Type: text/plain\r\n"
/bin/echo -e "\r\n"
/bin/echo -e "MySQL is running.\r\n"
/bin/echo -e "\r\n"
else
# mysql is fine, return http 503
/bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"
/bin/echo -e "Content-Type: Content-Type: text/plain\r\n"
/bin/echo -e "\r\n"
/bin/echo -e "MySQL is *down*.\r\n"
/bin/echo -e "\r\n"
fi

/etc/xinetd.d/mysqlchk (chmod 0644)
# default: on
# description: mysqlchk
service mysqlchk
{
disable = no
flags = REUSE
socket_type = stream
port = 9200
wait = no
user = nobody
server = /opt/mysqlchk
log_on_failure += USERID
only_from = 0.0.0.0/0
# recommended to put the IPs that need
# to connect exclusively (security purposes)
per_source = UNLIMITED
# Recently added (May 20, 2010)
# Prevents the system from complaining
# about having too many connections open from
# the same IP. More info:
# http://www.linuxfocus.org/English/November2000/article175.shtml
}


On the HAProxy box:
/etc/haproxy.cfg

global
daemon
maxconn 4096
quiet
user haproxy
group haproxy
#stats socket /tmp/haproxy
stats socket /tmp/haproxy mode 0600 level admin
pidfile /var/run/haproxy.pid

defaults
log global
mode tcp
option dontlognull retries 3 option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000

listen MySQL 10.0.0.10:3306
mode tcp
option httpchk
balance roundrobin

server 10.0.0.11:3306 10.0.0.17:3306 check port 9200 inter
12000 rise 3 fall 3
server 10.0.0.12:3306 10.0.0.18:3306 check port 9200 inter
12000 rise 3 fall 3
server 10.0.0.13:3306 10.0.0.19:3306 check port 9200 inter
12000 rise 3 fall 3

source 10.0.0.10

Simon Balz

unread,
Feb 12, 2012, 4:45:54 AM2/12/12
to codership
Hey thanks for those snippets.
There's only one slight little problem in the mysqlchk script:
The command 'SHOW DATABASES' is not reliable to check a Galera node
state because it still works even when wsrep_local_state is < 4.
For example a 'SELECT' won't work anymore:

mysql> show status like 'wsrep_local_state';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| wsrep_local_state | 0 |
+-------------------+-------+
1 row in set (0.00 sec)

mysql> select 1 from dual;
ERROR 1047 (08S01): Unknown command

So I'd recommend to change the mysqlchk command to
/usr/bin/mysql --host=$MYSQL_HOST --port=$MYSQL_PORT --user=
$MYSQL_USERNAME --password=$MYSQL_PASSWORD -e"SELECT 1 FROM DUAL" >
$TMP_FILE 2> $ERR_FILE
>         #http://www.linuxfocus.org/English/November2000/article175.shtml

Kacper Gogół

unread,
Feb 14, 2012, 4:16:43 PM2/14/12
to codership
This one is much much better:

WSSREP_STATUS=`/usr/bin/mysql --host=$MYSQL_HOST --user=
$MYSQL_USERNAME --password=$MYSQL_PASSWORD -e "show status like
'wsrep_local_state';" | awk '{if (NR!=1){print $2}}' 2>/dev/null`

#
# Check the galera cluster consistent on node, your solution still
allow connect to node even if cluster is desynced but mysql hear on
# 3306
#
#
if [ "$WSSREP_STATUS" == "4" ]
then
# mysql is fine, return http 200
/bin/echo -e "HTTP/1.1 200 OK\r\n"
/bin/echo -e "Content-Type: Content-Type: text/plain\r\n"
/bin/echo -e "\r\n"
/bin/echo -e "MySQL is running.\r\n"
/bin/echo -e "\r\n"
else
# mysql is fine, return http 503
/bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"
/bin/echo -e "Content-Type: Content-Type: text/plain\r\n"
/bin/echo -e "\r\n"
/bin/echo -e "MySQL is *down*.\r\n"
/bin/echo -e "\r\n"
fi

>         #http://www.linuxfocus.org/English/November2000/article175.shtml

Apoorva Gaurav

unread,
Aug 15, 2014, 9:59:07 PM8/15/14
to codersh...@googlegroups.com


On Wednesday, February 15, 2012 2:46:43 AM UTC+5:30, Kacper Gogół wrote:
This one is much much better:

WSSREP_STATUS=`/usr/bin/mysql --host=$MYSQL_HOST --user=
$MYSQL_USERNAME --password=$MYSQL_PASSWORD -e "show status like
'wsrep_local_state';" | awk '{if (NR!=1){print $2}}' 2>/dev/null`
we can also try this -e"show status where variable_name='wsrep_local_state_comment' and  value in ('Donor/Desynced', 'Synced');" 
Reply all
Reply to author
Forward
0 new messages