MariaDB Galera multiple datacenters

Mark C

unread,

Jan 7, 2014, 8:26:08 AM1/7/14

to codersh...@googlegroups.com

Hello,

I’m looking into using MariaDB & Galera to replace my current setup of MySQL and was looking for suggestions on how best to set up DR between two datacenters.

The servers are accessed via an app using the internet (already have failover in place for web traffic)

Current set up:

4(2 at each datacenter) MySQL servers (using innodb), setup in a Master – Slave replication as below:

dc1-master replicates to dc1-slave

dc2-master replicates to dc2-slave

dc1-master replicates to dc2-master

I only write to one master hence why there’s no replication back between dc1-master & dc2-master

The data centres are in two different geographical locations connected by a 1GB link (used as extension to local network); latency on this connection is around 5-6ms. They also have external internet links (100/1000MB).

So I was wondering what the best set is up to go for, I was thinking:

One galera cluster containing all 4 servers connected via the 1GB link which is fine but from what I’ve read if the link was to go down(temporary loss e.g. 60mins) both parts of the cluster would stop accepting reads and writes to avoid split brain due to there not been a majority for quorum to use.

I have seen that I could add more servers/weight quorum to my primary dc (dc1) so then that would have majority but what if dc1 fully goes down (power failure/total network loss), would dc2 be set to not accept connections?

Is there a way I could get set up so that if dc1 is fully down then dc2 will be given majority so it can accept connections automatically like my web traffic does?

Thanks

Mark

Alex Yurchenko

unread,

Jan 7, 2014, 4:55:54 PM1/7/14

to codersh...@googlegroups.com

Yes. If majority fails, you lose quorum.

> Is there a way I could get set up so that if dc1 is fully down then dc2
> will be given majority so it can accept connections automatically like
> my
> web traffic does?

Unfortunately, no, not without some third entity. Galera won't be able
to distinguish "dc1 down" from "dc1-dc2 connection down". Web traffic
goes to web servers which are stateless. With databases you need to make
sure you're not writing to outdated replica, and that's much harder.

Regards,
Alex

> Thanks
>
> Mark

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Mark C

unread,

Jan 8, 2014, 6:28:33 AM1/8/14

to codersh...@googlegroups.com

Hi Alexey,

Thanks for your reply.

How would you say is best to set up a third entity, e.g. a full server (mariadb+galera) or an arbitor?

and as I'm wanting a preference to dc1 would i set it up there or elsewhere e.g. my office,cloud based server(amazon or other)?

just thought would it be worth putting both ips(internal & external) of the servers into wsrep_cluster_address= or would that add network traffic and not solve the issue? can you get it to prioritize servers?

how would I go about making dc2 the live one if dc1 is fully down? would it be a case of on the cluster in dc2 by doing the following:

choose the most updated of the nodes. This can be done by checking the output of SHOW STATUS LIKE 'wsrep_last_committed'. Choose the node with the highest value.
run SET GLOBAL wsrep_provider_options='pc.bootstrap=yes' on it
set app connections accordingly (ignore dc1 and write to the server in dc2 thats had step 2 run on it)

Thanks

Mark

Alex Yurchenko

unread,

Jan 8, 2014, 8:34:33 AM1/8/14

to codersh...@googlegroups.com

On 2014-01-08 13:28, Mark C wrote:
> Hi Alexey,
>
> Thanks for your reply.
>
> How would you say is best to set up a third entity, e.g. a full server
> (mariadb+galera) or an arbitor?

Well, actually, "third entity" is to be understood in a broader,
theoretical sense. It can be a node or it can be a human operator - some
party that has additional, external information on the situation, beyond
simple "connection to dc1 lost". Something that would allow us to be
sure that dc1 is no longer operational. If it is a human, then he might
know that dc1 is really down. If it is a node, then dc1 is in minority.

> and as I'm wanting a preference to dc1 would i set it up there or
> elsewhere
> e.g. my office,cloud based server(amazon or other)?
>
> just thought would it be worth putting both ips(internal & external) of
> the
> servers into wsrep_cluster_address= or would that add network traffic
> and
> not solve the issue? can you get it to prioritize servers?

Arbitrator just counts as a node when quorum is calculated.

So if you put arbitrator in dc1, it would be identical to increasing
weight of one of dc1 nodes, only less efficient.

If you put arbitrator outside, then you should not worry about
prioritizing dc1 as it will fail only when it looses connection to both
dc2 AND arbitrator, which is likely to mean that it lost all outside
connectivity. Note in this case you want to minimize arbitrator
connection latency to both datacenters. So EC2 may be not a good idea.

> how would I go about making dc2 the live one if dc1 is fully down?
> would it
> be a case of on the cluster in dc2 by doing the following:
>
>

> 1. choose the most updated of the nodes. This can be done by

> checking
> the output of SHOW STATUS LIKE 'wsrep_last_committed'. Choose the
> node
> with the highest value.

> 2. run SET GLOBAL wsrep_provider_options='pc.bootstrap=yes' on it
> 3. set app connections accordingly (ignore dc1 and write to the

> server
> in dc2 thats had step 2 run on it)

Right on.

Mark C

unread,

Jan 8, 2014, 9:23:36 AM1/8/14

to codersh...@googlegroups.com

Thanks for your reply,

would using both WAN and LAN server IP's in wsrep_cluster_address work as a backup method if the 1GB LAN link between the datacenters goes down and the WAN connection was still working?

from what I'm understanding you are suggesting in my case to keep the failover of databases between datacentres as a manual thing that a human would carry out to be sure that which ever one has failed has actually failed?

Thanks

Mark

Alex Yurchenko

unread,

Jan 8, 2014, 10:09:49 AM1/8/14

to codersh...@googlegroups.com

On 2014-01-08 16:23, Mark C wrote:
> Thanks for your reply,
>
> would using both WAN and LAN server IP's in wsrep_cluster_address work
> as a
> backup method if the 1GB LAN link between the datacenters goes down and
> the
> WAN connection was still working?

1) wsrep_cluster_address value is used only on initial connect. As long
as at least one address there is valid, the node connects to cluster and
gets all other addresses from there.

2) Galera maintains a single link between each pair of nodes.

3) What you are asking for is essentially interface bonding. Galera can
not do that ATM. But I suppose you can, using other tools. Even if you
have a single physical NIC with 2 different IPs on it, you probably
still can do some tricks using bridging or even just by setting up smart
routing table.

> from what I'm understanding you are suggesting in my case to keep the
> failover of databases between datacentres as a manual thing that a
> human
> would carry out to be sure that which ever one has failed has actually
> failed?

I'm not qualified to make any advices in the matters like that. It
depends on your requirements and resources and the choice is yours. I
can only explain alternatives.

Mark C

unread,

May 12, 2014, 11:16:45 AM5/12/14

to codersh...@googlegroups.com

Hi

I've now moved over to using galera cluster and finding it very good.

But I'm now investigating arbitrators again and would like some help regarding if it would work in my situation.

The setup I went for is two nodes in each data center so four nodes in total, all of the nodes connect via internal (private) IP's (192.168.0.1, 192.168.0.2, etc) but if I was to use a arbitrator it would be hosted elsewhere (a third party) so would be connecting to the current cluster via an external IP, would it work having a mix of internal and external IP's?

Thanks

Mark

Daniel Black

unread,

May 12, 2014, 10:57:26 PM5/12/14

to Mark C, codersh...@googlegroups.com

----- Original Message -----
> The setup I went for is two nodes in each data center so four nodes in
> total, all of the nodes connect via internal (private) IP's
> (192.168.0.1, 192.168.0.2, etc) but if I was to use a arbitrator it
> would be hosted elsewhere (a third party) so would be connecting to
> the current cluster via an external IP, would it work having a mix of
> internal and external IP's?

Addressing and routing is outside the scope of what galera cares about so yes you can mix unicast addresses in anyway you want.

Be sure to use a different gmcast.segment value in each DC for network optimisation.

--
--
Daniel Black, Engineer @ Open Query (http://openquery.com.au)
Remote expertise & maintenance for MySQL/MariaDB server environments.

Mark C

unread,

May 13, 2014, 4:06:34 AM5/13/14

to codersh...@googlegroups.com, Mark C, daniel...@openquery.com

Hi Daniel

Thanks for the reply, I'll look at setting the gmcast.segment value.

If my limited understanding of arbitrators is correct, the arbitrator acts as a listening node (doesn’t store any data) and is used to help with quorum, so for example if one of the data centers suffers a full outage, whichever can see the most nodes (themselves + arbitrator) will carry on as the 'Primary component'.

So in my case, if the 1GB link between the two data centers go down but both data centers can still see the internet would they then use the arbitrator as the go between node as it were? And the cluster would carry on as if nothing had happened to the link?

Thanks

Mark

yan.zhang

unread,

May 14, 2014, 3:01:10 AM5/14/14

to codersh...@googlegroups.com, Mark C, daniel...@openquery.com

Yes. Because galera use multicast communication underlying. And if dc1 wants to send data to dc2, it can use garb as relay. And vice versa. Although it's not efficient, cluster still works.

Thanks
Mark

Mark C

unread,

Aug 20, 2014, 7:08:35 AM8/20/14

to codersh...@googlegroups.com, orcm...@gmail.com, daniel...@openquery.com

Hi All

I've now added a garb node (third party hosted), its connecting to the cluster using the nodes external ips (145.251.x.x) and the rest of the nodes are using internal ips (10.0.x.x) from looking at log on the garb node it contains these messages every two minutes:

2014-08-20 10:46:32.016 INFO: (7c5a354f-2486-11e4-ae76-93bf7d9be6b3, 'ssl://0.0.0.0:4567') reconnecting to fcf31c42-2230-11e4-b838-a6a7ddd912e3 (ssl://10.0.0.51:4567), attempt 13260

2014-08-20 10:46:33.517 INFO: (7c5a354f-2486-11e4-ae76-93bf7d9be6b3, 'ssl://0.0.0.0:4567') reconnecting to 45691e47-2223-11e4-9dfc-8e68b8558cf2 (ssl://10.0.0.50:4567), attempt 13260

2014-08-20 10:46:35.517 INFO: (7c5a354f-2486-11e4-ae76-93bf7d9be6b3, 'ssl://0.0.0.0:4567') reconnecting to 7ebd042a-1e4f-11e4-b6e3-6f7312bdf15b (ssl://10.0.0.120:4567), attempt 13260

The IPs are of the other servers in the cluster (its 4 node + garb), the IP that doesn't appear is the one of the node it connects to via its external IP (145.251.x.x). So I was wondering what these messages mean and if I'm able to fix them? Also when the garb node dropped out (guessing a blip in internet connection) it also caused the cluster to throw:

Duplicate entry 'xxxxx' for key 'PRIMARY'

wsrep_auto_increment_control is set to ON so I didn't think could Duplicate entries could happen.