WAN Cluster unstable

Ross McFadyen

unread,

Nov 4, 2015, 6:00:23 AM11/4/15

to codership

Hi My names Ross,

First post here please be gentle :D

I'm currently in the process of setting up a Galera cluster over WAN throughout various data centres across the world mostly Europe though , However we seem to be having problems with the cluster breaking up into partitions. The behaviour seems quite strange as it dopes not happen at a particular time and is not specific to a certain number of nodes or group of nodes. In some cases after an indeterminable amount of time the nodes join the `main` cluster and leave the partitioned cluster although not always.

We are currently using the following versions across all the nodes : Server version: 5.6.21 MySQL Community Server (GPL), wsrep_25.9

We build our nodes via puppet so the my.cnf below is persistent on all nodes ( I have obfuscated our IP's and removed node names etc)

[mysql]

port = 54998

socket = /usr/local/mysql-galera/tmp/mysql-galera.sock

pid-file=/usr/local/mysql-galera/galera.pid

[mysqld]

federated

basedir=/usr/local/mysql-galera

port = 54998

socket = /usr/local/mysql-galera/tmp/mysql-galera.sock

datadir = /usr/local/mysql-galera/data

log-error=/var/log/galera/staging.log

tmpdir = /usr/local/mysql-galera/tmp

wsrep_cluster_name = $cluster_name

wsrep_node_name = $node_name

wsrep_node_address = 172.16.**.**

pid-file=/usr/local/mysql-galera/galera.pid

user=mysql

wsrep_provider = /usr/lib64/galera-3/libgalera_smm.so

wsrep_notify_cmd = /usr/local/bin/galeranotify.py

wsrep_sst_method = xtrabackup-v2

wsrep_sst_auth = wsrep:citnow2015

wsrep_provider_options = "evs.keepalive_period = PT3S"

wsrep_provider_options = "evs.suspect_timeout = PT30S"

wsrep_provider_options = "evs.inactive_timeout = PT1M"

wsrep_provider_options = "evs.install_timeout = PT1M"

wsrep_provider_options = "evs.inactive_check_period = PT3S"

wsrep_provider_options = "evs.join_retrans_period = PT1S"

wsrep_sst_receive_address = 172.16.**.**:4444

The only thing in the servers logs are to do with evs::proto , I can only assume this is eviction related but I cant find much on it in an attempt to find out I added the evs variables to out my.cnf as you can see above but the problem still persists

2015-10-28 21:04:20 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567') address 'tcp://172.16.*.**:4567' pointing to uuid 05d47528 is blacklisted, skipping

2015-10-28 21:04:21 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567') reconnecting to c26a3519 (tcp://192.168.**.**:4567), attempt 0

2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL, view_id(REG,05d47528,63)) suspecting node: c26a3519

2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL, view_id(REG,05d47528,63)) suspected node without join message, declaring inactive

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 2612d743 at tcp://192.168.**.**:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 3b467ac0 at tcp://192.168.**.**:54567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 4cfabdc3 at tcp://10.44.1.***:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 5d9a2da1 at tcp://192.168.**.**:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 8aa69abb at tcp://192.168.44.**:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring 910c09a4 at tcp://10.44.*.***:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring e4502e3e at tcp://192.168.**.**:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring ef10664a at tcp://192.168.**.**:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: declaring fe8e5a2c at tcp://192.168.*.*:4567 stable

2015-10-28 21:04:23 25974 [Note] WSREP: Node 05d47528 state prim

2015-10-28 21:04:24 25974 [Note] WSREP: view(view_id(PRIM,05d47528,64) memb {

05d47528,0

2612d743,0

3b467ac0,0

4cfabdc3,0

5d9a2da1,0

8aa69abb,0

910c09a4,0

e4502e3e,0

ef10664a,0

fe8e5a2c,0

} joined {

} left {

} partitioned {

c26a3519,0

})

Can anyone provide any help ?

Many Thanks,

Ross McFadyen

The information contained in this email and any attachments are confidential and/or may be legally privileged. This email is intended only for the individual named, therefore if you are not the named addressee you should not disseminate or distribute as unauthorised use, copying or disclosure of part or all of this email is strictly prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. Unless stated otherwise, any views or opinions expressed in this email are those of the author only. CitNOW reserve the right to monitor e-mail communications. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. CitNOW Ltd., (Registered in England, company number 06630126), 9 Millars Brook, Molly Millars Lane, Wokingham, West Berkshire, United Kingdom, www.citnow.com

Philip Stoev

unread,

Nov 4, 2015, 6:28:52 AM11/4/15

to Ross McFadyen, codersh...@googlegroups.com

Hello,

The log seems to indicate that a node is trying and failing to connect to
another on its private IP address:

2015-10-28 21:04:21 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567')
reconnecting to c26a3519 (tcp://192.168.**.**:4567), attempt 0
2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL,
view_id(REG,05d47528,63)) suspecting node: c26a3519

Can you check that all your nodes have the correct wsrep_node_address
setting? It is also useful to check that all firewalls are open both ways
between all the nodes on all Galera ports.

If that does not help, please find the log of the node that was evicted from
the cluster at that time and see if any interesting events are registered
there.

Thank you.

Philip Stoev

--
You received this message because you are subscribed to the Google Groups
"codership" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Teemu Ollakka

unread,

Nov 4, 2015, 6:50:25 AM11/4/15

to codership

Hi,

First of all, what is your Galera library version?

On Wednesday, November 4, 2015 at 1:00:23 PM UTC+2, Ross McFadyen wrote:

wsrep_provider_options = "evs.keepalive_period = PT3S"
wsrep_provider_options = "evs.suspect_timeout = PT30S"
wsrep_provider_options = "evs.inactive_timeout = PT1M"
wsrep_provider_options = "evs.install_timeout = PT1M"
wsrep_provider_options = "evs.inactive_check_period = PT3S"

Try to remove evs.inactive_check_period setting. It might not be a cause to your troubles, but it does not do anything good either.

wsrep_provider_options = "evs.join_retrans_period = PT1S"

- Teemu

Ross McFadyen

unread,

Nov 4, 2015, 8:37:06 AM11/4/15

to codership, ross.m...@citnow.com

Hello,

Thanks for the reply. The nodes randomly stay stable for days at a time etc then a sub-set will partition for a while then rejoin. I think this rules out the possibility of it being the firewall and the possibility of the wsrep_node_address setting being wrong ? Please correct me if I'm wrong though.

We are using a mixture of EC2 and other boxes but all routing is done via our VPN so its hopefully not EC2 related

Cheers,

Ross

Philip Stoev

unread,

Nov 4, 2015, 9:19:19 AM11/4/15

to Ross McFadyen, codership, ross.m...@citnow.com

Hello,

Without the logs from all nodes involved it is difficult to say, but if the
settings and the firewalls are not fully symmetrical, sometimes node X will
be able to establish a TCP connection to node Y, thus forming a cluster, but
if later for some reason node Y attempts to make a connection to node X that
would fail, all other things being equal.

Ross McFadyen

unread,

Nov 4, 2015, 10:51:03 AM11/4/15

to codership, ross.m...@citnow.com

Hello,

Okay that's useful, The last partition we had was at 11.56 AM (GMT) here are the logs from each of the servers corresponding to this time

I have took the logs from that I believe to be the events running upto and during the partitioning from 4 of the 11 nodes, I have omitted all the "pointing to uuid xxxxxxx is blacklisted, skipping" messages to make it more readable.

Hopefully from these logs you can help us understand and fix the problem

Cheers,

Ross

UK.log

UKACG.log

Germany.log

South Africa.log

Philip Stoev

unread,

Nov 5, 2015, 10:59:47 AM11/5/15

to Ross McFadyen, codership, ross.m...@citnow.com

Hello,

Can you please try the following:

* Put all your wsrep_provider_options options on a single line, in a single
string separated by semicolons. Having multiple wsrep_provider_options
directives in your my.cnf causes only the last one to be processed.
Therefore, most of the timeouts you specified did not kick in and the
defaults were used, which are more suitable for LAN rather than WAN. On a
running server, you can execute SHOW VARIABLES LIKE 'wsrep_provider_options'
and check the timeouts that are actually in effect.

* Specify an identical but distinct value for gmcast.segment for all nodes
located in a single data center.

Ross McFadyen

unread,

Nov 6, 2015, 6:44:07 AM11/6/15

to codership, ross.m...@citnow.com

Hello,

You were right, the wsrep_provider_options were not taking affect as only the last one was being processed so effectively our setup was not optimised for WAN. I have also done as Teemu suggested and removed the inactive_check_period setting so this will now be back to the default.

Now we wait and see if the Cluster is now stable, I will post my findings here but hopefully this has resolved the issue.

Many Thanks,

Ross McFadyen

James Wang

unread,

Nov 6, 2015, 9:46:00 AM11/6/15

to codership, ross.m...@citnow.com

Any update please? Thanks

Ross McFadyen

unread,

Nov 6, 2015, 10:50:34 AM11/6/15

to codership, ross.m...@citnow.com

Hello James,

Cluster running stable at the moment, Will update when/if anything happens.

Cheers,

Ross

Ross McFadyen

unread,

Nov 9, 2015, 9:39:55 AM11/9/15

to codership, ross.m...@citnow.com

Hello Philip,

I implemented the changes as you suggested on the 6th to all of our servers apart from 1 , so 10/11 of our servers had the "wsrep_provider_options" changed, I verified this as being the case using the SHOW VARIABLES LIKE command as you suggested.

The cluster remained stable until the early morning of the 8th

where 3 nodes partitioned at different times, 2 of the nodes had the updated wsrep_provider_options and 1 of which did not.

I have attached the logs from all 11 servers so we can hopefully get too the bottom of why the 2 nodes that have the updated wsrep_provider_options partitioned, The partitioning lasted roughly 4 minutes and started at 6.36 am - 6.40 am (GMT) .

Cheers,

Ross

ag.txt

us.txt

za-log.log

aunz-log.log

bfogs.txt

de.txt

eu.txt

m-log.log

ns-log.log

px-log.log

uk.txt

Reply all

Reply to author

Forward