WAN Cluster unstable

104 views
Skip to first unread message

Ross McFadyen

unread,
Nov 4, 2015, 6:00:23 AM11/4/15
to codership
Hi My names Ross,

First post here please be gentle :D

I'm currently in the process of setting up a Galera cluster over WAN throughout various data centres across the world mostly Europe though , However we seem to be having problems with the cluster breaking up into partitions. The behaviour seems quite strange as it dopes not happen at a particular time and is not specific to a certain number of nodes or group of nodes. In some cases after an indeterminable amount of time the nodes join the `main` cluster and leave the partitioned cluster although not always. 

We are currently using the following versions across all the nodes : Server version: 5.6.21 MySQL Community Server (GPL), wsrep_25.9

We build our nodes via puppet so the my.cnf below is persistent on all nodes ( I have obfuscated our IP's and removed node names etc)

[mysql]
port = 54998
socket = /usr/local/mysql-galera/tmp/mysql-galera.sock
pid-file=/usr/local/mysql-galera/galera.pid


[mysqld]
federated
basedir=/usr/local/mysql-galera
port = 54998

socket = /usr/local/mysql-galera/tmp/mysql-galera.sock

datadir = /usr/local/mysql-galera/data

log-error=/var/log/galera/staging.log

tmpdir = /usr/local/mysql-galera/tmp

wsrep_cluster_name = $cluster_name

wsrep_node_name = $node_name

wsrep_node_address = 172.16.**.**

pid-file=/usr/local/mysql-galera/galera.pid


user=mysql
wsrep_provider = /usr/lib64/galera-3/libgalera_smm.so
wsrep_notify_cmd = /usr/local/bin/galeranotify.py
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = wsrep:citnow2015
wsrep_provider_options = "evs.keepalive_period = PT3S"
wsrep_provider_options = "evs.suspect_timeout = PT30S"
wsrep_provider_options = "evs.inactive_timeout = PT1M"
wsrep_provider_options = "evs.install_timeout = PT1M"
wsrep_provider_options = "evs.inactive_check_period = PT3S"
wsrep_provider_options = "evs.join_retrans_period = PT1S"

wsrep_sst_receive_address = 172.16.**.**:4444


The only thing in the servers logs are to do with evs::proto , I can only assume this is eviction related but I cant find much on it in an attempt to find out I added the evs variables to out my.cnf as you can see above but the problem still persists                                                 


2015-10-28 21:04:20 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567') address 'tcp://172.16.*.**:4567' pointing to uuid 05d47528 is blacklisted, skipping
2015-10-28 21:04:20 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567') address 'tcp://172.16.*.**:4567' pointing to uuid 05d47528 is blacklisted, skipping
2015-10-28 21:04:21 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567') reconnecting to c26a3519 (tcp://192.168.**.**:4567), attempt 0
2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL, view_id(REG,05d47528,63)) suspecting node: c26a3519
2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL, view_id(REG,05d47528,63)) suspected node without join message, declaring inactive
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 2612d743 at tcp://192.168.**.**:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 3b467ac0 at tcp://192.168.**.**:54567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 4cfabdc3 at tcp://10.44.1.***:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 5d9a2da1 at tcp://192.168.**.**:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 8aa69abb at tcp://192.168.44.**:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring 910c09a4 at tcp://10.44.*.***:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring e4502e3e at tcp://192.168.**.**:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring ef10664a at tcp://192.168.**.**:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: declaring fe8e5a2c at tcp://192.168.*.*:4567 stable
2015-10-28 21:04:23 25974 [Note] WSREP: Node 05d47528 state prim
2015-10-28 21:04:24 25974 [Note] WSREP: view(view_id(PRIM,05d47528,64) memb {
        05d47528,0
        2612d743,0
        3b467ac0,0
        4cfabdc3,0
        5d9a2da1,0
        8aa69abb,0
        910c09a4,0
        e4502e3e,0
        ef10664a,0
        fe8e5a2c,0
} joined {
} left {
} partitioned {
        c26a3519,0
})

Can anyone provide any help ?

Many Thanks,
Ross McFadyen


The information contained in this email and any attachments are confidential and/or may be legally privileged. This email is intended only for the individual named, therefore if you are not the named addressee you should not disseminate or distribute as unauthorised use, copying or disclosure of part or all of this email is strictly prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. Unless stated otherwise, any views or opinions expressed in this email are those of the author only. CitNOW reserve the right to monitor e-mail communications. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. CitNOW Ltd., (Registered in England, company number 06630126), 9 Millars Brook, Molly Millars Lane, Wokingham, West Berkshire, United Kingdom, www.citnow.com

Philip Stoev

unread,
Nov 4, 2015, 6:28:52 AM11/4/15
to Ross McFadyen, codersh...@googlegroups.com
Hello,

The log seems to indicate that a node is trying and failing to connect to
another on its private IP address:

2015-10-28 21:04:21 25974 [Note] WSREP: (05d47528, 'tcp://0.0.0.0:4567')
reconnecting to c26a3519 (tcp://192.168.**.**:4567), attempt 0
2015-10-28 21:04:22 25974 [Note] WSREP: evs::proto(05d47528, OPERATIONAL,
view_id(REG,05d47528,63)) suspecting node: c26a3519

Can you check that all your nodes have the correct wsrep_node_address
setting? It is also useful to check that all firewalls are open both ways
between all the nodes on all Galera ports.

If that does not help, please find the log of the node that was evicted from
the cluster at that time and see if any interesting events are registered
there.

Thank you.

Philip Stoev
--
You received this message because you are subscribed to the Google Groups
"codership" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Teemu Ollakka

unread,
Nov 4, 2015, 6:50:25 AM11/4/15
to codership

Hi,

First of all, what is your Galera library version?


On Wednesday, November 4, 2015 at 1:00:23 PM UTC+2, Ross McFadyen wrote:
wsrep_provider_options = "evs.keepalive_period = PT3S"
wsrep_provider_options = "evs.suspect_timeout = PT30S"
wsrep_provider_options = "evs.inactive_timeout = PT1M"
wsrep_provider_options = "evs.install_timeout = PT1M"
wsrep_provider_options = "evs.inactive_check_period = PT3S"

Try to remove evs.inactive_check_period setting. It might not be a cause to your troubles, but it does not do anything good either.
 
wsrep_provider_options = "evs.join_retrans_period = PT1S"


- Teemu 

Ross McFadyen

unread,
Nov 4, 2015, 8:37:06 AM11/4/15
to codership, ross.m...@citnow.com
Hello,

Thanks for the reply. The nodes randomly stay stable for days at a time etc then a sub-set will partition for a while then rejoin. I think this rules out the possibility of it being the firewall and the possibility of the wsrep_node_address setting being wrong ? Please correct me if I'm wrong though.

We are using a mixture of EC2 and other boxes but all routing is done via our VPN so its hopefully not EC2 related

Cheers,
Ross

Philip Stoev

unread,
Nov 4, 2015, 9:19:19 AM11/4/15
to Ross McFadyen, codership, ross.m...@citnow.com
Hello,

Without the logs from all nodes involved it is difficult to say, but if the
settings and the firewalls are not fully symmetrical, sometimes node X will
be able to establish a TCP connection to node Y, thus forming a cluster, but
if later for some reason node Y attempts to make a connection to node X that
would fail, all other things being equal.

Ross McFadyen

unread,
Nov 4, 2015, 10:51:03 AM11/4/15
to codership, ross.m...@citnow.com
Hello,

Okay that's useful, The last partition we had was at 11.56 AM (GMT) here are the logs from each of the servers corresponding to this time

I have took the logs from that I believe to be the events running upto and during the partitioning from 4 of the 11 nodes, I have omitted all the "pointing to uuid xxxxxxx is blacklisted, skipping" messages to make it more readable.

Hopefully from these logs you can help us understand and fix the problem

Cheers,
Ross
UK.log
UKACG.log
Germany.log
South Africa.log

Philip Stoev

unread,
Nov 5, 2015, 10:59:47 AM11/5/15
to Ross McFadyen, codership, ross.m...@citnow.com
Hello,

Can you please try the following:

* Put all your wsrep_provider_options options on a single line, in a single
string separated by semicolons. Having multiple wsrep_provider_options
directives in your my.cnf causes only the last one to be processed.
Therefore, most of the timeouts you specified did not kick in and the
defaults were used, which are more suitable for LAN rather than WAN. On a
running server, you can execute SHOW VARIABLES LIKE 'wsrep_provider_options'
and check the timeouts that are actually in effect.

* Specify an identical but distinct value for gmcast.segment for all nodes
located in a single data center.

Ross McFadyen

unread,
Nov 6, 2015, 6:44:07 AM11/6/15
to codership, ross.m...@citnow.com
Hello,

You were right, the wsrep_provider_options were not taking affect as only the last one was being processed so effectively our setup was not optimised for WAN. I have also done as Teemu suggested and removed the inactive_check_period setting so this will now be back to the default. 

Now we wait and see if the Cluster is now stable, I will post my findings here but hopefully this has resolved the issue.

Many Thanks,
Ross McFadyen

James Wang

unread,
Nov 6, 2015, 9:46:00 AM11/6/15
to codership, ross.m...@citnow.com
Any update please?  Thanks

Ross McFadyen

unread,
Nov 6, 2015, 10:50:34 AM11/6/15
to codership, ross.m...@citnow.com
Hello James,

Cluster running stable at the moment, Will update when/if anything happens.

Cheers,
Ross

Ross McFadyen

unread,
Nov 9, 2015, 9:39:55 AM11/9/15
to codership, ross.m...@citnow.com
Hello Philip,

I implemented the changes as you suggested on the 6th to all of our servers apart from 1 , so 10/11 of our servers had the "wsrep_provider_options" changed, I verified this as being the case using the SHOW VARIABLES LIKE command as you suggested.

The cluster remained stable until the early morning of the 8th

where 3 nodes partitioned at different times, 2 of the nodes had the updated wsrep_provider_options and 1 of which did not.

I have attached the logs from all 11 servers so we can hopefully get too the bottom of why the 2 nodes that have the updated wsrep_provider_options partitioned, The partitioning lasted roughly 4 minutes and started at 6.36 am - 6.40 am (GMT) .

Cheers,
Ross
ag.txt
us.txt
za-log.log
aunz-log.log
bfogs.txt
de.txt
eu.txt
m-log.log
ns-log.log
px-log.log
uk.txt
Reply all
Reply to author
Forward
0 new messages