MM and asynchronous slaves

633 views
Skip to first unread message

KTWalrus

unread,
Mar 21, 2012, 5:38:00 PM3/21/12
to codership
Just discovered Galera and have a question...

If I set up a Galera Multi-Master cluster, can I still hang normal
MySQL asynchronous slaves off each master?

My situation is that I need to control when the slaves process
updates. The slaves are read-only during the day and sync with the
master at night. Clients that need the most up to date info connect
to the master. All others connect to the read-only slaves. Also, I'm
thinking of distributing two masters geographically and have slaves in
one location sync with closest master.

I'd like to scale my master with Galera and still not affect the
slaves. But, when I sync the slaves with the master, I need to get
all changes that have happened to the master without losing any. Does
this require row-based asynchronous replication built into MySQL for a
specific master in the Galera Cluster to provide updates to the read-
only slaves?

Chris Boulton

unread,
Mar 21, 2012, 5:53:45 PM3/21/12
to KTWalrus, codership
If I set up a Galera Multi-Master cluster, can I still hang normal
MySQL asynchronous slaves off each master?

You can, and it's the way I see the scale-out strategy for a MySQL cluster with Galera. You introduce several masters to scale out writes, all running Galera for replication. You can then create a slave/read-only network by hanging slaves off those masters to scale your reads.

Think of Galera as just providing an additional method of replication. The old MySQL replication is still there, and works exactly how it used to.

Regards,

Chris Boulton
BigCommerce

Web: http://www.bigcommerce.com




--
You received this message because you are subscribed to the Google Groups "codership" group.
To post to this group, send email to codersh...@googlegroups.com.
To unsubscribe from this group, send email to codership-tea...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/codership-team?hl=en.


KTWalrus

unread,
Mar 21, 2012, 7:45:54 PM3/21/12
to codership
> Think of Galera as just providing an additional method of replication. The
> old MySQL replication is still there, and works exactly how it used to.

Thanks.

Do I need to use row-based replication for slaves?

KTWalrus

unread,
Mar 21, 2012, 7:55:50 PM3/21/12
to codership
One more question... If I have the Galera Cluster split
geographically in two locations, and each location has 4 masters, what
happens if the network connection between the two locations goes
down? Will each side continue to operate and when the network is
working again, will the masters all resync automatically? Or, will I
have a "split-brain" problem that will only be resolved by manually
syncing a master at one location with a master at the other datacenter?

Alex Yurchenko

unread,
Mar 21, 2012, 8:03:09 PM3/21/12
to codersh...@googlegroups.com

Hi,

You will end up with it, because Galera by default uses row-based and
that's what will be re-replicated to slaves. And you can't really
recover statements from row events. Is it a problem?

Regards,
Alex

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Alexey Yurchenko

unread,
Mar 21, 2012, 8:27:00 PM3/21/12
to codersh...@googlegroups.com
Yes, there will be split-brain. Your options are:
- odd number of masters,
- odd number of datacenters,
- an arbitrator outside (essentially in a 3rd datacenter again)

The benefit of 3 datacenters in your case would be that if A-B connections breaks but A-C and B-C connections stay, A to B replication will happen through C with minimal impact. I'd think about it ;)

To recover from split-brain you
- either wait for connection to recover
- or _manually_ promote one component to PRIMARY, then when connection recovers, another component will reconnect and sync with primary. Telling which component is PRIMARY is the only thing you do manually.

Henrik Ingo

unread,
Mar 22, 2012, 3:29:12 AM3/22/12
to Chris Boulton, KTWalrus, codership
On Wed, Mar 21, 2012 at 11:53 PM, Chris Boulton <ch...@bigcommerce.com> wrote:
>> If I set up a Galera Multi-Master cluster, can I still hang normal
>> MySQL asynchronous slaves off each master?
>
>
> You can, and it's the way I see the scale-out strategy for a MySQL cluster
> with Galera. You introduce several masters to scale out writes, all running
> Galera for replication. You can then create a slave/read-only network by
> hanging slaves off those masters to scale your reads.

Just a comment here: While the original use case by Mr Walrus was
quite valid, for just general scale-out I'd consider tossing MySQL
replication altogether. There is no reason why you couldn't have 10
Galera nodes in a cluster. You can choose to designate some of them
read-only if you prefer. It is much simpler to not mix 2 different
kinds of replication unless you have a good reason.

Good reasons in my opinion would be:
- you really want the replication to be asynchronous, such as time
delayed, offline, etc... (this thread)
- slow wan links (slaves in China, Australia, Africa etc...)
- You really have lots of slaves, like 50 or more. It could be useful
to have them more decoupled in such a case. (Although, Galera might
just fine even then.)

henrik

--
henri...@avoinelama.fi
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559

Chris Boulton

unread,
Mar 22, 2012, 8:17:43 AM3/22/12
to henri...@avoinelama.fi, codership
Thanks - I was hoping someone who's been around a bit longer chimed in.

Not to take things too off topic, but right now I'm thinking MySQL replication would be beneficial to slaves that are used for backup purposes, right?

I'm assuming that running because backups could potentially cause a bit of I/O contention on one of the nodes, having it in a synchronous cluster may affect performance of the overall cluster as changes are committed across all nodes.

Regards,

Chris Boulton
BigCommerce

Web: http://www.bigcommerce.com



Alexey Yurchenko

unread,
Mar 22, 2012, 9:14:17 AM3/22/12
to codersh...@googlegroups.com, henri...@avoinelama.fi
Hi Chris,

Thanks for bringing that up. We have a way to "desynchronize" a node from the cluster, but for now it is implemented only for running DDLs if you want to do a rolling schema upgrade. Perhaps it may make sense to make it as a generic feature.

However all is not lost.

1) You can set up a very high slave queue length tolerance on a given node so that it won't keep the cluster down.

2) This is probably a preferred way to do backups on galera cluster as you will get a backup corresponding to a known global transaction ID - and therefore can recover from it more reliably. The thing is that backup is nothing but a procedure of taking state snapshot. So if you run garbd as follows:

$ ./garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_test_cluster -a gcomm://192.168.0.1:4567 --sst backup --donor <desired donor name>

the desired donor will run wsrep_sst_backup script like this

120322 14:53:25 [Note] WSREP: Running: 'wsrep_sst_backup 'donor' 'ackup' 'root:rootpass' '/mnt/d1/mysql/var/' '/mnt/d1/mysql/etc/my.cnf' '3a52eaed-7360-11e1-0800-70cdaf716282' '0' '0'

and there you can write whatever you like (note that atm the second parameter is screwed up, will be fixed next release). See wsrep_sst_mysqldump and wsrep_sst_rsync for examples, http://www.codership.com/wiki/doku.php?id=scriptable_state_snapshot_transfer for calling convention.

Doing backup this way will
- give you a backup with a known global transaction ID, which you will be able to use to quickly promote new nodes
- take care about donor not slowing down the cluster

Getting back to async slaves - I think the only (but quite valid) reason to use native async replication for slaves is when you do WAN replication and you have a few slaves at each datacenter. Currently that will generate quite a lot of extra WAN traffic which may cost extra or cause WAN link performance degradation. That's the reason I'm not advocating going full Galera in such setup. Although if your WAN link permits that, going all Galera would save you a lot of troubles with node recovery and master failover.

Regards,
Alex
To post to this group, send email to codership-team@googlegroups.com.
To unsubscribe from this group, send email to codership-team+unsubscribe@googlegroups.com.

KTWalrus

unread,
Mar 22, 2012, 11:45:12 AM3/22/12
to codership
Thanks for all the info. I'll have to think about this and formulate a
plan.

A couple more questions do come to mind:

What if I have two datacenters each running a separate Galera
Cluster. I would need to sync the masters once a day (at night). Can
I set up asynchronous two way sync that is efficient over the WAN link
using Galera (that is, does Galera support asynchronous delayed
replication between two clusters)? Or, can this be done with native
MySQL replication (sync'ing two masters while each master is a member
of separate Galera Clusters)?

Right now, I'm thinking that I will do away with the read-only slaves
(currently there are 4 of them) and just go with a 4 node Galera
Cluster in its place. The web servers will have nginx/PHP/MySQL
services all on each server. These web servers use haproxy for load
balancing and high availability. Eventually, I hope to have this set
up at 2 datacenters (one on the west coast and the other on the east
coast). When that time happens, I will need to figure out how to sync
the two clusters overnight without affecting performance of the two
clusters during the day. If Galera can support such asynchronous WAN
cluster sync, I'd prefer to use that. That is, I would have a cron
job initiate the cluster sync during the night and I only want to
transfer the minimum amount of data over the WAN link.

Any opinions on this approach?

On Mar 22, 9:14 am, Alexey Yurchenko <alexey.yurche...@codership.com>
wrote:
> Hi Chris,
>
> Thanks for bringing that up. We have a way to "desynchronize" a node from
> the cluster, but for now it is implemented only for running DDLs if you
> want to do a rolling schema upgrade. Perhaps it may make sense to make it
> as a generic feature.
>
> However all is not lost.
>
> 1) You can set up a very high slave queue length tolerance on a given node
> so that it won't keep the cluster down.
>
> 2) This is probably a preferred way to do backups on galera cluster as you
> will get a backup corresponding to a known global transaction ID - and
> therefore can recover from it more reliably. The thing is that backup is
> nothing but a procedure of taking state snapshot. So if you run garbd as
> follows:
>
> $ ./garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_test_cluster -a
> gcomm://192.168.0.1:4567 --sst backup --donor <desired donor name>
>
> the desired donor will run wsrep_sst_backup script like this
>
> 120322 14:53:25 [Note] WSREP: Running: 'wsrep_sst_backup 'donor' 'ackup'
> 'root:rootpass' '/mnt/d1/mysql/var/' '/mnt/d1/mysql/etc/my.cnf'
> '3a52eaed-7360-11e1-0800-70cdaf716282' '0' '0'
>
> and there you can write whatever you like (note that atm the second
> parameter is screwed up, will be fixed next release). See
> wsrep_sst_mysqldump and wsrep_sst_rsync for examples,http://www.codership.com/wiki/doku.php?id=scriptable_state_snapshot_t...
> >> henrik.i...@avoinelama.fi
> >> +358-40-8211286 skype: henrik.ingo irc: hingo
> >>www.openlife.cc
>
> >> My LinkedIn profile:http://www.linkedin.com/profile/view?id=9522559
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "codership" group.
> >> To post to this group, send email to codersh...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> codership-tea...@googlegroups.com.

Alex Yurchenko

unread,
Mar 22, 2012, 2:36:43 PM3/22/12
to codersh...@googlegroups.com
Until Galera supports efficient inter-datacenter replication, native
master-slave is quite a valid option to synchronize two Galera clusters,
especially if you don't have hard sync requirements. And I don't think
it will affect performance much - native replication is single-threaded.
In fact, if you want it to have minimal impact, you better spread it
over the whole day. Just don't forget to set log_slave_updates on a
slave end. Maybe for speed you can have one pair of nodes doing it one
direction and another pair of nodes - the opposite direction.

Regards,
Alex

pu21...@hotmail.com

unread,
Jan 10, 2013, 6:19:37 AM1/10/13
to codersh...@googlegroups.com
hi,Alix,I have encounter a problem,when I execute a hot copy with wsrep_sst_xtrabackup script provided by codeship ,there is an error :
120110 19:01:32 [ERROR] WSREP: Process completed with error: wsrep_sst_backup --role 'donor' --address '' --auth 'sst:123abc' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:26': 2 (No such file or directory)

wsrep_sst_xtrabackup scripts was renamed as wsrep_sst_backup ,and was put in the system path:usr/sbin

I am not sure if the lack of script lead to this problem?
BTW,I want to ask a question:Whether or not  need me to write the wsrep_sst_backup scripts ?is there anywhere to get it ?

many thanks!

logs as following:

120110 19:01:32 [Note] WSREP: forgetting 0f66d434-727c-11e1-0800-26a6b6183efc (tcp://10.0.211.78:3333)
120110 19:01:32 [Note] WSREP: (147eaaa8-3b79-11e1-0800-1caef94ada9d, 'tcp://0.0.0.0:4567') turning message relay requesting off
120110 19:01:32 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
120110 19:01:32 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120110 19:01:32 [Note] WSREP: STATE EXCHANGE: sent state msg: 1057a792-727c-11e1-0800-6c48fedc2a8d
120110 19:01:32 [Note] WSREP: STATE EXCHANGE: got state msg: 1057a792-727c-11e1-0800-6c48fedc2a8d from 0 (localhost.localdomain)
120110 19:01:32 [Note] WSREP: STATE EXCHANGE: got state msg: 1057a792-727c-11e1-0800-6c48fedc2a8d from 1 (10.0.211.79)
120110 19:01:32 [Note] WSREP: STATE EXCHANGE: got state msg: 1057a792-727c-11e1-0800-6c48fedc2a8d from 2 (10.0.211.81)
120110 19:01:32 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 10,
        members    = 3/3 (joined/total),
        act_id     = 26,
        last_appl. = 0,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = 857b217e-3a8d-11e1-0800-5eb8ec0b0a53
120110 19:01:32 [Note] WSREP: Flow-control interval: [28, 28]
120110 19:01:32 [Note] WSREP: New cluster view: global state: 857b217e-3a8d-11e1-0800-5eb8ec0b0a53:26, view# 11: Primary, number of nodes: 3, my index: 1, protocol version 2
120110 19:01:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120110 19:01:32 [Note] WSREP: Assign initial position for certification: 26, protocol version: 2
120110 19:01:32 [Warning] WSREP: Could not find peer: 0f66d434-727c-11e1-0800-26a6b6183efc
120110 19:01:32 [Warning] WSREP: 1 (10.0.211.79): State transfer to -1 (left the group) failed: -1 (Operation not permitted)
120110 19:01:32 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 26)
120110 19:01:32 [Note] WSREP: Member 1 (10.0.211.79) synced with group.
120110 19:01:32 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 26)
120110 19:01:32 [Note] WSREP: Synchronized with group, ready for connections
120110 19:01:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120110 19:01:32 [ERROR] WSREP: Process completed with error: wsrep_sst_backup --role 'donor' --address '' --auth 'sst:123abc' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:26': 2 (No such file or directory)
120110 19:01:32 [Warning] WSREP: Protocol violation. JOIN message sender 1 (10.0.211.79) is not in state transfer (SYNCED). Message ignored.
120110 19:01:37 [Note] WSREP:  cleaning up 0f66d434-727c-11e1-0800-26a6b6183efc (tcp://10.0.211.78:3333)



在 2012年3月23日星期五UTC+8上午2时36分43秒,Alexey Yurchenko写道:

Alex Yurchenko

unread,
Jan 10, 2013, 9:44:23 AM1/10/13
to codersh...@googlegroups.com
Hi Danny,

Unfortunately

1) the log you posted is less than 1 second worth

2) the log messages seem to be drastically out of order (which suggests
that your system is very overloaded)

both of this makes it impossible to say anything definite, but it is
very likely that /usr/sbin is not in the PATH of mysqld process, so it
can't find the script.

As for your second question:

> BTW,I want to ask a question:Whether or not need me to write the
> wsrep_sst_backup scripts ?is there anywhere to get it ?

I'm afraid I didn't understand it. Three SST scripts:
wsrep_sst_mysqldump, wsrep_sst_rsync, wsrep_sst_xtrabackup are part of
MySQL-wsrep package and are installed (normally) in /usr/bin.

Regards,
Alex

On 2013-01-10 13:19, pu21...@hotmail.com wrote:
> hi,Alix,I have encounter a problem,when I execute a hot copy with
> wsrep_sst_xtrabackup script provided by codeship ,there is an error :
> *120110 19:01:32 [ERROR] WSREP: Process completed with error:
> wsrep_sst_backup --role 'donor' --address '' --auth 'sst:123abc'
> --socket
> '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'
> --defaults-file
> '/etc/my.cnf' --gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:26': 2 (No
> such
> file or directory)*
> *120110 19:01:32 [ERROR] WSREP: Process completed with error:
> wsrep_sst_backup --role 'donor' --address '' --auth 'sst:123abc'
> --socket
> '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'
> --defaults-file
> '/etc/my.cnf' --gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:26': 2 (No
> such
> file or directory)*
> 120110 19:01:32 [Warning] WSREP: Protocol violation. JOIN message
> sender 1
> (10.0.211.79) is not in state transfer (SYNCED). Message ignored.
> 120110 19:01:37 [Note] WSREP: cleaning up
> 0f66d434-727c-11e1-0800-26a6b6183efc (tcp://10.0.211.78:3333)

pu21...@hotmail.com

unread,
Jan 11, 2013, 1:42:33 AM1/11/13
to codership
Hi Alex ,
I copied the wsrep_sst_xtrabackup file which was downloaded from
codeship websit to my /usr/bin directory,and renamed it as
wsrep_sst_backup,
and then on a joiner machine which wasn't in the cluster at that
time , issued:
garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_wsrep_cluster -a
gcomm://10.0.211.79:4567 --sst backup --donor 10.0.211.79

then,the donor seems to work with a minor error :
WSREP_SST: [ERROR] innobackupex /tmp 2> /var/lib/mysql//
innobackup.backup.log | nc (20120111 13:52:08.781)
this is a statement I add in the wsrep_sst_backup script

I reviewed the script ,and konw the variable WSREP_SST_OPT_ADDR didn't
be obtained.

what's wrong with it?

if I want to execute a hotcopy of any node of the cluster, is the
operation above correct?
what are the steps of recovery ?

can you give me some instructions about it ?

many thanks!


Best Regards

yours Danny Pu
Jinan shandong provice China

donor's logs was as follows:

120111 13:52:01 [Note] WSREP: Flow-control interval: [28, 28]
120111 13:52:01 [Note] WSREP: New cluster view: global state:
857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28, view# 103: Primary, number of
nodes: 3, my index: 2, protocol version 2
120111 13:52:01 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 13:52:01 [Note] WSREP: Assign initial position for
certification: 28, protocol version: 2
120111 13:52:01 [Note] WSREP: Member 0 (10.0.211.81) synced with
group.
120111 13:52:07 [Note] WSREP: declaring
0520ccde-5bb3-11e2-0800-5625fa869020 stable
120111 13:52:07 [Note] WSREP: declaring
0833def2-5bb3-11e2-0800-8cca055d837b stable
120111 13:52:07 [Note] WSREP: declaring
688ee448-72f4-11e1-0800-71e26c0e9177 stable
120111 13:52:08 [Note] WSREP: view(view_id(PRIM,
0520ccde-5bb3-11e2-0800-5625fa869020,107) memb {
0520ccde-5bb3-11e2-0800-5625fa869020,
0833def2-5bb3-11e2-0800-8cca055d837b,
688ee448-72f4-11e1-0800-71e26c0e9177,
aea8d302-3bf5-11e1-0800-fba69358a1b5,
} joined {
} left {
} partitioned {
})
120111 13:52:08 [Note] WSREP: New COMPONENT: primary = yes, bootstrap
= no, my_idx = 3, memb_num = 4
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: sent state msg:
09b8aca8-5bb3-11e2-0800-27e8b55ea5c5
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
09b8aca8-5bb3-11e2-0800-27e8b55ea5c5 from 0 (10.0.211.81)
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
09b8aca8-5bb3-11e2-0800-27e8b55ea5c5 from 2 (localhost.localdomain)
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
09b8aca8-5bb3-11e2-0800-27e8b55ea5c5 from 3 (10.0.211.79)
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
09b8aca8-5bb3-11e2-0800-27e8b55ea5c5 from 1 (garb)
120111 13:52:08 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 103,
members = 3/4 (joined/total),
act_id = 28,
last_appl. = 0,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 857b217e-3a8d-11e1-0800-5eb8ec0b0a53
120111 13:52:08 [Note] WSREP: Flow-control interval: [32, 32]
120111 13:52:08 [Note] WSREP: New cluster view: global state:
857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28, view# 104: Primary, number of
nodes: 4, my index: 3, protocol version 2
120111 13:52:08 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 13:52:08 [Note] WSREP: Assign initial position for
certification: 28, protocol version: 2
120111 13:52:08 [Note] WSREP: Node 1 (garb) requested state transfer
from '10.0.211.79'. Selected 3 (10.0.211.79)(SYNCED) as donor.
120111 13:52:08 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO:
28)
120111 13:52:08 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 13:52:08 [Note] WSREP: Running: 'wsrep_sst_backup --role
'donor' --address '' --auth 'sst:123abc' --socket '/var/lib/mysql/
mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf'
--gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28''
120111 13:52:08 [Note] WSREP: 1 (garb): State transfer from 3
(10.0.211.79) complete.
120111 13:52:08 [Note] WSREP: sst_donor_thread signaled with 0
120111 13:52:08 [Note] WSREP: declaring
0520ccde-5bb3-11e2-0800-5625fa869020 stable
120111 13:52:08 [Note] WSREP: declaring
688ee448-72f4-11e1-0800-71e26c0e9177 stable
120111 13:52:08 [Note] WSREP: (aea8d302-3bf5-11e1-0800-fba69358a1b5,
'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive
peers: tcp://10.0.211.81:3333
120111 13:52:08 [Note] WSREP: view(view_id(PRIM,
0520ccde-5bb3-11e2-0800-5625fa869020,108) memb {
0520ccde-5bb3-11e2-0800-5625fa869020,
688ee448-72f4-11e1-0800-71e26c0e9177,
aea8d302-3bf5-11e1-0800-fba69358a1b5,
} joined {
} left {
} partitioned {
0833def2-5bb3-11e2-0800-8cca055d837b,
})
120111 13:52:08 [Note] WSREP: forgetting
0833def2-5bb3-11e2-0800-8cca055d837b (tcp://10.0.211.81:3333)
120111 13:52:08 [Note] WSREP: (aea8d302-3bf5-11e1-0800-fba69358a1b5,
'tcp://0.0.0.0:4567') turning message relay requesting off
120111 13:52:08 [Note] WSREP: New COMPONENT: primary = yes, bootstrap
= no, my_idx = 2, memb_num = 3
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: sent state msg:
0a082e2c-5bb3-11e2-0800-8f0715bdf090
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
0a082e2c-5bb3-11e2-0800-8f0715bdf090 from 0 (10.0.211.81)
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
0a082e2c-5bb3-11e2-0800-8f0715bdf090 from 1 (localhost.localdomain)
120111 13:52:08 [Note] WSREP: STATE EXCHANGE: got state msg:
0a082e2c-5bb3-11e2-0800-8f0715bdf090 from 2 (10.0.211.79)
120111 13:52:08 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 104,
members = 3/3 (joined/total),
act_id = 28,
last_appl. = 0,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 857b217e-3a8d-11e1-0800-5eb8ec0b0a53
120111 13:52:08 [Note] WSREP: Flow-control interval: [28, 28]
120111 13:52:08 [Note] WSREP: New cluster view: global state:
857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28, view# 105: Primary, number of
nodes: 3, my index: 2, protocol version 2
120111 13:52:08 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 13:52:08 [Note] WSREP: Assign initial position for
certification: 28, protocol version: 2
WSREP_SST: [ERROR] innobackupex /tmp 2> /var/lib/mysql//
innobackup.backup.log | nc (20120111 13:52:08.781)
usage: nc [-46DdhklnrStUuvzC] [-i interval] [-p source_port]
[-s source_ip_address] [-T ToS] [-w timeout] [-X
proxy_version]
[-x proxy_address[:port]] [hostname] [port[s]]
120111 13:52:14 [Note] WSREP: cleaning up
0833def2-5bb3-11e2-0800-8cca055d837b (tcp://10.0.211.81:3333)
WSREP_SST: [ERROR] innobackupex finished with error: 25. Check /var/
lib/mysql//innobackup.backup.log (20120111 13:52:15.028)
120111 13:52:15 [ERROR] WSREP: Failed to read from: wsrep_sst_backup --
role 'donor' --address '' --auth 'sst:123abc' --socket '/var/lib/mysql/
mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf'
--gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28'
120111 13:52:15 [ERROR] WSREP: Process completed with error:
wsrep_sst_backup --role 'donor' --address '' --auth 'sst:123abc' --
socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --
defaults-file '/etc/my.cnf' --gtid
'857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28': 22 (Invalid argument)
120111 13:52:15 [Warning] WSREP: Could not find peer:
0833def2-5bb3-11e2-0800-8cca055d837b
120111 13:52:15 [Warning] WSREP: 2 (10.0.211.79): State transfer to -1
(left the group) failed: -1 (Operation not permitted)


joiner's terminal was as follows:

[root@10 pushuaiye]# garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g
my_wsrep_cluster -a gcomm://10.0.211.79:4567 --sst backup --donor
10.0.211.79
2013-01-11 14:11:37.379 INFO: Read config:
daemon: 0
address: gcomm://10.0.211.79:4567
group: my_wsrep_cluster
sst: backup
donor: 10.0.211.79
options: gmcast.listen_addr=tcp://0.0.0.0:3333;
gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
cfg:
log:

2013-01-11 14:11:37.386 INFO: protonet asio version 0
2013-01-11 14:11:37.387 INFO: backend: asio
2013-01-11 14:11:37.396 INFO: GMCast version 0
2013-01-11 14:11:37.400 INFO: (c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
'tcp://0.0.0.0:3333') listening at tcp://0.0.0.0:3333
2013-01-11 14:11:37.400 INFO: (c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
'tcp://0.0.0.0:3333') multicast: , ttl: 1
2013-01-11 14:11:37.418 INFO: EVS version 0
2013-01-11 14:11:37.423 INFO: PC version 0
2013-01-11 14:11:37.423 INFO: gcomm: connecting to group
'my_wsrep_cluster', peer '10.0.211.79:4567'
2013-01-11 14:11:37.431 INFO: (c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
'tcp://0.0.0.0:3333') turning message relay requesting on, nonlive
peers: tcp://10.0.211.78:4567 tcp://10.0.211.81:4567
2013-01-11 14:11:37.705 INFO: (c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
'tcp://0.0.0.0:3333') turning message relay requesting off
2013-01-11 14:11:37.705 INFO: (c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
'tcp://0.0.0.0:3333') cleaning up established 0x1f29f2f0 which is
duplicate of 0x1f295e20
2013-01-11 14:11:38.902 INFO: declaring
0520ccde-5bb3-11e2-0800-5625fa869020 stable
2013-01-11 14:11:38.902 INFO: declaring
688ee448-72f4-11e1-0800-71e26c0e9177 stable
2013-01-11 14:11:38.902 INFO: declaring aea8d302-3bf5-11e1-0800-
fba69358a1b5 stable
2013-01-11 14:11:39.906 INFO: view(view_id(PRIM,
0520ccde-5bb3-11e2-0800-5625fa869020,109) memb {
0520ccde-5bb3-11e2-0800-5625fa869020,
688ee448-72f4-11e1-0800-71e26c0e9177,
aea8d302-3bf5-11e1-0800-fba69358a1b5,
c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
} joined {
} left {
} partitioned {
})
2013-01-11 14:11:39.930 INFO: gcomm: connected
2013-01-11 14:11:39.930 INFO: Changing maximum packet size to 64500,
resulting msg size: 32636
2013-01-11 14:11:39.930 INFO: Shifting CLOSED -> OPEN (TO: 0)
2013-01-11 14:11:39.930 INFO: Opened channel 'my_wsrep_cluster'
2013-01-11 14:11:39.931 INFO: New COMPONENT: primary = yes, bootstrap
= no, my_idx = 3, memb_num = 4
2013-01-11 14:11:39.931 INFO: STATE EXCHANGE: Waiting for state UUID.
2013-01-11 14:11:39.931 INFO: STATE EXCHANGE: sent state msg:
c35812c8-5bb5-11e2-0800-708ccd89c894
2013-01-11 14:11:39.931 INFO: STATE EXCHANGE: got state msg:
c35812c8-5bb5-11e2-0800-708ccd89c894 from 0 (10.0.211.81)
2013-01-11 14:11:39.931 INFO: STATE EXCHANGE: got state msg:
c35812c8-5bb5-11e2-0800-708ccd89c894 from 1 (localhost.localdomain)
2013-01-11 14:11:39.931 INFO: STATE EXCHANGE: got state msg:
c35812c8-5bb5-11e2-0800-708ccd89c894 from 2 (10.0.211.79)
2013-01-11 14:11:39.933 INFO: STATE EXCHANGE: got state msg:
c35812c8-5bb5-11e2-0800-708ccd89c894 from 3 (garb)
2013-01-11 14:11:39.933 INFO: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 105,
members = 3/4 (joined/total),
act_id = 28,
last_appl. = -1,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 857b217e-3a8d-11e1-0800-5eb8ec0b0a53
2013-01-11 14:11:39.933 INFO: Flow-control interval: [9999999,
9999999]
2013-01-11 14:11:39.933 INFO: Shifting OPEN -> PRIMARY (TO: 28)
2013-01-11 14:11:39.933 INFO: Sending state transfer request:
'backup', size: 6
2013-01-11 14:11:39.934 INFO: Node 3 (garb) requested state transfer
from '10.0.211.79'. Selected 2 (10.0.211.79)(SYNCED) as donor.
2013-01-11 14:11:39.934 INFO: Shifting PRIMARY -> JOINER (TO: 28)
2013-01-11 14:11:39.934 INFO: Closing send monitor...
2013-01-11 14:11:39.934 INFO: Closed send monitor.
2013-01-11 14:11:39.934 INFO: gcomm: terminating thread
2013-01-11 14:11:39.934 INFO: gcomm: joining thread
2013-01-11 14:11:39.935 INFO: gcomm: closing backend
2013-01-11 14:11:39.964 INFO: view(view_id(NON_PRIM,
0520ccde-5bb3-11e2-0800-5625fa869020,109) memb {
c1d7fb8e-5bb5-11e2-0800-58929ca59ad6,
} joined {
} left {
} partitioned {
0520ccde-5bb3-11e2-0800-5625fa869020,
688ee448-72f4-11e1-0800-71e26c0e9177,
aea8d302-3bf5-11e1-0800-fba69358a1b5,
})
2013-01-11 14:11:39.964 INFO: view((empty))
2013-01-11 14:11:39.964 INFO: gcomm: closed
2013-01-11 14:11:39.966 INFO: 3 (garb): State transfer from 2
(10.0.211.79) complete.
2013-01-11 14:11:39.966 INFO: Shifting JOINER -> JOINED (TO: 28)
2013-01-11 14:11:39.966 WARN: 0x1f26ee80 down context(s) not set
2013-01-11 14:11:39.966 WARN: Failed to send SYNC signal: -107
(Transport endpoint is not connected)
2013-01-11 14:11:39.966 INFO: New COMPONENT: primary = no, bootstrap
= no, my_idx = 0, memb_num = 1
2013-01-11 14:11:39.966 INFO: Flow-control interval: [9999999,
9999999]
2013-01-11 14:11:39.966 INFO: Received NON-PRIMARY.
2013-01-11 14:11:39.967 INFO: Shifting JOINED -> OPEN (TO: 28)
2013-01-11 14:11:39.967 INFO: Received self-leave message.
2013-01-11 14:11:39.967 INFO: Flow-control interval: [9999999,
9999999]
2013-01-11 14:11:39.967 INFO: Received SELF-LEAVE. Closing
connection.
2013-01-11 14:11:39.967 INFO: Shifting OPEN -> CLOSED (TO: 28)
2013-01-11 14:11:39.967 INFO: RECV thread exiting 0: Success
2013-01-11 14:11:39.969 INFO: recv_thread() joined.
2013-01-11 14:11:39.970 INFO: Closing slave action queue.
2013-01-11 14:11:39.970 WARN: Attempt to close a closed connection
2013-01-11 14:11:39.970 INFO: Exiting main loop
2013-01-11 14:11:39.971 INFO: Shifting CLOSED -> DESTROYED (TO: 28)

Alex Yurchenko

unread,
Jan 11, 2013, 2:21:25 AM1/11/13
to codersh...@googlegroups.com
Try

--sst 'backup|1.2.3.4:5678'

remember to escape the '|'

pu21...@hotmail.com

unread,
Jan 11, 2013, 3:55:07 AM1/11/13
to codership
hi Alex

problem remains when I issue the command on the joiner
machine(10.0.211.81)
garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_wsrep_cluster -a
gcomm://10.0.211.79:4567 --sst 'backup|10.0.211.79:4567' --donor
10.0.211.79

many thanks!

120111 16:46:40 [Note] WSREP: Flow-control interval: [32, 32]
120111 16:46:40 [Note] WSREP: New cluster view: global state:
857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28, view# 114: Primary, number of
nodes: 4, my index: 3, protocol version 2
120111 16:46:40 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 16:46:40 [Note] WSREP: Assign initial position for
certification: 28, protocol version: 2
120111 16:46:40 [Note] WSREP: Node 2 (garb) requested state transfer
from '10.0.211.79'. Selected 3 (10.0.211.79)(SYNCED) as donor.
120111 16:46:40 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO:
28)
120111 16:46:40 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 16:46:40 [Note] WSREP: Running: 'wsrep_sst_backup|10.0.211.79 --
role 'donor' --address '4567' --auth 'sst:123abc' --socket '/var/lib/
mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/
my.cnf' --gtid '857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28''
120111 16:46:40 [Note] WSREP: 2 (garb): State transfer from 3
(10.0.211.79) complete.
120111 16:46:40 [Note] WSREP: declaring
0520ccde-5bb3-11e2-0800-5625fa869020 stable
120111 16:46:40 [Note] WSREP: declaring
688ee448-72f4-11e1-0800-71e26c0e9177 stable
120111 16:46:40 [Note] WSREP: view(view_id(PRIM,
0520ccde-5bb3-11e2-0800-5625fa869020,118) memb {
0520ccde-5bb3-11e2-0800-5625fa869020,
688ee448-72f4-11e1-0800-71e26c0e9177,
aea8d302-3bf5-11e1-0800-fba69358a1b5,
} joined {
} left {
} partitioned {
6aaaf5da-5bcb-11e2-0800-cc9afbb01c0a,
})
120111 16:46:40 [Note] WSREP: forgetting 6aaaf5da-5bcb-11e2-0800-
cc9afbb01c0a (tcp://10.0.211.81:3333)
120111 16:46:40 [Note] WSREP: New COMPONENT: primary = yes, bootstrap
= no, my_idx = 2, memb_num = 3
120111 16:46:40 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120111 16:46:40 [Note] WSREP: STATE EXCHANGE: sent state msg:
6be243e0-5bcb-11e2-0800-5b63016b03fe
120111 16:46:40 [Note] WSREP: STATE EXCHANGE: got state msg:
6be243e0-5bcb-11e2-0800-5b63016b03fe from 0 (10.0.211.81)
120111 16:46:40 [Note] WSREP: STATE EXCHANGE: got state msg:
6be243e0-5bcb-11e2-0800-5b63016b03fe from 1 (localhost.localdomain)
120111 16:46:40 [Note] WSREP: STATE EXCHANGE: got state msg:
6be243e0-5bcb-11e2-0800-5b63016b03fe from 2 (10.0.211.79)
120111 16:46:40 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 114,
members = 3/3 (joined/total),
act_id = 28,
last_appl. = 0,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 857b217e-3a8d-11e1-0800-5eb8ec0b0a53
120111 16:46:40 [Note] WSREP: Flow-control interval: [28, 28]
/usr//bin/wsrep_sst_backup: line 79: WSREP_SST_OPT_DATA: unbound
variable
sh: 10.0.211.79: command not found
120111 16:46:40 [Note] WSREP: sst_donor_thread signaled with 0
120111 16:46:40 [Note] WSREP: New cluster view: global state:
857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28, view# 115: Primary, number of
nodes: 3, my index: 2, protocol version 2
120111 16:46:40 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 16:46:40 [Note] WSREP: Assign initial position for
certification: 28, protocol version: 2
120111 16:46:40 [ERROR] WSREP: Failed to read from: wsrep_sst_backup|
10.0.211.79 --role 'donor' --address '4567' --auth 'sst:123abc' --
socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --
defaults-file '/etc/my.cnf' --gtid
'857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28'
120111 16:46:40 [ERROR] WSREP: Process completed with error:
wsrep_sst_backup|10.0.211.79 --role 'donor' --address '4567' --auth
'sst:123abc' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/
mysql/' --defaults-file '/etc/my.cnf' --gtid
'857b217e-3a8d-11e1-0800-5eb8ec0b0a53:28': 2 (No such file or
directory)
120111 16:46:40 [Warning] WSREP: Could not find peer:
6aaaf5da-5bcb-11e2-0800-cc9afbb01c0a
120111 16:46:40 [Warning] WSREP: 2 (10.0.211.79): State transfer to -1
(left the group) failed: -1 (Operation not permitted)
120111 16:46:40 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO:
28)
120111 16:46:40 [Note] WSREP: Member 2 (10.0.211.79) synced with
group.
120111 16:46:40 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 28)
120111 16:46:40 [Note] WSREP: Synchronized with group, ready for
connections
120111 16:46:40 [Note] WSREP: wsrep_notify_cmd is not defined,
skipping notification.
120111 16:46:45 [Note] WSREP: cleaning up 6aaaf5da-5bcb-11e2-0800-
cc9afbb01c0a (tcp://10.0.211.81:3333)

On 1月11日, 下午3时21分, Alex Yurchenko <alexey.yurche...@codership.com>
wrote:
> Try
>
> --sst 'backup|1.2.3.4:5678'
>
> remember to escape the '|'
>
> > 2013-01-11 14:11:37.705  INFO:...
>
> 阅读更多 »

pu21...@hotmail.com

unread,
Jan 11, 2013, 3:58:51 AM1/11/13
to codership
hi Alex

problem remains when I issue the command on the joiner
machine(10.0.211.81)
garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_wsrep_cluster -a
gcomm://10.0.211.79:4567 --sst 'backup|10.0.211.79:4567' --donor
10.0.211.79



> > > 120111 13:52:08 [Note] WSREP:...
>
> 阅读更多 »

Alex Yurchenko

unread,
Jan 11, 2013, 5:33:43 AM1/11/13
to codersh...@googlegroups.com
On 2013-01-11 10:58, pu21...@hotmail.com wrote:
> hi Alex
>
> problem remains when I issue the command on the joiner
> machine(10.0.211.81)
> garbd -o gmcast.listen_addr=tcp://0.0.0.0:3333 -g my_wsrep_cluster -a
> gcomm://10.0.211.79:4567 --sst 'backup|10.0.211.79:4567' --donor
> 10.0.211.79
>
>
>

Oh, sorry, it must be colon instead of a pipe:

--sst 'backup:10.0.211.79:4567'

But then what are you trying to achieve? Is another nc listening for
backup at 10.0.211.79:4567?

nikoinlove

unread,
Jan 27, 2013, 5:47:20 AM1/27/13
to codersh...@googlegroups.com
Good day. I've tried to read the whole topic, but still got questions. Let's see an example.

I have a galera cluster of 3 nodes. All have log-slave-updates, binary log format is row.  Is this enough to setup an asynchronous replication?
Questions then:
1) What node should I use as a master? Can I replicate from a proxy(haproxy/glb) above my galera cluster?
2) What if one node fails? Do all nodes have the same binlog positions? Can I continue to replicate from another node from the same binlog position?

Thanks in advance.

Henrik Ingo

unread,
Jan 27, 2013, 9:32:09 AM1/27/13
to nikoinlove, codership
Hi Niko

On Sun, Jan 27, 2013 at 12:47 PM, nikoinlove <nikito...@gmail.com> wrote:
> I have a galera cluster of 3 nodes. All have log-slave-updates, binary log
> format is row. Is this enough to setup an asynchronous replication?

Yes. (I assume binary log is also being written to file.)


> Questions then:
> 1) What node should I use as a master?

Whichever you want. Just pick one.

> Can I replicate from a
> proxy(haproxy/glb) above my galera cluster?

No. It has to stay at the same node all the time.

> 2) What if one node fails?

The best solution is to try to recover the node and get it back into
the galera cluster. In this case the asynchronous replication can
simply continue.

If it is not possible to get the failed node back, you have two options:


1) Use CHANGE MASTER TO command to start replicating from another
node. However, it is not easy to figure out what binlog position to
continue to. They are not the same on each node. But if you can figure
out the correct position, you can just continue from there. The binlog
will contain all the right transactions after that position.

In other words, the binlogs on all galera nodes will contain the same
sequence of transactions, but will not have the same binlog names or
positions.

MySQL 5.6 introduces global transaction id, which makes it much easier
to find the right position in the other binlog.

2) Setup replication from scratch: choose another node, take a backup
from it, record binlog position, provision slave, start replication.


> Do all nodes have the same binlog positions?

No.


> Can
> I continue to replicate from another node from the same binlog position?

Yes, you just have to find the position first.

>
> Thanks in advance.

You're welcome. Good questions.

henrik




--
henri...@avoinelama.fi

nikoinlove

unread,
Jan 28, 2013, 9:11:20 AM1/28/13
to codersh...@googlegroups.com, nikoinlove, henri...@avoinelama.fi
Seems clear:) Another small question:

In other words, the binlogs on all galera nodes will contain the same
sequence of transactions, but will not have the same binlog names or
positions.

Does "the same sequence" means the transactions in the log are in the same order on all nodes? 
If yes, why will nodes ever have different positions in binlog?

Ilias Bertsimas

unread,
Jan 28, 2013, 9:20:07 AM1/28/13
to codersh...@googlegroups.com, nikoinlove, henri...@avoinelama.fi
Hi Niko,

I believe that is because each node acts independently as a mysql server as far as the binlogs are concerned and it might rotate the binlog under different local circumstances so even if you manage to start them exactly at the same time and have them in sync they will eventually differentiate from each other and  have the same transactions logged in different binlog files and positions. Something like that will be possible on mysql 5.6 where global transactions ids will be used in replication.

Kind Regards,
Ilias.

nikoinlove

unread,
Jan 28, 2013, 11:03:47 AM1/28/13
to codersh...@googlegroups.com, nikoinlove, henri...@avoinelama.fi
So the conclusion is - in 5.5 I can't switch a slave to another master from galera cluster, because I will never now the corresponding binlog position on another node,right?

Ilias Bertsimas

unread,
Jan 28, 2013, 11:14:30 AM1/28/13
to codersh...@googlegroups.com, nikoinlove, henri...@avoinelama.fi
Yep that seems to be the case. Only way to switch is with a fresh snapshot from the new master as Henrik already mentioned.
Reply all
Reply to author
Forward
0 new messages