How to backup galera cluster through garbd? SST failes

Merlin Morgenstern

unread,

Sep 1, 2015, 6:47:50 AM9/1/15

to Philip Stoev, codersh...@googlegroups.com, Josten Landtroop

Thank you everybody for your help on getting me started on MySQL Galera.

I have a cluster of 3 perfectly operational. Now I would like to adapt my backup procedure to use SST following this example:http://galeracluster.com/documentation-webpages/backingupthecluster.html

The command I am using:

node1:~$ sudo /usr/bin/garbd --address gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444 --group example_cluster --donor MyNode1 --sst backup

10.0.0.120 is the local adreass of node1. I also tried 10.0.0.10?3306 which is the VIP of the cluster through HAProxy.

Both fail with the error:

FATAL: Exception in creating receive loop: Failed to open connection to group: 110 (Connection timed out) at garb/garb_gcs.cpp:Gcs():35

Leaving me with the following questions:

How to specify the backupfile that I can transfer to an FTP Server for potential recovery of the cluster.
Why does the connection fail? Do I need to configure the cluster for backup access?

Thank you in advance for any help.

I am attaching the detailed output after issuing the backup command:

2015-09-01 12:29:56.240  INFO: CRC-32C: using hardware acceleration.
2015-09-01 12:29:56.241  INFO: Read config: 
    daemon:  0
    name:    garb
    address: gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444
    group:   example_cluster
    sst:     backup
    donor:   MyNode1
    options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
    cfg:     
    log:     

2015-09-01 12:29:56.245  INFO: protonet asio version 0
2015-09-01 12:29:56.245  INFO: Using CRC-32C for message checksums.
2015-09-01 12:29:56.246  INFO: backend: asio
2015-09-01 12:29:56.246  WARN: access file(./gvwstate.dat) failed(No such file or directory)
2015-09-01 12:29:56.247  INFO: restore pc from disk failed
2015-09-01 12:29:56.248  INFO: GMCast version 0
2015-09-01 12:29:56.249  INFO: (63af44e0, 'tcp://0.0.0.0:4444') listening at tcp://0.0.0.0:4444
2015-09-01 12:29:56.249  INFO: (63af44e0, 'tcp://0.0.0.0:4444') multicast: , ttl: 1
2015-09-01 12:29:56.250  INFO: EVS version 0
2015-09-01 12:29:56.250  INFO: gcomm: connecting to group 'example_cluster', peer '10.0.0.120:3307'
2015-09-01 12:29:59.255  WARN: no nodes coming from prim view, prim not possible
2015-09-01 12:29:59.256  INFO: view(view_id(NON_PRIM,63af44e0,1) memb {
    63af44e0,0
} joined {
} left {
} partitioned {
})
2015-09-01 12:29:59.759  WARN: last inactive check more than PT1.5S ago (PT3.50916S), skipping check
2015-09-01 12:30:29.287  INFO: view((empty))
2015-09-01 12:30:29.288 ERROR: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
     at gcomm/src/pc.cpp:connect():162
2015-09-01 12:30:29.289 ERROR: gcs/src/gcs_core.cpp:gcs_core_open():206: Failed to open backend connection: -110 (Connection timed out)
2015-09-01 12:30:29.290 ERROR: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'example_cluster' at 'gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444': -110 (Connection timed out)
2015-09-01 12:30:29.290 FATAL: Exception in creating receive loop: Failed to open connection to group: 110 (Connection timed out)
     at garb/garb_gcs.cpp:Gcs():35

Philip Stoev

unread,

Sep 1, 2015, 7:46:42 AM9/1/15

to Merlin Morgenstern, Josten Landtroop, codersh...@googlegroups.com

Hello,

Galera does not come with the script that performs backup using garbd. That
script needs to be created for the specific installation using one of the
existing wsrep_sst_* scripts as a template.

You may find it easier to do the following:
1. instead of garbd, start a regular mysqld just for backup purposes
2. wait for mysqld to sync with cluster via SST
3. shut down the server
4. The data directory of the server is now consistent and can be coped over
and archived.

Traditional backups via mysqldump and xtrabackup using an existing node will
also work.

Philip Stoev

Merlin Morgenstern

unread,

Sep 2, 2015, 9:45:02 AM9/2/15

to Philip Stoev, Josten Landtroop, codersh...@googlegroups.com

Thank you Pilipp for the clarification.

I trying now with XtraBackup. During setup I found that I imported some MyIsam tables by mistake into Galera and XtraBackup told me that ibdata1 is corrupt. I removed those tables and also the ibdata logfiles together with the binary logs. Restarted all nodes after that.

Unfortunatelly XtraBackup keeps on telling me, that my ibdata1 file is corrupt (though on a different page now!). I double checked all tables and they are perfectly fine.

This is the output of XtraBackup:

xtrabackup version 2.2.12 based on MySQL server 5.6.24 Linux (x86_64) (revision id: 8726828)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /data/mysql/data
xtrabackup: open files limit requested 0, set to 1024
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
>> log scanned up to (22054624442)
xtrabackup: Generating a list of tablespaces
[01] Copying ./ibdata1 to /data/backup/2015-09-02_15-23-50/ibdata1
[01] xtrabackup: Database page corruption detected at page 1320, retrying...

I am using this command:

sudo innobackupex --user=bkpuser --password=test --databases="test" /data/backup

Am I missing something or is innodb really corrupt? I can not see any indication of that except that xtrabackup tells me that.

Thank you in advance for any help.

Philip Stoev

unread,

Sep 2, 2015, 9:55:43 AM9/2/15

to Merlin Morgenstern, Josten Landtroop, codersh...@googlegroups.com

Hello,

Unfortunately I can not help with the error you are seeing, this seems to be
an InnoDB or xtrabackup problem rather than a Galera one. You may wish to
contact the mailing list that the maintainers of xtrabackup, have provided:

https://groups.google.com/forum/#!forum/percona-discussion

Thank you.

Yin Xuesong

unread,

Sep 1, 2016, 1:45:56 AM9/1/16

to codership, philip...@galeracluster.com, jos...@no-io.net

I think you should use port 4567,like this

node1:~$ sudo /usr/bin/garbd --address gcomm://10.0.0.120:4567,10.0.0.121:4567,10.0.0.121:4567?gmcast.listen_addr=tcp://0.0.0.0:4444 --group example_cluster --donor MyNode1 --sst backup

在 2015年9月1日星期二 UTC+8下午6:47:50，Merlin Morgenstern写道：

hunter86bg

unread,

Sep 2, 2016, 2:34:37 AM9/2/16

to codership, philip...@galeracluster.com, jos...@no-io.net

Could you share your versions of Galera Cluster, socat,netcat and rsync packages ?
They should be the same on all nodes (including the arbitrator).

hunter86bg

unread,

Sep 2, 2016, 2:40:21 AM9/2/16

to codership, philip...@galeracluster.com, jos...@no-io.net

Also I would be happy to provide me with a feedback on the following: Galera Cluster backup via SST(rsync)
I haven't tested it with recent version,but it might work and check the comments

James Wang

unread,

Sep 5, 2016, 10:37:21 AM9/5/16

to codership, philip...@galeracluster.com, jos...@no-io.net

try: set global wsrep_desync=ON;
backup ....

set global wsrep_desync=OFF;

Reply all

Reply to author

Forward