Garbd SST backup problem

312 views
Skip to first unread message

Søren Dam

unread,
Feb 21, 2017, 11:12:43 AM2/21/17
to codership
Hi All, i've been troubling with Garbd backup, i've some trouble making it work.

i'm running this command:

./usr/bin/garbd -a gcomm://galera1:4567?gmcast.listen_addr=tcp://0.0.0.0:4444 --group galera --donor galera1 --sst xtrabackup


It's pretty clear to me that it's the socat pipe from donor tio garbd host tht's not working, but i can't see why..

Everything else is working very fine.. 

and have the following console output

017-02-21 13:26:54.162  INFO: CRC-32C: using hardware acceleration.
2017-02-21 13:26:54.162  INFO: Read config: 
daemon:  0
name:    garb
address: gcomm://dgalera-sql-001:4567?gmcast.listen_addr=tcp://0.0.0.0:4444
group:   galera
sst:     xtrabackup
donor:   galera1
options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
cfg:     
log:     
 
2017-02-21 13:26:54.165  INFO: protonet asio version 0
2017-02-21 13:26:54.165  INFO: Using CRC-32C for message checksums.
2017-02-21 13:26:54.166  INFO: backend: asio
2017-02-21 13:26:54.166  INFO: gcomm thread scheduling priority set to other:0 
2017-02-21 13:26:54.166  WARN: access file(./gvwstate.dat) failed(No such file or directory)
2017-02-21 13:26:54.166  INFO: restore pc from disk failed
2017-02-21 13:26:54.168  INFO: GMCast version 0
2017-02-21 13:26:54.170  INFO: (691c4d47, 'tcp://0.0.0.0:4444') listening at tcp://0.0.0.0:4444
2017-02-21 13:26:54.170  INFO: (691c4d47, 'tcp://0.0.0.0:4444') multicast: , ttl: 1
2017-02-21 13:26:54.171  INFO: EVS version 0
2017-02-21 13:26:54.172  INFO: gcomm: connecting to group 'galera', peer 'galera1:4567'
2017-02-21 13:26:54.293  INFO: (691c4d47, 'tcp://0.0.0.0:4444') connection established to b2113ed7 tcp://8.8.8.8:4567
2017-02-21 13:26:54.293  INFO: (691c4d47, 'tcp://0.0.0.0:4444') turning message relay requesting on, nonlive peers: tcp://8.8.8.8:4567 tcp://8.8.8.9:4567 tcp://8.8.8.7:4567 
2017-02-21 13:26:54.362  INFO: declaring 69b5a251 at tcp://8.8.8.7:4567 stable
2017-02-21 13:26:54.363  INFO: declaring a3b7a69d at tcp://8.8.8.8:4567 stable
2017-02-21 13:26:54.363  INFO: declaring b2113ed7 at tcp://10.3.71.150:4567 stable
2017-02-21 13:26:54.363  INFO: declaring e2bbd2bb at tcp://8.8.8.9:4567 stable
2017-02-21 13:26:54.366  INFO: Node 69b5a251 state prim
2017-02-21 13:26:54.395  INFO: view(view_id(PRIM,691c4d47,19) memb {
691c4d47,0
69b5a251,0
a3b7a69d,1
b2113ed7,1
e2bbd2bb,1
} joined {
} left {
} partitioned {
})
2017-02-21 13:26:54.395  INFO: save pc into disk
2017-02-21 13:26:54.486  INFO: (691c4d47, 'tcp://0.0.0.0:4444') connection established to 69b5a251 tcp://8.8.8.7:4567
2017-02-21 13:26:54.490  INFO: (691c4d47, 'tcp://0.0.0.0:4444') connection established to e2bbd2bb tcp://8.8.8.9:4567
2017-02-21 13:26:54.527  INFO: (691c4d47, 'tcp://0.0.0.0:4444') connection established to a3b7a69d tcp://8.8.8.8:4567
2017-02-21 13:26:54.527  INFO: (691c4d47, 'tcp://0.0.0.0:4444') connection established to a3b7a69d tcp://8.8.8.8:4567
2017-02-21 13:26:54.673  INFO: gcomm: connected
2017-02-21 13:26:54.673  INFO: Changing maximum packet size to 64500, resulting msg size: 32636
2017-02-21 13:26:54.674  INFO: Shifting CLOSED -> OPEN (TO: 0)
2017-02-21 13:26:54.674  INFO: Opened channel 'galera'
2017-02-21 13:26:54.674  INFO: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 5
2017-02-21 13:26:54.675  INFO: STATE_EXCHANGE: sent state UUID: 6969c80d-f839-11e6-8c64-6b225d74d978
2017-02-21 13:26:54.677  INFO: STATE EXCHANGE: sent state msg: 6969c80d-f839-11e6-8c64-6b225d74d978
2017-02-21 13:26:54.678  INFO: STATE EXCHANGE: got state msg: 6969c80d-f839-11e6-8c64-6b225d74d978 from 1 (garb)
2017-02-21 13:26:54.678  INFO: STATE EXCHANGE: got state msg: 6969c80d-f839-11e6-8c64-6b225d74d978 from 2 (dkcph-t-galera-sql-002)
2017-02-21 13:26:54.678  INFO: STATE EXCHANGE: got state msg: 6969c80d-f839-11e6-8c64-6b225d74d978 from 3 (dkcph-t-galera-sql-001)
2017-02-21 13:26:54.678  INFO: STATE EXCHANGE: got state msg: 6969c80d-f839-11e6-8c64-6b225d74d978 from 4 (dkcph-t-galera-sql-003)
2017-02-21 13:26:54.679  INFO: STATE EXCHANGE: got state msg: 6969c80d-f839-11e6-8c64-6b225d74d978 from 0 (garb)
2017-02-21 13:26:54.679  INFO: Quorum results:
version    = 4,
component  = PRIMARY,
conf_id    = 18,
members    = 4/5 (joined/total),
act_id     = 75895172,
last_appl. = -1,
protocols  = 0/7/3 (gcs/repl/appl),
group UUID = 14d8aedd-e93d-11e6-816c-525ae11f96b7
2017-02-21 13:26:54.679  INFO: Flow-control interval: [9999999, 9999999]
2017-02-21 13:26:54.679  INFO: Shifting OPEN -> PRIMARY (TO: 75895172)
2017-02-21 13:26:54.679  INFO: Sending state transfer request: 'xtrabackup', size: 10
2017-02-21 13:26:54.680  INFO: Member 0.0 (garb) requested state transfer from 'dkcph-t-galera-sql-001'. Selected 3.1 (dkcph-t-galera-sql-001)(SYNCED) as donor.
2017-02-21 13:26:54.680  INFO: Shifting PRIMARY -> JOINER (TO: 75895172)
2017-02-21 13:26:54.680  INFO: Closing send monitor...
2017-02-21 13:26:54.680  INFO: Closed send monitor.
2017-02-21 13:26:54.680  INFO: gcomm: terminating thread
2017-02-21 13:26:54.680  INFO: gcomm: joining thread
2017-02-21 13:26:54.680  INFO: gcomm: closing backend
2017-02-21 13:26:54.685  INFO: 0.0 (garb): State transfer from 3.1 (dkcph-t-galera-sql-001) complete.
2017-02-21 13:26:54.685  INFO: Shifting JOINER -> JOINED (TO: 75895172)
2017-02-21 13:26:54.685  INFO: view(view_id(NON_PRIM,691c4d47,19) memb {
691c4d47,0
} joined {
} left {
} partitioned {
69b5a251,0
a3b7a69d,1
b2113ed7,1
e2bbd2bb,1
})
2017-02-21 13:26:54.686  INFO: view((empty))
2017-02-21 13:26:54.686  INFO: gcomm: closed
2017-02-21 13:26:54.686  WARN: 0x2638098 down context(s) not set
2017-02-21 13:26:54.686  WARN: Failed to send SYNC signal: -107 (Transport endpoint is not connected)
2017-02-21 13:26:54.686  INFO: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2017-02-21 13:26:54.686  INFO: Flow-control interval: [9999999, 9999999]
2017-02-21 13:26:54.686  INFO: Received NON-PRIMARY.
2017-02-21 13:26:54.686  INFO: Shifting JOINED -> OPEN (TO: 75895172)
2017-02-21 13:26:54.686  INFO: Received self-leave message.
2017-02-21 13:26:54.686  INFO: Flow-control interval: [9999999, 9999999]
2017-02-21 13:26:54.687  INFO: Received SELF-LEAVE. Closing connection.
2017-02-21 13:26:54.687  INFO: Shifting OPEN -> CLOSED (TO: 75895172)
2017-02-21 13:26:54.687  INFO: RECV thread exiting 0: Success
2017-02-21 13:26:54.687  INFO: recv_thread() joined.
2017-02-21 13:26:54.687  INFO: Closing replication queue.
2017-02-21 13:26:54.687  INFO: Closing slave action queue.
2017-02-21 13:26:54.687  WARN: Attempt to close a closed connection
2017-02-21 13:26:54.687  INFO: Exiting main loop
2017-02-21 13:26:54.687  INFO: Shifting CLOSED -> DESTROYED (TO: 75895172)




on donor

2017-02-21 13:26:55 140110903371520 [Note] WSREP: Flow-control interval: [32, 32]
2017-02-21 13:26:55 140110892575488 [Note] WSREP: New cluster view: global state: 14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172, view# 20: Primary, number of nodes: 4, my index: 2, protocol version 3
2017-02-21 13:26:55 140110892575488 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-02-21 13:26:55 140110892575488 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-02-21 13:26:55 140110892575488 [Note] WSREP: Assign initial position for certification: 75895172, protocol version: 3
2017-02-21 13:26:55 140110964193024 [Note] WSREP: Service thread queue flushed.
WSREP_SST: [ERROR] innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log (20170221 13:26:57.302)
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20170221 13:26:57.303)
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0'
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0': 22 (Invalid argument)
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0'
2017-02-21 13:26:57 140110903371520 [Warning] WSREP: Could not find peer: 691c4d47-f839-11e6-b349-c2fe1a316eee
2017-02-21 13:26:57 140110903371520 [Warning] WSREP: 2.1 (galarea1): State transfer to -1.-1 (left the group) failed: -22 (Invalid argument)
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 75895172)
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Member 2.1 (galera1) synced with group.
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 75895172)
2017-02-21 13:26:57 140110892575488 [Note] WSREP: Synchronized with group, ready for connections
2017-02-21 13:26:57 140110892575488 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-02-21 13:26:57 140110913861376 [Note] WSREP: (b2113ed7, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-02-21 13:27:00 140110913861376 [Note] WSREP:  cleaning up 691c4d47 (tcp://8.8.8.7:4444)


and 


/var/lib/mysql//innobackup.backup.log
 
170221 13:26:55 innobackupex: Starting the backup operation
 
IMPORTANT: Please check that the backup run completes successfully.
           At the end of a successful backup run innobackupex
           prints "completed OK!".
 
170221 13:26:56  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'root'  (using password: YES).
170221 13:26:56  version_check Connected to MySQL server
170221 13:26:56  version_check Executing a version check against the server...
170221 13:26:56  version_check Done.
170221 13:26:56 Connecting to MySQL server host: localhost, user: root, password: set, port: 0, socket: /var/lib/mysql/mysql.sock
Using server version 10.1.21-MariaDB
innobackupex version 2.4.4 based on MySQL server 5.7.13 Linux (x86_64) (revision id: df58cf2)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql/
xtrabackup: open files limit requested 0, set to 90000
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
InnoDB: Number of pools: 1
170221 13:26:56 >> log scanned up to (3039513784)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 780 for testsd/sbtest1, old maximum was 0
innobackupex: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
archive_write_header() failed.
[01] xtrabackup: error: cannot open the destination stream for ibdata1
[01] xtrabackup: Error: xtrabackup_copy_datafile() failed.
[01] xtrabackup: Error: failed to copy datafile.


On the donor side the logs look like this on mysql.err log

2017-02-21 13:26:55 140110903371520 [Note] WSREP: Flow-control interval: [32, 32]
2017-02-21 13:26:55 140110892575488 [Note] WSREP: New cluster view: global state: 14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172, view# 20: Primary, number of nodes: 4, my index: 2, protocol version 3
2017-02-21 13:26:55 140110892575488 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-02-21 13:26:55 140110892575488 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-02-21 13:26:55 140110892575488 [Note] WSREP: Assign initial position for certification: 75895172, protocol version: 3
2017-02-21 13:26:55 140110964193024 [Note] WSREP: Service thread queue flushed.
WSREP_SST: [ERROR] innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log (20170221 13:26:57.302)
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20170221 13:26:57.303)
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0'
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0': 22 (Invalid argument)
2017-02-21 13:26:57 140110100166400 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup --role 'donor' --address '' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog '/var/mysqlbinlog/mysql-bin' --gtid '14d8aedd-e93d-11e6-816c-525ae11f96b7:75895172' --gtid-domain-id '0'
2017-02-21 13:26:57 140110903371520 [Warning] WSREP: Could not find peer: 691c4d47-f839-11e6-b349-c2fe1a316eee
2017-02-21 13:26:57 140110903371520 [Warning] WSREP: 2.1 (galarea1): State transfer to -1.-1 (left the group) failed: -22 (Invalid argument)
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 75895172)
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Member 2.1 (galera1) synced with group.
2017-02-21 13:26:57 140110903371520 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 75895172)
2017-02-21 13:26:57 140110892575488 [Note] WSREP: Synchronized with group, ready for connections
2017-02-21 13:26:57 140110892575488 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-02-21 13:26:57 140110913861376 [Note] WSREP: (b2113ed7, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-02-21 13:27:00 140110913861376 [Note] WSREP:  cleaning up 691c4d47 (tcp://8.8.8.7:4444)


and /var/lib/mysql//innobackup.backup.log

170221 13:26:55 innobackupex: Starting the backup operation
 
IMPORTANT: Please check that the backup run completes successfully.
           At the end of a successful backup run innobackupex
           prints "completed OK!".
 
170221 13:26:56  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'root'  (using password: YES).
170221 13:26:56  version_check Connected to MySQL server
170221 13:26:56  version_check Executing a version check against the server...
170221 13:26:56  version_check Done.
170221 13:26:56 Connecting to MySQL server host: localhost, user: root, password: set, port: 0, socket: /var/lib/mysql/mysql.sock
Using server version 10.1.21-MariaDB
innobackupex version 2.4.4 based on MySQL server 5.7.13 Linux (x86_64) (revision id: df58cf2)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql/
xtrabackup: open files limit requested 0, set to 90000
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
InnoDB: Number of pools: 1
170221 13:26:56 >> log scanned up to (3039513784)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 780 for testsd/sbtest1, old maximum was 0
innobackupex: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
archive_write_header() failed.
[01] xtrabackup: error: cannot open the destination stream for ibdata1
[01] xtrabackup: Error: xtrabackup_copy_datafile() failed.
[01] xtrabackup: Error: failed to copy datafile.



alexey.y...@galeracluster.com

unread,
Feb 22, 2017, 6:04:23 AM2/22/17
to Søren Dam, codership
Unlike mysqld, garbd does not start receiving script automatically, it
merely broadcasts SST request through group communication mechanism.
Backup host must be prepared manually before running garbd in this case.

Søren Dam

unread,
Feb 22, 2017, 1:40:22 PM2/22/17
to codership, soer...@gmail.com
Ok, i didn't see that in any of the documentation i've seen. Thanks for the answer  Alexey:-)

So you do you script your own backup scripts with netcat/socat tunnel creation? 

alexey.y...@galeracluster.com

unread,
Feb 26, 2017, 3:42:13 AM2/26/17
to Søren Dam, codership
Yes, I would suggest taking an existing SST script and modify it to your
liking or use unmodified.
Reply all
Reply to author
Forward
0 new messages