SST fails on Galera

74 views
Skip to first unread message

Pierre Schwartz

unread,
Nov 30, 2017, 7:40:58 AM11/30/17
to codership
Hello,
I have 2 nodes : galera1 (sync and 40G+ of data) and galera2, trying to fully sync with node1 (~ 10.000 tables)

On node2 startup, SST is started (fine) . galera1 gives the databases to galera2 (I see the disk space growing on galera2 and I see the syslog of innodbbackupex on galera1).
And at the end of the SST, is stops with theses logs :

On the receiver :
2017-11-30 12:36:08 140115384989440 [Note] WSREP: Member 1.0 (galera2) requested state transfer from '*any*'. Selected 0.0 (galera1)(SYNCED) as donor.
2017-11-30 12:36:08 140115384989440 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 15455972)
2017-11-30 12:36:08 140115960735488 [Note] WSREP: Requesting state transfer: success, donor: 0
2017-11-30 12:36:08 140115960735488 [Note] WSREP: GCache history reset: old(51993a17-cb9f-11e7-ba08-5a9b1bd63677:0) -> new(51993a17-cb9f-11e7-ba08-5a9b1bd63677:15455859)
2017-11-30 12:36:10 140115401766656 [Note] WSREP: (a7af52fa, 'tcp://0.0.0.0:4567') connection to peer a7af52fa with addr tcp://192.168.0.131:4567 timed out, no messages seen in PT3S
2017-11-30 12:36:10 140115401766656 [Note] WSREP: (a7af52fa, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-11-30 12:39:20 140115384989440 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000000 of size 134217728 bytes
2017-11-30 12:42:27 140115384989440 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000001 of size 134217728 bytes
2017-11-30 12:45:49 140115384989440 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000002 of size 134217728 bytes
2017-11-30 12:50:29 140115384989440 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000003 of size 134217728 bytes
2017-11-30 13:03:32 140115384989440 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000004 of size 134217728 bytes
2017-11-30 13:27:56 140115384989440 [Warning] WSREP: 0.0 (galera1): State transfer to 1.0 (galera2) failed: -22 (Invalid argument)
2017-11-30 13:27:56 140115384989440 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
2017-11-30 13:27:56 140115384989440 [Note] WSREP: gcomm: terminating thread
2017-11-30 13:27:56 140115384989440 [Note] WSREP: gcomm: joining thread
2017-11-30 13:27:56 140115384989440 [Note] WSREP: gcomm: closing backend
2017-11-30 13:27:56 140115359823616 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.0.131' --datadir '/var/lib/mysql/'   --parent '11869'  '' : 2 (No such file or directory)
2017-11-30 13:27:56 140115359823616 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2017-11-30 13:27:56 140115960951360 [ERROR] WSREP: SST failed: 2 (No such file or directory)
2017-11-30 13:27:56 140115960951360 [ERROR] Aborting


On the donor (galera1)
... transfering the data, compressing and streaming ....
Nov 30 13:27:53 galera1 -innobackupex-backup: 171130 13:27:53 [01]        ...done
Nov 30 13:27:54 galera1 -innobackupex-backup: 171130 13:27:54 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...
Nov 30 13:27:54 galera1 -innobackupex-backup: 171130 13:27:54 >> log scanned up to (398747638500)
Nov 30 13:27:54 galera1 -innobackupex-backup: 171130 13:27:54 Executing FLUSH TABLES WITH READ LOCK...
Nov 30 13:27:55 galera1 -innobackupex-backup: 171130 13:27:55 >> log scanned up to (398748098498)
Nov 30 13:27:56 galera1 -innobackupex-backup: 171130 13:27:56 >> log scanned up to (398748109280)
Nov 30 13:27:56 galera1 -innobackupex-backup: Error: failed to execute query FLUSH TABLES WITH READ LOCK: Lock wait timeout exceeded; try restarting transaction
Nov 30 13:27:56 galera1 -wsrep-sst-donor: innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log
Nov 30 13:27:56 galera1 -wsrep-sst-donor: Cleanup after exit with status:22
Nov 30 13:27:56 galera1 -wsrep-sst-donor: Cleaning up temporary directories


Any idea about the error ? Maybe a mysqld config option is wrong.
Did anyone already get it ?

Lammert Bies

unread,
Dec 4, 2017, 5:51:15 AM12/4/17
to codership
The donor log says "innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log"

Does this backup log file give any additional information about the error?

Pierre Schwartz

unread,
Dec 4, 2017, 6:01:19 AM12/4/17
to codership
No the innobackup.backup.log file does not exist

FISH

unread,
Dec 6, 2017, 7:10:22 AM12/6/17
to codership
The sst with the xtrabackup doesn't executed successfully.  Please retry to run the sst with rsync, if xtrabackup won't succeed after multiple tries.

 -wsrep-sst-donor: innobackupex finished with error: 1.  Check /var/lib/mysql//innobackup.backup.log
Reply all
Reply to author
Forward
0 new messages