Re: Issues with wsrep_sst_xtrabackup in latest PXC (5.5.28)

2,060 views
Skip to first unread message

Michael Eklund

unread,
Nov 16, 2012, 3:39:38 PM11/16/12
to percona-d...@googlegroups.com
I can confirm the 1st case, though it is more nefarious then described.  WSRP wsrep_sst_xtrabackup will use the last included cnf file.  It does not appear to use them as a unit.

Mike E.

On Friday, November 16, 2012 9:43:19 AM UTC-6, Freejack wrote:
Hi.  Installed the latest version of PXC yesterday in our test environment.  We've been using xtrabackup for SST and has been working well up to now.  It seems to be broken in the latest version though.  Here is what I have installed:

Percona-XtraDB-Cluster-shared-5.5.28-23.7.369.rhel5
Percona-XtraDB-Cluster-galera-2.0-1.117.rhel5
Percona-XtraDB-Cluster-server-5.5.28-23.7.369.rhel5
Percona-XtraDB-Cluster-client-5.5.28-23.7.369.rhel5
percona-xtrabackup-2.0.3-470.rhel5

Two issues...
1) innobackupex looks in the wsrep.cnf file for the datadir.  As most of us probably have this configured in my.cnf instead, innobackupex will bomb out.  I added datadir to wsrep.cnf also and it was able to continue
2) after the db was fully transferred to the recipient node (based on size of datadir), the SST process terminates prematurely with the following.  innobackex completed OK though.  First log is from donor, second from recipient:



DONOR:
121116  8:45:32 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'donor' --address '10.1.8.233:4444/xtrabackup_sst' --auth '***:***' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/wsrep.cnf' --gtid 'f472d7be-2e86-11e2-0800-cf63dfd4bfa3:421807''
121116  8:45:32 [Note] WSREP: sst_donor_thread signaled with 0
mktemp: invalid option -- -
Usage: mktemp [-V] | [-dqtu] [-p prefix] [template]
121116  9:21:09 [Note] WSREP: Provider paused at f472d7be-2e86-11e2-0800-cf63dfd4bfa3:421807
121116  9:25:11 [Note] WSREP: Provider resumed.
/usr//bin/wsrep_sst_xtrabackup: line 37: $1: unbound variable
121116  9:25:13 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '10.1.8.233:4444/xtrabackup_sst' --auth '***:***' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/wsrep.cnf' --gtid 'f472d7be-2e86-11e2-0800-cf63dfd4bfa3:421807'
121116  9:25:13 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '10.1.8.233:4444/xtrabackup_sst' --auth '***:***' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/wsrep.cnf' --gtid 'f472d7be-2e86-11e2-0800-cf63dfd4bfa3:421807': 1 (Operation not permitted)
121116  9:25:13 [Warning] WSREP: 0 (): State transfer to 1 () failed: -1 (Operation not permitted)
121116  9:25:13 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 421807)
121116  9:25:13 [Note] WSREP: Member 0 () synced with group.
121116  9:25:13 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 421807)
121116  9:25:13 [Note] WSREP: Synchronized with group, ready for connections
121116  9:25:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121116  9:25:14 [Note] WSREP: view(view_id(PRIM,03efd9aa-2ff2-11e2-0800-8f099c644f65,7) memb {
        03efd9aa-2ff2-11e2-0800-8f099c644f65,
} joined {
} left {
} partitioned {
        e1c9287a-2ff3-11e2-0800-a5873d6c135d,
})
121116  9:25:14 [Note] WSREP: forgetting e1c9287a-2ff3-11e2-0800-a5873d6c135d (tcp://10.1.8.233:4567)
121116  9:25:14 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1



RECIPIENT:
121116  8:45:31 [Note] WSREP: Prepared SST request: xtrabackup|10.1.8.233:4444/xtrabackup_sst
121116  8:45:31 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121116  8:45:31 [Note] WSREP: Assign initial position for certification: 421807, protocol version: 2
121116  8:45:31 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (f472d7be-2e86-11e2-0800-cf63dfd4bfa3): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():440. IST will be unavailable.
121116  8:45:31 [Note] WSREP: Node 1 () requested state transfer from '*any*'. Selected 0 ()(SYNCED) as donor.
121116  8:45:31 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 421807)
121116  8:45:31 [Note] WSREP: Requesting state transfer: success, donor: 0
tar: xtrabackup_checkpoints: time stamp 2012-11-16 09:25:11 is 1 s in the future
tar: xtrabackup_binary: time stamp 2012-11-16 09:25:13 is 1 s in the future
121116  9:25:12 [Warning] WSREP: 0 (): State transfer to 1 () failed: -1 (Operation not permitted)
121116  9:25:12 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():712: Will never receive state. Need to abort.
121116  9:25:12 [Note] WSREP: gcomm: terminating thread
121116  9:25:12 [Note] WSREP: gcomm: joining thread
121116  9:25:12 [Note] WSREP: gcomm: closing backend
121116  9:25:13 [Note] WSREP: view(view_id(NON_PRIM,03efd9aa-2ff2-11e2-0800-8f099c644f65,6) memb {
        e1c9287a-2ff3-11e2-0800-a5873d6c135d,
} joined {
} left {
} partitioned {
        03efd9aa-2ff2-11e2-0800-8f099c644f65,
})
121116  9:25:13 [Note] WSREP: view((empty))
121116  9:25:13 [Note] WSREP: gcomm: closed
121116  9:25:13 [Note] WSREP: /usr/sbin/mysqld: Terminated.
WSREP_SST: [ERROR] Parent mysqld process (PID:6995) terminated unexpectedly. (20121116 09:25:13.866)
121116 09:25:13 mysqld_safe mysqld from pid file /var/lib/mysql/r8-qadb-node1.pid ended



Thanks!

Michael Eklund

unread,
Nov 16, 2012, 3:48:46 PM11/16/12
to percona-d...@googlegroups.com
actually, my datadir was fine though, problem is using the wrong cnf file, it wont the right innodb* settings if you use custom ones.

Michael Eklund

unread,
Nov 16, 2012, 4:33:45 PM11/16/12
to percona-d...@googlegroups.com
filed a bug here:  https://bugs.launchpad.net/codership-mysql/+bug/1079892

vote it up!!!

it is due to recursive calls in mysys/default.c::search_default_file_with_ext which are not accounted for in wsrep mysql patches.

Mike E.

Michael Eklund

unread,
Nov 16, 2012, 4:37:50 PM11/16/12
to percona-d...@googlegroups.com
going to post this in codership group, as it really is an upstream problem.

Michael Eklund

unread,
Nov 16, 2012, 4:46:59 PM11/16/12
to percona-d...@googlegroups.com
I do not see the second problem in my environment.  It may be due to clocks not being in sync.

Freejack

unread,
Nov 16, 2012, 4:55:59 PM11/16/12
to percona-d...@googlegroups.com
I suspected that as well so I synced and tried again with the same results.

amol

unread,
Nov 26, 2012, 12:22:35 AM11/26/12
to percona-d...@googlegroups.com
any luck with this guys,  i am facing the same errors in my donor and receiver nodes with the same Percona versions

Freejack

unread,
Nov 26, 2012, 12:52:26 PM11/26/12
to percona-d...@googlegroups.com
I had to go back to using codership's patched mysql build for now.  Couldn't figure out why the SST fails at the end.

Amol Kedar

unread,
Nov 26, 2012, 1:12:09 PM11/26/12
to percona-d...@googlegroups.com
did you use that build on all 3 nodes? or just the node that was not startiing?
do you have instructions on how to get the build ? and install..

Thanks once again..

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To post to this group, send email to percona-d...@googlegroups.com.
 
 

Freejack

unread,
Nov 26, 2012, 1:15:27 PM11/26/12
to percona-d...@googlegroups.com
You can download everything you need here: http://www.codership.com/downloads/download-mysqlgalera

I used the PXC on all nodes without success.  Reverted all nodes back to codership mysql.  Just install rpm's.  The only thing you need to change in the config is to point to where the galera lib is.  I believe codership and Percona install the library to different places.  

Amol Kedar

unread,
Nov 26, 2012, 1:21:36 PM11/26/12
to percona-d...@googlegroups.com
oh..so you are saying that you gave up on PXC and moved to another cluster software?
ok i was thinking you patched the same installation of PXC but packaged by codership..etc..

ok i think i'll try to revert back to 5.5.27 and see if that help...

thanks for you help and advice..

Freejack

unread,
Nov 26, 2012, 1:27:12 PM11/26/12
to percona-d...@googlegroups.com
Not really another cluster software.  PXC(luster) is built on codership's galera multi-master code.  It's more of a generic version of mysql but with the same clustering capability.
Message has been deleted

Alex Yurchenko

unread,
Dec 25, 2012, 8:55:51 PM12/25/12
to percona-d...@googlegroups.com
Hi,

Just wanted to warn whoever is concerned that this is not a real fix.
The script will pass, but it will create a file that won't be used.
There is no need to call mktemp in the first place - pidfile is stored
in a predefined, unchangeable location. A properly fixed version of the
script is available as a separate download here:
https://launchpad.net/codership-mysql/5.5/5.5.28-23.7/+download/wsrep_sst_xtrabackup
.

Regards,
Alex

On 2012-12-25 03:22, Syahrul Sazli Shaharir wrote:
> Hi Freejack,
>
> This might help you or someone else in the same environment as mine
> (latest
> PXC in CentOS/RHEL 5) - in my case, the telling error was:-
>
> mktemp: invalid option -- -
> Usage: mktemp [-V] | [-dqtu] [-p prefix] [template]
> ( which you also have on your output logs )
>
> which led to a simple fix in /usr/bin/wsrep_sst_xtrabackup :-
>
> 118c118,120
> < XTRABACKUP_PID=$(mktemp --tmpdir
> wsrep_sst_xtrabackupXXXX.pid)
> ---
>> #XTRABACKUP_PID=$(mktemp --tmpdir
>> wsrep_sst_xtrabackupXXXX.pid)
>> # Fix for CentOS 5
>> XTRABACKUP_PID=$(mktemp -t wsrep_sst_xtrabackupXXXX.pid)
>
> (root cause: mktemp version)
>
> and the sync succeeded right afterwards. Hope it helps - thanks.
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Freejack

unread,
Dec 28, 2012, 11:39:05 AM12/28/12
to percona-d...@googlegroups.com
Thanks Alex.  I haven't had a chance to try this yet but will let you know how it goes.
Message has been deleted

Alex Yurchenko

unread,
May 1, 2013, 12:42:49 PM5/1/13
to percona-d...@googlegroups.com
This is most likely permissions problem, not related to OP. Make sure
that
1) selinux/apparmor is disabled (not only in config, but in real life)
2) xtrabackup is installed and is in mysqld's PATH
3) you can run the failed SST command manually: cut and paste it from
the error log.

On 2013-05-01 16:11, adi.h...@fiverr.com wrote:
> I experience same issue.
> Any Idea what should i do?
>
> On Thursday, February 28, 2013 8:43:51 AM UTC+2, Peter Pang wrote:
>>
>> hi mate, i meet this problem with latest version of percona cluster
>>
>> first node log:
>>
>>
>> 130228 17:35:04 [ERROR] WSREP: Failed to read from:
>> wsrep_sst_xtrabackup
>> --role 'donor' --address '10.132.23.69:4444/xtrabackup_sst' --auth
>> 'root:C02G327PDHJQ' --socket '/var/run/mysqld/mysqld.sock' --datadir
>> '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid
>> 'c77d33b4-816e-11e2-0800-24d345dbcebc:0'
>> 130228 17:35:04 [Note] WSREP: sst_donor_thread signaled with 0
>> 130228 17:35:04 [ERROR] WSREP: Process completed with error:
>> wsrep_sst_xtrabackup --role 'donor' --address '
>> 10.132.23.69:4444/xtrabackup_sst' --auth 'root:C02G327PDHJQ' --socket
>> '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'
>> --defaults-file
>> '/etc/mysql/my.cnf' --gtid 'c77d33b4-816e-11e2-0800-24d345dbcebc:0':
>> 2 (No
>> such file or directory)
>>
>> second node log:
>>
>> 130228 17:42:08 [Warning] WSREP: 0 (ip-10-132-23-68): State transfer
>> to 1
>> (ip-10-132-23-69) failed: -1 (Operation not permitted)
>> 130228 17:42:08 [ERROR] WSREP:
>> gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never
>> receive
>> state. Need to abort.
>> 130228 17:42:08 [Note] WSREP: gcomm: terminating thread
>> 130228 17:42:08 [Note] WSREP: gcomm: joining thread
>> 130228 17:42:08 [Note] WSREP: gcomm: closing backend
>> WSREP_SST: [ERROR] Error while getting st data from donor node: 1, 2
>> (20130228 17:42:08.675)
>> 130228 17:42:08 [ERROR] WSREP: Process completed with error:
>> wsrep_sst_xtrabackup --role 'joiner' --address '10.132.23.69' --auth
>> 'root:C02G327PDHJQ' --datadir '/var/lib/mysql/' --defaults-file
>> '/etc/mysql/my.cnf' --parent '12178': 32 (Broken pipe)
>> 130228 17:42:08 [ERROR] WSREP: Failed to read uuid:seqno from joiner
>> script.
>> 130228 17:42:08 [ERROR] WSREP: SST failed: 32 (Broken pipe)
>> 130228 17:42:08 [ERROR] Aborting
>>
>>
>> i just thinks , that is because the script "wsrep_sst_xtrabackup" can
>> not
>> be find by percona
>>
>> could u help me to fix it ?
>>
>> thanks.
Reply all
Reply to author
Forward
0 new messages