Remove one node and join it again with IST?

145 views
Skip to first unread message

P4tt4nz

unread,
Nov 1, 2012, 7:36:28 AM11/1/12
to codersh...@googlegroups.com
Hi

I wonder if there is a way to remove one node from the cluster and still have it online in read only mode, do some stuff on the cluster and then let the node in question join again with just IST?

I have tried to restart a node with gcomm:// as a new cluster. I save the grastate.dat before i restart the node.
Set the node in a read_only mode.
Then do stuff to the remaining cluster.
When im done i restart the read_only node with the old grastate.dat and instead of gcomm:// i point it to the donor.

It will then join the cluster but i can only get it to do this with a SST.

Here is the log i got from the restarted node:
121101 12:16:39 [Note] WSREP: Running: 'wsrep_sst_xtrabackup 'joiner' '172.27.13.94' 'root:dev' '/db/data/' '/db/data/my.cnf' '26252' 2>sst.err'
121101 12:16:39 [Note] WSREP: Prepared SST request: xtrabackup|172.27.13.94:4444/xtrabackup_sst
121101 12:16:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121101 12:16:39 [Note] WSREP: Assign initial position for certification: 12674, protocol version: 2
121101 12:16:39 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (966b97ea-0324-11e2-0800-1a9b14b7fe9c): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():440. IST will be unavailable.
121101 12:16:39 [Note] WSREP: Node 2 (e2) requested state transfer from 'e1'. Selected 0 (e1)(SYNCED) as donor.
121101 12:16:39 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12674)
121101 12:16:39 [Note] WSREP: Requesting state transfer: success, donor: 0

And this is from the Donor:
121101 12:16:37 [Note] WSREP: New cluster view: global state: 966b97ea-0324-11e2-0800-1a9b14b7fe9c:12674, view# 11: Primary, number of nodes: 3, my index: 0, protocol version 2
121101 12:16:37 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121101 12:16:37 [Note] WSREP: Assign initial position for certification: 12674, protocol version: 2
121101 12:16:39 [Note] WSREP: Node 2 (e2) requested state transfer from 'e1'. Selected 0 (ekund1)(SYNCED) as donor.
121101 12:16:39 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 12674)
121101 12:16:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121101 12:16:39 [Note] WSREP: Running: 'wsrep_sst_xtrabackup 'donor' '172.27.13.94:4444/xtrabackup_sst' 'root:dev' '/db/data/' '/db/data/my.cnf' '966b97ea-0324-11e2-0800-1a9b14b7fe9c' '12674' '0''
121101 12:16:39 [Note] WSREP: sst_donor_thread signaled with 0

Cheers
Patrik

P4tt4nz

unread,
Nov 1, 2012, 8:51:46 AM11/1/12
to codersh...@googlegroups.com
Hmm seems i cant even stop, do a create table on cluster, and start the stopped node without a full SST. Strange thing in the log on the stopped and started node:
Group state: 966b97ea-0324-11e2-0800-1a9b14b7fe9c:12677
Local state: 00000000-0000-0000-0000-000000000000:-1

but if i check the grastate.dat file on restarted node:
more grastate.dat
# GALERA saved state
version: 2.1
uuid:    966b97ea-0324-11e2-0800-1a9b14b7fe9c
seqno:   12670
cert_index:

/Patrik

Alex Yurchenko

unread,
Nov 1, 2012, 9:06:32 AM11/1/12
to codersh...@googlegroups.com
On 2012-11-01 14:51, P4tt4nz wrote:
> Hmm seems i cant even stop, do a create table on cluster, and start
> the
> stopped node without a full SST. Strange thing in the log on the
> stopped
> and started node:
> Group state: 966b97ea-0324-11e2-0800-1a9b14b7fe9c:12677
> Local state: 00000000-0000-0000-0000-000000000000:-1
>
> but if i check the grastate.dat file on restarted node:
> more grastate.dat
> # GALERA saved state
> version: 2.1
> uuid: 966b97ea-0324-11e2-0800-1a9b14b7fe9c
> seqno: 12670
> cert_index:
>
> /Patrik

Need more logs - starting from the start. Donor is irrelevant here as
joiner just does not recognize saved state.
--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011
Reply all
Reply to author
Forward
0 new messages