Can't add new node to existing cluster

50 views
Skip to first unread message

sla...@tempus.com

unread,
May 23, 2017, 1:37:36 PM5/23/17
to codership
Hi,
I'm trying to add another MySQL node to existing galera cluster but no success. I'm hoping that someone could help me to solve my issue. Currently I have cluster with 2 nodes running and have following config:

Node1 config

[mysqld]

binlog_format=ROW

default-storage-engine=innodb

innodb_autoinc_lock_mode=2

bind-address=0.0.0.0

 

# Galera Provider Configuration

wsrep_on=ON

wsrep_provider=/usr/lib/galera/libgalera_smm.so

 

# Galera Cluster Configuration

wsrep_cluster_name="test_cluster"

wsrep_cluster_address="gcomm://172.16.101.61,172.16.101.62,172.16.101.63"

 

# Galera Synchronization Configuration

wsrep_sst_method=rsync

 

# Galera Node Configuration

wsrep_node_address="172.16.101.61"

wsrep_node_name="db-ha-01"

 

 

Node2 Config:

[mysqld]

binlog_format=ROW

default-storage-engine=innodb

innodb_autoinc_lock_mode=2

bind-address=0.0.0.0

 

# Galera Provider Configuration

wsrep_on=ON

wsrep_provider=/usr/lib/galera/libgalera_smm.so

 

# Galera Cluster Configuration

wsrep_cluster_name="test_cluster"

wsrep_cluster_address="gcomm://172.16.101.61,172.16.101.62,172.16.101.63"

 

# Galera Synchronization Configuration

wsrep_sst_method=rsync

 

# Galera Node Configuration

wsrep_node_address="172.16.101.62"

wsrep_node_name="db-ha-02"

 

Failed Node config:

[mysqld]

binlog_format=ROW

default-storage-engine=innodb

innodb_autoinc_lock_mode=2

bind-address=0.0.0.0

 

# Galera Provider Configuration

wsrep_on=ON

wsrep_provider=/usr/lib/galera/libgalera_smm.so

 

# Galera Cluster Configuration

wsrep_cluster_name="test_cluster"

wsrep_cluster_address="gcomm://172.16.101.61,172.16.101.62,172.16.101.63"

 

# Galera Synchronization Configuration

wsrep_sst_method=rsync

 

# Galera Node Configuration

wsrep_node_address="172.16.101.63"

wsrep_node_name="db-ha-03"

 


When I'm trying to join Node 3 I'm getting following error:


2017-05-22 16:56:58 3186 [Note] WSREP: Read nil XID from storage engines, skipping position init

2017-05-22 16:56:58 3186 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'

2017-05-22 16:56:58 3186 [Note] WSREP: wsrep_load(): Galera 3.20(r7e383f7) by Codership Oy <in...@codership.com> loaded successfully.

2017-05-22 16:56:58 3186 [Note] WSREP: CRC-32C: using hardware acceleration.

2017-05-22 16:56:58 3186 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootsrap: 1

2017-05-22 16:56:58 3186 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 172.16.101.63; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.

2017-05-22 16:56:58 3186 [Note] WSREP: GCache history reset: old(86979474-043a-11e7-ae6c-be317c558b2e:0) -> new(00000000-0000-0000-0000-000000000000:-1)

2017-05-22 16:56:58 3186 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1

2017-05-22 16:56:58 3186 [Note] WSREP: wsrep_sst_grab()

2017-05-22 16:56:58 3186 [Note] WSREP: Start replication

2017-05-22 16:56:58 3186 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1

2017-05-22 16:56:58 3186 [Note] WSREP: protonet asio version 0

2017-05-22 16:56:58 3186 [Note] WSREP: Using CRC-32C for message checksums.

2017-05-22 16:56:58 3186 [Note] WSREP: backend: asio

2017-05-22 16:56:58 3186 [Note] WSREP: gcomm thread scheduling priority set to other:0 

2017-05-22 16:56:58 3186 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)

2017-05-22 16:56:58 3186 [Note] WSREP: restore pc from disk failed

2017-05-22 16:56:58 3186 [Note] WSREP: GMCast version 0

2017-05-22 16:56:58 3186 [Note] WSREP: (93a5b880, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567

2017-05-22 16:56:58 3186 [Note] WSREP: (93a5b880, 'tcp://0.0.0.0:4567') multicast: , ttl: 1

2017-05-22 16:56:58 3186 [Note] WSREP: EVS version 0

2017-05-22 16:56:58 3186 [Note] WSREP: gcomm: connecting to group 'test_cluster', peer '172.16.101.62:'

2017-05-22 16:56:58 3186 [Note] WSREP: (93a5b880, 'tcp://0.0.0.0:4567') connection established to 14a11e6f tcp://172.16.101.62:4567

2017-05-22 16:56:58 3186 [Note] WSREP: (93a5b880, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://172.16.101.61:4567 

2017-05-22 16:56:58 3186 [Note] WSREP: (93a5b880, 'tcp://0.0.0.0:4567') connection established to 3e7aa8ab tcp://172.16.101.61:4567

2017-05-22 16:56:58 3186 [Note] WSREP: declaring 14a11e6f at tcp://172.16.101.62:4567 stable

2017-05-22 16:56:58 3186 [Note] WSREP: declaring 3e7aa8ab at tcp://172.16.101.61:4567 stable

2017-05-22 16:56:58 3186 [Note] WSREP: Node 14a11e6f state prim

2017-05-22 16:56:58 3186 [Note] WSREP: view(view_id(PRIM,14a11e6f,1201) memb {

        14a11e6f,0

        3e7aa8ab,0

        93a5b880,0

} joined {

} left {

} partitioned {

})

2017-05-22 16:56:58 3186 [Note] WSREP: save pc into disk

2017-05-22 16:56:59 3186 [Note] WSREP: gcomm: connected

2017-05-22 16:56:59 3186 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636

2017-05-22 16:56:59 3186 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)

2017-05-22 16:56:59 3186 [Note] WSREP: Opened channel 'test_cluster'

2017-05-22 16:56:59 3186 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3

2017-05-22 16:56:59 3186 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.

2017-05-22 16:56:59 3186 [Note] WSREP: Waiting for SST to complete.

2017-05-22 16:56:59 3186 [Note] WSREP: STATE EXCHANGE: sent state msg: 93f32dff-3f39-11e7-9389-6a54f1b9c8c9

2017-05-22 16:56:59 3186 [Note] WSREP: STATE EXCHANGE: got state msg: 93f32dff-3f39-11e7-9389-6a54f1b9c8c9 from 0 (db-ha-02)

2017-05-22 16:56:59 3186 [Note] WSREP: STATE EXCHANGE: got state msg: 93f32dff-3f39-11e7-9389-6a54f1b9c8c9 from 1 (db-ha-01)

2017-05-22 16:56:59 3186 [Note] WSREP: STATE EXCHANGE: got state msg: 93f32dff-3f39-11e7-9389-6a54f1b9c8c9 from 2 (db-ha-03)

2017-05-22 16:56:59 3186 [Note] WSREP: Quorum results:

        version    = 4,

        component  = PRIMARY,

        conf_id    = 1197,

        members    = 2/3 (joined/total),

        act_id     = 9746813,

        last_appl. = -1,

        protocols  = 0/7/3 (gcs/repl/appl),

        group UUID = 86979474-043a-11e7-ae6c-be317c558b2e

2017-05-22 16:56:59 3186 [Note] WSREP: Flow-control interval: [28, 28]

2017-05-22 16:56:59 3186 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 9746813)

2017-05-22 16:56:59 3186 [Note] WSREP: State transfer required: 

        Group state: 86979474-043a-11e7-ae6c-be317c558b2e:9746813

        Local state: 00000000-0000-0000-0000-000000000000:-1

2017-05-22 16:56:59 3186 [Note] WSREP: New cluster view: global state: 86979474-043a-11e7-ae6c-be317c558b2e:9746813, view# 1198: Primary, number of nodes: 3, my index: 2, protocol version 3

2017-05-22 16:56:59 3186 [Warning] WSREP: Gap in state sequence. Need state transfer.

2017-05-22 16:56:59 3186 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.16.101.63' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '3186'  '' '

2017-05-22 16:56:59 3186 [Note] WSREP: Prepared SST request: rsync|172.16.101.63:4444/rsync_sst

2017-05-22 16:56:59 3186 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2017-05-22 16:56:59 3186 [Note] WSREP: REPL Protocols: 7 (3, 2)

2017-05-22 16:56:59 3186 [Note] WSREP: Assign initial position for certification: 9746813, protocol version: 3

2017-05-22 16:56:59 3186 [Note] WSREP: Service thread queue flushed.

2017-05-22 16:56:59 3186 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (86979474-043a-11e7-ae6c-be317c558b2e): 1 (Operation not permitted)

         at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.

2017-05-22 16:56:59 3186 [Note] WSREP: Member 2.0 (db-ha-03) requested state transfer from '*any*'. Selected 0.0 (db-ha-02)(SYNCED) as donor.

2017-05-22 16:56:59 3186 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 9746825)

2017-05-22 16:56:59 3186 [Note] WSREP: Requesting state transfer: success, donor: 0

2017-05-22 16:56:59 3186 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(86979474-043a-11e7-ae6c-be317c558b2e:9746813)

2017-05-22 16:56:59 3186 [Warning] WSREP: 0.0 (db-ha-02): State transfer to 2.0 (db-ha-03) failed: -2 (No such file or directory)

2017-05-22 16:56:59 3186 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.

2017-05-22 16:56:59 3186 [Note] WSREP: gcomm: terminating thread

2017-05-22 16:56:59 3186 [Note] WSREP: gcomm: joining thread

2017-05-22 16:56:59 3186 [Note] WSREP: gcomm: closing backend

2017-05-22 16:56:59 3186 [Note] WSREP: view(view_id(NON_PRIM,14a11e6f,1201) memb {

        93a5b880,0

} joined {

} left {

} partitioned {

        14a11e6f,0

        3e7aa8ab,0

})

2017-05-22 16:56:59 3186 [Note] WSREP: view((empty))

2017-05-22 16:56:59 3186 [Note] WSREP: gcomm: closed

2017-05-22 16:56:59 3186 [Note] WSREP: /usr/sbin/mysqld: Terminated.

WSREP_SST: [ERROR] Parent mysqld process (PID:3186) terminated unexpectedly. (20170522 16:57:00.206)

WSREP_SST: [INFO] Joiner cleanup. rsync PID: 3226 (20170522 16:57:00.211)

WSREP_SST: [INFO] Joiner cleanup done. (20170522 16:57:00.719)




This email and any attachments may contain privileged and confidential information and/or protected health information (PHI) that is protected by federal and state privacy laws.  It is intended solely for the use of Tempus Labs and the recipient(s) named above.  Nothing contained in this communication and any attachments thereto is intended to waive any privileges or rights of confidentiality.  If you are not the recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any review, dissemination, distribution, printing or copying of this email message and/or any attachments is strictly prohibited.  If you have received this transmission in error, please notify us immediately at (877)-654-5544 and permanently delete this email and any attachments.
Reply all
Reply to author
Forward
0 new messages