Not able to restart mysql on nodes that leave cluster (3 node cluster - 2 nodes mysql stopped)

880 views
Skip to first unread message

quig...@healthsciencessc.org

unread,
Jul 28, 2017, 11:16:47 AM7/28/17
to codership
Hi, 

We recently set up a database cluster for one of our applications. 

It is a 3 node database cluster set up:
Node 1 (started with the parameter --wsrep-new-cluster) : Subnet A
Node 2: Subnet A
Node 3: Subnet B

I have a Virtual IP on subnet A (the applications connecting to the database cluster connect through this) that will failover to Node 2 if something happens to Node 1. This was set up for automatic failover so there is no downtime in the application. Since Node 3 is on a different subnet than the other two, it is used only for backups and with the idea that if something happened to the datacenter where Node 1 and Node 2 live we can manually failover the application to use Node 3. 

Yesterday, Node 1 stopped and Node 3 shutdown too. Node 2 is still up and ws_cluster_size returns 1 and the VIP failover was successful and the VIP is associated with Node 2. I have been unable to restart mysql on Node 1 and Node 3. 

Node 1 Log:

 

170727 12:02:02 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://NODE_2_IP:4567

170727 12:02:02 [Note] WSREP: declaring 570523b9 at tcp://NODE_3_IP:4567 stable

170727 12:02:02 [Note] WSREP: forgetting e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:02 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') turning message relay requesting off

170727 12:02:02 [Note] WSREP: Node 570523b9 state prim

170727 12:02:02 [Note] WSREP: view(view_id(PRIM,570523b9,6) memb {

        570523b9,0

        e196056a,0

} joined {

} left {

} partitioned {

        e2004ea9,0

})

170727 12:02:02 [Note] WSREP: save pc into disk

170727 12:02:02 [Note] WSREP: forgetting e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:02 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: sent state msg: ee015468-72e4-11e7-8514-478779e3836e

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: got state msg: ee015468-72e4-11e7-8514-478779e3836e from 0 (NODE_3)

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: got state msg: ee015468-72e4-11e7-8514-478779e3836e from 1 (NODE_1)

170727 12:02:02 [Note] WSREP: Quorum results:

        version    = 4,

        component  = PRIMARY,

        conf_id    = 5,

        members    = 2/2 (joined/total),

        act_id     = 12090178,

        last_appl. = 12090147,

        protocols  = 0/7/3 (gcs/repl/appl),

        group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:02 [Note] WSREP: Flow-control interval: [23, 23]

170727 12:02:03 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178, view# 6: Primary, number of nodes: 2, my index: 1, protocol version 3

170727 12:02:03 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:03 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:03 [Note] WSREP: Assign initial position for certification: 12090178, protocol version: 3

170727 12:02:03 [Note] WSREP: Service thread queue flushed.

170727 12:02:03 [Note] WSREP: autoinc decrease: 3 -> 2

THD: 7459652, current: 26717405, autoinc: 26717405

170727 12:02:03 [Note] WSREP: autoinc decrease: 3 -> 2

THD: 7459652, current: 21642794, autoinc: 21642794

170727 12:02:03 [Note] WSREP: autoinc decrease: 3 -> 2

THD: 7459653, current: 8481746, autoinc: 8481746

 

170727 12:02:08 [Note] WSREP:  cleaning up e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:08 [Note] WSREP: forgetting 570523b9 (tcp://NODE_3_IP:4567)

170727 12:02:08 [Note] WSREP: Node e196056a state prim

170727 12:02:08 [Note] WSREP: view(view_id(PRIM,e196056a,7) memb {

        e196056a,0

} joined {

} left {

} partitioned {

        570523b9,0

})

170727 12:02:08 [Note] WSREP: save pc into disk

170727 12:02:08 [Note] WSREP: forgetting 570523b9 (tcp://NODE_3_IP:4567)

170727 12:02:08 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1

170727 12:02:08 [Note] WSREP: STATE_EXCHANGE: sent state UUID: f1914216-72e4-11e7-80c5-5b0dec40d614

170727 12:02:08 [Note] WSREP: STATE EXCHANGE: sent state msg: f1914216-72e4-11e7-80c5-5b0dec40d614

170727 12:02:08 [Note] WSREP: STATE EXCHANGE: got state msg: f1914216-72e4-11e7-80c5-5b0dec40d614 from 0 (NODE_1)

170727 12:02:08 [Note] WSREP: Quorum results:

        version    = 4,

        component  = PRIMARY,

        conf_id    = 6,

        members    = 1/1 (joined/total),

        act_id     = 12090208,

        last_appl. = 12090147,

        protocols  = 0/7/3 (gcs/repl/appl),

        group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:08 [Note] WSREP: Flow-control interval: [16, 16]

170727 12:02:08 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208, view# 7: Primary, number of nodes: 1, my index: 0, protocol version 3

170727 12:02:08 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:08 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:08 [Note] WSREP: Assign initial position for certification: 12090208, protocol version: 3

170727 12:02:08 [Note] WSREP: Service thread queue flushed.

170727 12:02:08 [Note] WSREP: autoinc decrease: 2 -> 1

THD: 7459662, current: 26717416, autoinc: 26717416

170727 12:02:08 [Note] WSREP: autoinc decrease: 2 -> 1

THD: 7459662, current: 21642806, autoinc: 21642806

170727 12:02:08 [Note] WSREP: autoinc decrease: 2 -> 1

THD: 7459662, current: 8481750, autoinc: 8481750

 

 

170727 12:02:11 [Note] WSREP: autoinc decrease: 3 -> 1

THD: 7459674, current: 26717426, autoinc: 26717426

170727 12:02:11 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') connection established to f33ecf46 tcp://NODE_2_IP:4567

170727 12:02:11 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:

170727 12:02:12 [Note] WSREP: declaring f33ecf46 at tcp://NODE_2_IP:4567 stable

170727 12:02:12 [Note] WSREP: Node e196056a state prim

170727 12:02:12 [Note] WSREP: view(view_id(PRIM,e196056a,8) memb {

        e196056a,0

        f33ecf46,0

} joined {

} left {

} partitioned {

})

170727 12:02:12 [Note] WSREP: save pc into disk

170727 12:02:12 [Note] WSREP: forgetting 570523b9 (tcp://NODE_3_IP:4567)

170727 12:02:12 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2

170727 12:02:12 [Note] WSREP: STATE_EXCHANGE: sent state UUID: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: sent state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: got state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c from 0 (NODE_1)

170727 12:02:12 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') connection established to 570523b9 tcp://NODE_3_IP:4567

170727 12:02:12 [Warning] WSREP: discarding established (time wait) 570523b9 (tcp://NODE_3_IP:4567)

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: got state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c from 1 (NODE_2)

170727 12:02:12 [Note] WSREP: Quorum results:

        version    = 4,

        component  = PRIMARY,

        conf_id    = 7,

        members    = 1/2 (joined/total),

        act_id     = 12090282,

        last_appl. = 12090275,

        protocols  = 0/7/3 (gcs/repl/appl),

        group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:12 [Note] WSREP: Flow-control interval: [23, 23]

170727 12:02:12 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090282, view# 8: Primary, number of nodes: 2, my index: 0, protocol version 3

170727 12:02:12 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:12 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:12 [Note] WSREP: Assign initial position for certification: 12090282, protocol version: 3

170727 12:02:12 [Note] WSREP: Service thread queue flushed.

170727 12:02:13 [Note] WSREP: Member 1.0 (NODE_2) requested state transfer from '*any*'. Selected 0.0 (NODE_1)(SYNCED) as donor.

170727 12:02:13 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 12090285)

170727 12:02:13 [Note] WSREP: IST request: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178-12090282|tcp://NODE_2_IP:4568

170727 12:02:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:13 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address 'NODE_2_IP:4444/rsync_sst' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178' --bypass'

170727 12:02:13 [Note] WSREP: sst_donor_thread signaled with 0

WSREP_SST: [INFO] Bypassing state dump. (20170727 12:02:13.632)

170727 12:02:13 [Note] WSREP: async IST sender starting to serve tcp://NODE_2_IP:4568 sending 12090179-12090282

170727 12:02:13 [Note] WSREP: 0.0 (NODE_1): State transfer to 1.0 (NODE_2) complete.

170727 12:02:13 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 12090285)

170727 12:02:13 [Note] WSREP: Member 0.0 (NODE_1) synced with group.

170727 12:02:13 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 12090285)

170727 12:02:13 [Note] WSREP: Synchronized with group, ready for connections

170727 12:02:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:15 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') turning message relay requesting off

170727 12:02:15 [Note] WSREP:  cleaning up 570523b9 (tcp://NODE_3_IP:4567)

170727 12:02:15 [Note] WSREP: autoinc decrease: 3 -> 2

THD: 7459680, current: 26717429, autoinc: 26717429

170727 12:02:15 [Note] WSREP: autoinc decrease: 1 -> 2

THD: 7459680, current: 21642818, autoinc: 21642818

170727 12:02:15 [Note] WSREP: autoinc decrease: 1 -> 2

THD: 7459681, current: 8481754, autoinc: 8481754

170727 12:02:23 [Note] WSREP: 1.0 (NODE_2): State transfer from 0.0 (NODE_1) complete.

170727 12:02:23 [Note] WSREP: async IST sender served

170727 12:02:23 [Note] WSREP: Member 1.0 (NODE_2) synced with group.

170727 12:02:31 [Note] /usr/sbin/mysqld: Normal shutdown

 

170727 12:02:31 [Note] WSREP: Stop replication

170727 12:02:31 [Note] WSREP: Provider disconnect

170727 12:02:31 [Note] WSREP: Closing send monitor...

170727 12:02:31 [Note] WSREP: Closed send monitor.

170727 12:02:31 [Note] WSREP: gcomm: terminating thread

170727 12:02:31 [Note] WSREP: gcomm: joining thread

170727 12:02:31 [Note] WSREP: gcomm: closing backend

170727 12:02:32 [Note] WSREP: Before Lock_thread_count

170727 12:02:35 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') connection to peer f33ecf46 with addr tcp://NODE_2_IP:4567 timed out, no messages seen in PT3S

170727 12:02:35 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://NODE_2_IP:4567

170727 12:02:36 [Note] WSREP: (e196056a, 'tcp://0.0.0.0:4567') reconnecting to f33ecf46 (tcp://NODE_2_IP:4567), attempt 0

170727 12:02:37 [Note] WSREP: evs::proto(e196056a, LEAVING, view_id(REG,e196056a,8)) suspecting node: f33ecf46

170727 12:02:37 [Note] WSREP: evs::proto(e196056a, LEAVING, view_id(REG,e196056a,8)) suspected node without join message, declaring inactive

170727 12:02:37 [Note] WSREP: view(view_id(NON_PRIM,e196056a,8) memb {

        e196056a,0

} joined {

} left {

} partitioned {

        f33ecf46,0

})

170727 12:02:37 [Note] WSREP: view((empty))

170727 12:02:37 [Note] WSREP: gcomm: closed

170727 12:02:37 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

170727 12:02:37 [Note] WSREP: Flow-control interval: [16, 16]

170727 12:02:37 [Note] WSREP: Received NON-PRIMARY.

170727 12:02:37 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 12090475)

170727 12:02:37 [Note] WSREP: Received self-leave message.

170727 12:02:37 [Note] WSREP: Flow-control interval: [0, 0]

170727 12:02:37 [Note] WSREP: Received SELF-LEAVE. Closing connection.

170727 12:02:37 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 12090475)

170727 12:02:37 [Note] WSREP: RECV thread exiting 0: Success

170727 12:02:37 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3

170727 12:02:37 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:37 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:37 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3

170727 12:02:37 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:37 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:37 [Note] WSREP: applier thread exiting (code:0)

170727 12:02:37 [Note] WSREP: closing applier 2

170727 12:02:37 [Note] WSREP: recv_thread() joined.

170727 12:02:37 [Note] WSREP: Closing replication queue.

170727 12:02:37 [Note] WSREP: wsrep running threads now: 1

170727 12:02:37 [Note] WSREP: Closing slave action queue.

170727 12:02:39 [Note] WSREP: waiting for client connections to close: 1

170727 12:02:39 [Note] WSREP: closing wsrep thread 1

170727 12:02:39 [Note] WSREP: WSREP rollback thread wakes for signal

170727 12:02:39 [Note] WSREP: WSREP rollback thread has empty abort queue

170727 12:02:39 [Note] WSREP: rollbacker thread exiting

170727 12:02:39 [Note] WSREP: wsrep running threads now: 0

170727 12:02:39 [Note] Event Scheduler: Purging the queue. 0 events

170727 12:02:39 [Note] WSREP: dtor state: CLOSED

170727 12:02:39 [Note] WSREP: mon: entered 12090475 oooe fraction 0 oool fraction 0

170727 12:02:39 [Note] WSREP: mon: entered 12090475 oooe fraction 0.00089922 oool fraction 9.09807e-07

170727 12:02:39 [Note] WSREP: mon: entered 12248941 oooe fraction 0 oool fraction 0

170727 12:02:39 [Note] WSREP: cert index usage at exit 0

170727 12:02:39 [Note] WSREP: cert trx map usage at exit 126

170727 12:02:39 [Note] WSREP: deps set usage at exit 0

170727 12:02:39 [Note] WSREP: avg deps dist 1.61573

170727 12:02:39 [Note] WSREP: avg cert interval 0.0113644

170727 12:02:39 [Note] WSREP: cert index size 339

170727 12:02:39 [Note] WSREP: Service thread queue flushed.

170727 12:02:39 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0

170727 12:02:39 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0.999984, misses: 196, in use: 0, in pool: 196

170727 12:02:39 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0.852064, misses: 129, in use: 0, in pool: 129

170727 12:02:39 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 12090475)

170727 12:02:40 [Note] WSREP: Flushing memory map to disk...

170727 12:02:44  InnoDB: Starting shutdown...

170727 12:02:49  InnoDB: Shutdown completed; log sequence number 158493979344

170727 12:02:49 [Note] /usr/sbin/mysqld: Shutdown complete

 

170727 12:02:50 mysqld_safe mysqld from pid file /var/lib/mysql/NODE_1-v.pid ended

170727 12:02:51 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 12:02:51 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.GxHJfb' --pid-file='/var/lib/mysql/NODE_1-v-recover.pid'

170727 12:02:51 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 107475 ...

170727 14:28:03 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 14:28:03 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.luw7yX' --pid-file='/var/lib/mysql/NODE_1-v-recover.pid'

170727 14:28:03 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 1933 ...

170727 14:34:25 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 14:34:25 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.hYxf3l' --pid-file='/var/lib/mysql/NODE_1-v-recover.pid'

170727 14:34:25 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 3509 ...

170727 14:34:30 mysqld_safe WSREP: Recovered position e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475

170727 14:34:30 [Note] WSREP: wsrep_start_position var submitted: 'e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475'

170727 14:34:30 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 3574 ...

170727 14:34:30 [Note] WSREP: Setting wsrep_ready to 0

170727 14:34:30 [Note] WSREP: Read nil XID from storage engines, skipping position init

170727 14:34:30 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'

170727 14:34:30 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <in...@codership.com> loaded successfully.

170727 14:34:30 [Note] WSREP: CRC-32C: using hardware acceleration.

170727 14:34:30 [Note] WSREP: Found saved state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475, safe_to_bootsrap: 0

170727 14:34:30 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = NODE_1_IP; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.

170727 14:34:30 [Note] WSREP: GCache history reset: old(e197148b-366e-11e7-ac22-a2a37d1a5dae:0) -> new(e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475)

170727 14:34:30 [Note] WSREP: Assign initial position for certification: 12090475, protocol version: -1

170727 14:34:30 [Note] WSREP: wsrep_sst_grab()

170727 14:34:30 [Note] WSREP: Start replication

170727 14:34:30 [Note] WSREP: Setting initial position to e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475

170727 14:34:30 [Note] WSREP: protonet asio version 0

170727 14:34:30 [Note] WSREP: Using CRC-32C for message checksums.

170727 14:34:30 [Note] WSREP: backend: asio

170727 14:34:30 [Note] WSREP: gcomm thread scheduling priority set to other:0

170727 14:34:30 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)

170727 14:34:30 [Note] WSREP: restore pc from disk failed

170727 14:34:30 [Note] WSREP: GMCast version 0

170727 14:34:30 [Warning] WSREP: Failed to resolve tcp:// NODE_2_IP:4567

170727 14:34:30 [Warning] WSREP: Failed to resolve tcp:// NODE_3_IP:4567

170727 14:34:30 [Note] WSREP: (3a9c4360, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567

170727 14:34:30 [Note] WSREP: (3a9c4360, 'tcp://0.0.0.0:4567') multicast: , ttl: 1

170727 14:34:30 [Note] WSREP: EVS version 0

170727 14:34:30 [Note] WSREP: gcomm: connecting to group 'CLUSTER_NAME', peer 'NODE_1_IP:, NODE_2_IP:, NODE_3_IP:'

170727 14:34:30 [Note] WSREP: (3a9c4360, 'tcp://0.0.0.0:4567') connection established to 3a9c4360 tcp://NODE_1_IP:4567

170727 14:34:30 [Warning] WSREP: (3a9c4360, 'tcp://0.0.0.0:4567') address 'tcp://NODE_1_IP:4567' points to own listening address, blacklisting

170727 14:34:33 [Note] WSREP: (3a9c4360, 'tcp://0.0.0.0:4567') connection to peer 3a9c4360 with addr tcp://NODE_1_IP:4567 timed out, no messages seen in PT3S

170727 14:34:33 [Warning] WSREP: no nodes coming from prim view, prim not possible

170727 14:34:33 [Note] WSREP: view(view_id(NON_PRIM,3a9c4360,1) memb {

        3a9c4360,0

} joined {

} left {

} partitioned {

})

170727 14:34:34 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50126S), skipping check

170727 14:35:03 [Note] WSREP: view((empty))

170727 14:35:03 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)

         at gcomm/src/pc.cpp:connect():158

170727 14:35:03 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)

170727 14:35:03 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1404: Failed to open channel 'CLUSTER_NAME' at 'gcomm://NODE_1_IP, NODE_2_IPNODE_3_IP': -110 (Connection timed out)

170727 14:35:03 [ERROR] WSREP: gcs connect failed: Connection timed out

170727 14:35:03 [ERROR] WSREP: wsrep::connect(gcomm://NODE_1_IPNODE_2_IPNODE_3_IP) failed: 7

170727 14:35:03 [ERROR] Aborting

 

170727 14:35:03 [Note] WSREP: waiting for client connections to close: 0

170727 14:35:03 [Note] WSREP: Service disconnected.

170727 14:35:04 [Note] WSREP: Some threads may fail to exit.

170727 14:35:04 [Note] /usr/sbin/mysqld: Shutdown complete

 

170727 14:35:04 mysqld_safe mysqld from pid file /var/lib/mysql/NODE_1-v.pid ended

 


Node 3 Log:

170727 12:02:02 [Note] WSREP: declaring e196056a at tcp://NODE_1_IP:4567 stable

170727 12:02:02 [Note] WSREP: forgetting e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:02 [Note] WSREP: Node 570523b9 state prim

170727 12:02:02 [Note] WSREP: view(view_id(PRIM,570523b9,6) memb {

        570523b9,0

        e196056a,0

} joined {

} left {

} partitioned {

        e2004ea9,0

})

170727 12:02:02 [Note] WSREP: save pc into disk

170727 12:02:02 [Note] WSREP: forgetting e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:02 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2

170727 12:02:02 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ee015468-72e4-11e7-8514-478779e3836e

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: sent state msg: ee015468-72e4-11e7-8514-478779e3836e

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: got state msg: ee015468-72e4-11e7-8514-478779e3836e from 0 (NODE_3)

170727 12:02:02 [Note] WSREP: STATE EXCHANGE: got state msg: ee015468-72e4-11e7-8514-478779e3836e from 1 (NODE_1)

170727 12:02:02 [Note] WSREP: Quorum results:

        version    = 4,

        component  = PRIMARY,

        conf_id    = 5,

        members    = 2/2 (joined/total),

        act_id     = 12090178,

        last_appl. = 12090147,

        protocols  = 0/7/3 (gcs/repl/appl),

        group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:02 [Note] WSREP: Flow-control interval: [23, 23]

170727 12:02:02 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178, view# 6: Primary, number of nodes: 2, my index: 0, protocol version 3

170727 12:02:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:02 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:02 [Note] WSREP: Assign initial position for certification: 12090178, protocol version: 3

170727 12:02:02 [Note] WSREP: Service thread queue flushed.

170727 12:02:07 [Note] /usr/sbin/mysqld: Normal shutdown

 

170727 12:02:07 [Note] WSREP: Stop replication

170727 12:02:07 [Note] WSREP: Provider disconnect

170727 12:02:07 [Note] WSREP: Closing send monitor...

170727 12:02:07 [Note] WSREP: Closed send monitor.

170727 12:02:07 [Note] WSREP: gcomm: terminating thread

170727 12:02:07 [Note] WSREP: gcomm: joining thread

170727 12:02:07 [Note] WSREP: gcomm: closing backend

170727 12:02:08 [Note] WSREP:  cleaning up e2004ea9 (tcp://NODE_2_IP:4567)

170727 12:02:11 [Note] WSREP: (570523b9, 'tcp://0.0.0.0:4567') connection to peer e196056a with addr tcp://NODE_1_IP:4567 timed out, no messages seen in PT3S

170727 12:02:11 [Note] WSREP: (570523b9, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://NODE_1_IP:4567

170727 12:02:12 [Note] WSREP: (570523b9, 'tcp://0.0.0.0:4567') reconnecting to e196056a (tcp://NODE_1_IP:4567), attempt 0

170727 12:02:13 [Note] WSREP: evs::proto(570523b9, LEAVING, view_id(REG,570523b9,6)) suspecting node: e196056a

170727 12:02:13 [Note] WSREP: evs::proto(570523b9, LEAVING, view_id(REG,570523b9,6)) suspected node without join message, declaring inactive

170727 12:02:13 [Note] WSREP: view(view_id(NON_PRIM,570523b9,6) memb {

        570523b9,0

} joined {

} left {

} partitioned {

        e196056a,0

})

170727 12:02:13 [Note] WSREP: view((empty))

170727 12:02:13 [Note] WSREP: gcomm: closed

170727 12:02:13 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

170727 12:02:13 [Note] WSREP: Flow-control interval: [16, 16]

170727 12:02:13 [Note] WSREP: Received NON-PRIMARY.

170727 12:02:13 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 12090208)

170727 12:02:13 [Note] WSREP: Received self-leave message.

170727 12:02:13 [Note] WSREP: Flow-control interval: [0, 0]

170727 12:02:13 [Note] WSREP: Received SELF-LEAVE. Closing connection.

170727 12:02:13 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 12090208)

170727 12:02:13 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3

170727 12:02:13 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:13 [Note] WSREP: RECV thread exiting 0: Success

170727 12:02:13 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3

170727 12:02:13 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:13 [Note] WSREP: applier thread exiting (code:0)

170727 12:02:13 [Note] WSREP: closing applier 2

170727 12:02:13 [Note] WSREP: wsrep running threads now: 1

170727 12:02:13 [Note] WSREP: recv_thread() joined.

170727 12:02:13 [Note] WSREP: Closing replication queue.

170727 12:02:13 [Note] WSREP: Closing slave action queue.

170727 12:02:15 [Note] WSREP: waiting for client connections to close: 1

170727 12:02:15 [Note] WSREP: closing wsrep thread 1

170727 12:02:15 [Note] WSREP: WSREP rollback thread wakes for signal

170727 12:02:15 [Note] WSREP: WSREP rollback thread has empty abort queue

170727 12:02:15 [Note] WSREP: rollbacker thread exiting

170727 12:02:15 [Note] WSREP: wsrep running threads now: 0

170727 12:02:15 [Note] WSREP: Before Lock_thread_count

170727 12:02:15 [Note] Event Scheduler: Purging the queue. 0 events

170727 12:02:15 [Note] WSREP: dtor state: CLOSED

170727 12:02:15 [Note] WSREP: mon: entered 12090208 oooe fraction 0 oool fraction 0

170727 12:02:15 [Note] WSREP: mon: entered 12090208 oooe fraction 0 oool fraction 0

170727 12:02:15 [Note] WSREP: mon: entered 12248667 oooe fraction 0 oool fraction 8.16415e-08

170727 12:02:15 [Note] WSREP: cert index usage at exit 0

170727 12:02:15 [Note] WSREP: cert trx map usage at exit 30

170727 12:02:15 [Note] WSREP: deps set usage at exit 0

170727 12:02:15 [Note] WSREP: avg deps dist 1.61574

170727 12:02:15 [Note] WSREP: avg cert interval 0.0113642

170727 12:02:15 [Note] WSREP: cert index size 57

170727 12:02:15 [Note] WSREP: Service thread queue flushed.

170727 12:02:15 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0

170727 12:02:15 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0

170727 12:02:15 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0.999984, misses: 194, in use: 0, in pool: 194

170727 12:02:15 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 12090208)

170727 12:02:15 [Note] WSREP: Flushing memory map to disk...

170727 12:02:15  InnoDB: Starting shutdown...

170727 12:02:18  InnoDB: Shutdown completed; log sequence number 157221199329

170727 12:02:18 [Note] /usr/sbin/mysqld: Shutdown complete

 

170727 12:02:18 mysqld_safe mysqld from pid file /var/lib/mysql/NODE_3-v.pid ended

170727 12:02:19 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 12:02:19 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.oFMhDx' --pid-file='/var/lib/mysql/NODE_3-v-recover.pid'

170727 12:02:19 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 63868 ...

170727 16:05:09 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 16:05:09 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.W4RzUH' --pid-file='/var/lib/mysql/NODE_3-v-recover.pid'

170727 16:05:09 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 90310 ...

170727 16:32:37 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 16:32:37 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.GOG9b9' --pid-file='/var/lib/mysql/NODE_3-v-recover.pid'

170727 16:32:37 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 94081 ...

170727 16:44:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 16:44:58 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.3hdsNb' --pid-file='/var/lib/mysql/NODE_3-v-recover.pid'

170727 16:44:58 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 95689 ...

170727 18:08:13 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 18:08:13 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.REpH2A' --pid-file='/var/lib/mysql/NODE_3-v-recover.pid'

170727 18:08:13 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 105345 ...

170727 18:08:16 mysqld_safe WSREP: Recovered position e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208

170727 18:08:16 [Note] WSREP: wsrep_start_position var submitted: 'e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208'

170727 18:08:16 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 105393 ...

170727 18:08:16 [Note] WSREP: Setting wsrep_ready to 0

170727 18:08:16 [Note] WSREP: Read nil XID from storage engines, skipping position init

170727 18:08:16 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'

170727 18:08:16 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <in...@codership.com> loaded successfully.

170727 18:08:16 [Note] WSREP: CRC-32C: using hardware acceleration.

170727 18:08:16 [Note] WSREP: Found saved state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208, safe_to_bootsrap: 0

170727 18:08:16 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = NODE_3_IP; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc

170727 18:08:16 [Note] WSREP: GCache history reset: old(e197148b-366e-11e7-ac22-a2a37d1a5dae:0) -> new(e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208)

170727 18:08:16 [Note] WSREP: Assign initial position for certification: 12090208, protocol version: -1

170727 18:08:16 [Note] WSREP: wsrep_sst_grab()

170727 18:08:16 [Note] WSREP: Start replication

170727 18:08:16 [Note] WSREP: Setting initial position to e197148b-366e-11e7-ac22-a2a37d1a5dae:12090208

170727 18:08:16 [Note] WSREP: protonet asio version 0

170727 18:08:16 [Note] WSREP: Using CRC-32C for message checksums.

170727 18:08:16 [Note] WSREP: backend: asio

170727 18:08:16 [Note] WSREP: gcomm thread scheduling priority set to other:0

170727 18:08:16 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)

170727 18:08:16 [Note] WSREP: restore pc from disk failed

170727 18:08:16 [Note] WSREP: GMCast version 0

170727 18:08:16 [Warning] WSREP: Failed to resolve tcp:// NODE_2_IP:4567

170727 18:08:16 [Warning] WSREP: Failed to resolve tcp:// NODE_3_IP:4567

170727 18:08:16 [Note] WSREP: (173284dd, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567

170727 18:08:16 [Note] WSREP: (173284dd, 'tcp://0.0.0.0:4567') multicast: , ttl: 1

170727 18:08:16 [Note] WSREP: EVS version 0

170727 18:08:16 [Note] WSREP: gcomm: connecting to group 'CLUSTER_NAME', peer 'NODE_1_IP:, NODE_2_IP:, NODE_3_IP:'

170727 18:08:19 [Warning] WSREP: no nodes coming from prim view, prim not possible

170727 18:08:19 [Note] WSREP: view(view_id(NON_PRIM,173284dd,1) memb {

        173284dd,0

} joined {

} left {

} partitioned {

})

170727 18:08:19 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50322S), skipping check

170727 18:08:49 [Note] WSREP: view((empty))

170727 18:08:49 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)

         at gcomm/src/pc.cpp:connect():158

170727 18:08:49 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)

170727 18:08:49 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1404: Failed to open channel 'CLUSTER_NAME' at 'gcomm://NODE_1_IP, NODE_2_IP, NODE_3_IP': -110 (Connection timed out)

170727 18:08:49 [ERROR] WSREP: gcs connect failed: Connection timed out

170727 18:08:49 [ERROR] WSREP: wsrep::connect(gcomm://NODE_1_IP, NODE_2_IP, NODE_3_IP) failed: 7

170727 18:08:49 [ERROR] Aborting

 

170727 18:08:49 [Note] WSREP: waiting for client connections to close: 0

170727 18:08:49 [Note] WSREP: Service disconnected.

170727 18:08:50 [Note] WSREP: Some threads may fail to exit.

170727 18:08:50 [Note] /usr/sbin/mysqld: Shutdown complete

 

170727 18:08:50 mysqld_safe mysqld from pid file /var/lib/mysql/NODE_3-v.pid ended



We want to get NODE 3 up and connected to the cluster but have been unsuccessful. In the log above, it saying there is connection timeout. I'd appreciate any help with this. I have not been able to figure out why I can't rejoin the Node 1 and Node 2 back to the cluster.


Thanks,


Erin Quigley

Systems Administrator, Biomedical Informatics Center

Medical University South Carolina

Erin Quigley

unread,
Jul 28, 2017, 12:08:00 PM7/28/17
to codership
This is the log file for NODE 2:

170727 12:02:02 [Note] /usr/sbin/mysqld: Normal shutdown

 

170727 12:02:02 [Note] WSREP: Stop replication

170727 12:02:02 [Note] WSREP: Provider disconnect

170727 12:02:02 [Note] WSREP: Closing send monitor...

170727 12:02:02 [Note] WSREP: Closed send monitor.

170727 12:02:02 [Note] WSREP: gcomm: terminating thread

170727 12:02:02 [Note] WSREP: gcomm: joining thread

170727 12:02:02 [Note] WSREP: gcomm: closing backend

170727 12:02:02 [Note] WSREP: view(view_id(NON_PRIM,570523b9,5) memb {

     e2004ea9,0

} joined {

} left {

} partitioned {

     570523b9,0

     e196056a,0

})

170727 12:02:02 [Note] WSREP: view((empty))

170727 12:02:02 [Note] WSREP: gcomm: closed

170727 12:02:02 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

170727 12:02:02 [Note] WSREP: Flow-control interval: [16, 16]

170727 12:02:02 [Note] WSREP: Received NON-PRIMARY.

170727 12:02:02 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 12090178)

170727 12:02:02 [Note] WSREP: Received self-leave message.

170727 12:02:02 [Note] WSREP: Flow-control interval: [0, 0]

170727 12:02:02 [Note] WSREP: Received SELF-LEAVE. Closing connection.

170727 12:02:02 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 12090178)

170727 12:02:02 [Note] WSREP: RECV thread exiting 0: Success

170727 12:02:02 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3

170727 12:02:02 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:02 [Note] WSREP: recv_thread() joined.

170727 12:02:02 [Note] WSREP: Closing replication queue.

170727 12:02:02 [Note] WSREP: Closing slave action queue.

170727 12:02:02 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3

170727 12:02:02 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:02 [Note] WSREP: applier thread exiting (code:0)

170727 12:02:02 [Note] WSREP: closing applier 2

170727 12:02:02 [Note] WSREP: wsrep running threads now: 1

170727 12:02:02 [Note] WSREP: Before Lock_thread_count

170727 12:02:04 [Note] WSREP: waiting for client connections to close: 1

170727 12:02:04 [Note] WSREP: closing wsrep thread 1

170727 12:02:04 [Note] WSREP: WSREP rollback thread wakes for signal

170727 12:02:04 [Note] WSREP: WSREP rollback thread has empty abort queue

170727 12:02:04 [Note] WSREP: rollbacker thread exiting

170727 12:02:04 [Note] WSREP: wsrep running threads now: 0

170727 12:02:04 [Note] Event Scheduler: Purging the queue. 0 events

170727 12:02:04 [Note] WSREP: dtor state: CLOSED

170727 12:02:04 [Note] WSREP: mon: entered 12068252 oooe fraction 0 oool fraction 0

170727 12:02:04 [Note] WSREP: mon: entered 12068252 oooe fraction 6.62896e-07 oool fraction 0

170727 12:02:04 [Note] WSREP: mon: entered 12210097 oooe fraction 0 oool fraction 0

170727 12:02:04 [Note] WSREP: cert index usage at exit 0

170727 12:02:04 [Note] WSREP: cert trx map usage at exit 31

170727 12:02:04 [Note] WSREP: deps set usage at exit 0

170727 12:02:04 [Note] WSREP: avg deps dist 1.60182

170727 12:02:04 [Note] WSREP: avg cert interval 0.0113849

170727 12:02:04 [Note] WSREP: cert index size 278

170727 12:02:04 [Note] WSREP: Service thread queue flushed.

170727 12:02:04 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0

170727 12:02:04 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0.852064, misses: 129, in use: 0, in pool: 129

170727 12:02:04 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0.999984, misses: 194, in use: 0, in pool: 194

170727 12:02:04 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 12090178)

170727 12:02:04 [Note] WSREP: Flushing memory map to disk...

170727 12:02:04  InnoDB: Starting shutdown...

170727 12:02:05  InnoDB: Waiting for 200 pages to be flushed

170727 12:02:06  InnoDB: Shutdown completed; log sequence number 157096526824

170727 12:02:06 [Note] /usr/sbin/mysqld: Shutdown complete

 

170727 12:02:06 mysqld_safe mysqld from pid file /var/lib/mysql/NODE_2-v.pid ended

170727 12:02:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

170727 12:02:08 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.xhDIHJ' --pid-file='/var/lib/mysql/NODE_2-v-recover.pid'

170727 12:02:08 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 118718 ...

170727 12:02:11 mysqld_safe WSREP: Recovered position e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178

170727 12:02:11 [Note] WSREP: wsrep_start_position var submitted: 'e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178'

170727 12:02:11 [Note] /usr/sbin/mysqld (mysqld 5.5.57-MariaDB-wsrep) starting as process 118768 ...

170727 12:02:11 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:11 [Note] WSREP: Read nil XID from storage engines, skipping position init

170727 12:02:11 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'

170727 12:02:11 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <in...@codership.com> loaded successfully.

170727 12:02:11 [Note] WSREP: CRC-32C: using hardware acceleration.

170727 12:02:11 [Note] WSREP: Found saved state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178, safe_to_bootsrap: 0

170727 12:02:11 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = NODE_2_IP; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.i

170727 12:02:11 [Note] WSREP: GCache history reset: old(e197148b-366e-11e7-ac22-a2a37d1a5dae:0) -> new(e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178)

170727 12:02:11 [Note] WSREP: Assign initial position for certification: 12090178, protocol version: -1

170727 12:02:11 [Note] WSREP: wsrep_sst_grab()

170727 12:02:11 [Note] WSREP: Start replication

170727 12:02:11 [Note] WSREP: Setting initial position to e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178

170727 12:02:11 [Note] WSREP: protonet asio version 0

170727 12:02:11 [Note] WSREP: Using CRC-32C for message checksums.

170727 12:02:11 [Note] WSREP: backend: asio

170727 12:02:11 [Note] WSREP: gcomm thread scheduling priority set to other:0

170727 12:02:11 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)

170727 12:02:11 [Note] WSREP: restore pc from disk failed

170727 12:02:11 [Note] WSREP: GMCast version 0

170727 12:02:11 [Warning] WSREP: Failed to resolve tcp:// NODE_2_IP:4567

170727 12:02:11 [Warning] WSREP: Failed to resolve tcp:// NODE_3_IP:4567

170727 12:02:11 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567

170727 12:02:11 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') multicast: , ttl: 1

170727 12:02:11 [Note] WSREP: EVS version 0

170727 12:02:11 [Note] WSREP: gcomm: connecting to group 'CLUSTER_NAME', peer 'NODE_1_IP:, NODE_2_IP:, NODE_3_IP:'

170727 12:02:11 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') connection established to e196056a tcp://NODE_1_IP:4567

170727 12:02:11 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:

170727 12:02:12 [Note] WSREP: declaring e196056a at tcp://NODE_1_IP:4567 stable

170727 12:02:12 [Note] WSREP: Node e196056a state prim

170727 12:02:12 [Note] WSREP: view(view_id(PRIM,e196056a,8) memb {

     e196056a,0

     f33ecf46,0

} joined {

} left {

} partitioned {

})

170727 12:02:12 [Note] WSREP: save pc into disk

170727 12:02:12 [Note] WSREP: gcomm: connected

170727 12:02:12 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636

170727 12:02:12 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)

170727 12:02:12 [Note] WSREP: Opened channel 'CLUSTER_NAME'

170727 12:02:12 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.

170727 12:02:12 [Note] WSREP: Waiting for SST to complete.

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: sent state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: got state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c from 0 (NODE_1)

170727 12:02:12 [Note] WSREP: STATE EXCHANGE: got state msg: f39ec4ca-72e4-11e7-91b1-7e3360df7d2c from 1 (NODE_2)

170727 12:02:12 [Note] WSREP: Quorum results:

     version    = 4,

     component  = PRIMARY,

     conf_id    = 7,

     members    = 1/2 (joined/total),

     act_id     = 12090282,

     last_appl. = -1,

     protocols  = 0/7/3 (gcs/repl/appl),

     group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:12 [Note] WSREP: Flow-control interval: [23, 23]

170727 12:02:12 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 12090282)

170727 12:02:12 [Note] WSREP: State transfer required:

     Group state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090282

     Local state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178

170727 12:02:12 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090282, view# 8: Primary, number of nodes: 2, my index: 1, protocol version 3

170727 12:02:12 [Warning] WSREP: Gap in state sequence. Need state transfer.

170727 12:02:12 [Note] WSREP: Setting wsrep_ready to 0

170727 12:02:12 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address 'NODE_2_IP' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '118768''

170727 12:02:13 [Note] WSREP: Prepared SST request: rsync|NODE_2_IP:4444/rsync_sst

170727 12:02:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:13 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:13 [Note] WSREP: Assign initial position for certification: 12090282, protocol version: 3

170727 12:02:13 [Note] WSREP: Service thread queue flushed.

170727 12:02:13 [Note] WSREP: IST receiver addr using tcp://NODE_2_IP:4568

170727 12:02:13 [Note] WSREP: Prepared IST receiver, listening at: tcp://NODE_2_IP:4568

170727 12:02:13 [Note] WSREP: Member 1.0 (NODE_2) requested state transfer from '*any*'. Selected 0.0 (NODE_1)(SYNCED) as donor.

170727 12:02:13 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12090285)

170727 12:02:13 [Note] WSREP: Requesting state transfer: success, donor: 0

170727 12:02:13 [Note] WSREP: GCache history reset: old(e197148b-366e-11e7-ac22-a2a37d1a5dae:0) -> new(e197148b-366e-11e7-ac22-a2a37d1a5dae:12090282)

170727 12:02:13 [Note] WSREP: 0.0 (NODE_1): State transfer to 1.0 (NODE_2) complete.

170727 12:02:13 [Note] WSREP: Member 0.0 (NODE_1) synced with group.

WSREP_SST: [INFO] Joiner cleanup. rsync PID: 118815 (20170727 12:02:14.193)

WSREP_SST: [INFO] Joiner cleanup done. (20170727 12:02:14.699)

170727 12:02:14 [Note] WSREP: SST complete, seqno: 12090178

170727 12:02:14 [Note] Plugin 'FEEDBACK' is disabled.

170727 12:02:14 InnoDB: The InnoDB memory heap is disabled

170727 12:02:14 InnoDB: Mutexes and rw_locks use InnoDB's own implementation

170727 12:02:14 InnoDB: Compressed tables use zlib 1.2.7

170727 12:02:14 InnoDB: Using Linux native AIO

170727 12:02:14 InnoDB: Initializing buffer pool, size = 1.0G

170727 12:02:14 InnoDB: Completed initialization of buffer pool

170727 12:02:14 InnoDB: highest supported file format is Barracuda.

170727 12:02:14  InnoDB: Waiting for the background threads to start

170727 12:02:15 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') turning message relay requesting off

170727 12:02:15 InnoDB: 5.5.57-MariaDB started; log sequence number 157096526824

170727 12:02:15 [Note] Server socket created on IP: '0.0.0.0'.

170727 12:02:16 [Note] Event Scheduler: Loaded 0 events

170727 12:02:16 [Note] WSREP: Signalling provider to continue.

170727 12:02:16 [Note] WSREP: SST received: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090178

170727 12:02:16 [Note] /usr/sbin/mysqld: ready for connections.

Version: '5.5.57-MariaDB-wsrep'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server, wsrep_25.21.r9949137

170727 12:02:16 [Note] WSREP: Receiving IST: 104 writesets, seqnos 12090178-12090282

170727 12:02:23 [Note] WSREP: IST received: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090282

170727 12:02:23 [Note] WSREP: 1.0 (NODE_2): State transfer from 0.0 (NODE_1) complete.

170727 12:02:23 [Note] WSREP: Shifting JOINER -> JOINED (TO: 12090404)

170727 12:02:23 [Note] WSREP: Member 1.0 (NODE_2) synced with group.

170727 12:02:23 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 12090404)

170727 12:02:23 [Note] WSREP: Synchronized with group, ready for connections

170727 12:02:23 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:23 [Note] WSREP: Nobody is waiting for SST.

 

Status information:

 

Current dir: /var/lib/mysql/

Running threads: 2  Stack size: 294912

Current locks:

lock: 0x7fdb988992c0:

 

lock: 0x7fdb98899040:

 

lock: 0x7fdb98898f00:

 

lock: 0x7fdb98898dc0:

 

lock: 0x7fdb98898c80:

 

lock: 0x7fdb98898a00:

 

lock: 0x7fdb98898b40:

 

lock: 0x7fdb988988c0:

 

lock: 0x7fdb98898780:

 

lock: 0x7fdb98898640:

 

lock: 0x7fdb988983c0:

 

lock: 0x7fdb98898280:

 

lock: 0x7fdba6484b78:

 

lock: 0x7fdba646b178:

 

lock: 0x7fdba6481b78:

 

lock: 0x7fdba6469378:

 

lock: 0x7fdba6467f78:

 

lock: 0x7fdba6466178:

 

lock: 0x7fdba6545d78:

 

lock: 0x7fdb83fff378:

 

lock: 0x7fdba6480378:

 

lock: 0x12610e0:

 

lock: 0x1261700:

 

lock: 0x12617e0:

 

lock: 0x12618c0:

 

lock: 0x12611c0:

 

lock: 0x12612a0:

 

lock: 0x1261000:

 

lock: 0x1260e40:

 

lock: 0x1260f20:

 

lock: 0x12619a0:

 

lock: 0x1261380:

 

lock: 0x1261620:

 

lock: 0x1261540:

 

lock: 0x1261460:

 

lock: 0x1260ba0:

 

lock: 0x1260c80:

 

lock: 0x1260d60:

 

 

Key caches:

default

Buffer_size:     134217728

Block_size:           1024

Division_limit:        100

Age_threshold:         300

Partitions:              0

blocks used:             0

not flushed:             0

w_requests:              0

writes:                  0

r_requests:              0

reads:                   0

 

 

handler status:

read_key:           17

read_next:           0

read_rnd             0

read_first:          0

write:             136

delete               0

update:           1435

 

Table status:

Opened tables:         12

Open tables:           38

Open files:            24

Open streams:           0

 

Alarm status:

Active alarms:   0

Max used alarms: 0

Next alarm time: 0

 

Memory status:

Non-mmapped space allocated from system: 0

Number of free chunks:              1

Number of fastbin blocks:      0

Number of mmapped regions:          0

Space in mmapped regions:      0

Maximum total allocated space:      0

Space available in freed fastbin blocks: 0

Total allocated space:              0

Total free space:              0

Top-most, releasable space:         0

Estimated memory (with thread stack):    589824

 

 

 

Events status:

LLA = Last Locked At  LUA = Last Unlocked At

WOC = Waiting On Condition  DL = Data Locked

 

Event scheduler status:

State      : INITIALIZED

Thread id  : 0

LLA        : n/a:0

LUA        : n/a:0

WOC        : NO

Workers    : 0

Executed   : 0

Data locked: NO

 

Event queue status:

Element count   : 0

Data locked     : NO

Attempting lock : NO

LLA             : init_queue:139

LUA             : init_queue:149

WOC             : NO

Next activation : never

170727 12:02:32 [Note] WSREP: forgetting e196056a (tcp://NODE_1_IP:4567)

170727 12:02:32 [Note] WSREP: Node f33ecf46 state prim

170727 12:02:32 [Note] WSREP: view(view_id(PRIM,f33ecf46,9) memb {

     f33ecf46,0

} joined {

} left {

} partitioned {

     e196056a,0

})

170727 12:02:32 [Note] WSREP: save pc into disk

170727 12:02:32 [Note] WSREP: forgetting e196056a (tcp://NODE_1_IP:4567)

170727 12:02:32 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1

170727 12:02:32 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ffe3b6b1-72e4-11e7-bc7f-4f5d43660e01

170727 12:02:32 [Note] WSREP: STATE EXCHANGE: sent state msg: ffe3b6b1-72e4-11e7-bc7f-4f5d43660e01

170727 12:02:32 [Note] WSREP: STATE EXCHANGE: got state msg: ffe3b6b1-72e4-11e7-bc7f-4f5d43660e01 from 0 (NODE_2)

170727 12:02:32 [Note] WSREP: Quorum results:

     version    = 4,

     component  = PRIMARY,

     conf_id    = 8,

     members    = 1/1 (joined/total),

     act_id     = 12090475,

     last_appl. = 12090349,

     protocols  = 0/7/3 (gcs/repl/appl),

     group UUID = e197148b-366e-11e7-ac22-a2a37d1a5dae

170727 12:02:32 [Note] WSREP: Flow-control interval: [16, 16]

170727 12:02:32 [Note] WSREP: New cluster view: global state: e197148b-366e-11e7-ac22-a2a37d1a5dae:12090475, view# 9: Primary, number of nodes: 1, my index: 0, protocol version 3

170727 12:02:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

170727 12:02:32 [Note] WSREP: REPL Protocols: 7 (3, 2)

170727 12:02:32 [Note] WSREP: Assign initial position for certification: 12090475, protocol version: 3

170727 12:02:32 [Note] WSREP: Service thread queue flushed.

170727 12:02:36 [Note] WSREP: (f33ecf46, 'tcp://0.0.0.0:4567') connection established to e196056a tcp://NODE_1_IP:4567

170727 12:02:36 [Warning] WSREP: discarding established (time wait) e196056a (tcp://NODE_1_IP:4567)

170727 12:02:38 [Note] WSREP:  cleaning up e196056a (tcp://NODE_1_IP:4567)

WSREP: Deadlock detected

 

Reply all
Reply to author
Forward
0 new messages