Hi,
I am attempting to join the 4th node to a working cluster that is currently receiving writes. 3 nodes are working fine. When I join the 4th node, the join appears to go fine, there are no problems that I can identify in the logs, and the wsrep_on variable is set to ON on the joiner.
If I stop writing to any node in the cluster, this node can be restarted and will not perform another sst but does work properly.
Here are the logs on the joiner:
121024 12:32:50 [Warning] You need to use --log-bin to make --binlog-format work.
121024 12:32:50 [Note] WSREP: wsrep_load(): loading provider library 'none'
121024 12:32:52 [Note] WSREP: Service disconnected.
121024 12:32:53 [Note] WSREP: Some threads may fail to exit.
121024 12:32:53 [Warning] You need to use --log-bin to make --binlog-format work.
121024 12:32:53 [Note] WSREP: wsrep_load(): loading provider library 'none'
121024 12:32:53 [Note] WSREP: Service disconnected.
121024 12:32:54 [Note] WSREP: Some threads may fail to exit.
121024 12:33:49 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
121024 12:33:49 [Note] WSREP: wsrep_load(): Galera 22.1.1(r95) by Codership Oy loaded succesfully.
121024 12:33:49 [Note] WSREP: Preallocating 134219040/134219040 bytes in '/var/lib/mysql//galera.cache'...
121024 12:33:49 [Note] WSREP: Passing config to GCS: gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; replicator.commit_order = 3
121024 12:33:49 [Note] WSREP: wsrep_sst_grab()
121024 12:33:49 [Note] WSREP: Start replication
121024 12:33:49 [Warning] WSREP: state file not found: /var/lib/mysql//grastate.dat
121024 12:33:49 [Note] WSREP: Assign initial position for certification: -1, protocol version: 1
121024 12:33:49 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
121024 12:33:49 [Note] WSREP: protonet asio version 0
121024 12:33:49 [Note] WSREP: backend: asio
121024 12:33:49 [Note] WSREP: GMCast version 0
121024 12:33:49 [Note] WSREP: (dc3f073b-1d7a-11e2-0800-6c1bbe3cc082, 'tcp://
0.0.0.0:4567') listening at tcp://
0.0.0.0:4567121024 12:33:49 [Note] WSREP: (dc3f073b-1d7a-11e2-0800-6c1bbe3cc082, 'tcp://
0.0.0.0:4567') multicast: , ttl: 1
121024 12:33:49 [Note] WSREP: EVS version 0
121024 12:33:49 [Note] WSREP: PC version 0
121024 12:33:49 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer ''
121024 12:33:49 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,1) memb {
dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,
} joined {
} left {
} partitioned {
})
121024 12:33:49 [Note] WSREP: gcomm: connected
121024 12:33:49 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
121024 12:33:49 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
121024 12:33:49 [Note] WSREP: Opened channel 'my_wsrep_cluster'
121024 12:33:49 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 1
121024 12:33:49 [Note] WSREP: Waiting for SST to complete.
121024 12:33:49 [Note] WSREP: Starting new group from scratch: dc3fe281-1d7a-11e2-0800-67dd803567ca
121024 12:33:49 [Note] WSREP: STATE_EXCHANGE: sent state UUID: dc401389-1d7a-11e2-0800-cf32a98ae4bd
121024 12:33:49 [Note] WSREP: STATE EXCHANGE: sent state msg: dc401389-1d7a-11e2-0800-cf32a98ae4bd
121024 12:33:49 [Note] WSREP: STATE EXCHANGE: got state msg: dc401389-1d7a-11e2-0800-cf32a98ae4bd from 0 (Test63)
121024 12:33:49 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 0,
members = 1/1 (joined/total),
act_id = 0,
last_appl. = -1,
protocols = 0/1/1 (gcs/repl/appl),
group UUID = dc3fe281-1d7a-11e2-0800-67dd803567ca
121024 12:33:49 [Note] WSREP: Flow-control interval: [8, 16]
121024 12:33:49 [Note] WSREP: Restored state OPEN -> JOINED (0)
121024 12:33:49 [Note] WSREP: Member 0 (Test63) synced with group.
121024 12:33:49 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
121024 12:33:49 [Note] WSREP: New cluster view: global state: dc3fe281-1d7a-11e2-0800-67dd803567ca:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 1
121024 12:33:49 [Note] WSREP: SST complete, seqno: 0
121024 12:33:49 InnoDB: The InnoDB memory heap is disabled
121024 12:33:49 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121024 12:33:49 InnoDB: Compressed tables use zlib 1.2.3.3
121024 12:33:49 InnoDB: Initializing buffer pool, size = 128.0M
121024 12:33:49 InnoDB: Completed initialization of buffer pool
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
121024 12:33:49 InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
121024 12:33:49 InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
121024 12:33:49 InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: 127 rollback segment(s) active.
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
121024 12:33:50 InnoDB: Waiting for the background threads to start
121024 12:33:51 InnoDB: 1.1.8 started; log sequence number 0
121024 12:33:51 [Note] Event Scheduler: Loaded 0 events
121024 12:33:51 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:33:51 [Note] WSREP: Assign initial position for certification: 0, protocol version: 1
121024 12:33:51 [Note] WSREP: Synchronized with group, ready for connections
121024 12:33:51 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:33:51 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.17' socket: '/var/run/mysqld/mysqld.sock' port: 3306 wsrep_22.3.r3645
121024 12:34:36 [Note] /usr/sbin/mysqld: Normal shutdown
121024 12:34:36 [Note] WSREP: Stop replication
121024 12:34:36 [Note] WSREP: Closing send monitor...
121024 12:34:36 [Note] WSREP: Closed send monitor.
121024 12:34:36 [Note] WSREP: gcomm: terminating thread
121024 12:34:36 [Note] WSREP: gcomm: joining thread
121024 12:34:36 [Note] WSREP: gcomm: closing backend
121024 12:34:36 [Note] WSREP: GMCast::handle_stable_view: view((empty))
121024 12:34:36 [Note] WSREP: Received self-leave message.
121024 12:34:36 [Note] WSREP: gcomm: closed
121024 12:34:36 [Note] WSREP: Flow-control interval: [0, 0]
121024 12:34:36 [Note] WSREP: Received SELF-LEAVE. Closing connection.
121024 12:34:36 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 0)
121024 12:34:36 [Note] WSREP: RECV thread exiting 0: Success
121024 12:34:36 [Note] WSREP: New cluster view: global state: dc3fe281-1d7a-11e2-0800-67dd803567ca:0, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 1
121024 12:34:36 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:34:36 [Note] WSREP: applier thread exiting (code:0)
121024 12:34:36 [Note] WSREP: recv_thread() joined.
121024 12:34:36 [Note] WSREP: Closing slave action queue.
121024 12:34:38 [Note] WSREP: rollbacker thread exiting
121024 12:34:38 [Note] Event Scheduler: Purging the queue. 0 events
121024 12:34:38 [Note] WSREP: dtor state: CLOSED
121024 12:34:38 [Note] WSREP: apply mon: entered 0
121024 12:34:38 [Note] WSREP: apply mon: entered 0
121024 12:34:38 [Note] WSREP: mon: entered 3 oooe fraction 0 oool fraction 0
121024 12:34:38 [Note] WSREP: cert index usage at exit 0
121024 12:34:38 [Note] WSREP: cert trx map usage at exit 0
121024 12:34:38 [Note] WSREP: deps set usage at exit 0
121024 12:34:38 [Note] WSREP: avg deps dist 0
121024 12:34:38 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
121024 12:34:38 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 0)
121024 12:34:38 [Note] WSREP: Flushing memory map to disk...
121024 12:34:38 InnoDB: Starting shutdown...
121024 12:34:38 InnoDB: Shutdown completed; log sequence number 1595675
121024 12:34:38 [Note] /usr/sbin/mysqld: Shutdown complete
121024 12:35:13 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
121024 12:35:13 [Note] WSREP: wsrep_load(): Galera 22.1.1(r95) by Codership Oy loaded succesfully.
121024 12:35:13 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
121024 12:35:13 [Note] WSREP: Passing config to GCS: gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; replicator.commit_order = 3
121024 12:35:13 [Note] WSREP: wsrep_sst_grab()
121024 12:35:13 [Note] WSREP: Start replication
121024 12:35:13 [Note] WSREP: Found saved state: dc3fe281-1d7a-11e2-0800-67dd803567ca:0
121024 12:35:13 [Note] WSREP: Assign initial position for certification: 0, protocol version: 1
121024 12:35:13 [Note] WSREP: Setting initial position to dc3fe281-1d7a-11e2-0800-67dd803567ca:0
121024 12:35:13 [Note] WSREP: protonet asio version 0
121024 12:35:13 [Note] WSREP: backend: asio
121024 12:35:13 [Note] WSREP: GMCast version 0
121024 12:35:13 [Note] WSREP: (0e7f8abf-1d7b-11e2-0800-96c021d320fe, 'tcp://
0.0.0.0:4567') listening at tcp://
0.0.0.0:4567121024 12:35:13 [Note] WSREP: (0e7f8abf-1d7b-11e2-0800-96c021d320fe, 'tcp://
0.0.0.0:4567') multicast: , ttl: 1
121024 12:35:13 [Note] WSREP: EVS version 0
121024 12:35:13 [Note] WSREP: PC version 0
121024 12:35:13 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '
192.168.1.60:'
121024 12:35:13 [Note] WSREP: (0e7f8abf-1d7b-11e2-0800-96c021d320fe, 'tcp://
0.0.0.0:4567') cleaning up duplicate 0x2c3b940 after established 0x2c3bbd0
121024 12:35:13 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,0e7f8abf-1d7b-11e2-0800-96c021d320fe,42) memb {
0e7f8abf-1d7b-11e2-0800-96c021d320fe,
132b93f1-180d-11e2-0800-0575fc423eba,
538dd5e0-1810-11e2-0800-dc3567fd7e23,
6d37d75d-bffa-11e1-0800-283a3c4de719,
} joined {
} left {
} partitioned {
})
121024 12:35:13 [Note] WSREP: declaring 132b93f1-180d-11e2-0800-0575fc423eba stable
121024 12:35:13 [Note] WSREP: declaring 538dd5e0-1810-11e2-0800-dc3567fd7e23 stable
121024 12:35:13 [Note] WSREP: declaring 6d37d75d-bffa-11e1-0800-283a3c4de719 stable
121024 12:35:14 [Note] WSREP: gcomm: connected
121024 12:35:14 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
121024 12:35:14 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
121024 12:35:14 [Note] WSREP: Opened channel 'my_wsrep_cluster'
121024 12:35:14 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 4
121024 12:35:14 [Note] WSREP: Waiting for SST to complete.
121024 12:35:14 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 0ecc8624-1d7b-11e2-0800-ca2f98a05357
121024 12:35:14 [Note] WSREP: STATE EXCHANGE: sent state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357
121024 12:35:14 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 1 (Test60)
121024 12:35:14 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 2 (Test62)
121024 12:35:14 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 3 (Test61)
121024 12:35:14 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 0 (Test63)
121024 12:35:14 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 41,
members = 3/4 (joined/total),
act_id = 18742299,
last_appl. = -1,
protocols = 0/1/1 (gcs/repl/appl),
group UUID = 9d46827d-bff9-11e1-0800-0323802b992b
121024 12:35:14 [Note] WSREP: Flow-control interval: [16, 32]
121024 12:35:14 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 18742299)
121024 12:35:14 [Note] WSREP: New cluster view: global state: 9d46827d-bff9-11e1-0800-0323802b992b:18742299, view# 42: Primary, number of nodes: 4, my index: 0, protocol version 1
121024 12:35:14 [Warning] WSREP: Gap in state sequence. Need state transfer.
121024 12:35:16 [Note] WSREP: Running: 'wsrep_sst_rsync 'joiner' '192.168.1.63' 'root:password' '/var/lib/mysql/' '10340' 2>sst.err'
121024 12:35:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:35:16 [Note] WSREP: Assign initial position for certification: 18742299, protocol version: 1
121024 12:35:16 [Note] WSREP: State transfer required:
Group state: 9d46827d-bff9-11e1-0800-0323802b992b:18742299
Local state: dc3fe281-1d7a-11e2-0800-67dd803567ca:0
121024 12:35:16 [Note] WSREP: Node 0 (Test63) requested state transfer from 'Test60'. Selected 1 (Test60)(SYNCED) as donor.
121024 12:35:16 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 18742355)
121024 12:35:16 [Note] WSREP: Requesting state transfer: success, donor: 1
121024 12:41:03 [Note] WSREP: 1 (Test60): State transfer to 0 (Test63) complete.
121024 12:41:04 [Note] WSREP: SST complete, seqno: 18742358
121024 12:41:04 InnoDB: The InnoDB memory heap is disabled
121024 12:41:04 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121024 12:41:04 InnoDB: Compressed tables use zlib 1.2.3.3
121024 12:41:04 InnoDB: Initializing buffer pool, size = 128.0M
121024 12:41:04 InnoDB: Completed initialization of buffer pool
121024 12:41:04 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 71520529374
121024 12:41:04 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 71520845267
121024 12:41:05 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
121024 12:41:05 InnoDB: Waiting for the background threads to start
121024 12:41:06 InnoDB: 1.1.8 started; log sequence number 71520845267
121024 12:41:06 [Note] Event Scheduler: Loaded 0 events
121024 12:41:06 [Note] WSREP: Signalling provider to continue.
121024 12:41:06 [Note] WSREP: Received SST: 9d46827d-bff9-11e1-0800-0323802b992b:18742358
121024 12:41:06 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.17' socket: '/var/run/mysqld/mysqld.sock' port: 3306 wsrep_22.3.r3645
121024 12:41:06 [Note] WSREP: 0 (Test63): State transfer from 1 (Test60) complete.
121024 12:41:06 [Note] WSREP: Shifting JOINER -> JOINED (TO: 18750654)
Here are the logs from the donor:
121024 12:29:28 [Note] WSREP: STATE EXCHANGE: got state msg: 40c31df4-1d7a-11e2-0800-57f3cc711251 from 0 (Test60)
121024 12:29:28 [Note] WSREP: STATE EXCHANGE: got state msg: 40c31df4-1d7a-11e2-0800-57f3cc711251 from 2 (Test61)
121024 12:29:28 [Note] WSREP: STATE EXCHANGE: got state msg: 40c31df4-1d7a-11e2-0800-57f3cc711251 from 1 (Test62)
121024 12:29:28 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 40,
members = 3/3 (joined/total),
act_id = 18734246,
last_appl. = 18733543,
protocols = 0/1/1 (gcs/repl/appl),
group UUID = 9d46827d-bff9-11e1-0800-0323802b992b
121024 12:29:28 [Note] WSREP: Flow-control interval: [14, 28]
121024 12:29:28 [Note] WSREP: New cluster view: global state: 9d46827d-bff9-11e1-0800-0323802b992b:18734246, view# 41: Primary, number of nodes: 3, my index: 0, protocol version 1
121024 12:29:28 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:29:28 [Note] WSREP: Assign initial position for certification: 18734246, protocol version: 1
121024 12:29:33 [Note] WSREP: cleaning up eb8df53d-1d6f-11e2-0800-c6d0232d5206 (tcp://
192.168.1.63:4567)
121024 12:35:13 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,0e7f8abf-1d7b-11e2-0800-96c021d320fe,42) memb {
0e7f8abf-1d7b-11e2-0800-96c021d320fe,
132b93f1-180d-11e2-0800-0575fc423eba,
538dd5e0-1810-11e2-0800-dc3567fd7e23,
6d37d75d-bffa-11e1-0800-283a3c4de719,
} joined {
} left {
} partitioned {
})
121024 12:35:13 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 1, memb_num = 4
121024 12:35:13 [Note] WSREP: declaring 0e7f8abf-1d7b-11e2-0800-96c021d320fe stable
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
121024 12:35:13 [Note] WSREP: declaring 538dd5e0-1810-11e2-0800-dc3567fd7e23 stable
121024 12:35:13 [Note] WSREP: declaring 6d37d75d-bffa-11e1-0800-283a3c4de719 stable
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: sent state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 1 (Test60)
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 2 (Test62)
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 3 (Test61)
121024 12:35:13 [Note] WSREP: STATE EXCHANGE: got state msg: 0ecc8624-1d7b-11e2-0800-ca2f98a05357 from 0 (Test63)
121024 12:35:13 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 41,
members = 3/4 (joined/total),
act_id = 18742299,
last_appl. = 18741534,
protocols = 0/1/1 (gcs/repl/appl),
group UUID = 9d46827d-bff9-11e1-0800-0323802b992b
121024 12:35:13 [Note] WSREP: Flow-control interval: [16, 32]
121024 12:35:13 [Note] WSREP: New cluster view: global state: 9d46827d-bff9-11e1-0800-0323802b992b:18742299, view# 42: Primary, number of nodes: 4, my index: 1, protocol version 1
121024 12:35:13 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:35:13 [Note] WSREP: Assign initial position for certification: 18742299, protocol version: 1
121024 12:35:15 [Note] WSREP: Node 0 (Test63) requested state transfer from 'Test60'. Selected 1 (Test60)(SYNCED) as donor.
121024 12:35:15 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 18742355)
121024 12:35:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
121024 12:35:16 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '
192.168.1.63:4444/rsync_sst' 'root:password' '/var/lib/mysql/' '9d46827d-bff9-11e1-0800-0323802b992b' '18742355' '0' 2>sst.err'
121024 12:35:16 [Note] WSREP: sst_donor_thread signaled with 0
121024 12:35:16 [Note] WSREP: Flushing tables for SST...
121024 12:35:16 [Note] WSREP: Provider paused at 9d46827d-bff9-11e1-0800-0323802b992b:18742358
121024 12:35:16 [Note] WSREP: Tables flushed.
121024 12:41:03 [Note] WSREP: Provider resumed.
121024 12:41:03 [Note] WSREP: 1 (Test60): State transfer to 0 (Test63) complete.
121024 12:41:03 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 18750575)
121024 12:41:06 [Note] WSREP: 0 (Test63): State transfer from 1 (Test60) complete.
After what looks like a successful SST and join of the new node, the command show variables like "wsrep%"; run on the joiner gives:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.5.17
Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use main;
ERROR 1047 (08S01): Unknown command
mysql> show variables like "wsrep%";
+--------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+--------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_OSU_method | TOI |
| wsrep_auto_increment_control | ON |
| wsrep_causal_reads | OFF |
| wsrep_certify_nonPK | ON |
| wsrep_cluster_name | my_wsrep_cluster |
| wsrep_convert_LOCK_to_trx | OFF |
| wsrep_data_home_dir | /var/lib/mysql/ |
| wsrep_dbug_option | |
| wsrep_debug | OFF |
| wsrep_drupal_282555_workaround | OFF |
| wsrep_forced_binlog_format | NONE |
| wsrep_max_ws_rows | 131072 |
| wsrep_max_ws_size | 1073741824 |
| wsrep_node_name | Test63 |
| wsrep_notify_cmd | |
| wsrep_on | ON |
| wsrep_provider | /usr/lib/galera/libgalera_smm.so |
| wsrep_provider_options | evs.debug_log_mask = 0x1; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT15S; evs.join_retrans_period = PT0.3S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 0; evs.view_forget_timeout = PT5M; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gmcast.listen_addr = tcp://
0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.time_wait = PT5S; gmcast.version = 0; pc.checksum = true; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT2S; pc.npvo = false; pc.version = 0; protonet.backend = asio; protonet.version = 0; replicator.commit_order = 3 |
| wsrep_retry_autocommit | 1 |
| wsrep_slave_threads | 1 |
| wsrep_sst_auth | ******** |
| wsrep_sst_donor | Test60 |
| wsrep_sst_method | rsync |
| wsrep_sst_receive_address | 192.168.1.63 |
| wsrep_start_position | 00000000-0000-0000-0000-000000000000:-1 |
+--------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
27 rows in set (0.00 sec)
mysql>
But the node has obviously not actually joined correctly because it responds with Unknown command to most sql commands.
Can you see anything in the logs that I should look at or do you have any idea what might be wrong? I did have a very similar problem earlier with this node, but other nodes would successfully join. I re-installed the O/S and tried the latest version of mysql and galera: galera-23.2.2rc2-amd64.deb and mysql-server-wsrep-5.5.23-23.6-amd64.deb. Results were an apparently successful connection but response of Unkown command to a USE query. For the present test I am using the same version of mysql and galera that the rest of the cluster is on: galera-22.1.1-amd64.deb and mysql-server-wsrep-5.5.17-22.3-amd64.deb.
Anyway, I can stop my application briefly, add the node to the cluster and start my application again, so this problem is not urgent, but it is presumably not the intended behaviour. I am using Ubuntu 11.10.
Thanks for any help,
Jase