Hi,
I am attempting to join the 4th node to a working cluster that is currently
receiving writes. 3 nodes are working fine. When I join the 4th node, the
join appears to go fine, there are no problems that I can identify in the
logs, and the wsrep_on variable is set to ON on the joiner.
If I stop writing to any node in the cluster, this node can be restarted
and will not perform another sst but does work properly.
Here are the logs on the joiner:
121024 12:32:50 [Warning] You need to use --log-bin to make --binlog-format
work.
121024 12:32:50 [Note] WSREP: wsrep_load(): loading provider library 'none'
121024 12:32:52 [Note] WSREP: Service disconnected.
121024 12:32:53 [Note] WSREP: Some threads may fail to exit.
121024 12:32:53 [Warning] You need to use --log-bin to make --binlog-format
work.
121024 12:32:53 [Note] WSREP: wsrep_load(): loading provider library 'none'
121024 12:32:53 [Note] WSREP: Service disconnected.
121024 12:32:54 [Note] WSREP: Some threads may fail to exit.
121024 12:33:49 [Note] WSREP: wsrep_load(): loading provider library
'/usr/lib/galera/libgalera_smm.so'
121024 12:33:49 [Note] WSREP: wsrep_load(): Galera 22.1.1(r95) by Codership
Oy loaded succesfully.
121024 12:33:49 [Note] WSREP: Preallocating 134219040/134219040 bytes in
'/var/lib/mysql//galera.cache'...
121024 12:33:49 [Note] WSREP: Passing config to GCS: gcache.dir =
/var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M;
gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit =
16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle
= 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit
= 0.25; replicator.commit_order = 3
121024 12:33:49 [Note] WSREP: wsrep_sst_grab()
121024 12:33:49 [Note] WSREP: Start replication
121024 12:33:49 [Warning] WSREP: state file not found:
/var/lib/mysql//grastate.dat
121024 12:33:49 [Note] WSREP: Assign initial position for certification:
-1, protocol version: 1
121024 12:33:49 [Note] WSREP: Setting initial position to
00000000-0000-0000-0000-000000000000:-1
121024 12:33:49 [Note] WSREP: protonet asio version 0
121024 12:33:49 [Note] WSREP: backend: asio
121024 12:33:49 [Note] WSREP: GMCast version 0
121024 12:33:49 [Note] WSREP: (dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,
'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
121024 12:33:49 [Note] WSREP: (dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,
'tcp://0.0.0.0:4567') multicast: , ttl: 1
121024 12:33:49 [Note] WSREP: EVS version 0
121024 12:33:49 [Note] WSREP: PC version 0
121024 12:33:49 [Note] WSREP: gcomm: connecting to group
'my_wsrep_cluster', peer ''
121024 12:33:49 [Note] WSREP: GMCast::handle_stable_view:
view(view_id(PRIM,dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,1) memb {
dc3f073b-1d7a-11e2-0800-6c1bbe3cc082,
} joined {
} left {
} partitioned {
})
121024 12:33:49 [Note] WSREP: gcomm: connected
121024 12:33:49 [Note] WSREP: Changing maximum packet size to 64500,
resulting msg size: 32636
121024 12:33:49 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
121024 12:33:49 [Note] WSREP: Opened channel 'my_wsrep_cluster'
121024 12:33:49 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0,
memb_num = 1
121024 12:33:49 [Note] WSREP: Waiting for SST to complete.
121024 12:33:49 [Note] WSREP: Starting new group from scratch:
dc3fe281-1d7a-11e2-0800-67dd803567ca
121024 12:33:49 [Note] WSREP: STATE_EXCHANGE: sent state UUID:
dc401389-1d7a-11e2-0800-cf32a98ae4bd
121024 12:33:49 [Note] WSREP: STATE EXCHANGE: sent state msg:
dc401389-1d7a-11e2-0800-cf32a98ae4bd
121024 12:33:49 [Note] WSREP: STATE EXCHANGE: got state msg:
dc401389-1d7a-11e2-0800-cf32a98ae4bd from 0 (Test63)
121024 12:33:49 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 0,
members = 1/1 (joined/total),
act_id = 0,
last_appl. = -1,
protocols = 0/1/1 (gcs/repl/appl),
group UUID = dc3fe281-1d7a-11e2-0800-67dd803567ca
121024 12:33:49 [Note] WSREP: Flow-control interval: [8, 16]
121024 12:33:49 [Note] WSREP: Restored state OPEN -> JOINED (0)
121024 12:33:49 [Note] WSREP: Member 0 (Test63) synced with group.
121024 12:33:49 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
121024 12:33:49 [Note] WSREP: New cluster view: global state:
dc3fe281-1d7a-11e2-0800-67dd803567ca:0, view# 1: Primary, number of nodes:
1, my index: 0, protocol version 1
121024 12:33:49 [Note] WSREP: SST complete, seqno: 0
121024 12:33:49 InnoDB: The InnoDB memory heap is disabled
121024 12:33:49 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121024 12:33:49 InnoDB: Compressed tables use zlib 1.2.3.3
121024 12:33:49 InnoDB: Initializing buffer pool, size = 128.0M
121024 12:33:49 InnoDB: Completed initialization of buffer pool
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
121024 12:33:49 InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
121024 12:33:49 InnoDB: Log file ./ib_logfile0 did not exist: new to be
created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
121024 12:33:49 InnoDB: Log file ./ib_logfile1 did not exist: new to be
created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: 127 rollback segment(s) active.
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
121024 12:33:50 InnoDB: Waiting for the background threads to start
121024 12:33:51 InnoDB: 1.1.8 started; log sequence number 0
121024 12:33:51 [Note] Event Scheduler: Loaded 0 events
121024 12:33:51 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
121024 12:33:51 [Note] WSREP: Assign initial position for certification: 0,
protocol version: 1
121024 12:33:51 [Note] WSREP: Synchronized with group, ready for connections
121024 12:33:51 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
121024 12:33:51 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.17' socket: '/var/run/mysqld/mysqld.sock' port: 3306
wsrep_22.3.r3645
121024 12:34:36 [Note] /usr/sbin/mysqld: Normal shutdown
121024 12:34:36 [Note] WSREP: Stop replication
121024 12:34:36 [Note] WSREP: Closing send monitor...
121024 12:34:36 [Note] WSREP: Closed send monitor.
121024 12:34:36 [Note] WSREP: gcomm: terminating thread
121024 12:34:36 [Note] WSREP: gcomm: joining thread
121024 12:34:36 [Note] WSREP: gcomm: closing backend
121024 12:34:36 [Note] WSREP: GMCast::handle_stable_view: view((empty))
121024 12:34:36 [Note] WSREP: Received self-leave message.
121024 12:34:36 [Note] WSREP: gcomm: closed
121024 12:34:36 [Note] WSREP: Flow-control interval: [0, 0]
121024 12:34:36 [Note] WSREP: Received SELF-LEAVE. Closing connection.
121024 12:34:36 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 0)
121024 12:34:36 [Note] WSREP: RECV thread exiting 0: Success
121024 12:34:36 [Note] WSREP: New cluster view: global state:
dc3fe281-1d7a-11e2-0800-67dd803567ca:0, view# -1: non-Primary, number of
nodes: 0, my index: -1, protocol version 1
121024 12:34:36 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
121024 12:34:36 [Note] WSREP: applier thread exiting (code:0)
121024 12:34:36 [Note] WSREP: recv_thread() joined.
121024 12:34:36 [Note] WSREP: Closing slave action queue.
121024 12:34:38 [Note] WSREP: rollbacker thread exiting
121024 12:34:38 [Note] Event Scheduler: Purging the queue. 0 events
121024 12:34:38 [Note] WSREP: dtor state: CLOSED
121024 12:34:38 [Note] WSREP: apply mon: entered 0
121024 12:34:38 [Note] WSREP: apply mon: entered 0
121024 12:34:38 [Note] WSREP: mon: entered 3 oooe fraction 0 oool fraction 0
121024 12:34:38 [Note] WSREP: cert index usage at exit 0
121024 12:34:38 [Note] WSREP: cert trx map usage at exit 0
121024 12:34:38 [Note] WSREP: deps set usage at exit 0
121024 12:34:38 [Note] WSREP: avg deps dist 0
121024 12:34:38 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
121024 12:34:38 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 0)
121024 12:34:38 [Note] WSREP: Flushing memory map to disk...
121024 12:34:38 InnoDB: Starting shutdown...
121024 12:34:38 InnoDB: Shutdown completed; log sequence number 1595675
121024 12:34:38 [Note] /usr/sbin/mysqld: Shutdown complete
121024 12:35:13 [Note] WSREP: wsrep_load(): loading provider library
'/usr/lib/galera/libgalera_smm.so'
121024 12:35:13 [Note] WSREP: wsrep_load(): Galera 22.1.1(r95) by Codership
Oy loaded succesfully.
121024 12:35:13 [Note] WSREP: Reusing existing
'/var/lib/mysql//galera.cache'.
121024 12:35:13 [Note] WSREP: Passing config to GCS: gcache.dir =
/var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M;
gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit =
16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle
= 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit
= 0.25; replicator.commit_order = 3
121024 12:35:13 [Note] WSREP: wsrep_sst_grab()
121024 12:35:13 [Note] WSREP: Start replication
121024 12:35:13 [Note] WSREP: Found saved state:
dc3fe281-1d7a-11e2-0800-67dd803567ca:0
121024 12:35:13 [Note] WSREP: Assign initial position for certification: 0,
protocol version: 1
121024 12:35:13 [Note] WSREP: Setting initial position to
dc3fe281-1d7a-11e2-0800-67dd803567ca:0
121024 12:35:13 [Note] WSREP: protonet asio version 0
121024 12:35:13 [Note] WSREP: backend:
...
read more »