Hi,
Using Galera with versions below :
Galera Library 23.2.0 - galera-23.2.0-src.tar.gz
and MySQL 5.5.20 - mysql-5.5.20_wsrep_23.4-linux-x86_64.tar.gz
3 nodes in the cluster and having a strange behaviour when I shutdown one of the node and restart it :
------ Log of node1 when shutting down node2 :
120314 12:41:42 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,4eb79155-6d9d-11e1-0800-4ce267baa4ec,20) memb {
4eb79155-6d9d-11e1-0800-4ce267baa4ec,
8a274b9b-6dc2-11e1-0800-46dccafcd983,
} joined {
} left {
} partitioned {
54483391-6c53-11e1-0800-e9f853ac6358,
})
120314 12:41:42 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 2
120314 12:41:42 [Note] WSREP: forgetting 54483391-6c53-11e1-0800-e9f853ac6358 (tcp://
192.168.0.110:4567)
120314 12:41:42 [Note] WSREP: declaring 8a274b9b-6dc2-11e1-0800-46dccafcd983 stable
120314 12:41:42 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ab3a978c-6dca-11e1-0800-1298391903f0
120314 12:41:42 [Note] WSREP: STATE EXCHANGE: sent state msg: ab3a978c-6dca-11e1-0800-1298391903f0
120314 12:41:42 [Note] WSREP: STATE EXCHANGE: got state msg: ab3a978c-6dca-11e1-0800-1298391903f0 from 0 (cygnus)
120314 12:41:42 [Note] WSREP: STATE EXCHANGE: got state msg: ab3a978c-6dca-11e1-0800-1298391903f0 from 1 (vmdebian2)
120314 12:41:42 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 19,
members = 2/2 (joined/total),
act_id = 114755,
last_appl. = 114523,
protocols = 0/3/1 (gcs/repl/appl),
group UUID = 7757c866-6c09-11e1-0800-08c23d11f6b3
120314 12:41:42 [Note] WSREP: Flow-control interval: [12, 23]
120314 12:41:42 [Note] WSREP: New cluster view: global state: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755, view# 20: Primary, number of nodes: 2, my index: 0, protocol version 1
120314 12:41:42 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 12:41:42 [Note] WSREP: Assign initial position for certification: 114755, protocol version: 2
120314 12:41:47 [Note] WSREP: cleaning up 54483391-6c53-11e1-0800-e9f853ac6358 (tcp://
192.168.0.110:4567)
-> Seems OK
------ Log of node2 when shutting down node2 :
120314 12:41:41 [Note] /usr/sbin/mysqld: Normal shutdown
120314 12:41:41 [Note] WSREP: Stop replication
120314 12:41:41 [Note] WSREP: Closing send monitor...
120314 12:41:41 [Note] WSREP: Closed send monitor.
120314 12:41:41 [Note] WSREP: gcomm: terminating thread
120314 12:41:41 [Note] WSREP: gcomm: joining thread
120314 12:41:41 [Note] WSREP: gcomm: closing backend
120314 12:41:41 [Note] WSREP: evs::proto(54483391-6c53-11e1-0800-e9f853ac6358, LEAVING, view_id(REG,4eb79155-6d9d-11e1-0800-4ce267baa4ec,19)) uuid 4eb79155-6d9d-11e1-0800-4ce267baa4ec missing from install message, assuming partitioned
120314 12:41:41 [Note] WSREP: evs::proto(54483391-6c53-11e1-0800-e9f853ac6358, LEAVING, view_id(REG,4eb79155-6d9d-11e1-0800-4ce267baa4ec,19)) uuid 8a274b9b-6dc2-11e1-0800-46dccafcd983 missing from install message, assuming partitioned
120314 12:41:41 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
120314 12:41:41 [Note] WSREP: GMCast::handle_stable_view: view(view_id(NON_PRIM,4eb79155-6d9d-11e1-0800-4ce267baa4ec,19) memb {
54483391-6c53-11e1-0800-e9f853ac6358,
} joined {
} left {
} partitioned {
4eb79155-6d9d-11e1-0800-4ce267baa4ec,
8a274b9b-6dc2-11e1-0800-46dccafcd983,
})
120314 12:41:41 [Note] WSREP: GMCast::handle_stable_view: view((empty))
120314 12:41:41 [Note] WSREP: gcomm: closed
120314 12:41:41 [Note] WSREP: Flow-control interval: [8, 16]
120314 12:41:41 [Note] WSREP: Received NON-PRIMARY.
120314 12:41:41 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 114755)
120314 12:41:41 [Note] WSREP: Received self-leave message.
120314 12:41:41 [Note] WSREP: Flow-control interval: [0, 0]
120314 12:41:41 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120314 12:41:41 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 114755)
120314 12:41:41 [Note] WSREP: RECV thread exiting 0: Success
120314 12:41:41 [Note] WSREP: New cluster view: global state: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 1
120314 12:41:41 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 12:41:41 [Note] WSREP: New cluster view: global state: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 1
120314 12:41:41 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 12:41:41 [Note] WSREP: applier thread exiting (code:0)
120314 12:41:41 [Note] WSREP: recv_thread() joined.
120314 12:41:41 [Note] WSREP: Closing slave action queue.
120314 12:41:43 [Note] WSREP: SST kill local trx: 246733
120314 12:41:43 [Note] WSREP: rollbacker thread exiting
120314 12:41:43 [Note] Event Scheduler: Purging the queue. 0 events
120314 12:41:43 [Note] WSREP: dtor state: CLOSED
120314 12:41:43 [Note] WSREP: apply mon: entered 0
120314 12:41:43 [Note] WSREP: apply mon: entered 0
120314 12:41:44 [Note] WSREP: mon: entered 111634 oooe fraction 0 oool fraction 8.06206e-05
120314 12:41:44 [Note] WSREP: cert index usage at exit 339
120314 12:41:44 [Note] WSREP: cert trx map usage at exit 234
120314 12:41:44 [Note] WSREP: deps set usage at exit 0
120314 12:41:44 [Note] WSREP: avg deps dist 90.8376
120314 12:41:44 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
120314 12:41:44 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 114755)
120314 12:41:44 [Note] WSREP: Flushing memory map to disk...
120314 12:41:44 InnoDB: Starting shutdown...
120314 12:41:46 InnoDB: Shutdown completed; log sequence number 617857171
120314 12:41:46 [Note] /usr/sbin/mysqld: Shutdown complete
-> Seems OK
------ Log of node1 when restarting node2 :
120314 13:03:21 [Note] WSREP: (4eb79155-6d9d-11e1-0800-4ce267baa4ec, 'tcp://
0.0.0.0:4567') cleaning up duplicate 0x7fc080001120 after established 0x7fc080001810
120314 13:03:21 [Note] WSREP: (4eb79155-6d9d-11e1-0800-4ce267baa4ec, 'tcp://
0.0.0.0:4567') cleaning up duplicate 0x16ad9b0 after established 0x7fc080002430
120314 13:03:21 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 3
120314 13:03:21 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,4eb79155-6d9d-11e1-0800-4ce267baa4ec,21) memb {
4eb79155-6d9d-11e1-0800-4ce267baa4ec,
8a274b9b-6dc2-11e1-0800-46dccafcd983,
b0452092-6dcd-11e1-0800-555f18cbadc9,
} joined {
} left {
} partitioned {
})
120314 13:03:21 [Note] WSREP: declaring 8a274b9b-6dc2-11e1-0800-46dccafcd983 stable
120314 13:03:21 [Note] WSREP: declaring b0452092-6dcd-11e1-0800-555f18cbadc9 stable
120314 13:03:21 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b1e22797-6dcd-11e1-0800-8b91e67379ba
120314 13:03:21 [Note] WSREP: STATE EXCHANGE: sent state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba
120314 13:03:21 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 0 (cygnus)
120314 13:03:21 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 1 (vmdebian2)
120314 13:03:22 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 2 (vmdebian1)
120314 13:03:22 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 20,
members = 2/3 (joined/total),
act_id = 116071,
last_appl. = 116068,
protocols = 0/3/1 (gcs/repl/appl),
group UUID = 7757c866-6c09-11e1-0800-08c23d11f6b3
120314 13:03:22 [Note] WSREP: Flow-control interval: [14, 28]
120314 13:03:22 [Note] WSREP: New cluster view: global state: 7757c866-6c09-11e1-0800-08c23d11f6b3:116071, view# 21: Primary, number of nodes: 3, my index: 0, protocol version 1
120314 13:03:22 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 13:03:22 [Note] WSREP: Assign initial position for certification: 116071, protocol version: 2
120314 13:03:25 [Note] WSREP: Node 2 (vmdebian1) requested state transfer from '*any*'. Selected 0 (cygnus)(SYNCED) as donor.
120314 13:03:25 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 116073)
120314 13:03:25 [Note] WSREP: IST request: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755-116071|tcp://
192.168.0.110:4568120314 13:03:25 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 13:03:25 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '
192.168.0.110:4444/rsync_sst' 'sst:5T13wPid' '/opt/mysql-galera/data/' '/etc/my-galera.cnf' '7757c866-6c09-11e1-0800-08c23d11f6b3' '114755' '1''
120314 13:03:25 [Note] WSREP: sst_donor_thread signaled with 0
120314 13:03:25 [Note] WSREP: async IST sender starting to serve tcp://
192.168.0.110:4568 sending 114756-116071
------ Log of node2 when restarting node2 :
120314 13:03:07 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
120314 13:03:07 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
120314 13:03:07 [Note] WSREP: wsrep_load(): Galera 23.2.0(r120) by Codership Oy <
in...@codership.com> loaded succesfully.
120314 13:03:07 [Note] WSREP: Preallocating 134217728/268436768 bytes in '/var/lib/mysql//galera.cache'...
120314 13:03:18 [Note] WSREP: Passing config to GCS: gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 256M; gcache.size = 256M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
120314 13:03:19 [Note] WSREP: wsrep_sst_grab()
120314 13:03:19 [Note] WSREP: Start replication
120314 13:03:19 [Note] WSREP: Found saved state: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755
120314 13:03:19 [Note] WSREP: Assign initial position for certification: 114755, protocol version: -1
120314 13:03:19 [Note] WSREP: Setting initial position to 7757c866-6c09-11e1-0800-08c23d11f6b3:114755
120314 13:03:19 [Note] WSREP: protonet asio version 0
120314 13:03:19 [Note] WSREP: backend: asio
120314 13:03:19 [Note] WSREP: GMCast version 0
120314 13:03:19 [Note] WSREP: (b0452092-6dcd-11e1-0800-555f18cbadc9, 'tcp://
0.0.0.0:4567') listening at tcp://
0.0.0.0:4567120314 13:03:19 [Note] WSREP: (b0452092-6dcd-11e1-0800-555f18cbadc9, 'tcp://
0.0.0.0:4567') multicast: , ttl: 1
120314 13:03:19 [Note] WSREP: EVS version 0
120314 13:03:19 [Note] WSREP: PC version 0
120314 13:03:19 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '
192.168.0.4:4567'
120314 13:03:19 [Note] WSREP: (b0452092-6dcd-11e1-0800-555f18cbadc9, 'tcp://
0.0.0.0:4567') cleaning up duplicate 0x8b06ba8 after established 0x8b17e98
120314 13:03:19 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,4eb79155-6d9d-11e1-0800-4ce267baa4ec,21) memb {
4eb79155-6d9d-11e1-0800-4ce267baa4ec,
8a274b9b-6dc2-11e1-0800-46dccafcd983,
b0452092-6dcd-11e1-0800-555f18cbadc9,
} joined {
} left {
} partitioned {
})
120314 13:03:19 [Note] WSREP: declaring 4eb79155-6d9d-11e1-0800-4ce267baa4ec stable
120314 13:03:19 [Note] WSREP: declaring 8a274b9b-6dc2-11e1-0800-46dccafcd983 stable
120314 13:03:20 [Note] WSREP: gcomm: connected
120314 13:03:20 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
120314 13:03:20 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
120314 13:03:20 [Note] WSREP: Opened channel 'my_wsrep_cluster'
120314 13:03:20 [Note] WSREP: Waiting for SST to complete.
120314 13:03:20 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 2, memb_num = 3
120314 13:03:20 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120314 13:03:20 [Note] WSREP: STATE EXCHANGE: sent state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba
120314 13:03:20 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 0 (cygnus)
120314 13:03:20 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 1 (vmdebian2)
120314 13:03:20 [Note] WSREP: STATE EXCHANGE: got state msg: b1e22797-6dcd-11e1-0800-8b91e67379ba from 2 (vmdebian1)
120314 13:03:20 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 20,
members = 2/3 (joined/total),
act_id = 116071,
last_appl. = -1,
protocols = 0/3/1 (gcs/repl/appl),
group UUID = 7757c866-6c09-11e1-0800-08c23d11f6b3
120314 13:03:20 [Note] WSREP: Flow-control interval: [14, 28]
120314 13:03:20 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 116071)
120314 13:03:20 [Note] WSREP: State transfer required:
Group state: 7757c866-6c09-11e1-0800-08c23d11f6b3:116071
Local state: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755
120314 13:03:20 [Note] WSREP: New cluster view: global state: 7757c866-6c09-11e1-0800-08c23d11f6b3:116071, view# 21: Primary, number of nodes: 3, my index: 2, protocol version 1
120314 13:03:20 [Warning] WSREP: Gap in state sequence. Need state transfer.
120314 13:03:22 [Note] WSREP: Running: 'wsrep_sst_rsync 'joiner' '192.168.0.110' 'sst:5T13wPid' '/var/lib/mysql/' '/etc/mysql/my.cnf' '1176' 2>sst.err'
120314 13:03:23 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120314 13:03:23 [Note] WSREP: Assign initial position for certification: 116071, protocol version: 2
120314 13:03:23 [Note] WSREP: Prepared IST receiver, listening at: tcp://
192.168.0.110:4568120314 13:03:23 [Note] WSREP: Node 2 (vmdebian1) requested state transfer from '*any*'. Selected 0 (cygnus)(SYNCED) as donor.
120314 13:03:23 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 116073)
120314 13:03:23 [Note] WSREP: Requesting state transfer: success, donor: 0
120314 13:03:25 [Note] WSREP: SST complete, seqno: 114755
120314 13:03:25 [ERROR] Can't open shared library '/usr/lib/galera/semisync_master.so' (errno: 0 cannot open shared object file: No such file or directory)
120314 13:03:25 [Warning] Couldn't load plugin named 'rpl_semi_sync_master' with soname 'semisync_master.so'.
120314 13:03:25 [ERROR] Can't open shared library '/usr/lib/galera/semisync_slave.so' (errno: 0 cannot open shared object file: No such file or directory)
120314 13:03:25 [Warning] Couldn't load plugin named 'rpl_semi_sync_slave' with soname 'semisync_slave.so'.
120314 13:03:25 InnoDB: The InnoDB memory heap is disabled
120314 13:03:25 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120314 13:03:25 InnoDB: Compressed tables use zlib 1.2.3.3
120314 13:03:25 InnoDB: Using Linux native AIO
120314 13:03:25 InnoDB: Initializing buffer pool, size = 64.0M
120314 13:03:25 InnoDB: Completed initialization of buffer pool
120314 13:03:25 InnoDB: highest supported file format is Barracuda.
120314 13:03:26 InnoDB: Waiting for the background threads to start
120314 13:03:27 InnoDB: 1.1.8 started; log sequence number 617857171
120314 13:03:28 [Note] Event Scheduler: Loaded 0 events
120314 13:03:28 [Note] WSREP: Signalling provider to continue.
120314 13:03:28 [Note] WSREP: Received SST: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755
120314 13:03:28 [Note] WSREP: SST received: 7757c866-6c09-11e1-0800-08c23d11f6b3:114755
120314 13:03:28 [Note] WSREP: Receiving IST: 1316 writesets, seqnos 114755-116071
120314 13:03:28 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.20-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution, wsrep_23.4.r3713
120314 13:03:59 [Warning] IP address '192.168.0.1' could not be resolved: Name or service not known
Then, I can wait even for hours but nothing new is happening and this is the status of the 2 nodes through mysql shell :
node1 :
mysql> show status like 'wsrep%';
+----------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid | 7757c866-6c09-11e1-0800-08c23d11f6b3 |
| wsrep_protocol_version | 3 |
| wsrep_last_committed | 117834 |
| wsrep_replicated | 6412 |
| wsrep_replicated_bytes | 12888505 |
| wsrep_received | 14039 |
| wsrep_received_bytes | 31360018 |
| wsrep_local_commits | 6375 |
| wsrep_local_cert_failures | 3 |
| wsrep_local_bf_aborts | 129 |
| wsrep_local_replays | 103 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 93.660767 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 1.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.000000 |
| wsrep_local_state | 2 |
| wsrep_local_state_comment | Donor (+) |
| wsrep_cert_index_size | 1866 |
| wsrep_cluster_conf_id | 21 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 7757c866-6c09-11e1-0800-08c23d11f6b3 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_index | 0 |
| wsrep_provider_name | Galera |
| wsrep_provider_version | 2.0(rXXXX) |
| wsrep_ready | ON |
+----------------------------+--------------------------------------+
38 rows in set (0.00 sec)
node2 :
mysql> show status like 'wsrep%';
+----------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid | 7757c866-6c09-11e1-0800-08c23d11f6b3 |
| wsrep_protocol_version | 3 |
| wsrep_last_committed | 114755 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_received | 1 |
| wsrep_received_bytes | 243 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_recv_queue | 1766 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 1 |
| wsrep_local_state_comment | Waiting for SST (4) |
| wsrep_cert_index_size | 0 |
| wsrep_cluster_conf_id | 21 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 7757c866-6c09-11e1-0800-08c23d11f6b3 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_index | 2 |
| wsrep_provider_name | Galera |
| wsrep_provider_version | 23.2.0(r120) |
| wsrep_ready | OFF |
+----------------------------+--------------------------------------+
38 rows in set (0.00 sec)
node3 is OK and status 4 with comment Synced (6).
Some questions/wondering :
- Why the node2 seems not being able to recover IST from node1 ?
- If I understand well, as node2 is waiting to be sync from node1 then it is unavailable for serving requests -> seems logical, but what is the status for node1 ? It is showing local state 2 / Donor (+), does the node is available to serve request (select & insert) in this state or not ? If it's not the case then I only have 1 node available out of free in this case and so with a high traffic this can be a problem ...
- I saw a Warning message about my IP address 192.168.0.1 (which is the main IP address of eth0 for node1) but node is listening on 192.168.0.4 which is an alias of eth0 :
eth0 Link encap:Ethernet HWaddr 00:13:D4:87:35:3A
inet adr:192.168.0.1 Bcast:192.168.0.255 Masque:255.255.255.0
eth0:3 Link encap:Ethernet HWaddr 00:13:D4:87:35:3A
inet adr:192.168.0.4 Bcast:192.168.0.255 Masque:255.255.255.0
node2 is having IP address 192.168.0.110 :
eth0 Link encap:Ethernet HWaddr 08:00:27:28:83:d7
inet adr:192.168.0.110 Bcast:192.168.0.255 Masque:255.255.255.0
Does the alias can cause a problem ?
Any help/advices would be very appreciated, thanks by advance for your time.
BR,
Laurent