Why slave changed to master without failover in the cluster?

1,299 views
Skip to first unread message

cwh...@gmail.com

unread,
Aug 11, 2014, 11:45:48 PM8/11/14
to redi...@googlegroups.com
The cluster has six nodes, these nodes information is as follows through 'cluster nodes' command:
6f1a37d66aacbcb0128c6306b5b989ad81c160c1 192.168.1.157:6379 master - 0 1407808913042 13 connected 0-5460
876f04626f806226755c98f15a8475c950a8446b 192.168.1.155:6379 master - 0 1407808915045 5 connected 5461-10922
a9ce649f167b83dc477d195e62ca3fe9034db5e6 192.168.1.151:6379 slave 3e6c92b2137397573846ca7ce3385de19b033c13 0 1407808912040 9 connected
f7bbdafd8baee642981062877e51a7a47f85acdd 192.168.1.153:6379 slave 6f1a37d66aacbcb0128c6306b5b989ad81c160c1 0 1407808911039 13 connected
fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff 192.168.1.152:6379 slave 876f04626f806226755c98f15a8475c950a8446b 0 1407808914042 5 connected
3e6c92b2137397573846ca7ce3385de19b033c13 :6379 myself,master - 0 0 9 connected 10923-16383
The ip of 'myself' above is 192.168.1.154.

After wrote some k/v into the cluster, and then executed 'flushall' command on every master node. At last executing 'cluster nodes command again:
6f1a37d66aacbcb0128c6306b5b989ad81c160c1 192.168.1.157:6379 master - 0 1407809551053 13 connected 0-5460
876f04626f806226755c98f15a8475c950a8446b 192.168.1.155:6379 master - 0 1407809548048 5 connected 5461-10922
a9ce649f167b83dc477d195e62ca3fe9034db5e6 192.168.1.151:6379 master - 0 1407809551053 16 connected 10923-16383
f7bbdafd8baee642981062877e51a7a47f85acdd 192.168.1.153:6379 slave 6f1a37d66aacbcb0128c6306b5b989ad81c160c1 0 1407809550051 13 connected
fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff 192.168.1.152:6379 slave 876f04626f806226755c98f15a8475c950a8446b 0 1407809549049 5 connected
3e6c92b2137397573846ca7ce3385de19b033c13 :6379 myself,slave a9ce649f167b83dc477d195e62ca3fe9034db5e6 0 0 9 connected
Obviously, the roles of the previous master '192.168.1.154'  and its slave '192.168.1.151' exchanged.

I read the log of '192.168.1.154', but did not understand why the change can happen?  The log fragment is as follows:
39690:M 12 Aug 09:54:56.156 # Server started, Redis version 2.9.56
39690:M 12 Aug 09:54:56.158 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
39690:M 12 Aug 09:54:56.159 * The server is now ready to accept connections on port 6379
39690:M 12 Aug 09:54:59.164 # Cluster state changed: ok
39690:M 12 Aug 09:55:27.900 * Clear FAIL state for node fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff: slave is reachable again.
39690:M 12 Aug 09:55:34.521 * Clear FAIL state for node f7bbdafd8baee642981062877e51a7a47f85acdd: slave is reachable again.
39690:M 12 Aug 09:56:07.015 * Clear FAIL state for node a9ce649f167b83dc477d195e62ca3fe9034db5e6: slave is reachable again.
39690:M 12 Aug 09:56:08.010 * Slave asks for synchronization
39690:M 12 Aug 09:56:08.010 * Full resync requested by slave.
39690:M 12 Aug 09:56:08.010 * Starting BGSAVE for SYNC
39690:M 12 Aug 09:56:08.011 * Background saving started by pid 39829
39829:C 12 Aug 09:56:08.028 * DB saved on disk
39829:C 12 Aug 09:56:08.029 * RDB: 4 MB of memory used by copy-on-write
39690:M 12 Aug 09:56:08.115 * Background saving terminated with success
39690:M 12 Aug 09:56:08.115 * Synchronization with slave succeeded
39690:M 12 Aug 10:09:22.110 * Marking node 876f04626f806226755c98f15a8475c950a8446b as failing (quorum reached).
39690:M 12 Aug 10:09:22.111 # Cluster state changed: fail
39690:M 12 Aug 10:09:48.081 # Failover auth granted to fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff for epoch 15
39690:M 12 Aug 10:09:48.082 * FAIL message received from 876f04626f806226755c98f15a8475c950a8446b about fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff
39690:M 12 Aug 10:09:48.082 # Failover auth denied to a9ce649f167b83dc477d195e62ca3fe9034db5e6: its master is up
39690:M 12 Aug 10:09:48.082 # Configuration change detected. Reconfiguring myself as a replica of a9ce649f167b83dc477d195e62ca3fe9034db5e6
39690:S 12 Aug 10:09:48.183 * Clear FAIL state for node fbb5b1d8d5433be8f5a98d8854fc87b3b266cdff: slave is reachable again.
39690:S 12 Aug 10:09:49.085 * Connecting to MASTER 192.168.1.151:6379
39690:S 12 Aug 10:09:49.085 * MASTER <-> SLAVE sync started
39690:S 12 Aug 10:09:49.085 * Non blocking connect for SYNC fired the event.
39690:S 12 Aug 10:09:49.085 * Master replied to PING, replication can continue...
39690:S 12 Aug 10:09:49.085 * Partial resynchronization not possible (no cached master)
39690:S 12 Aug 10:09:49.086 * Full resync from master: 57beb0e3305365d9576deb5b4afd8000362655d0:514204110
39690:S 12 Aug 10:09:54.933 * MASTER <-> SLAVE sync: receiving 314225183 bytes from master
39690:S 12 Aug 10:09:56.006 * Clear FAIL state for node 876f04626f806226755c98f15a8475c950a8446b: is reachable again and nobody is serving its slots after some time.
39690:S 12 Aug 10:09:56.006 # Cluster state changed: ok
39690:S 12 Aug 10:09:57.823 * MASTER <-> SLAVE sync: Flushing old data
39690:S 12 Aug 10:09:57.824 * MASTER <-> SLAVE sync: Loading DB in memory
39690:S 12 Aug 10:11:24.815 * MASTER <-> SLAVE sync: Finished with success

I understand that the node 876f04626f806226755c98f15a8475c950a8446b(192.168.1.155) was detected failed, even if failover,  it should be object to 192.168.1.155 but  why need failover for 192.168.1.154? 
Maybe i misunderstand the log, please give me help, thanks!

赵方远

unread,
Jul 23, 2015, 1:15:43 PM7/23/15
to Redis DB, cwh...@gmail.com
I run into the same issue with flushall. Can anyone help? I think this is definitly not a rare case!

在 2014年8月12日星期二 UTC+8上午11:45:48,cwh...@gmail.com写道:

赵方远

unread,
Jul 23, 2015, 2:08:06 PM7/23/15
to Redis DB, cwh...@gmail.com
Got the answer! checkout here.
https://github.com/antirez/redis/issues/2691
by the way  are you a Chinese?


在 2014年8月12日星期二 UTC+8上午11:45:48,cwh...@gmail.com写道:
The cluster has six nodes, these nodes information is as follows through 'cluster nodes' command:
Reply all
Reply to author
Forward
0 new messages