PSYNC Failed : Partial resynchronization not accepted (v 5.0.3)

465 views
Skip to first unread message

Ganesh Kumar Ganesan

unread,
Jan 1, 2020, 5:31:00 AM1/1/20
to Redis DB

Our application is deployed in two different environments (Env - 1 & Env - 2). Both environments will have two nodes master and replica.

              Env - 1                                  Env - 2
+ + + + + + + + + + + + + + + + +        + + + + + + + + + + + + + + + + +    
| MASTER - - - - - - -> REPLICA |  --->  | MASTER - - - - - - -> REPLICA | 
+ + + + + + + + + + + + + + + + +        + + + + + + + + + + + + + + + + +    

Initially, the redis replication will be configured from the Env - 1 (R) Node to Env - 2 (M) Node.

There will be a case to reverse the replication. To achieve this, we will break the replication and then reconfigure the replication between the Env - 2 (R) Node to Env - 1 (M) Node. During this, redis servers (Env - 1) couldn't complete the partial resync. (logs below)


Reason for the failure :

  1. Once we break the replication (replicaof no one) for Env - 2 (M) Node. Then new master_replid getting generated and older master_replid & master_repl_offset shifted to master_replid2 & second_repl_offset.
  2. Still, the Env - 1 (R) node receives PINGs from the old master of Env - 1 (M) node in the configured interval (repl-ping-replica-period)
  3. Due to the PINGs in Env - 1 master & replica node offset getting incremented which causing a mismatch so the partial resync failed.

The full resync is expensive. How can we avoid?
Any other possible solution for this case?


Env - 2 (Replica) Node


Replica xxx.xx.xx.xx:6379 asks for synchronization
Partial resynchronization not accepted: Requested offset for second ID was 21415514, but I can reply up to 21415500
Starting BGSAVE for SYNC with target: disk
Background saving started by pid 7790
Background saving terminated with success
Synchronization with replica xxx.xx.xx.xx:6379 succeeded


Env - 1 (Master) Node


Trying a partial resynchronization (request 1ad7e56ef10178e3a0be05ebb58f90b7668223a4:21415514).
Full resync from master: f3950935a9ed1da5c10f77b5ccbd7868767e59ea:21415513
Discarding previously cached master state.
MASTER
<-> REPLICA sync: receiving 2057 bytes from master
MASTER
<-> REPLICA sync: Flushing old data
MASTER
<-> REPLICA sync: Loading DB in memory
MASTER
<-> REPLICA sync: Finished with success





Ganesh KuMar Ganesan

unread,
Mar 18, 2020, 2:00:44 AM3/18/20
to redi...@googlegroups.com

To achieve the reverse replication case using psync, thought setting repl-ping-replica-period from the default (10) into higher (>100) value and finish the reverse replication before ping can happen will help. 


While checking the source, when the mod of loops counter and repl_ping_slave_period is equal to zero then ping will be sent. The loops counter hasn't reinitialized after the ping. So even after increasing ping period, the next ping can happen at any time when condition satisfies. i.e. Assume, the ping period changed from 10 to 100 when loops counter value is 90. So, the next ping will happen after 10 secs. Due to this psync will be failed.


image.png

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/yQlYrRlpcdU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/redis-db/2000d2a1-619f-4ac5-8fdc-2761192f8ffc%40googlegroups.com.

Ganesh KuMar Ganesan

unread,
Apr 2, 2020, 6:10:30 AM4/2/20
to redi...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages