We're having some big problems getting replication working in 2.8.2; the setup is a little complex involving multi-level replication, and upgrading from a 2.6.13 install. The bridge between our primary and secondary data-centre is showing
and so it keeps re-syncing; it **seems** to have stabilised now, but it has knocked our confidence in it a bit, so I'm unsure whether to revert!
Context: 20GB database (based on "used"), with high-latency to a secondary data-centre where it takes 10 minutes to do a full sync+load, and during that time we churn about 300MB of transaction data. During our deploy we've tried changing repl_backlog_size to 1GB, but it was still unstable.
At the time of writing, we're keeping our master node on 2.6.13 so we still have a revert story - and unless we can get confidence in the replication I'm going to have to start re-deploying 2.6.13.
[32022] 06 Dec 19:24:56.360 * MASTER <-> SLAVE sync started
[32022] 06 Dec 19:24:56.439 * Non blocking connect for SYNC fired the event.
[32022] 06 Dec 19:24:56.550 # Error reply to PING from master: '-LOADING Redis is loading the dataset in memory'
[32022] 06 Dec 19:24:57.363 * MASTER <-> SLAVE sync started
[32022] 06 Dec 19:24:57.442 * Non blocking connect for SYNC fired the event.
[32022] 06 Dec 19:24:57.522 * Master replied to PING, replication can continue...
[32022] 06 Dec 19:24:57.602 * Partial resynchronization not possible (no cached master)
[32022] 06 Dec 19:24:57.682 * Full resync from master: 00682274becb9882c4c9741cfdb583962f4b9fae:181600
[32022] 06 Dec 19:26:32.246 * MASTER <-> SLAVE sync: receiving 6434201052 bytes from master
[32022] 06 Dec 19:32:20.430 * MASTER <-> SLAVE sync: Loading DB in memory
[32022] 06 Dec 19:36:19.497 * MASTER <-> SLAVE sync: Finished with success
[32022] 06 Dec 19:37:20.803 # MASTER timeout: no data nor PING received...
[32022] 06 Dec 19:37:20.803 * Caching the disconnected master state.
[32022] 06 Dec 19:37:20.804 * MASTER <-> SLAVE sync started
[32022] 06 Dec 19:37:20.884 * Non blocking connect for SYNC fired the event.
[32022] 06 Dec 19:37:20.963 * Master replied to PING, replication can continue...
[32022] 06 Dec 19:37:21.042 * Trying a partial resynchronization (request 00682274becb9882c4c9741cfdb583962f4b9fae:3670551).
[32022] 06 Dec 19:37:21.121 * Full resync from master: 00682274becb9882c4c9741cfdb583962f4b9fae:652792803
[32022] 06 Dec 19:37:21.121 * Discarding previously cached master state.
[32022] 06 Dec 19:38:56.056 * MASTER <-> SLAVE sync: receiving
6458924765 bytes from master
[32022] 06 Dec 19:44:31.897 * MASTER <-> SLAVE sync: Loading DB in memory
[32022] 06 Dec 19:48:45.907 * MASTER <-> SLAVE sync: Finished with success
[32022] 06 Dec 19:49:47.043 # MASTER timeout: no data nor PING received...
[32022] 06 Dec 19:49:47.043 * Caching the disconnected master state.
[32022] 06 Dec 19:49:47.044 * MASTER <-> SLAVE sync started
[32022] 06 Dec 19:49:47.123 * Non blocking connect for SYNC fired the event.
[32022] 06 Dec 19:49:47.202 * Master replied to PING, replication can continue...
[32022] 06 Dec 19:49:47.281 * Trying a partial resynchronization (request 00682274becb9882c4c9741cfdb583962f4b9fae:656844415).
[32022] 06 Dec 19:49:47.360 * Full resync from master: 00682274becb9882c4c9741cfdb583962f4b9fae:1063025761
[32022] 06 Dec 19:49:47.360 * Discarding previously cached master state.
[32022] 06 Dec 19:51:22.405 * MASTER <-> SLAVE sync: receiving
6463872589 bytes from master
[32022] 06 Dec 19:56:31.966 * MASTER <-> SLAVE sync: Loading DB in memory
[32022] 06 Dec 20:00:45.534 * MASTER <-> SLAVE sync: Finished with success
[32022] 06 Dec 20:01:46.048 # MASTER timeout: no data nor PING received...
[32022] 06 Dec 20:01:46.048 * Caching the disconnected master state.
[32022] 06 Dec 20:01:46.049 * MASTER <-> SLAVE sync started
[32022] 06 Dec 20:01:46.128 * Non blocking connect for SYNC fired the event.
[32022] 06 Dec 20:01:46.207 * Master replied to PING, replication can continue...
[32022] 06 Dec 20:01:46.286 * Trying a partial resynchronization (request 00682274becb9882c4c9741cfdb583962f4b9fae:1066551814).
[32022] 06 Dec 20:01:46.365 * Full resync from master: 00682274becb9882c4c9741cfdb583962f4b9fae:1480454340
[32022] 06 Dec 20:01:46.365 * Discarding previously cached master state.