Thanks for the fast reply Josiah,
When I was talking about disabling the limits (hard and soft) for the
slave only during initial sync I was referring to the possibility to
set these limits to 0 via redis-cli during runtime, attach the slave,
let it sync and then change those limits back to more conservative
values. I believe it can be done but I am not sure about the impact on
the master - that was my concern/question.
The 2 instances in my case are in 2 different availability zones in
AWS but I don't think I have bandwidth issues.
So, to give more info on my problems/setup, this is what happens as
soon as I start the slave:
[3355] 29 May 21:20:34.240 * Slave asks for synchronization
[3355] 29 May 21:20:34.240 * Full resync requested by slave.
[3355] 29 May 21:20:34.240 * Waiting for next BGSAVE for SYNC
[4242] 29 May 21:21:09.248 * DB saved on disk
[4242] 29 May 21:21:10.737 * RDB: 1006 MB of memory used by copy-on-write
[3355] 29 May 21:21:12.879 * Background saving terminated with success
[3355] 29 May 21:21:18.850 * Background saving started by pid 4826
[3355] 29 May 21:22:56.027 # Client addr=
10.179.58.228:45862 fd=208
name= age=142 idle=142 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0
qbuf-free=0 obl=16377 oll=4182 omem=234916512 events=r cmd=psync
scheduled to be closed ASAP for overcoming of output buffer limits.
[3355] 29 May 21:23:51.688 * Slave asks for synchronization
[3355] 29 May 21:23:51.688 * Full resync requested by slave.
[3355] 29 May 21:23:51.688 * Waiting for next BGSAVE for SYNC
[4826] 29 May 21:24:18.947 * DB saved on disk
[4826] 29 May 21:24:20.333 * RDB: 1212 MB of memory used by copy-on-write
[3355] 29 May 21:24:22.610 * Background saving terminated with success
[3355] 29 May 21:24:28.603 * Background saving started by pid 5378
[3355] 29 May 21:25:56.082 # Client addr=
10.179.58.228:46076 fd=313
name= age=125 idle=125 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0
qbuf-free=0 obl=16383 oll=2664 omem=149808560 events=r cmd=psync
scheduled to be closed ASAP for overcoming of output buffer limits.
[3355] 29 May 21:26:47.944 * Slave asks for synchronization
[3355] 29 May 21:26:47.944 * Full resync requested by slave.
[3355] 29 May 21:26:47.944 * Waiting for next BGSAVE for SYNC
[5378] 29 May 21:27:28.177 * DB saved on disk
[5378] 29 May 21:27:29.686 * RDB: 986 MB of memory used by copy-on-write
[3355] 29 May 21:27:31.831 * Background saving terminated with success
[3355] 29 May 21:27:37.814 * Background saving started by pid 5934
........
During this time, on the slave:
[4640] 29 May 20:32:42.355 * Connecting to MASTER
redis3.freelancer.com:6379
[4640] 29 May 20:32:42.358 * MASTER <-> SLAVE sync started
[4640] 29 May 20:32:42.360 * Non blocking connect for SYNC fired the event.
[4640] 29 May 20:32:42.362 * Master replied to PING, replication can continue...
[4640] 29 May 20:32:42.363 * Partial resynchronization not possible
(no cached master)
[4640] 29 May 20:32:42.365 * Full resync from master:
621480e9295872416266e563939b4fd6724eb5b7:8823523103
[8470] 29 May 20:34:35.711 * DB saved on disk
[8470] 29 May 20:34:36.062 * RDB: 1 MB of memory used by copy-on-write
[4640] 29 May 20:34:36.636 * Background saving terminated with success
[4640] 29 May 20:36:28.710 # Timeout receiving bulk data from
MASTER... If the problem persists try to set the 'repl-timeout'
parameter in redis.conf to a larger value.
[4640] 29 May 20:36:28.710 * Connecting to MASTER
redis3.freelancer.com:6379
[4640] 29 May 20:36:28.788 * MASTER <-> SLAVE sync started
[4640] 29 May 20:36:28.790 * Non blocking connect for SYNC fired the event.
[4640] 29 May 20:36:28.791 * Master replied to PING, replication can continue...
[4640] 29 May 20:36:28.793 * Partial resynchronization not possible
(no cached master)
[4640] 29 May 20:36:28.795 * Full resync from master:
621480e9295872416266e563939b4fd6724eb5b7:10105649942
[4640] 29 May 20:39:25.067 # Timeout receiving bulk data from
MASTER... If the problem persists try to set the 'repl-timeout'
parameter in redis.conf to a larger value.
[4640] 29 May 20:39:25.068 * Connecting to MASTER
redis3.freelancer.com:6379
[4640] 29 May 20:39:25.073 * MASTER <-> SLAVE sync started
[4640] 29 May 20:39:25.075 * Non blocking connect for SYNC fired the event.
[4640] 29 May 20:39:25.076 * Master replied to PING, replication can continue...
[4640] 29 May 20:39:25.080 * Partial resynchronization not possible
(no cached master)
[4640] 29 May 20:39:25.082 * Full resync from master:
621480e9295872416266e563939b4fd6724eb5b7:10355757565
[4640] 29 May 20:41:25.421 * Background saving started by pid 8474
[4640] 29 May 20:42:31.522 # Timeout receiving bulk data from
MASTER... If the problem persists try to set the 'repl-timeout'
parameter in redis.conf to a larger value.
So, it looks like my ~6GB dump on master is not getting sent across to
the slave fast enough - can be a bandwidth problem but considering it
is 6GB.... maybe I should increase the "repl-timeout" only...!?
Can that be done? If yes, can you please tell me what are the
implications for the master?
Thanks,
Adi