Issue with Redis replication while transferring rdb file from master to slave.

305 views
Skip to first unread message

manohar ht

unread,
Apr 2, 2014, 6:45:11 AM4/2/14
to redi...@googlegroups.com
Hi,

we are facing issue with redis replication wherein the transfer of rdb file from master to slave after bgsave has been completed slows down to very low speed to ~20KB/sec, even though first 900MB of data transfer is done at 20 MB/sec, while the transfer of rdb slows down, redis takes lot of time to execute simple auth command and we start seeing clients timeouts. We have a dataset of 47GB and both master and slave are running on redis 2.6.5. Master is a production server with approx 1.5k client and with ~20k ops/sec under normal load. Client output buffer limit is disabled.

There is no issue with the network as i see no packet drops and normal file transfer over network can be done at 64MB/sec.

Has anyone come across something similar, or any suggestions ? This is a bit critical.

Thanks,
Manohar.

Josiah Carlson

unread,
Apr 2, 2014, 12:11:21 PM4/2/14
to redi...@googlegroups.com
Have you checked the memory use on your master? Have you checked the outgoing buffer sizes on your master?

Based on what you are describing, it's possible that your master is running out of memory during the sync process and is swapping. Without more information, this is what I suspect is happening:
1. An rdb is created and starts streaming to your slave
2. All write commands to your master are held in an in-memory backlog
3. The backlog grows big enough to cause Redis to start swapping
4. Network IO drops because the master is busy swapping

Now, there could certainly be something else going on. Can you get the output of INFO while this is going on and post it here?

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

manohar ht

unread,
Apr 2, 2014, 1:05:49 PM4/2/14
to redi...@googlegroups.com
Hi Josiah,

Thanks for replying, Below is the output of INFO command while this data transfer to slave was going on.

# Server

redis_version:2.6.5

redis_git_sha1:00000000

redis_git_dirty:0

redis_mode:standalone

os:Linux 2.6.32-5-amd64 x86_64

arch_bits:64

multiplexing_api:epoll

gcc_version:4.7.1

process_id:3298

run_id:d580c2f14057eb4fcac5ea39bdf6df64f8affb42

tcp_port:6379

uptime_in_seconds:27884216

uptime_in_days:322

lru_clock:1230993


# Clients

connected_clients:702

client_longest_output_list:4061

client_biggest_input_buf:0

blocked_clients:97


# Memory

used_memory:47263968248

used_memory_human:44.02G

used_memory_rss:51173629952

used_memory_peak:50351466720

used_memory_peak_human:46.89G

used_memory_lua:31744

mem_fragmentation_ratio:1.08

mem_allocator:jemalloc-3.0.0


# Persistence

loading:0

rdb_changes_since_last_save:299638

rdb_bgsave_in_progress:0

rdb_last_save_time:1396429699

rdb_last_bgsave_status:ok

rdb_last_bgsave_time_sec:682

rdb_current_bgsave_time_sec:-1

aof_enabled:0

aof_rewrite_in_progress:0

aof_rewrite_scheduled:0

aof_last_rewrite_time_sec:-1

aof_current_rewrite_time_sec:-1

aof_last_bgrewrite_status:ok


# Stats

total_connections_received:58068338

total_commands_processed:373271899948

instantaneous_ops_per_sec:9803

rejected_connections:0

expired_keys:2335356

evicted_keys:0

keyspace_hits:266945144550

keyspace_misses:129989805166

pubsub_channels:2

pubsub_patterns:0

latest_fork_usec:2552065


# Replication

role:master

connected_slaves:1

slave0:[x.x.x.x],6379,send_bulk


# CPU

used_cpu_sys:18310397952.00

used_cpu_user:138492768.00

used_cpu_sys_children:223836.31

used_cpu_user_children:1814577.50


# Keyspace

db0:keys=60125716,expires=32

db1:keys=310,expires=0

(0.67s)


Process state:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                       
 3298   redis     20   0    50.5g  47g   1028 D    0      75.7     5124692h redis-server     

Master and Slave are 64bit, 64GB machines on gigabit network. The machine was not swapping while doing data transfer. Client output buffer soft and hard limit on master was set to 0. RDB snapshot is 14G on disk.

Thanks,
Manohar.

Josiah Carlson

unread,
Apr 2, 2014, 1:42:30 PM4/2/14
to redi...@googlegroups.com
I'm not able to determine the cause of the slowdown based on the INFO output. But one thing I noticed is that it takes about 11 minutes to create the snapshot, and at 20 megs/second, another 10+ minutes to transfer the snapshot. But once the snapshot is transferred, remember that the slave needs to *load* that data, which could take a long time.

Have you examined the slave to determine it's state during all of this? Thinking about it, it's possible that the sending of the snapshot has completed, and the 20k/second transfer is what it takes to sync write commands from the master of the slave. Can you check INFO output on the slave there? Does db0/db1 have a number of keys that is growing over time? What does the Redis log say for both your master and slave?

 - Josiah

manohar ht

unread,
Apr 2, 2014, 2:06:38 PM4/2/14
to redi...@googlegroups.com
Hi Josiah,

The issue is the transfer itself doesn't complete, after BGSAVE completes on master, dump transfer starts from master to slave, but slows down immediately and this leads to client disconnections and slow command execution on master and we had to break replication to recover back. We tried twice with same result once with client-output-buffer-limit enabled and later by setting no limit on it.

- Manohar.

manohar ht

unread,
Apr 2, 2014, 2:10:34 PM4/2/14
to redi...@googlegroups.com
Also tried looking into redis logs on both master and slave. I found no errors logged. The rdb transfers continues but with slow transfer rates by which time many issues started showing up on master.

Josiah Carlson

unread,
Apr 2, 2014, 2:13:20 PM4/2/14
to redi...@googlegroups.com
... what other issues are showing up on your Master?

 - Josiah

manohar ht

unread,
Apr 2, 2014, 2:42:59 PM4/2/14
to redi...@googlegroups.com
2 main issues

- Client disconnections.

Apr  2 05:39:47  redis[3298]: Reading from client: Connection reset by peer
Apr  2 05:39:47  redis[3298]: Client closed connection
Apr  2 05:39:47  redis[3298]: Reading from client: Connection reset by peer
Apr  2 05:39:47  redis[3298]: Client closed connection
Apr  2 05:39:47  redis[3298]: Reading from client: Connection reset by peer
Apr  2 05:39:47  redis[3298]: Client closed connection
Apr  2 05:39:47  redis[3298]: Client closed connection
Apr  2 05:39:47  redis[3298]: Reading from client: Connection reset by peer
Apr  2 05:39:47  redis[3298]: Client closed connection
Apr  2 05:39:47  redis[3298]: Reading from client: Connection reset by peer

- Any command execution on redis master like auth or info takes long time to complete upto 40 secs.

- Manohar.

Yiftach Shoolman

unread,
Apr 2, 2014, 3:59:53 PM4/2/14
to redi...@googlegroups.com
Can you please send the redis logs from both master and slave ?

Yiftach Shoolman
+972-54-7634621

Josiah Carlson

unread,
Apr 2, 2014, 5:51:38 PM4/2/14
to redi...@googlegroups.com
Okay, this really sounds like one or two things:
1. Your machine is swapping - can you give us the first 5 lines of the output of 'top', the lines that begin with "top", "Tasks:", "Cpu(s):", "Mem:", and "Swap:"?
2. Your machine is trying to sync data out to another machine for data storage (though this *should* have been happening during writes if you are on EBS or NFS)

There are a bunch of other things that might be the issue (hardware on the master, switch/router, someone sniffing the network, ...), but those seem less likely than the above.

 - Josiah

Alexander O. Yuriev

unread,
Apr 3, 2014, 8:34:51 AM4/3/14
to redi...@googlegroups.com
Hello,

It is neither swapping nor is it using a network storage - concurrently with that issue it is possible to fetch the
dump using SFTP at a reasonable rate ( ~ 600Mbit/sec ).

manohar ht

unread,
Apr 3, 2014, 8:37:33 AM4/3/14
to redi...@googlegroups.com
Hi Josiah,

Tasks: 231 total,   1 running, 230 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 91.7%id,  7.8%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  66060688k total, 61098168k used,  4962520k free,   192800k buffers
Swap:  4192956k total,   463508k used,  3729448k free,  9741900k cached

There is no other process which is syncing data to other machine other than redis dump transfer. Also we don't use EBS or NFS, the data is transferred and received to and from local disk of master and slave.

Josiah Carlson

unread,
Apr 3, 2014, 12:41:07 PM4/3/14
to redi...@googlegroups.com
That output tells me that you have 463 megabytes swapped to disk, while 9.7 gigabytes are cached, 5 gigabytes are free, and 7.8% of your total CPU time is spent waiting on IO. That would suggest close to 100% of 1 core in a 12 core system is spending its time dealing with network and/or disk IO - probably disk, and it's probably Redis trying to get its data back into main memory. You can hit '1' on your keyboard to swap between the 'aggregate' CPU view and "individual"  CPU view to prove or disprove my hypothesis. You can also see what is using the most memory by typing a capital "M" (it is arguably the most useful view you can have on a machine with Redis running). You've got almost 5 gigs of free memory, which was likely used during the snapshotting process by the forked child, and part of your cache is likely caching the resulting snapshot.

Since all you *really* need to do is to get the data from swap back into Redis, and you (at least for now) have plenty of free memory, you might just be able to sequentially read your swap to get it into your cache, which would then let Redis read it much faster. A quick "sudo dd if=<path to your swap partition> of=/dev/null ibs=1M count=4k" should get it read.

My guess is that the snapshot is also taking up a bunch of your cache, which you can reduce significantly (and help with future issues) with the use of 'vmtouch': https://github.com/hoytech/vmtouch . You can see how much of the file is in cache with 'vmtouch <dump file>', and you can force it to be un-cached (releasing memory) with 'vmtouch -e <dump file>'. This may seem like a bad idea, but since the access pattern on the snapshot is completely sequential, the kernel will start doing read-ahead immediately after the file is dropped from the cache, and shouldn't suffer too horribly from hiccups.


Long-term, you've got a memory problem. Heck, you've already got a memory problem. If you can migrate some of your data from that one instance, awesome. If you can outright eliminate some of that data, even better. If you've got other things running on that machine, try to move them off. If you can kill a service that isn't useful, awesome. If you can get a bigger machine, that's great too - I know some folks that rent 512 gig memory machines for $2k US/month (as of 9 months ago) in the US.


Let me know if the 'dd' command helped, and whether the snapshot was using a bunch of cache. Note that you may want/need to use vmtouch periodically on the snapshot to ensure that it's not taking up main memory when it doesn't have to, which you might be able to use to prevent swapping in the first place. As an example, you can set up vmtouch to run once/minute while the snapshotting process is taking place, or even just toss in a crontab, which should keep its cache usage low during and after the snapshotting/slaving process.

Incidentally, having Redis make the system calls to advise the kernel to not cache the snapshot would be a good patch, and might be a good long-term solution for people in memory-constrained environments.

 - Josiah

Josiah Carlson

unread,
Apr 7, 2014, 4:30:16 PM4/7/14
to redi...@googlegroups.com

I'm going to guess that because you are also from Livestream (like Manohar) you are hopping in as another voice in the conversation.

My response is simple: just because you can transfer data through some other program at high rates of speed doesn't mean that Redis isn't swapped. Something is eating your IO. Check out: http://guichaz.free.fr/iotop/ to watch for what is and is not, paying very close attention to the "SWAPIN" column to either confirm or correct my claim.

 - Josiah


Reply all
Reply to author
Forward
0 new messages