High disk usage every few mins. between the master & slave

Dennis McEntire

unread,

Mar 20, 2013, 8:58:19 PM3/20/13

to redi...@googlegroups.com

Hi,

We have a redis master that has about 2.6GB of data (mostly UUIDs). We also have a slave redis running as well for that data. The master is running on an SSD drive.

Every few minutes both servers show high disk activity and solid network transfers. The CPU usage is high as well, and the box is very slow during this time.

Forgive my newness on this, but is there something I should look at or configure for this?

Note - I recently did a "slaveof noone" for the slave to stop replication and the problem seems to have gone away.

One note on the live redis box, I think we need to add some RAM. From top: Swap: 3831460k total, 353192k used

Thanks in advance for any tips,

Dennis

========================

In case it's helpful, some info from the redis-cli:

# Server
redis_version:2.6.9
redis_git_sha1:00000000
redis_git_dirty:0
redis_mode:standalone
os:Linux 2.6.29.1-desktop-4mnb i686
arch_bits:32
multiplexing_api:epoll
gcc_version:4.3.2
process_id:3290
run_id:2dc15eb96770f577c4d58462ba7d19f8b46f68b4
tcp_port:6379
uptime_in_seconds:3106698
uptime_in_days:35
lru_clock:67856

# Memory
used_memory:2800254040
used_memory_human:2.61G
used_memory_rss:2490617856
used_memory_peak:2800715496
used_memory_peak_human:2.61G
used_memory_lua:20480
mem_fragmentation_ratio:0.89
mem_allocator:jemalloc-3.2.0

# Persistence
loading:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1363824095
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:32
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok

# Stats
total_connections_received:931694
total_commands_processed:2238838
instantaneous_ops_per_sec:0
rejected_connections:0
expired_keys:21007
evicted_keys:0
keyspace_hits:1599146
keyspace_misses:242772
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:44213

# CPU
used_cpu_sys:186.30
used_cpu_user:83.79
used_cpu_sys_children:6216.39
used_cpu_user_children:61561.99

# Keyspace
db0:keys=429843,expires=10

Greg Andrews

unread,

Mar 20, 2013, 10:10:29 PM3/20/13

to redi...@googlegroups.com

Under the Persistence header, your info shows:

rdb_last_save_time:1363824095
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:32

and under CPU:

used_cpu_sys:186.30
used_cpu_user:83.79
used_cpu_sys_children:6216.39
used_cpu_user_children:61561.99

The last save time for an RDB snapshot was not long before you sent your message to the list. And it took 32 seconds to complete, which is a long time.

RDB snapshots spawn a background/child process to perform the writes to disk, and the CPU figures show those child processes are consuming a huge amount of cpu time relative to the main process. They're laboring hard.

It looks like that Redis instance is configured to save snapshots to disk, and writing to disk is slow and heavy. Are these Redis instance on servers with very busy hard drives, or virtual/cloud machines that have slow disk i/o speeds?

-Greg

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Josiah Carlson

unread,

Mar 20, 2013, 10:10:28 PM3/20/13

to redi...@googlegroups.com

It definitely looks like your master is short on memory, and it's also likely that your slave is too (assuming that it's got the same amount of memory). If Redis needs to hit swap for some of your data set, that could cause disconnections, which could cause a slave disconnect/re-connect (which would induce huge network and disk IO).

I'd upgrade the memory on both machines and try it again. Also, you may want to pay attention to the size of the outgoing buffer for the slave, as a slow network could be causing swapping + disconnection too.

Regards,

- Josiah

On Wed, Mar 20, 2013 at 5:58 PM, Dennis McEntire <dmce...@gmail.com> wrote:

Josiah Carlson

unread,

Mar 20, 2013, 11:04:56 PM3/20/13

to redi...@googlegroups.com

To answer your last question, Redis is partially swapped out (which is why it has lower resident memory than active memory), so when it goes to fork + dump to disk, it has to read that data from swap, and write other data to swap.

- Josiah

Dennis McEntire

unread,

Mar 21, 2013, 1:48:29 AM3/21/13

to redi...@googlegroups.com

Thanks for all the great info so far. Here's some quick answers to some of the questions that have come up:

1. First of all, I think it's RAM that's the issue, so I'm definitely on getting that upped.
2. The CPU is a dual core P4, standalone machine, with a 60GB SSD HD as the only drive in the system (OS and Redis).
3. The slave has been shut down, so I don't know if that's the main cause of the issue.

The comments about Redis configured to save snapshots to disk is interesting, I haven't changed anything in regards to the settings in the conf file if there are directives related to that. Should I make any adjustments to the conf file for this and make it persist at a different rate or schedule?

In regards to the 32 seconds time, the dump file is ~1.2GB in size, maybe that's why it takes so long?

Thanks again for everyone's input,

Dennis

Josiah Carlson

unread,

Mar 21, 2013, 3:09:36 AM3/21/13

to redi...@googlegroups.com

Low memory is the cause of your swapping. Snapshotting automatically as per configuration or when a slave (re)connects touches all of the memory that Redis uses, which induces your nasty disk IO.

Assuming that your hardware isn't too awful, writing a 1.2 gigabyte snapshot shouldn't be too bad. It's all sequential IO, which means that you should be able to write anywhere from 66-300 megabytes/second (depending your disk controller and potentially older SSD). That said, 1.2 gigs in 32 seconds is about 37 megs/second, which is only low because the OS needs to move data in/out of swap.

- Josiah

P.S. You're going to have to run a redis-benchmark on that P4; I'm curious to see how well it performs against processors that are 4-5 generations side-stepped (the core series was derived from the mobile Pentium chips, not the P4). Also, you may want to check out Dell refurbs (http://www.dfsdirectsales.com/), you can get Intel Core 2 duo/quads with 2-4 gigs of memory (easily upgraded) for $200-250. They are great dev/project machines, and generally support SATA2, DDR2 memory, PCI Express, etc.

Reply all

Reply to author

Forward