Strange growing cpu usage in 2.6.7

Matt Billenstein

unread,

Dec 21, 2012, 3:37:27 PM12/21/12

to redi...@googlegroups.com

I've noticed a couple times redis has a cpu pegged at 100% and operations against it become very slow - restarting the process brings it back down to normal 1-3% usage that I expect and makes normal ops become fast again. This seems to have started when I upgraded from 2.4.17 to 2.6.7. After restarting yesterday, I'm in the 20-25% cpu utilization range as reported by top.

I use redis mainly as a cache with some some pubsub stuff also happening through the pubsub commands and blocking list pops. At any one time I might have 200-300 clients blocked and a small number doing get/set. Memory usage is low - < 1GB and the server seems generally snappy with low load -- > 90% idle even with this process spinning.

I'm going to leave this running for now in case anyone has any ideas on how to debug what's happening here - after it gets to maybe 50% cpu, I'll probably roll back to 2.4 and try that.

Stats:

$ redis-cli info
# Server
redis_version:2.6.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_mode:standalone
os:Linux 3.2.0-23-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.6.3
process_id:29234
run_id:743277e68006d805b447558cc29df5f41dedb6b9
tcp_port:6379
uptime_in_seconds:70288
uptime_in_days:0
lru_clock:1394474

# Clients
connected_clients:277
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:261

# Memory
used_memory:1576423696
used_memory_human:1.47G
used_memory_rss:798629888
used_memory_peak:1577030184
used_memory_peak_human:1.47G
used_memory_lua:31744
mem_fragmentation_ratio:0.51
mem_allocator:jemalloc-3.2.0

# Persistence
loading:0
rdb_changes_since_last_save:542876
rdb_bgsave_in_progress:0
rdb_last_save_time:1356051735
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok

# Stats
total_connections_received:1389953
total_commands_processed:5249307
instantaneous_ops_per_sec:40
rejected_connections:0
expired_keys:156224
evicted_keys:0
keyspace_hits:597436
keyspace_misses:1458886
pubsub_channels:3
pubsub_patterns:0
latest_fork_usec:28577

# Replication
role:master
connected_slaves:0

# CPU
used_cpu_sys:220.86
used_cpu_user:7416.73
used_cpu_sys_children:0.91
used_cpu_user_children:0.14

# Keyspace
db0:keys=57958,expires=57947

$ cat /proc/29234/status
Name:   redis-server
State: R (running)
Tgid:   29234
Pid:    29234
PPid:   1
TracerPid:      0
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
FDSize: 512
Groups: 4 24 27 30 46 111 112 1000
VmPeak: 1607880 kB
VmSize: 1607880 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:    780692 kB
VmRSS:    780692 kB
VmData: 1597796 kB
VmStk:       136 kB
VmExe:       760 kB
VmLib:      2984 kB
VmPTE:      3140 kB
VmSwap:        0 kB
Threads:        3
SigQ:   1/514571
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001001
SigCgt: 00000001800044c8
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ffffffff,ffffffff
Cpus_allowed_list:      0-63
Mems_allowed:   00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        13163553
nonvoluntary_ctxt_switches:     767016

thx

m

Matt Billenstein

unread,

Dec 21, 2012, 3:38:20 PM12/21/12

to redi...@googlegroups.com

Also, I'll add, I've tried restarting all the processes which talk to redis which should disconnect all clients - cpu usage remained flat at about 20-25%.

m

Salvatore Sanfilippo

unread,

Dec 21, 2012, 3:45:14 PM12/21/12

to Redis DB

Hello Matt,

there is a CPU utilization issue with Redis 2.6, but it is 1% when
idle instead of mostly 0% with 2.4.
However this was fixed into the unstable and 2.8 development branches.

Now this instead looks very odd. Can I ask you what happens if you
stop all the clients in some way if this is possible? Does the CPU
drops to 0%? This is definitely worth investigating as indeed it could
be a bug.

Another question is if you see strange things reported into the slow log.

If this does not help to identify the problem better, I've other
ideas. Please let's continue in this thread until the problem is
fixed, or if you prever for some reason we can switch to a Github
issue.

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/5kf9q3mLoasJ.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org

Beauty is more important in computing than anywhere else in technology
because software is so complicated. Beauty is the ultimate defence
against complexity.
— David Gelernter

Salvatore Sanfilippo

unread,

Dec 21, 2012, 3:46:38 PM12/21/12

to Redis DB

p.s. also some trace obtained using strace -p could help as well.

Yiftach Shoolman

unread,

Dec 21, 2012, 3:50:56 PM12/21/12

to redi...@googlegroups.com

Also, do you have any persistent storage configuration, if "yes" sharing this info can help finding the issue

Yiftach Shoolman
+972-54-7634621

Matt Billenstein

unread,

Dec 21, 2012, 4:07:20 PM12/21/12

to redi...@googlegroups.com

Hmm, perhaps a false alarm -- stopping the processes doing those keys commands brought the cpu usage down -- the reason a restart was working was because these processes were dying due to being disconnected from redis...

m

Salvatore Sanfilippo

unread,

Dec 21, 2012, 4:33:39 PM12/21/12

to Redis DB

On Fri, Dec 21, 2012 at 10:07 PM, Matt Billenstein <ma...@vazor.com> wrote:
> Hmm, perhaps a false alarm -- stopping the processes doing those keys
> commands brought the cpu usage down -- the reason a restart was working was
> because these processes were dying due to being disconnected from redis...

Another win for the Redis Slow Log :-) This calls for more
observability as many users are asking.
Thank you Matt.

Cheers,
Salvatore

Reply all

Reply to author

Forward