Hello, my team and I are new to Redis. We have implemented a new Redis remote cache into our application and have been live in production for a month or so. We are seeing a CPU spike virtually every hour at the top of the hour. This in turn is causing some socket timeouts within our Jedis client from our application. We normally average around 75% idle CPU. However, for a few seconds at the top of each hour Redis is maxing out on CPU. Additionally, I can also see these CPU spikes at the top of each minute (ie. 0, 1, or 2 seconds past each minute) but they are not nearly as extreme and impactful as the CPU spike at the top of the hour (ie 00 minutes past the hour). We are not doing any replication or background saving. I also see these CPU spikes on our backup and development servers even with no traffic or entries hitting the cache. I’m providing the info stats, command stats, and SAR info in case it helps. I’m losing my mind here and any ideas or suggestions would be a huge help. Thanks in advance!
info
# Server
redis_version:3.0.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:72cf55e957a6f159
redis_mode:standalone
os:Linux 3.10.0-862.14.4.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:1265
run_id:4ea2d39a518051c2934bbfe3b6079eab90713e4b
tcp_port:6379
uptime_in_seconds:1954685
uptime_in_days:22
hz:10
lru_clock:16533909
config_file:/opt/redis/conf/redis.conf
# Clients
connected_clients:604
client_longest_output_list:0
client_biggest_input_buf:125976
blocked_clients:0
# Memory
used_memory:5317756536
used_memory_human:4.95G
used_memory_rss:5719674880
used_memory_peak:6309424888
used_memory_peak_human:5.88G
used_memory_lua:36864
mem_fragmentation_ratio:1.08
mem_allocator:jemalloc-3.6.0
# Persistence
loading:0
rdb_changes_since_last_save:697187150
rdb_bgsave_in_progress:0
rdb_last_save_time:1541305880
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
# Stats
total_connections_received:7235698
total_commands_processed:1482965886
instantaneous_ops_per_sec:1552
total_net_input_bytes:41835380717205
total_net_output_bytes:3654192431288
instantaneous_input_kbps:46642.76
instantaneous_output_kbps:1763.65
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:627574702
evicted_keys:0
keyspace_hits:68917352
keyspace_misses:697123893
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
migrate_cached_sockets:0
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:324772.25
used_cpu_user:105355.06
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=83335,expires=83335,avg_ttl=20452
# Commandstats
cmdstat_get:calls=766297676,usec=6369187523,usec_per_call=8.31
cmdstat_setex:calls=697420911,usec=26393919537,usec_per_call=37.85
cmdstat_auth:calls=6441161,usec=81666380,usec_per_call=12.68
cmdstat_ping:calls=13303192,usec=58604787,usec_per_call=4.41
cmdstat_info:calls=434,usec=213743,usec_per_call=492.50
cmdstat_monitor:calls=4,usec=27,usec_per_call=6.75
cmdstat_config:calls=3,usec=226,usec_per_call=75.33
cmdstat_slowlog:calls=32,usec=1676,usec_per_call=52.38
08:59:45 AM CPU %user %nice %system %iowait %steal %idle
08:59:55 AM all 5.68 0.00 20.45 0.00 0.00 73.86
08:59:56 AM all 5.68 0.00 18.75 0.00 0.00 75.57
08:59:57 AM all 4.92 0.00 14.21 0.00 0.00 80.87
08:59:58 AM all 4.97 0.00 14.92 0.00 0.00 80.11
08:59:59 AM all 3.87 0.00 16.02 0.00 0.00 80.11
09:00:00 AM all 4.97 0.00 16.57 0.00 0.00 78.45
09:00:01 AM all 19.58 0.00 34.97 0.00 0.00 45.45
09:00:02 AM all 6.38 0.00 24.47 0.00 0.00 69.15
09:00:05 AM all 23.38 0.00 76.62 0.00 0.00 0.00
09:00:06 AM all 27.94 0.00 72.06 0.00 0.00 0.00
09:00:06 AM CPU %user %nice %system %iowait %steal %idle
09:00:07 AM all 22.73 0.00 68.18 0.00 0.00 9.09
09:00:08 AM all 40.00 0.00 60.00 0.00 0.00 0.00
09:00:09 AM all 53.93 0.00 46.07 0.00 0.00 0.00
09:00:10 AM all 45.33 0.00 54.67 0.00 0.00 0.00
09:00:11 AM all 24.79 0.00 29.91 0.00 0.00 45.30
09:00:12 AM all 13.25 0.00 56.02 0.00 0.00 30.72
09:00:13 AM all 14.20 0.00 46.15 0.00 0.00 39.64
09:00:14 AM all 17.75 0.00 53.25 0.00 0.00 28.99
09:00:15 AM all 12.74 0.00 61.15 0.00 0.00 26.11
09:00:16 AM all 10.86 0.00 52.00 0.00 0.00 37.14
09:00:17 AM all 11.05 0.00 61.63 0.00 0.00 27.33
09:00:18 AM all 11.43 0.00 53.71 0.00 0.00 34.86
09:00:19 AM all 9.04 0.00 34.46 0.00 0.00 56.50
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.
save 900 1 save 300 10 save 60 10000
Try configuring Redis as an LRU cache, maxmemory 6gb, eviction allkeys-lru, and replace your setex calls with set.
First and foremost, thank you to all that responded and helped here. While we have yet to find the culprit, we will be focusing on the cron job theory. Again, these CPU spikes are also occurring on backup and development servers that have no traffic or keys in the cache.
Jus to answer a few of the questions… the slow log has produced no unexpected results. The last save occurred almost a month ago when the server was last restarted. And while I admit that we are using Redis in an unorthodox manor, it is in fact by design and is providing a big impact/savings to our company. The cache hit rate of 9% is expected. Each key/value is intentionally being set with an expiration of 60 seconds. There is no bulk loading taking place. We are using a JedisPool, but I do not have an exact understanding of how the connections are being used and may eventually shift my focus there.
Again… thank you!