Very high CPU usage in redis cluster

101 views
Skip to first unread message

Sumit Singh

unread,
May 12, 2022, 11:15:32 AMMay 12
to Redis DB

Screenshot 2022-05-12 at 12.13.25 PM.png
I am running a 12 node cluster (6 master, 6 nodes)
I am not able to figure out what could be the reason for such high CPU usage?
  • 8 core machine on AWS EC2 (c5n.2xlarge) instance. 12 different machines used.
  • Peak network I/O is 50-60 MBps.
  • Configuration of io-threads is 6 and save feature is off.
  • Instantaneous operations per second is max 500 per node in cluster.
  • Cluster only used for PUBSUB workload.
My ec2 instance is reporting CPU usage upto 80% in peak duration.
Is it safe to run this cluster in this configuration. Is there something wrong with my redis setup?

output of INFO command

# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:ff608f42b3f8bc5a
redis_mode:cluster
os:Linux 5.11.0-1020-aws x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.4.0
process_id:16829
process_supervised:no
run_id:590c744334a6abed6d2cc6bb5731528fa54add14
tcp_port:7000
server_time_usec:1652337925894382
uptime_in_seconds:2377260
uptime_in_days:27
hz:10
configured_hz:10
lru_clock:8170757
executable:/usr/local/bin/redis-server
config_file:/etc/redis/redis.conf
io_threads_active:1

# Clients
connected_clients:1985
cluster_connections:22
maxclients:65536
client_recent_max_input_buffer:32
client_recent_max_output_buffer:21632
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0

# Memory
used_memory:2211133432
used_memory_human:2.06G
used_memory_rss:65937408
used_memory_rss_human:62.88M
used_memory_peak:2234166104
used_memory_peak_human:2.08G
used_memory_peak_perc:98.97%
used_memory_overhead:2193319424
used_memory_startup:5121992
used_memory_dataset:17814008
used_memory_dataset_perc:0.81%
allocator_allocated:2211826304
allocator_active:2220785664
allocator_resident:2228867072
total_system_memory:21450035200
total_system_memory_human:19.98G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.00
allocator_frag_bytes:8959360
allocator_rss_ratio:1.00
allocator_rss_bytes:8081408
rss_overhead_ratio:0.03
rss_overhead_bytes:-2162929664
mem_fragmentation_ratio:0.03
mem_fragmentation_bytes:-2145603784
mem_not_counted_for_evict:0
mem_replication_backlog:2147483648
mem_clients_slaves:20512
mem_clients_normal:40693272
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1649960665
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:528384
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:106474705
total_commands_processed:458234053
instantaneous_ops_per_sec:286
total_net_input_bytes:105277842309
total_net_output_bytes:30471552871778
instantaneous_input_kbps:72.67
instantaneous_output_kbps:29856.47

rejected_connections:0
sync_full:1
sync_partial_ok:0
sync_partial_err:1
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:27968
evicted_keys:0
keyspace_hits:0
keyspace_misses:4
pubsub_channels:146
pubsub_patterns:73

latest_fork_usec:362
total_forks:1
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:1
dump_payload_sanitizations:0
total_reads_processed:542660612
total_writes_processed:47265437793
io_threaded_reads_processed:0
io_threaded_writes_processed:47065492223

# Replication
role:master
connected_slaves:1
slave0:ip=172.31.154.0,port=7007,state=online,offset=3294858,lag=0
master_failover_state:no-failover
master_replid:942586d3d9a105bbc13c473bfaab13fae83c35de
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:3294858
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:2147483648
repl_backlog_first_byte_offset:1
repl_backlog_histlen:3294858

# CPU
used_cpu_sys:521862.964736
used_cpu_user:7796612.363067
used_cpu_sys_children:0.000000
used_cpu_user_children:0.001205
used_cpu_sys_main_thread:111576.865254
used_cpu_user_main_thread:77105.481941

# Modules

# Errorstats
errorstat_ERR:count=1

# Cluster
cluster_enabled:1

# Keyspace

redis.conf
bind 172.31.157.136
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit pubsub 5gb 5gb 60
client-output-buffer-limit slave 5gb 5gb 60
cluster-allow-reads-when-down no
cluster-config-file nodes.conf
cluster-enabled yes
cluster-node-timeout 5000
cluster-replica-validity-factor 0
cluster-require-full-coverage yes
daemonize no
dbfilename dump.rdb
dir /var/lib/redis
disable-thp yes
io-threads 6
logfile /var/log/redis/redis-server.log
maxclients 65536
pidfile /var/run/redis/redis-server.pid
port 7000
rdbchecksum yes
rdbcompression yes
repl-backlog-size 2gb
repl-diskless-sync yes
replica-priority 100
replica-read-only yes
replica-serve-stale-data yes
save ""
stop-writes-on-bgsave-error no
tcp-backlog 65536
tcp-keepalive 300
timeout 0

Reply all
Reply to author
Forward
0 new messages