High sys CPU vs. user CPU and low throughput

93 views
Skip to first unread message

Dean

unread,
Aug 21, 2011, 2:32:18 PM8/21/11
to Redis DB
Hi all,

Running Redis 2.2.11 on a 12 core ~2.6Ghz box in a test environment.
For our test, we pre-populate Redis using a snapshot from production,
then turn one client loose on it. We have a single Jedis-based client
doing mostly O(n) calls (RPOPLPUSH, DEL, LLEN, LREM) and one HGETALL
on a 25 member hash - We are not using any KEYS, intersection, or
union calls. The client establishes around 50 connections to Redis.
There are no other connections other than one instance of redis-cli
and our monitoring system (which connects via telnet once every 60
seconds).

We are seeing performance limits around 5k TPS with client and server
on the same box via TCP/IP, so something is clearly not happy. One
thing that looks very atypical vs. the Redis instances that we use in
production is that the ratio of user CPU to sys CPU is highly biased
toward the system.

A recent test had us pegged around 5k TPS, 86% sys CPU and 4% user
CPU. The issue seems specific to the Jedis client or the way that
we're using it. We have a developer currently chasing things from the
Jedis side, but I've volunteered to look for clues on the redis-server
end of things.

We initially thought that the issue was related to high connection
churn. While it turns out that we had some, resolving it did not
change the behavior described above. Before I dive into poor man's
profiler, I though I'd hit the group up to see if anyone had seen this
type of behavior before and could volunteer possible causes.

Cheers,

Dean

Dean

unread,
Aug 21, 2011, 2:48:19 PM8/21/11
to Redis DB
I misstated - most of our commands are O(1), with HGETALL being the
only O(n) command.

Thanks,

Dean

Pieter Noordhuis

unread,
Aug 22, 2011, 9:48:14 AM8/22/11
to redi...@googlegroups.com
Hi Dean,

The 5kops performance ceiling could be the real maximum when you have
configured Redis to use the most aggressive AOF settings and you have
a slow disk. This would also explain the high percentage sys. Can you
post the output of INFO for this node? Also: if you don't have AOF
configured you could run redis-benchmark with the same number of
connections to find out its maximum throughput. If that is close to
the numbers you see from Jedis this is probably kernel related,
otherwise it is more likely to be Jedis/JVM related.

Cheers,
Pieter

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

Dean

unread,
Aug 22, 2011, 11:19:36 AM8/22/11
to Redis DB
Hi Pieter,

Thanks for your reply. Since this instance is only used for
throughput testing, all persistence is disabled. We load a *.rdb
snapshot at startup as part of our fixturing, then don't do anything
else disk-based.

Using redis-benchmark on the same node as our test client, 50 clients,
20 byte keys...no perf problem:

MSET (10 keys): 81967.21 requests per second
SET: 138888.89 requests per second
GET: 135135.14 requests per second
INCR: 140845.06 requests per second
LPUSH: 135135.14 requests per second
LPOP: 136986.30 requests per second

I also tried the same redis-benchmark test in a loop, and observed a
more normal 42% user CPU and 58% sys CPU split.

INFO dump appears below. Signs point to Jedis/JVM, but it seems
curious that a misconfigured/misbehaved client could so significantly
impair what seems to be an otherwise healthy and very capable redis-
server instance. Put another way, I'm chasing this for two reasons--
one, because I'd like to help our JVM developer find his performance
bottleneck and two, from an ops perspective I hadn't realized that one
client, even using benign calls, could severely impact the performance
of a shared redis-server instance.

Cheers,

Dean

redis_version:2.2.11
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
process_id:16347
uptime_in_seconds:36985
uptime_in_days:0
lru_clock:1379121
used_cpu_sys:1166.44
used_cpu_user:26.20
used_cpu_sys_childrens:0.00
used_cpu_user_childrens:0.00
connected_clients:1
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:1120703264
used_memory_human:1.04G
used_memory_rss:1193054208
mem_fragmentation_ratio:1.06
use_tcmalloc:1
loading:0
aof_enabled:0
changes_since_last_save:2059145
bgsave_in_progress:0
last_save_time:1313988472
bgrewriteaof_in_progress:0
total_connections_received:1898
total_commands_processed:7125181
expired_keys:4
evicted_keys:0
keyspace_hits:2031922
keyspace_misses:76307
hash_max_zipmap_entries:256
hash_max_zipmap_value:168
pubsub_channels:0
pubsub_patterns:0
vm_enabled:0
role:master
allocation_stats:
2=451557,6=1,7=2,8=3448655,9=91863,10=3877012,11=2893788,12=7722661,13=7190671,14=4486008,15=1948489,16=54707195,17=3237212,18=1170110,19=3211500,20=235093,21=287278,22=1480435,23=1203669,24=10079609,25=84741,26=215927,27=66528,28=15288,29=63428,30=141987,31=96306,32=5372416,33=16152,34=96022,35=15776,36=24026,37=21863,38=2705,39=402672,40=100808,41=18748,42=15057,43=29884,44=13604,45=106115,46=118183,47=12208,48=1739283,49=29076,50=317320,51=47183,52=90061,53=17985,54=8496,55=48353,56=92032,57=37733,58=11558,59=157767,60=48863,61=22656,62=21885,63=22127,64=272445,65=105332,66=308910,67=129241,68=20292,69=27185,70=15491,71=167286,72=603791,73=128738,74=14610,75=55334,76=17727,77=104561,78=19619,79=97816,80=18938,81=19126,82=28582,83=97010,84=76362,85=67444,86=31753,87=12532,88=300806,89=40610,90=45209,91=66709,92=72521,93=42245,94=29933,95=46089,96=100469,97=116238,98=66877,99=130024,100=45517,101=27527,102=27616,103=29307,104=50862,105=88922,106=61174,107=37111,108=77811,109=29533,110=21434,111=24996,112=13999,113=20958,114=23530,115=111981,116=26695,117=41817,118=44301,119=57606,120=72108,121=26597,122=21546,123=28813,124=17610,125=20013,126=34400,127=18971,128=267851,129=17378,130=17389,131=22439,132=15163,133=14505,134=25859,135=53945,136=37476,137=10633,138=26986,139=12542,140=11559,141=22980,142=8688,143=10352,144=19409,145=21671,146=18218,147=16443,148=28934,149=21970,150=17829,151=24994,152=15872,153=15329,154=17688,155=10820,156=10037,157=9696,158=15727,159=18307,160=19306,161=18826,162=22059,163=16983,164=11491,165=10396,166=10219,167=8145,168=7657,169=9987,170=9193,171=16361,172=8904,173=16732,174=9333,175=9498,176=5630,177=7637,178=4650,179=3755,180=4050,181=3608,182=4097,183=6462,184=7855,185=13796,186=12502,187=16452,188=28221,189=32362,190=22217,191=10053,192=6453,193=5403,194=3750,195=4677,196=4510,197=5547,198=3766,199=7937,200=11366,201=8317,202=7499,203=7077,204=4137,205=5475,206=4216,207=25837,208=5055,209=16889,210=6355,211=9726,212=9761,213=14765,214=17869,215=19835,216=8328,217=9897,218=4315,219=10332,220=5166,221=8200,222=4583,223=4932,224=4003,225=6551,226=9599,227=7405,228=7726,229=9618,230=11141,231=25402,232=13531,233=11402,234=9229,235=7644,236=7742,237=10380,238=13643,239=16801,240=17720,241=17095,242=22827,243=32857,244=35723,245=25874,246=15273,247=10181,248=9422,249=11373,250=13826,251=15594,252=17078,253=21007,254=20604,255=27643,>=256=4720106
db0:keys=403740,expires=0
db1:keys=28708,expires=0

Pieter Noordhuis

unread,
Aug 22, 2011, 11:54:24 AM8/22/11
to redi...@googlegroups.com
Hi Dean,

The percentages USR and SYS you mention belong to the Redis process
alone, correct? Such a big skew should be observable by means of
strace. The following one-liner prints system calls taking more than
0.1ms. It doesn't show epoll_wait calls because its time is of no use.

$ sudo strace -T -tt -e 'trace=!epoll_wait' -p $(pgrep redis-server)
2>&1 | ruby -e "while line = STDIN.gets; puts line if
line[/<(\d+\.\d+)>/, 1].to_f > 0.000100; end"

Note that you'll have some outliers here and there, but given the
amount of kernel time of the Redis process this snippet should have
different output (bigger outliers, higher average) when the Jedis
clients connect as opposed to the (normal) redis-benchmark test.

Cheers,
Pieter

Pieter Noordhuis

unread,
Aug 22, 2011, 11:55:19 AM8/22/11
to redi...@googlegroups.com
And be sure to concatenate that command on a single line if it gets wrapped :-)

Cheers,
Pieter

Dean

unread,
Aug 22, 2011, 1:25:12 PM8/22/11
to Redis DB
Hi Pieter,

Correct-- the USR and SYS belong to the Redis process alone, derived
from INFO by our monitoring system.

I tried strace initially at a 0.1ms threshold, but switched to a 1.0ms
threshold due to the volume of calls emitted at the 0.1ms level. What
follows is a few lines of each run. Apologies for the verbose post!

Cheers,

Dean

1.0ms threshold:

10:18:16.188622 epoll_ctl(3, EPOLL_CTL_MOD, 12, {EPOLLIN, {u32=12,
u64=12}}) = 0 <0.003865>
10:18:16.233628 write(13, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001364>
10:18:16.253800 write(12, "+QUEUED\r\n+QUEUED\r\n+QUEUED\r\n+QUEU"...,
1496) = 1496 <0.002405>
10:18:16.276025 epoll_ctl(3, EPOLL_CTL_MOD, 7, {EPOLLIN, {u32=7,
u64=7}}) = 0 <0.001079>
10:18:16.409464 epoll_ctl(3, EPOLL_CTL_MOD, 13, {EPOLLIN|EPOLLOUT,
{u32=13, u64=13}}) = 0 <0.001190>
10:18:16.511810 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001181>
10:18:16.670016 write(12, "+QUEUED\r\n+QUEUED\r\n+QUEUED\r\n+QUEU"...,
1496) = 1496 <0.001052>
10:18:18.216759 write(9, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001172>
10:18:18.217999 epoll_ctl(3, EPOLL_CTL_MOD, 9, {EPOLLIN, {u32=9,
u64=9}}) = 0 <0.004349>
10:18:18.245676 epoll_ctl(3, EPOLL_CTL_MOD, 18, {EPOLLIN|EPOLLOUT,
{u32=18, u64=18}}) = 0 <0.002916>
10:18:18.257650 write(19, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001471>
10:18:19.787420 epoll_ctl(3, EPOLL_CTL_MOD, 15, {EPOLLIN, {u32=15,
u64=15}}) = 0 <0.001252>
10:18:20.752972 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5,
u64=5}}) = 0 <0.001309>
10:18:21.033418 write(10, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.012612>
10:18:23.815541 read(10, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.003376>
10:18:24.203077 write(16, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.004387>
10:18:25.174018 epoll_ctl(3, EPOLL_CTL_MOD, 6, {EPOLLIN|EPOLLOUT,
{u32=6, u64=6}}) = 0 <0.007471>
10:18:26.663663 write(8, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001003>
10:18:26.944797 write(19, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001202>
10:18:28.271733 write(18, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.011329>
10:18:28.294274 write(12, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001754>
10:18:31.800587 write(14, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.005289>
10:18:31.961712 write(18, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001020>
10:18:34.646210 write(14, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001851>
10:18:35.661158 epoll_ctl(3, EPOLL_CTL_MOD, 6, {EPOLLIN, {u32=6,
u64=6}}) = 0 <0.001956>
10:18:38.229525 write(19, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.003662>
10:18:39.047475 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.003360>
10:18:39.377400 write(9, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001430>
10:18:44.789518 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.006046>
10:18:45.134278 epoll_ctl(3, EPOLL_CTL_MOD, 7, {EPOLLIN|EPOLLOUT,
{u32=7, u64=7}}) = 0 <0.015380>
10:18:46.183777 write(9, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.001778>
10:18:46.803998 write(18, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.003490>
10:18:47.478052 epoll_ctl(3, EPOLL_CTL_MOD, 9, {EPOLLIN, {u32=9,
u64=9}}) = 0 <0.001012>

0.1ms threshold:

10:15:21.038459 epoll_ctl(3, EPOLL_CTL_MOD, 15, {EPOLLIN|EPOLLOUT,
{u32=15, u64=15}}) = 0 <0.000187>
10:15:21.048211 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000119>
10:15:21.051098 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000126>
10:15:21.051857 write(13, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000143>
10:15:21.056802 write(15, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000163>
10:15:21.061832 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000105>
10:15:21.066182 read(13, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000105>
10:15:21.072175 write(7, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000138>
10:15:21.075149 write(11, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000102>
10:15:21.084490 write(6, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000107>
10:15:21.152902 write(15, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000160>
10:15:21.154809 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT,
{u32=5, u64=5}}) = 0 <0.000104>
10:15:21.160443 write(11, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000134>
10:15:21.170793 write(19, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000113>
10:15:21.176113 write(15, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000122>
10:15:21.182217 read(10, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000296>
10:15:21.195841 write(16, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000101>
10:15:21.197473 read(15, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000110>
10:15:21.199504 write(6, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000125>
10:15:21.202458 write(16, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000207>
10:15:21.241046 read(5, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000130>
10:15:21.248711 write(8, "+OK\r\n+QUEUED\r\n+QUEUED\r\n*2\r\n:0\r
\n:"..., 35) = 35 <0.000132>
10:15:21.249517 read(6, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000136>
10:15:21.252410 read(18, "*1\r\n$5\r\nMULTI\r\n*4\r\n$4\r\nLREM\r\n
$7\r"..., 1024) = 107 <0.000123>


On Aug 22, 8:55 am, Pieter Noordhuis <pcnoordh...@gmail.com> wrote:
> And be sure to concatenate that command on a single line if it gets wrapped :-)
>
> Cheers,
> Pieter
>
>
>
>
>
>
>
> On Mon, Aug 22, 2011 at 5:54 PM, Pieter Noordhuis <pcnoordh...@gmail.com> wrote:
> > Hi Dean,
>
> > The percentages USR and SYS you mention belong to the Redis process
> > alone, correct? Such a big skew should be observable by means of
> > strace. The following one-liner prints system calls taking more than
> > 0.1ms. It doesn't show epoll_wait calls because its time is of no use.
>
> > $ sudo strace -T -tt -e 'trace=!epoll_wait' -p $(pgrep redis-server)
> > 2>&1 | ruby -e "while line = STDIN.gets; puts line if
> > line[/<(\d+\.\d+)>/, 1].to_f > 0.000100; end"
>
> > Note that you'll have some outliers here and there, but given the
> > amount of kernel time of the Redis process this snippet should have
> > different output (bigger outliers, higher average) when the Jedis
> > clients connect as opposed to the (normal) redis-benchmark test.
>
> > Cheers,
> > Pieter
>
Reply all
Reply to author
Forward
0 new messages