Different cpu consumption by scylla threads with different linux kernels after the "nodetool drain" command

65 views
Skip to first unread message

Mark Barinstein

<mark.barinstein@gmail.com>
unread,
Mar 29, 2023, 7:44:42 AM3/29/23
to ScyllaDB users
Hi All,

The question is about different cpu consumption by scylla threads with different linux kernels after the nodetool drain command.
All the results are from a single node system, but behavior of multi-node systems is nearly the same.
scylladb 5.1.5 Open Source.
We suspect, that different kernel functions are called depending on an OS kernel version at least, and this explains different behavior.
The details are below.

Questions:
Is this known behavior?
Does it work as designed?

top [-1] -H -n1 -b -p $(pidof scylla)

Linux kernels 3.x / 4.x
Ubuntu 18.04, Centos 7/8, RHEL 8.1

Threads:  12 total,   1 running,  11 sleeping,   0 stopped,   0 zombie
%Cpu(s): 14.8 us,  9.8 sy,  0.0 ni, 73.8 id,  0.0 wa,  0.0 hi,  1.6 si,  0.0 st
KiB Mem :  3861256 total,  3041724 free,   455896 used,   363636 buff/cache
KiB Swap:  4063228 total,  4063228 free,        0 used.  3171324 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
  8993 scylla    20   0   16.0t 201992  32912 R 93.3  5.2   1:46.35 scylla    <--
  8994 scylla    20   0   16.0t 201992  32912 S  0.0  5.2   0:01.36 reactor-1
  ...


Linux kernels 5.x
Ubuntu 20.04, Centos 7 (5.x kernel is installed manually)

Threads:   8 total,   0 running,   8 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  1.7 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  3990136 total,  3286472 free,   406180 used,   297484 buff/cache
KiB Swap:  4063228 total,  4063228 free,        0 used.  3354412 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
  1263 scylla    20   0   16.0t 239768  65260 S  0.0  6.0   0:02.38 scylla
  1265 scylla    20   0   16.0t 239768  65260 S  0.0  6.0   0:01.70 reactor-1
  ...

 
On distros with the top -1 option available we see that first 1 or 2 threads are 100% busy.
The situation is slightly different in a multi-node environment:
On the drained node the reactor-1 thread consumes 100% cpu as well (2 theads are 100% busy in this case).
But not on other nodes, where the main scylla process consumes 100 of cpu only.

strace -p $(pidof scylla) -c

Linux kernels 3.x / 4.x
Ubuntu 18.04, Centos 7/8, RHEL 8.1

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.58    4.569387           3   1359826           epoll_pwait
  0.22    0.010306          20       510           write
  0.16    0.007365           6      1115           timerfd_settime
  0.02    0.000973          10        97           timer_settime
  0.01    0.000638           6        95           rt_sigreturn
  0.00    0.000046           5         8           rt_sigprocmask
------ ----------- ----------- --------- --------- ----------------
100.00    4.588715               1361651           total


Linux kernels 5.x
Ubuntu 20.04, Centos 7 (5.x kernel is installed manually)

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 71.14    0.692684         507      1365           io_pgetevents
  6.17    0.060087          12      4635           timerfd_settime
  5.31    0.051699          27      1859           io_submit
  4.71    0.045870          57       794           write
  3.95    0.038451          20      1901       182 read
  3.46    0.033645          22      1510           membarrier
  2.78    0.027074           9      2886           rt_sigprocmask
  2.48    0.024196          14      1702           timer_settime
------ ----------- ----------- --------- --------- ----------------
100.00    0.973706                 16652       182 total

Mark Barinstein

<mark.barinstein@gmail.com>
unread,
May 5, 2023, 4:14:38 PM5/5/23
to ScyllaDB users
For those who are interested.
The answer to this question has been provided here:
https://forum.scylladb.com/t/different-cpu-consumption-by-scylla-threads-with-different-linux-kernels-after-the-nodetool-drain-command

среда, 29 марта 2023 г. в 14:44:42 UTC+3, Mark Barinstein:
Reply all
Reply to author
Forward
0 new messages