Analysis of perf sched for simple server.

199 views
Skip to first unread message

John Hening

unread,
Apr 23, 2018, 3:33:17 PM4/23/18
to mechanica...@googlegroups.com
Hello,

1. I have a simple one-threaded tcp server written in Java. I try to measure its receive-and-process-response latency. More preceisely, the client sends (by loopback) the 128-bytes message with a timestamp in the header. The server receives a message, reads a content byte by byte and compute a difference between `now` and the timestamp from the header. The difference is more or less 6μs.

Now, I am trying to make it faster.

But, firstly I would like to examine a scheduling issue. So, I've collected results with:

perf sched record -p <PID> -- sleep 10



and then:

perf sched timehist -V



Excerpt from the result is presented below: (I've filtered it for my server thread)

               
                                                                    wait time  sch delay   run time
   time                  cpu    task name
[tid/pid]                 (msec)     (msec)     (msec)
--------------- ------  ---------  ------------------------------  ---------  ---------  ---------

   
1849.531842  [0002]    s        java[7340/7338]                     0.000      0.000     56.003
   
21849.587844 [0002]    s        java[7340/7338]                     0.000      0.000     56.001
   
21849.607838 [0002]    s        java[7340/7338]                     0.000      0.000     19.994
   
21849.615836 [0002]    s        java[7340/7338]                     0.000      0.000      7.998
   
...............................................................................................
   
21849.691834 [0002]    s        java[7340/7338]                     0.000      0.000      4.000
   
21849.703837 [0001]   s         java[7340/7338]                     0.000      0.000     38.330
   
21849.711838 [0005]       s     java[7340/7338]                     0.000      0.000      0.000
   
21849.719834 [0005]       s     java[7340/7338]                     0.000      0.000      7.996



My question is:
How is it possible that wait_time is always zero? After all, it is impossible. My CPU is multicore but there is a lot of processes (threads) that needs CPU time. How to interpret that?
From the other hand I am gonna to reduce CPU migration. Perhaps it will help :-)


2. The second issue is:

The excerpt from perf sched script

            java  
7340 [002] 21848.012360:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012375:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012391:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012406:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
           
...................................................................................................
         swapper    
0 [007] 21848.012554:       sched:sched_wakeup: comm=java pid=7377 prio=120 target_cpu=007
         swapper    
0 [007] 21848.012554:       sched:sched_wakeup: comm=java pid=7377 prio=120 target_cpu=007
           
...................................................................................................
            java  
7340 [002] 21848.012555:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7377 [007] 21848.012582: sched:sched_stat_runtime: comm=java pid=7377 runtime=37420 [ns] vruntime=1628300433237 [ns]
            java  
7340 [002] 21848.012585:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7377 [007] 21848.012587:       sched:sched_switch: prev_comm=java prev_pid=7377 prev_prio=120 prev_state=S ==> next_comm=swapper/7 next_pid=0 next_prio=120

Why my server receives sched_wakup that looks like spurious wakeups? What is a swapper?

Please explain and help :-)

Gil Tene

unread,
Apr 23, 2018, 4:41:02 PM4/23/18
to mechanical-sympathy


On Monday, April 23, 2018 at 12:33:17 PM UTC-7, John Hening wrote:
Hello,

1. I have a simple one-threaded tcp server written in Java. I try to measure its receive-and-process-response latency. More preceisely, the client sends (by loopback) the 128-bytes message with a timestamp in the header. The server receives a message, reads a content byte by byte and compute a difference between `now` and the timestamp from the header. The difference is more or less 6μs.

That "more or less" is probably a pretty a big range ;-).  As in it's probably 6usec +/- 100000% or more. As in "that occasional 6msec glitch you'll be very lucky stay below is 1000x as big as your common case".

Avi Kivity

unread,
Apr 24, 2018, 4:45:49 AM4/24/18
to mechanica...@googlegroups.com, John Hening



On 2018-04-23 22:33, John Hening wrote:
Hello,

1. I have a simple one-threaded tcp server written in Java. I try to measure its receive-and-process-response latency. More preceisely, the client sends (by loopback) the 128-bytes message with a timestamp in the header. The server receives a message, reads a content byte by byte and compute a difference between `now` and the timestamp from the header. The difference is more or less 6μs.

Now, I am trying to make it faster.

But, firstly I would like to examine a scheduling issue. So, I've collected results with:

perf sched record -p <PID> -- sleep 10

and then:

perf sched timehist -V

Excerpt from the result is presented below: (I've filtered it for my server thread)

                

                                                                    wait time  sch delay   run time
  
time                  cpu    task name [tid/pid]                 (msec)     (msec)     (msec)
--------------- ------  ---------  ------------------------------  ---------  ---------  ---------

   
1849.531842  [0002]    s        java[7340/7338]                     0.000      0.000     56.003
   
21849.587844 [0002]    s        java[7340/7338]                     0.000      0.000     56.001
   
21849.607838 [0002]    s        java[7340/7338]                     0.000      0.000     19.994
   
21849.615836 [0002]    s        java[7340/7338]                     0.000      0.000      7.998
   ...............................................................................................
   
21849.691834 [0002]    s        java[7340/7338]                     0.000      0.000      4.000
   
21849.703837 [0001]   s         java[7340/7338]                     0.000      0.000     38.330
   
21849.711838 [0005]       s     java[7340/7338]                     0.000      0.000      0.000
   
21849.719834 [0005]       s     java[7340/7338]                     0.000      0.000      7.996

My question is:
How is it possible that wait_time is always zero? After all, it is impossible. My CPU is multicore but there is a lot of processes (threads) that needs CPU time. How to interpret that?
From the other hand I am gonna to reduce CPU migration. Perhaps it will help :-)

Maybe your setup is broken in some way. But perhaps if you unfilter the results it will make more sense.




2. The second issue is:

The excerpt from perf sched script

            java  
7340 [002] 21848.012360:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012375:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012391:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7340 [002] 21848.012406:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
           
...................................................................................................
         swapper    
0 [007] 21848.012554:       sched:sched_wakeup: comm=java pid=7377 prio=120 target_cpu=007
         swapper    
0 [007] 21848.012554:       sched:sched_wakeup: comm=java pid=7377 prio=120 target_cpu=007
           
...................................................................................................
            java  
7340 [002] 21848.012555:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7377 [007] 21848.012582: sched:sched_stat_runtime: comm=java pid=7377 runtime=37420 [ns] vruntime=1628300433237 [ns]
            java  
7340 [002] 21848.012585:       sched:sched_wakeup: comm=java pid=7511 prio=120 target_cpu=005
            java  
7377 [007] 21848.012587:       sched:sched_switch: prev_comm=java prev_pid=7377 prev_prio=120 prev_state=S ==> next_comm=swapper/7 next_pid=0 next_prio=120

Why my server receives sched_wakup that looks like spurious wakeups? What is a swapper?




When a cpu goes idle, it switches to the swapper (an idiosyncrasy). An interrupt then wakes up the cpu, runs the TCP stack, which decides to wake up your process. Those wakeups will be seen from any process which happens to run; swapper is simply the task that runs when nothing needs the cpu.

Peter Booth

unread,
Apr 25, 2018, 2:15:11 PM4/25/18
to mechanical-sympathy
I have  a couple of suggestions:

0. It costs about 29 nanos to grab a timestamp. If you are doing it on every message sent and received you might end up with your calls to clock_gettime being your bottleneck
1. Nitsan wakart wrote a nice Java ping benchmark see http://psy-lob-saw.blogspot.com/2013/07/a-java-ping-buffet.html 
which is available in git. It's worth comparing with your test and seeing if the numbers are similar. His allows you to choose between spinning, blocking, using blocking or nonblocking select

2. How many cores do you have on your host? how much traffic are you sending?

3. If you use pidstat you can see  a per second count of the number of voluntary and involuntary context switches.
Reply all
Reply to author
Forward
0 new messages