libvma tcp client using 100% cpu

133 views
Skip to first unread message

Bud

unread,
Sep 9, 2014, 11:56:48 AM9/9/14
to libvm...@googlegroups.com
Hi, I have an application with a tcp client that occasionally gets into a strange state where it continually polls even with no data.  I'm looking for suggestions on what may be going on.

I'm running on Ubuntu 14.04 with VMA 6.6.4 / MLNX_OFED_LINUX-2.2-1.0.1.  VMA_RX_POLL is set to 100000.

Here is the output from vma_stats:

======================================================
Fd=[73]
- TCP, Blocked
- Bound IF  = [0.0.0.0:37471]
- Connected = [x.x.x.x:9148]
Tx Offload: 1740 KB / 5560 / 0 / 0 [bytes/packets/drops/errors]
Rx Offload: 2088 KB / 11248 / 0 / 0 [bytes/packets/eagains/errors]
Rx byte: cur 0 / max 2256 / dropped 0 / limit 0
Rx pkt : cur 0 / max 3 / dropped 0
Rx poll: 336946390 / 2011 (0.00%) [miss/hit]
======================================================
Fd=[54]
- UDP, Blocked, MC Loop Enabled 
- Bound IF  = [y.y.y.y:10048]
- Member of = [y.y.y.y]
Rx Offload: 136662 KB / 2552648 / 0 / 0 [bytes/packets/eagains/errors]
Rx byte: cur 0 / max 178777 / dropped 0 / limit 4194304
Rx pkt : cur 0 / max 3230 / dropped 0
Rx poll: 136014 / 2313542 (94.45%) [miss/hit]
======================================================
Fd=[55]
- UDP, Blocked, MC Loop Enabled 
- Bound IF  = [z.z.z.z:10048]
- Member of = [z.z.z.z]
Rx Offload: 136662 KB / 2552648 / 0 / 0 [bytes/packets/eagains/errors]
Rx byte: cur 0 / max 207126 / dropped 0 / limit 4194304
Rx pkt : cur 0 / max 3779 / dropped 0
Rx poll: 138248 / 2312221 (94.36%) [miss/hit]
======================================================
poll
Polling CPU:0%
======================================================
CQ=[0]
Packets dropped:              0    
Packets queue len:            0
Drained max:                  0
Buffer pool size:            43
Buffer disorder:          49.74%
======================================================
======================================================
RING=[0]
Packets count:          5117097    
Packets bytes:        497341835    
Interrupt requests:  41291399870    
Interrupt received:      140131    
Moderation frame count:          0
Moderation usec period:          0
======================================================
-------------------------------------------------------------------------------

The Rx Poll misses are continually increasing for the tcp client FD 73 with a hit rate of near zero. 

If I jack up the log level and trace level, I see the following:

 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:793:request_notification() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:820:request_notification() returning with 0
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:279:poll() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:793:request_notification() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:820:request_notification() returning with 0
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:279:poll() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:793:request_notification() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:820:request_notification() returning with 0
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx() 
 Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+  : cqm[0x7fbcb802da10]:279:poll() 

This pattern seems to repeat 10 times every 250us. However, there is another pattern that is more like this:

 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC   : si_tcp[fd=73]:2389:rx_wait_helper() 
 Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+  : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

To me, it seems like it wants to block but for some reason it doesn't.  Anyone have any thoughts on this?

-- Bud

Alex Rosenbaum

unread,
Sep 10, 2014, 2:16:27 AM9/10/14
to libvm...@googlegroups.com

From the pieces of information you wrote I see your application has 3 sockets (1 TCP + 2 UDP multicast). I am assuming you have 3 threads, each spinning on one of these blocking sockets.

Please provide more information if I am wrong about some of these assumptions.

 

When you set VMA_RX_POLL to 100000, each thread that calls VMA's ‘recv()’ (or similar) will spin (100% CPU) on the HW queue until it successfully receives an ingress matching packet or the polling budget is consumed (100K cycles in your setting). When the polling budget (VMA_RX_POLL) is consumed the thread will arm interrupts and block until the next packet arrives.

In the attached vma_stats, both your UDP sockets received about 2.5 M packets (for whatever duration the test ran), so they are very busy, meaning that there is a high chance that the next 100,000 HW polling cycles will catch the next ingress packet successfully and not need to arm interrupts (and save IRQ handling + contact switches). This will +1 the “Rx poll hit” ratio for that socket. And as expected both UDP socket have a Rx poll hit ration of ~94% in your attached vma_stats.

For that same test duration, your TCP socket received only 11K packets, which is less than 0.5% of the UDP sockets traffic rate. So it seems that the HW polling duration is not long enough to catch the next TCP packet, which causes a “Rx poll miss” and entering blocking mode for that socket.


I expect you’re seeing these logs as ‘patterns’ since the default VMA configuration is to use a single ring per process. Your application has 3 networking threads, and which the TCP thread is blocking, the UDP thread keep on spinning on that same ring. Each ingress UDP (high rate) packet triggers the interrupt and the TCP socket wakes-up to lern nothing new arrived but causes it to start over it’s HW polling budget.

 

You can solve this with tuning VMA to use ring per thread.

Please read more here: https://code.google.com/p/libvma/wiki/VMA_Parameters#My_application_is_multithreaded


Regards,

Alex

Bud

unread,
Sep 10, 2014, 10:48:07 AM9/10/14
to libvm...@googlegroups.com
That makes sense.  So I have one numa node with 6 cores (12 hyperthreaded) and as you said, two multicast receive threads and one tcp socket (with two threads -- Rx/Tx) and the tcp socket doesn't do a lot.  In addition, there are a few other tcp sockets handled via the os.

From what I am reading, 

VMA_RING_ALLOCATION_LOGIC_RX=20
VMA_RX_BUFS= 150000 # networking_threads (3) X 50000

Should improve the situation in that it will allow the threads to block after polling when there isn't anything to do.

One more question.  If I run two instances of the app, would I set the VMA_RX_BUFS to 300000 or do these apply at the process level.

Thanks for your help Alex.  I'll report back.  -- Bud

Bud

unread,
Sep 11, 2014, 3:42:08 PM9/11/14
to libvm...@googlegroups.com
Alex, That seems to have helped - that tcp client thread now seems to go to sleep after spinning for a brief period.  Thanks!  -- Bud
Reply all
Reply to author
Forward
0 new messages