Groups

libvma tcp client using 100% cpu

133 views

Skip to first unread message

Bud

unread,

Sep 9, 2014, 11:56:48 AM9/9/14

to libvm...@googlegroups.com

Hi, I have an application with a tcp client that occasionally gets into a strange state where it continually polls even with no data. I'm looking for suggestions on what may be going on.

I'm running on Ubuntu 14.04 with VMA 6.6.4 / MLNX_OFED_LINUX-2.2-1.0.1. VMA_RX_POLL is set to 100000.

Here is the output from vma_stats:

======================================================

Fd=[73]

- TCP, Blocked

- Bound IF = [0.0.0.0:37471]

- Connected = [x.x.x.x:9148]

Tx Offload: 1740 KB / 5560 / 0 / 0 [bytes/packets/drops/errors]

Rx Offload: 2088 KB / 11248 / 0 / 0 [bytes/packets/eagains/errors]

Rx byte: cur 0 / max 2256 / dropped 0 / limit 0

Rx pkt : cur 0 / max 3 / dropped 0

Rx poll: 336946390 / 2011 (0.00%) [miss/hit]

======================================================

Fd=[54]

- UDP, Blocked, MC Loop Enabled

- Bound IF = [y.y.y.y:10048]

- Member of = [y.y.y.y]

Rx Offload: 136662 KB / 2552648 / 0 / 0 [bytes/packets/eagains/errors]

Rx byte: cur 0 / max 178777 / dropped 0 / limit 4194304

Rx pkt : cur 0 / max 3230 / dropped 0

Rx poll: 136014 / 2313542 (94.45%) [miss/hit]

======================================================

Fd=[55]

- UDP, Blocked, MC Loop Enabled

- Bound IF = [z.z.z.z:10048]

- Member of = [z.z.z.z]

Rx Offload: 136662 KB / 2552648 / 0 / 0 [bytes/packets/eagains/errors]

Rx byte: cur 0 / max 207126 / dropped 0 / limit 4194304

Rx pkt : cur 0 / max 3779 / dropped 0

Rx poll: 138248 / 2312221 (94.36%) [miss/hit]

======================================================

poll

Polling CPU:0%

======================================================

CQ=[0]

Packets dropped: 0

Packets queue len: 0

Drained max: 0

Buffer pool size: 43

Buffer disorder: 49.74%

======================================================

======================================================

RING=[0]

Packets count: 5117097

Packets bytes: 497341835

Interrupt requests: 41291399870

Interrupt received: 140131

Moderation frame count: 0

Moderation usec period: 0

======================================================

-------------------------------------------------------------------------------

The Rx Poll misses are continually increasing for the tcp client FD 73 with a hit rate of near zero.

If I jack up the log level and trace level, I see the following:

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:793:request_notification()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:820:request_notification() returning with 0

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:279:poll()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:793:request_notification()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:820:request_notification() returning with 0

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:279:poll()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:793:request_notification()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:820:request_notification() returning with 0

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:630:poll_and_process_element_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:570:poll_and_process_helper_rx()

Time: 2391708.000 Pid: 32219 Tid: 32241 VMA FUNC+ : cqm[0x7fbcb802da10]:279:poll()

This pattern seems to repeat 10 times every 250us. However, there is another pattern that is more like this:

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC : si_tcp[fd=73]:2389:rx_wait_helper()

Time: 2391852.250 Pid: 32219 Tid: 32241 VMA FUNC+ : si_tcp[fd=73]:2442:rx_wait_helper() 73: too many polls without data blocking=1

To me, it seems like it wants to block but for some reason it doesn't. Anyone have any thoughts on this?

-- Bud

Alex Rosenbaum

unread,

Sep 10, 2014, 2:16:27 AM9/10/14

to libvm...@googlegroups.com

From the pieces of information you wrote I see your application has 3 sockets (1 TCP + 2 UDP multicast). I am assuming you have 3 threads, each spinning on one of these blocking sockets.

Please provide more information if I am wrong about some of these assumptions.

When you set VMA_RX_POLL to 100000, each thread that calls VMA's ‘recv()’ (or similar) will spin (100% CPU) on the HW queue until it successfully receives an ingress matching packet or the polling budget is consumed (100K cycles in your setting). When the polling budget (VMA_RX_POLL) is consumed the thread will arm interrupts and block until the next packet arrives.

In the attached vma_stats, both your UDP sockets received about 2.5 M packets (for whatever duration the test ran), so they are very busy, meaning that there is a high chance that the next 100,000 HW polling cycles will catch the next ingress packet successfully and not need to arm interrupts (and save IRQ handling + contact switches). This will +1 the “Rx poll hit” ratio for that socket. And as expected both UDP socket have a Rx poll hit ration of ~94% in your attached vma_stats.

For that same test duration, your TCP socket received only 11K packets, which is less than 0.5% of the UDP sockets traffic rate. So it seems that the HW polling duration is not long enough to catch the next TCP packet, which causes a “Rx poll miss” and entering blocking mode for that socket.

I expect you’re seeing these logs as ‘patterns’ since the default VMA configuration is to use a single ring per process. Your application has 3 networking threads, and which the TCP thread is blocking, the UDP thread keep on spinning on that same ring. Each ingress UDP (high rate) packet triggers the interrupt and the TCP socket wakes-up to lern nothing new arrived but causes it to start over it’s HW polling budget.

You can solve this with tuning VMA to use ring per thread.

Please read more here: https://code.google.com/p/libvma/wiki/VMA_Parameters#My_application_is_multithreaded

Regards,

Alex

Bud

unread,

Sep 10, 2014, 10:48:07 AM9/10/14

to libvm...@googlegroups.com

That makes sense. So I have one numa node with 6 cores (12 hyperthreaded) and as you said, two multicast receive threads and one tcp socket (with two threads -- Rx/Tx) and the tcp socket doesn't do a lot. In addition, there are a few other tcp sockets handled via the os.

From what I am reading,

VMA_RING_ALLOCATION_LOGIC_RX=20

VMA_RX_BUFS= 150000 # networking_threads (3) X 50000

Should improve the situation in that it will allow the threads to block after polling when there isn't anything to do.

One more question. If I run two instances of the app, would I set the VMA_RX_BUFS to 300000 or do these apply at the process level.

Thanks for your help Alex. I'll report back. -- Bud

Bud

unread,

Sep 11, 2014, 3:42:08 PM9/11/14

to libvm...@googlegroups.com

Alex, That seems to have helped - that tcp client thread now seems to go to sleep after spinning for a brief period. Thanks! -- Bud

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu