Why are latency values so erratic over enp6s0f3 interface in smallLAN profile?

31 views
Skip to first unread message

Gautam Thaker

unread,
Nov 2, 2023, 10:21:42 PM11/2/23
to emulab-users
Hi:

I have recently been running some TCP roundtrip latency measurements between two d430s using a large number of samples. (Either 1E6 or 1E7 samples, so 'mean' latencies shown  in attached graphs are believed statistically significant.)

I swapped in (and is still active) a 2 node 'smallLAN' profile experiment. I measured latencies using 'en01' and  'enp6s0f3' interfaces. Why do I see such erratic mean values when 'enp5s0f3' interface is used? Could this be due to some type of software defined network, rather than HW interfaces with good quality switch, being in play?

I have data going back to year 2004 (PC850), and 2008 (PC3000) doing this same test. Networks were slower back then, but results not so erratic. (1E6 samples for PC850 results, but 'only' 1E5 samples for PC3000 results).

Gautam

emulab_pc850_and_pc3000.png
emulab_d430_en01_vs_enp6s0f3.png

Leigh Stoller

unread,
Nov 3, 2023, 8:45:48 AM11/3/23
to emulab...@googlegroups.com

> I swapped in (and is still active) a 2 node 'smallLAN' profile experiment. I measured latencies using 'en01' and 'enp6s0f3' interfaces. Why do I see such erratic mean values when 'enp5s0f3' interface is used? Could this be due to some type of software defined network, rather than HW interfaces with good quality switch, being in play?

Hi. We would need to see the status page of your running experiment.
Please send the link.

Leigh


Gautam Thaker

unread,
Nov 3, 2023, 8:57:07 AM11/3/23
to emulab-users

Mike Hibler

unread,
Nov 3, 2023, 9:44:42 AM11/3/23
to emulab...@googlegroups.com
The only thing we can say with certainty is that there is no software-defined
anything involved. those nodes have Intel X710 1Gb NICs. And if the current
experiment is the one on which you gathered those measurements, both nodes
are attached to the same Dell S3048 switch which is not showing anything
unusual in terms of statistics on the ports in question (i.e., no errors,
overruns, etc.) Note that eno1 is the shared control network which is on
a dumb Dell N2048 switch (and we would prefer you not use that for high
volume experiment-related traffic).

One other observation is that the really big outlier is at 2k which is the
first point at which you cross over the 1500 byte MTU and would require
two ethernet packets per message.

Ultimately you will probably have to look at the TCP streams with wireshark
or something to see if there is something strange going on. I don't see
anything obvious at the Ethernet level.
> --
> You received this message because you are subscribed to the Google Groups
> "emulab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to emulab-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/
> emulab-users/57f69a4b-fc36-41e3-8aae-e84485732ed3n%40googlegroups.com.



Gautam Thaker

unread,
Nov 3, 2023, 10:58:16 AM11/3/23
to emulab-users
Thanks Mike for detailed response. Yes, I prefer/need to only test on   enp5s0f3, not en01, and only tested on en01 to confirm that more normal behavior is possible.

I have been making these measurements for some 30 years now, typically w/ eye towards examining tail end of latency distributions, and often see at 1 to 2 MTU cross over some strange behavior such as this. It is typically some issue in the system software, such a driver or software defined network switch, etc (not play here as you say.)

I will have to try to examine using wireshark as you mention. I had first seen this w/ openstack profile and thought this was specific to complex software that is in that profile. I was seeing this when measuring between allocated experiment nodes, not between two spun up VMs using openstack, where perhaps "complex software" could be a culprit. Thus I swapped in "smallLAN" profile and did this experiment to confirm.

Gautam

Gautam Thaker

unread,
Nov 15, 2023, 7:48:22 PM11/15/23
to emulab-users
@Mike Hibler (or anyone else):

Could my underlying issue be  something to do w/ "interrupt throttling"? I know there used to be such a thing, at least w/ Intel e1000 driver(s), though I am unsure if this even is in play here.  For record, and possible comment I post 3 results here:

2023 - Emulab d430s over eno1  interface  => All results are reasonable
2023 - Emulab d430s over enp6s0f3 inteface => Erratic, unreasonable behavior
2004 - Emulab PCE850s                                   =>    All results are reasonable 

These charts are basically histograms of latencies for each msg size shown in vertical orientation for a quick visual summary of entire experiment. 

I know wireshark is my friend here, but before I start down that path I was hoping there is something in the driver (or kernel) configuration that is at play here?  May be some large TCPDUMP files to do the needed packet capture (I would do it just for one message size and much fewer samples, hoping to catch a sequence in which large latency is observed.

Gautam
2004_fedora_ubersock_unloaded_PCE850_linux_2.6.8-1.521_1E6.out.png
2023-11-01__10h_04m__MDT_5.15.0-86-generic_smallLAN_2_nodes_d430_enp6s0f3_1E7.ubout.png
2023-11-01__19h_31m__MDT_5.15.0-86-generic_smallLAN_2_nodes_d430_eno1_intfc_1E6.ubout.png

Gautam Thaker

unread,
Nov 16, 2023, 2:41:47 PM11/16/23
to emulab-users
After some more digging around I found this note of mine from 2007:

"after considerable investigations it was determined that the problem with these
tests were related to interrupt throttling on Intel Pro/1000 ethernet drivers.
The interrupt throttle feature can be turned off by adding the following
entry in /etc/modprobe.conf file:

options e1000 InterruptThrottleRate=0,0,0,0,0,0 TxIntDelay=0,0,0,0,0,0 TxAbsIntDelay=0,0,0,0,0,0 RxAbsIntDelay=0,0,0,0,0,0"

Would anything like this be possible on  NIC cards that are on d710 (or d430)? I assume I can go back to PC3000s and this should still do the intended for me, remove the erratic round trip latencies.

Also, in the old days some PC3000s had cross-over network connection, thus having no switch in the middle. Is that feature gone/gone from Emulab across all nodes types?

Gautam

Gautam Thaker

unread,
Nov 17, 2023, 10:27:26 AM11/17/23
to emulab-users
Does anyone know which file I would add the line:

"options e1000 InterruptThrottleRate=0,0,0,0,0,0 TxIntDelay=0,0,0,0,0,0 TxAbsIntDelay=0,0,0,0,0,0 RxAbsIntDelay=0,0,0,0,0,0"  ??

it looks like old location /etc/modprobe.conf   is now not used, and I need to put it in a file under /etc/modprobe.d/  directory. 

Does it matter what I call the file w/ a single line above? I tried to create e1000.conf and put this line, but dmesg seems to not indicate this was ever seen.

GHT
Reply all
Reply to author
Forward
0 new messages