Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Liitle's Law and TCP work in progress

97 views
Skip to first unread message

Bob McMahon

unread,
Jul 5, 2024, 1:47:11 PM7/5/24
to BBR Development
Hi All,

I'm considering adding TCP byte wait times as calculated per Little's Law to the iperf 2 send side outputs. A review of Little's Law is here.

I'm thinking the TCP bytes in flight are a good proxy for the work in progress (WIP or L) but I'm not certain. A started on a thread on the iperf 2 open source site with some example outputs.

I'm also thinking TCP throughput is equal enough on input and outputs to use LL. The throughput used is the iperf 2 write rate in units of bytes. The socket should also have TCP_NOTSENT_LOWAT set so the queue size should include send side bloat.

I appreciate expert feedback on this, e.g. is it accurate enough and is it useful?

Thanks,
Bob

"[Little's Law] was developed by John Little in the 1960s. At its core, the law describes the relationship between throughput, cycle times, and work in progress (WIP). Specifically, Little's Law states that the average WIP is equal to the product of the throughput rate and average cycle time."

This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.

Neal Cardwell

unread,
Jul 5, 2024, 3:00:29 PM7/5/24
to Bob McMahon, BBR Development
Hi Bob,

Can you please clarify what you mean by "TCP byte wait time"?

And can you please clarify the motivation for using Little's Law in this case?

If you are using (as you mention) "TCP bytes in flight" as L, the average number of bytes in the system, then I'm not yet seeing why you need Little's law (L= λW), since the other two quantities, the long-term average effective arrival rate (λ), and the average time that a byte spends in the system (W), can in this case be directly measured. For example, I would imagine that λ can be approximated by the average tcpi_delivery_rate (or the average delivery rate measured by the receiving application reading out of the socket), and W can be approximated by the average tcpi_rtt. Does that sound plausible?

best regards,
neal



--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/CAHb6Lvpz66SSbOQZAs0%3DWiNZdEXDRC7QD%3DGqiTM%2B_kCwo%2Bbo%3DA%40mail.gmail.com.

Bob McMahon

unread,
Jul 5, 2024, 3:25:12 PM7/5/24
to Neal Cardwell, BBR Development
> Can you please clarify what you mean by "TCP byte wait time"?

The naming may be poor, but it's the byte "cycle time" as computed by Little's law. The throughput used is the write rate into a socket. The work in progress, as currently proposed, is bytes in flight.

> And can you please clarify the motivation for using Little's Law in this case?

I'm trying to get a socket application's view of the e2e byte delay, since TCP is a byte protocol, using app level information to the extend possible. I realize bytes in flights isn't such - hence my concern there. The write rate is known by the app because it's doing the writes.

I find RTT is a transport layer timer and things like quick-ack affect it. I'm trying to avoid looking too deep into the "system" to get this "black box" measurement.

Bob

Neal Cardwell

unread,
Jul 5, 2024, 3:50:38 PM7/5/24
to Bob McMahon, BBR Development
On Fri, Jul 5, 2024 at 3:25 PM Bob McMahon <bob.m...@broadcom.com> wrote:
> Can you please clarify what you mean by "TCP byte wait time"?

The naming may be poor, but it's the byte "cycle time" as computed by Little's law. The throughput used is the write rate into a socket. The work in progress, as currently proposed, is bytes in flight.

OK, thanks for clarifying.
 
> And can you please clarify the motivation for using Little's Law in this case?

I'm trying to get a socket application's view of the e2e byte delay, since TCP is a byte protocol, using app level information to the extend possible. I realize bytes in flights isn't such - hence my concern there. The write rate is known by the app because it's doing the writes.

I find RTT is a transport layer timer and things like quick-ack affect it. I'm trying to avoid looking too deep into the "system" to get this "black box" measurement.

If you are using "TCP bytes in flight" as the quantity of items in the system (L), and you are using the Linux TCP definition of "TCP bytes in flight", then the average time in the system (W) is (by this definition) the time "in flight": the time elapsed from a TCP data sender transmitting a data byte until that data sender receives a TCP ACK for that byte. Note that "the time elapsed from a TCP data sender transmitting a data byte until that data sender receives a TCP ACK for that byte" is also the definition of RTT in Linux TCP. So your W is basically, by definition, equal to the average TCP RTT (which should be close to the average tcpi_rtt).

Indeed, RTT is affected by quick-ACKs and delayed ACKs, but so is "TCP bytes in flight", because "in flight" means "transmitted but not yet ACKed".

best regards,
neal

Bob McMahon

unread,
Jul 5, 2024, 5:27:12 PM7/5/24
to Neal Cardwell, BBR Development
If you are using "TCP bytes in flight" as the quantity of items in the system (L), and you are using the Linux TCP definition of "TCP bytes in flight", then the average time in the system (W) is (by this definition) the time "in flight": the time elapsed from a TCP data sender transmitting a data byte until that data sender receives a TCP ACK for that byte. Note that "the time elapsed from a TCP data sender transmitting a data byte until that data sender receives a TCP ACK for that byte" is also the definition of RTT in Linux TCP. So your W is basically, by definition, equal to the average TCP RTT (which should be close to the average tcpi_rtt).

Hmm, using a socket fq-rate limiter (SO_MAX_PACING_RATE)  reveals a very different RTT than the LL calculated average time per a measurement below. Does this offer contradictory information about the RTT being the same as W?

rjmcmahon@fedora:~/Code/pyflows/iperf2-code$ src/iperf -c 192.168.1.77 -i 1 -e --tcp-cca cubic --tcp-write-prefetch 256K  -w 4M --fq-rate 100m
------------------------------------------------------------
Client connecting to 192.168.1.77, TCP port 5001 with pid 35066 (1/0 flows/load)
Write buffer size: 131072 Byte
fair-queue socket pacing set to  100 Mbit/s
TCP congestion control set to cubic using cubic
TOS defaults to 0x0 (dscp=0,ecn=0) (Nagle on)
TCP window size: 8.00 MByte (WARNING: requested 4.00 MByte)
Event based writes (pending queue watermark at 262144 bytes)
------------------------------------------------------------
[  1] local 192.168.1.103%enp4s0 port 60198 connected with 192.168.1.77 port 5001 (prefetch=262144) (cubic) (icwnd/mss/irtt=14/1448/319) (ct=0.37 ms) on 2024-06-27 12:07:24.074 (PDT)
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry     InF(pkts)/Cwnd(pkts)/RTT(var)  fq-rate  Wait(ms)  NetPwr
[  1] 0.00-1.00 sec  12.3 MBytes   103 Mbits/sec  99/0         0      127K(90)/149K(106)/365(18) us  100 Mbit/sec   10.124 ms 35192
[  1] 1.00-2.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/375(15) us  100 Mbit/sec   10.444 ms 33205
[  1] 2.00-3.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/369(18) us  100 Mbit/sec   10.444 ms 33745
[  1] 3.00-4.00 sec  12.0 MBytes   101 Mbits/sec  96/0         0      127K(90)/149K(106)/377(11) us  100 Mbit/sec   10.335 ms 33376
[  1] 4.00-5.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/369(15) us  100 Mbit/sec   10.444 ms 33745
[  1] 5.00-6.00 sec  12.0 MBytes   101 Mbits/sec  96/0         0      127K(90)/149K(106)/373(12) us  100 Mbit/sec   10.335 ms 33734
[  1] 6.00-7.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/373(22) us  100 Mbit/sec   10.444 ms 33383
[  1] 7.00-8.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/368(17) us  100 Mbit/sec   10.444 ms 33837
[  1] 8.00-9.00 sec  11.9 MBytes  99.6 Mbits/sec  95/0         0      127K(90)/149K(106)/379(19) us  100 Mbit/sec   10.444 ms 32854
[  1] 9.00-10.00 sec  12.0 MBytes   101 Mbits/sec  96/0         0      127K(90)/149K(106)/380(12) us  100 Mbit/sec   10.335 ms 33113
[  1] 0.00-10.04 sec   120 MBytes   100 Mbits/sec  958/0         0      149K/345(20) us 36225

Bob

Bob McMahon

unread,
Jul 5, 2024, 6:14:21 PM7/5/24
to Neal Cardwell, BBR Development
or maybe this is an artifact of the write size and InF is somehow not 127K (which is one write size unit)?

Bob

Neal Cardwell

unread,
Jul 5, 2024, 8:55:45 PM7/5/24
to Bob McMahon, BBR Development
Thanks, Bob.

In that example, the bottleneck is the pacing mechanism itself, and the inter-skb pacing delay is much longer than the RTT. And in such cases, the simplified model of the system I was mentioning, '"the time elapsed from a TCP data sender transmitting a data byte until that data sender receives a TCP ACK for that byte" is also the definition of RTT',  is not a detailed enough model to reason through what's going on.

Since Eric switched the Linux pacing model to the EDT model in 2018, the RTT no longer includes time spent in the fq pacing layer.

So it looks like the difference between the "RTT" and "Wait" in your example is caused by the following:

+ The "RTT" is measuring the time between the EDT pacing release time and the time of the ACK.
+ The "Wait" is measuring the time between TCP transmission and the time of the ACK. This includes time queued in the fq pacing layer waiting to be released to the NIC.

Because the pacing rate is so low here (100Mbit/sec), the skbs in the fq pacing layer spend a lot of time queuing, waiting for their turn to be released. So the "Wait" time (10ms) is considerably longer than the RTT (370us).

thanks,
neal

 



Reply all
Reply to author
Forward
0 new messages