TCP Delayed ACKs

74 views
Skip to first unread message

patacongo

unread,
Dec 8, 2019, 3:49:38 PM12/8/19
to NuttX
I just added some support for TCP delayed ACKs to the NuttX TCP stack (part of RFC 1122).  Here is some description from RTC 1122:

4.2.3.2  When to Send an ACK Segment

            A host that is receiving a stream of TCP data segments can
            increase efficiency in both the Internet and the hosts by
            sending fewer than one ACK (acknowledgment) segment per data
            segment received; this is known as a "delayed ACK" [TCP:5].

            A TCP SHOULD implement a delayed ACK, but an ACK should not
            be excessively delayed; in particular, the delay MUST be
            less than 0.5 seconds, and in a stream of full-sized
            segments there SHOULD be an ACK for at least every second
            segment.

            DISCUSSION:

A delayed ACK gives the application an opportunity to update the window and perhaps to send an immediate response. In particular, in the case of character-mode remote login, a delayed ACK can reduce the number of segments sent by the server by a factor of 3 (ACK, window update, and echo character all combined in one segment). In addition, on some large multi-user hosts, a delayed ACK can substantially reduce protocol processing overhead by reducing the total number of packets to be processed [TCP:5]. However, excessive delays on ACK's can disturb the round-trip timing and packet "clocking" algorithms [TCP:7].

So, bottom line, it should provide bettter TCP RX performance. I am having problems testing this properly right now (due to some problems in the little networking test bed.). If any has a good TCP test bed in place, I would love to get your feedback.

@Xiang You did some testing with iperf3. Did you learn anything interesting about network performance? If you still have the iperf3 setup, I would love to verify that it is still functional with the delayed ACKs.

The delayed ACK as to be enabled as a TCP configuration.

Gregory Nutt

unread,
Dec 8, 2019, 4:46:41 PM12/8/19
to nu...@googlegroups.com
On 12/8/2019 2:49 PM, patacongo wrote:
I just added some support for TCP delayed ACKs to the NuttX TCP stack (part of RFC 1122).  Here is some description from RTC 1122:

4.2.3.2  When to Send an ACK Segment

            ...
            A TCP SHOULD implement a delayed ACK, but an ACK should not
            be excessively delayed; in particular, the delay MUST be
            less than 0.5 seconds, ...

There is a problem with meeting this requirement.  The current TCP timers are in units of 0.5 sec.  We cannot represent a 0.5 second delay with that resolution.  And, worse, most network drivers only drive the TCP timing at intervals of around 1 sec.  That might make the feature less usable.


patacongo

unread,
Dec 8, 2019, 6:19:28 PM12/8/19
to NuttX

...  I am having problems testing this properly right now (due to some problems in the little networking test bed.).  ...

I finally came up with a good test.  I verified the behavior or the delayed ACK implementation using the apps/examples/tcpblaster configuration and Wireshark.  It works as intended.

I was able to compare the NuttX delayed ACK behavior with the Windows delayed ACK behavior.  NuttX waits and ACKs on every second packet (segment).  Windows, behaves differently:  It ACKs after every six packets.

I think this might be due to the MTU mismatch between the target.  The target was set up for a 590 byte MTU and Windows was set up for the standard 1500 byte MTU.  6 x 590 byte packets is close to the size of 2 x1500 byte packets.

I don't know which behavior is correct.  The spec says every second segment which, to me, means two packets of any size.

Johnny Billquist

unread,
Dec 8, 2019, 7:02:48 PM12/8/19
to nu...@googlegroups.com, patacongo
No, the RFC actually says two *full sized* packets.

To quote your quote:

"A TCP SHOULD implement a delayed ACK, but an ACK should not
be excessively delayed; in particular, the delay MUST be
less than 0.5 seconds, and in a stream of full-sized
segments there SHOULD be an ACK for at least every second
segment."

Note the "in a stream of full-sized segments, there SHOULD be an ACK for
at least every second segment".

Johnny

--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: b...@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol

Gregory Nutt

unread,
Dec 8, 2019, 7:06:25 PM12/8/19
to Johnny Billquist, nu...@googlegroups.com
Depends on what a "full-sized" packet means. I don't think it is well
defined.  1500 bytes is one common MTU size, but the MTU size is
optional.  Is it the negotiated connection MTU? Or the preferred MTU of
the recipient?

Johnny Billquist

unread,
Dec 8, 2019, 7:08:16 PM12/8/19
to nu...@googlegroups.com, Gregory Nutt
Well, the RFC actually don't talk about packets or MTU here. It talks
about segments. And the maximum segment size for each end is exchanged
as a part of the SYN packets at connection establishment.

Johnny

Johnny Billquist

unread,
Dec 8, 2019, 7:19:43 PM12/8/19
to nu...@googlegroups.com, Gregory Nutt
By the way, the MTU is not negotiated, nor is it optional.
The MTU is basically just the maximum packet length accepted by the
interface.

At the TCP level, there is something called the Path MTU, but it's a
rather shaky thing that don't work that well, and is something each side
tries to figure out without the involvement of the other side, and it
can change dynamically.

And I spoke too hastily. MSS is actually negotiated. Each side sends
it's MSS, but both sides have to use the smaller MSS of the two sides
suggestion. That said, it can even be clamped more. SYN packets should
contain the MSS option, but in case they don't there are also defined
defaults that should be used.

Johnny

On 2019-12-09 01:06, Gregory Nutt wrote:

Gregory Nutt

unread,
Dec 8, 2019, 7:25:52 PM12/8/19
to Johnny Billquist, nu...@googlegroups.com

> Well, the RFC actually don't talk about packets or MTU here. It talks
> about segments. And the maximum segment size for each end is exchanged
> as a part of the SYN packets at connection establishment.

The MSS is closely related to the MTU and, I would think, it would be
the smaller of the to MSS's which should be close to the 590 byte
maximum packet size (which is not the MTU, but the MTU plus the Ethernet
header).  So I still don't understand the Windows behavior. The target
sends a 590 byte packet.  The MSS should be around 520-ish bytes.  So
you would still think that boils down to one ACK per two full-size (590
byte packet), not six.


Johnny Billquist

unread,
Dec 8, 2019, 7:38:46 PM12/8/19
to Gregory Nutt, nu...@googlegroups.com
I think you are right in that MSS should be 536 (probably), and that is
then the full sized segment.

However, I'm not at all surprised if Windows bends the rules. And I
should warn you that Linux does too. I haven't found any newer RFC (that
I can remember) that redefines this rule, but Linux can suddenly drop
down to only sending an ACK after something like 10-15 packets, even
when MSS is 1460.

And I've seen Windows sometimes do very broken things in many places...

So you might just have hit a place where they are trying to be clever,
and are breaking the RFC.

Johnny Billquist

unread,
Dec 8, 2019, 7:41:49 PM12/8/19
to Gregory Nutt, nu...@googlegroups.com
On 2019-12-09 01:38, Johnny Billquist wrote:
> On 2019-12-09 01:25, Gregory Nutt wrote:
>>
>>> Well, the RFC actually don't talk about packets or MTU here. It talks
>>> about segments. And the maximum segment size for each end is
>>> exchanged as a part of the SYN packets at connection establishment.
>>
>> The MSS is closely related to the MTU and, I would think, it would be
>> the smaller of the to MSS's which should be close to the 590 byte
>> maximum packet size (which is not the MTU, but the MTU plus the
>> Ethernet header).  So I still don't understand the Windows behavior.
>> The target sends a 590 byte packet.  The MSS should be around 520-ish
>> bytes.  So you would still think that boils down to one ACK per two
>> full-size (590 byte packet), not six.
>
> I think you are right in that MSS should be 536 (probably), and that is
> then the full sized segment.
>
> However, I'm not at all surprised if Windows bends the rules. And I
> should warn you that Linux does too. I haven't found any newer RFC (that
> I can remember) that redefines this rule, but Linux can suddenly drop
> down to only sending an ACK after something like 10-15 packets, even
> when MSS is 1460.
>
> And I've seen Windows sometimes do very broken things in many places...
>
> So you might just have hit a place where they are trying to be clever,
> and are breaking the RFC.

By the way - sorry for the noise. I initially misremembered that MSS is
actually supposed to be the smaller of the two sides opinion on MSS.

When I did my TCP/IP, I originally had that, but somewhere along the
way, I changed it to keep different MSS for receive and transmit, and I
no longer remember why I did the change. But looking at the RFC, it is
very clear that they should both pick the smaller MSS.

So I'll go back to being quiet on this. :-)

patacongo

unread,
Dec 8, 2019, 8:20:39 PM12/8/19
to nu...@googlegroups.com

However, I'm not at all surprised if Windows bends the rules. And I
should warn you that Linux does too. I haven't found any newer RFC (that
I can remember) that redefines this rule, but Linux can suddenly drop
down to only sending an ACK after something like 10-15 packets, even
when MSS is 1460.

And I've seen Windows sometimes do very broken things in many places...

So you might just have hit a place where they are trying to be clever,
and are breaking the RFC.

It would not surprise me if there were some other RFCs that define a more adaptive way to do delayed ACKs.  RFC 1122 is very old and there has been a lot of work on network congestion since then.

There are 10 or so RFCs that update RFC 1122.  I skimmed through all of them.. just searching for "delayed ACK" and found nothing.  But I might have missed someting.

Johnny Billquist

unread,
Dec 8, 2019, 8:26:23 PM12/8/19
to nu...@googlegroups.com, patacongo
On 2019-12-09 02:20, patacongo wrote:
>
> However, I'm not at all surprised if Windows bends the rules. And I
> should warn you that Linux does too. I haven't found any newer RFC
> (that
> I can remember) that redefines this rule, but Linux can suddenly drop
> down to only sending an ACK after something like 10-15 packets, even
> when MSS is 1460.
>
> And I've seen Windows sometimes do very broken things in many places...
>
> So you might just have hit a place where they are trying to be clever,
> and are breaking the RFC.
>
> It would not surprise me if there were some other RFCs that define a
> more adaptive way to do delayed ACKs.  RFC 1122 is very old and there
> has been a lot of work on network congestion since then.
>
> There are 10 or so RFCs that update RFC 1122.  I simmed through all of
> them.. just searching for "delayed ACK" and found nothing.  But I might
> have missed someting.

I have been wondering the same in the past, but I have also failed to
find any update to that part. But it is obvious that some systems do not
follow RFC 1122 in this aspect, so one could suspect there is some update.
If you find something, I'd be very keen on hearing about it.

Anyway, segment size is, I believe well defined, and delayed ACK also.

One more thing, while I'm at it. I would say most systems today to not
actually seem to use delayed ACK in the normal situation. It's only when
you start pushing lots of data that you'll see the rate of ACKs go down.
Otherwise, normally, I've pretty much observed other systems always
immediately send ACK when a packet is received, no matter what I do with
the packets on the sending side. But that is just an observation, and if
so, it's also not forbidden by systems to do so.

Gregory Nutt

unread,
Dec 8, 2019, 8:33:34 PM12/8/19
to Johnny Billquist, nu...@googlegroups.com

> One more thing, while I'm at it. I would say most systems today to not
> actually seem to use delayed ACK in the normal situation. It's only
> when you start pushing lots of data that you'll see the rate of ACKs
> go down. Otherwise, normally, I've pretty much observed other systems
> always immediately send ACK when a packet is received, no matter what
> I do with the packets on the sending side. But that is just an
> observation, and if so, it's also not forbidden by systems to do so.

Hmmm..  I bet that is where the "full-size" packet comes in.  If we are
streaming a lot of data then we will be sending full packets (i.e.,
payload == MSS).  If the packet is not "full-size" then we are not
streaming (or at least we are falling behind) and it does make sense to
send the ACK immediately.  I think I should make that change tomorrow.

Thanks!  I always learn something when we get into these long discussions


Johnny Billquist

unread,
Dec 8, 2019, 9:07:33 PM12/8/19
to Gregory Nutt, nu...@googlegroups.com
Pro's and cons... The pathological counterexample is an interactive
telnet session, where you normally press a key, and the system will echo
it. If you don't have delayed ack, you'll have a packet with 1 byte with
the key pressed. That solicits an ACK. Then the other side echoes the
pressed key, creating a new packet with 1 byte. That is then received
back by you, and you immediately send an ACK for that. If you are a fast
typer, you could pretty much half the number of packets, if you use
delayed ACK...

This also have implications for nagle. But it all depends so much on
what the usage pattern looks like...

Johnny Billquist

unread,
Dec 8, 2019, 9:09:28 PM12/8/19
to Gregory Nutt, nu...@googlegroups.com
(And of course, with gigabit ethernet, gigs of ram, and processors with
multiple cores running at multiple GHz, people don't care about such
overheads anymore...)

patacongo

unread,
Dec 9, 2019, 9:19:04 AM12/9/19
to NuttX

However, I'm not at all surprised if Windows bends the rules. And I
should warn you that Linux does too. I haven't found any newer RFC (that
I can remember) that redefines this rule, but Linux can suddenly drop
down to only sending an ACK after something like 10-15 packets, even
when MSS is 1460.
 

patacongo

unread,
Dec 9, 2019, 9:27:40 AM12/9/19
to NuttX

This also have implications for nagle. But it all depends so much on
what the usage pattern looks like...


NuttX does not specifically support the Nagle algorithm, but kind of behaves that way when TCP write buffering is enabled.  New outgoing TCP data is concatenated at the end of the write buffer in a non-packetized byte stream.  When the network driver is able to send a packet, it will slurp up as much as it can from the write buffer into the outgoing packet, again without regard to any boundary about where data was added in the TCP stream.

So Telnet on a busy network could have multiple bytes per outgoing packet.  But possibly one one on a non-busy network.

Delayed ACKs on incoming RX packets only only really affect the sender of the packet in that it means that data has to be retained longer until the ACK is received.

Johnny Billquist

unread,
Dec 9, 2019, 9:05:18 PM12/9/19
to nu...@googlegroups.com, patacongo
On 2019-12-09 02:20, patacongo wrote:
>
> However, I'm not at all surprised if Windows bends the rules. And I
> should warn you that Linux does too. I haven't found any newer RFC
> (that
> I can remember) that redefines this rule, but Linux can suddenly drop
> down to only sending an ACK after something like 10-15 packets, even
> when MSS is 1460.
>
> And I've seen Windows sometimes do very broken things in many places...
>
> So you might just have hit a place where they are trying to be clever,
> and are breaking the RFC.
>
> It would not surprise me if there were some other RFCs that define a
> more adaptive way to do delayed ACKs.  RFC 1122 is very old and there
> has been a lot of work on network congestion since then.
>
> There are 10 or so RFCs that update RFC 1122.  I simmed through all of
> them.. just searching for "delayed ACK" and found nothing.  But I might
> have missed someting.

Out of amusement and annoyance I decided to search some more and refresh
my memory at the same time.

A quick summary is that:

MSS is not shared between transmit and receive. It is not negotiated. I
must have seen this before, since I implemented it, but had forgotten
since. Anyway. For sending, you need to adjust based on what the remote
machine MSS is. And vice versa. There is no relation between the two,
and they can indeed be different. But of course, you also cannot send
more than whatever your stack and interface allows. So you might not
even be able to send packets that the receiver considers full sized
segments.

Delayed ACK is certainly written as "ACK at least for every two full
sized segments received". However, it is also said that the exact way to
determine this is a bit fuzzy, and it can be done in different ways. The
probably most appropriate is to actually count the number of bytes, and
ACK when that becomes 2 times the full sized segment. However, it is
also noted that a full sized segment does not necessarily mean MSS, even
though it often might be. PMTU might result in smaller segment sizes, as
does if you stuff options into the packet. That reduces the actual size
of the payload, but MSS do not take that into account. However, for the
delayed ACK, you should.

Finding the relevant RFCs might be a bit tricky. But the most important
ones to read on this topic are:
https://tools.ietf.org/html/rfc2581
https://tools.ietf.org/html/rfc2525
https://tools.ietf.org/html/rfc2923

And I still would consider Windows TCP to probably be broken, and eye
the Linux implementation with suspicion as well... :-)

patacongo

unread,
Dec 10, 2019, 11:38:10 AM12/10/19
to NuttX
This is one of those discussions that is really already complete, but there is always one more thing to say.  Here is mine:

One thing that I realize is that I have been misinterpreting the Wireshark output.  I have looked at hundreds of hours of Wireshark captures for NuttX networking and have become so jaded that I skim over the packets without paying attention to details of the packets.  That is a dangerous habit but happens with all repititive tasks.

When I added the delayed ACK, Wireshark began to act in a different way that I did not notice.  Without delayed ACKs, I was used to seeing one TCP RX packet followed by one ACK going the the other way.  So one packet per line: RX, ACK, RX, ACK, etc.  But current Wireshark is behaving differently with delayed ACKs enabled:  With delayted ACKs it merges all of the un-ACKed TCP RX packets into line so I was seeing one TCP RX transfer of a size much larger than max packet size.  Wireshark apparently merged the un-ACKed TCP transfers into one line.

Then each of these big TCP transfers was followed by several ACK packets.  This aynchronous behavior is apparently due to packet buffering and the same behavior occurs whether sending to the host or to the target.  So I completely misinterpreted the Wireshark data.  So you need to discount everything I said before about the number of segments between ACKs.  I have no idea now and didn't retain the capture data that I would need to unravel that.  I could potentially determine that number of segments per ACK by looking at the RX sequence number in each ACK.

patacongo

unread,
Dec 10, 2019, 11:40:21 AM12/10/19
to NuttX

By the way, the MTU is not negotiated, nor is it optional.
The MTU is basically just the maximum packet length accepted by the
interface.

Minus the size of the Ethernet header.

Johnny Billquist

unread,
Dec 10, 2019, 12:30:37 PM12/10/19
to patacongo, NuttX
Well, for ethernet the data payload is usually 1500 bytes, so the mtu at the interface level is also 1500. Any ethernet headers are outside of the mtu or payload.

Johnny

patacongo <spud...@gmail.com> skrev: (10 december 2019 17:40:21 CET)

--
Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.
Reply all
Reply to author
Forward
0 new messages