Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

TCP MSS not adhered to properly when options present

1 view
Skip to first unread message

Rick Byers

unread,
Nov 29, 2001, 1:11:16 PM11/29/01
to tech...@netbsd.org
Hi,

NetBSD doesn't appear to be handling the TCP MSS option properly.
According to RFC 1122, the MSS sent by a host is the maximum IP packet
size its willing to receive minus 40. This correlates to the maximum
tcp payload when no tcp or ip options are present. However, NetBSD
appears to use it as the maximum tcp payload even when options are
present.

I first noticed this trying to ftp to ftp.netbsd.org - what version of
NetBSD does it run? (www.netbsd.org exhibits the same behaviour) If I
open an FTP connetion to ftp.netbsd.org with a TCP MSS of, say, 1360 (by
eg. setting my MTU to 1400), I get back packets with TCP payload up to
1360 bytes. However, ftp.netbsd.org uses the TCP timestamp options which
add 12 bytes to the TCP header. the result is that I get a 1412 byte IP
packet back, which would have had to be fragmented had my MTU really been
1400.

Although this is a fabricated example, I have a real problem from my
machine at home which is connected to the net with PPPoE (mtu 1492).
Infact, I can't ftp to ftp.netbsd.org from inside my lan as it appears
ipnat isn't handling the fragments properly (its strange because it
handels the fragments I get from www.netbsd.org properly, but I'll look
into that seperately).

I have verified this behaviour under NetBSD-current (1.5Y), but only when
pmtu discovery is enabled.

For some reason if pmtu discovery is disabled, NetBSD only sends tcp
segments up to 500 bytes (552 byte IP packets) through my pppoe interface,
even though (in this case) an MSS of 1360 was received and 1452 was sent.
Is this behaviour by design, or a bug?

I haven't had a chance to look into the source for either of theese
problems, but I hope to have time this weekend if someone doesn't beat me
to it.

Thanks,
Rick

Scott Barron

unread,
Nov 30, 2001, 1:55:49 AM11/30/01
to tech...@netbsd.org
On Thu, Nov 29, 2001 at 01:10:58PM -0500, Rick Byers wrote:
> Hi,
>
> NetBSD doesn't appear to be handling the TCP MSS option properly.
> According to RFC 1122, the MSS sent by a host is the maximum IP packet
> size its willing to receive minus 40. This correlates to the maximum
> tcp payload when no tcp or ip options are present. However, NetBSD
> appears to use it as the maximum tcp payload even when options are
> present.
>

I am pretty much a novice when it comes to the networking code but the
way I understand it is the MSS options are exchanged when a connection
is created. However the option length may not be constant during the
lifetime of the connection (a host that sends SACK blocks, something I'm
currently working on, is a good example of this). So I see only two
things, set it to the size that is sent, which is what I think it
currently does (looking at tcp_input.c in 1.5.2) or set it to the size
advertised minus MAX_TCPOPTLEN and take and go for the "worst case" so
to speak. Personally, I'm not sure which is a better idea and I don't
know how other systems handle it (I suspect the same way). Maybe
somebody more experienced can chime in here (also please chime in if
I've gotten anything incorrect).


> I haven't had a chance to look into the source for either of theese
> problems, but I hope to have time this weekend if someone doesn't beat me
> to it.
>

The sent MSS is recorded in tcp_dooptions() in tcp_input.c (grep for
TCPOPT_MAXSEG, my file is too hacked up to give you a valid line
number), to give you a starting point.

Hopefully I've gotten what little information I can provide correct :)
-Scott

Mark Allman

unread,
Nov 30, 2001, 7:28:30 AM11/30/01
to Scott Barron, tech...@netbsd.org

> I am pretty much a novice when it comes to the networking code but
> the way I understand it is the MSS options are exchanged when a
> connection is created. However the option length may not be
> constant during the lifetime of the connection (a host that sends
> SACK blocks, something I'm currently working on, is a good example
> of this).

The SACK issue is pretty thin, I think. The only time that comes
into play is when you have bi-directional bulk transfer, I think.
And, that is pretty rare. I.e., SACK makes the size of the ACKs
variable. (Some people have suggested that timestamps should be
changed so that we only use them if our sending rate becomes "large"
such that we need them for PAWS, but that is just a theory at this
point.) It seems to me that the end-point that opens the connection
passively can solve this problem. That end-point know whether or
not timestamps will be used and can adjust things accordingly.

allman


---
Mark Allman -- NASA GRC/BBN -- http://roland.grc.nasa.gov/~mallman/

Rick Byers

unread,
Nov 30, 2001, 9:30:30 AM11/30/01
to Scott Barron, tech...@netbsd.org
On Fri, 30 Nov 2001, Scott Barron wrote:

> > NetBSD doesn't appear to be handling the TCP MSS option properly.
> > According to RFC 1122, the MSS sent by a host is the maximum IP packet
> > size its willing to receive minus 40. This correlates to the maximum
> > tcp payload when no tcp or ip options are present. However, NetBSD
> > appears to use it as the maximum tcp payload even when options are
> > present.
>
> I am pretty much a novice when it comes to the networking code but the
> way I understand it is the MSS options are exchanged when a connection
> is created. However the option length may not be constant during the
> lifetime of the connection (a host that sends SACK blocks, something I'm
> currently working on, is a good example of this). So I see only two
> things, set it to the size that is sent, which is what I think it
> currently does (looking at tcp_input.c in 1.5.2) or set it to the size
> advertised minus MAX_TCPOPTLEN and take and go for the "worst case" so
> to speak. Personally, I'm not sure which is a better idea and I don't
> know how other systems handle it (I suspect the same way). Maybe
> somebody more experienced can chime in here (also please chime in if
> I've gotten anything incorrect).

Right. That is exactly why the RFCs define the MSS to be the maximum IP
packet size minus 40. That way, the extra IP or TCP options count against
the segment size to keep the total packet size bounded at a constant
(MSS+40).

> > I haven't had a chance to look into the source for either of theese
> > problems, but I hope to have time this weekend if someone doesn't beat me
> > to it.
>
> The sent MSS is recorded in tcp_dooptions() in tcp_input.c (grep for
> TCPOPT_MAXSEG, my file is too hacked up to give you a valid line
> number), to give you a starting point.

Cool, thanks. I'll take a look this weekend and see if I can come up with
a fix. All that should be required is to subtract the length of any TCP
or IP options when desciding how much payload to include in a TCP
packet...

Thanks,
Rick

Rick Byers

unread,
Nov 30, 2001, 9:33:10 AM11/30/01
to Mark Allman, tech...@netbsd.org

In my understanding of the specs, even if an end-point knew it would
allways be including 12 bytes of TCP options, its not supposed to use that
to adjust the MSS value it advertises. The MSS is the maximum IP packet
size minus 40 (i.e. doesn't include any options), and then the actuall
maximum data payload must be determined on a per packet basis after taking
any options into account...

Rick


Scott Barron

unread,
Nov 30, 2001, 10:11:11 AM11/30/01
to tech...@netbsd.org

Alright I broke out TCP/IP Illustrated v2 and noticed the following
snippet of code that is missing from the NetBSD code:

/*
* Adjust data length if insertion of options will
* bump the packet length beyond the t_maxseg length.
*/
if (len > tp->t_maxseg - optlen) {
len = tp->t_maxseg - optlen;
sendalot = 1;
}

This seems to do what you're talking about but I'm not sure why it was
removed (haven't been around that long). Would this do the trick?
(Against a -current checked out very early this morning (Nov 30)). This
built but I haven't tested it.

Index: tcp_output.c
===================================================================
RCS file: /cvsroot/syssrc/sys/netinet/tcp_output.c,v
retrieving revision 1.75
diff -r1.75 tcp_output.c
745a746,750
>
> if (len > txsegsize - optlen) {
> len = txsegsize - optlen;
> sendalot = 1;
> }

-Scott

Scott Barron

unread,
Nov 30, 2001, 10:23:48 AM11/30/01
to tech...@netbsd.org
Hmmm, right after I post this I notice in tcp_segsize() that size is being
decreased by tcp_optlen() (which returns TCPOLEN_TSTAMP_APPA if the
timestamp options are set) and then uses the min of that size and the
peer mss. So it looks like it should be doing the right thing. Sorry
to reply to my own post, I will ensure that I am properly caffienated
before future posts.

-Scott

Rick Byers

unread,
Dec 1, 2001, 1:32:13 AM12/1/01
to tech...@netbsd.org
Ok, I've had a chance to look at this now...

> Hmmm, right after I post this I notice in tcp_segsize() that size is being
> decreased by tcp_optlen() (which returns TCPOLEN_TSTAMP_APPA if the
> timestamp options are set) and then uses the min of that size and the
> peer mss. So it looks like it should be doing the right thing. Sorry
> to reply to my own post, I will ensure that I am properly caffienated
> before future posts.

Its true size is being decreased, but txsegsizep is calculated as the
minimum of size and the mss advertised by the peer. The option
lengths have to be subtracted from the mss advertised by the peer also.
I've written a patch that appears to fix this, and filed a PR: kern/14799.

Since the bug causes NetBSD to send packets largers than the receiver has
said it can receive, I consider this to be a serious bug. Of course it
won't affect most people because most people have mtu=1500. Anyway, could
someone please sanity check my work and committ it if it looks good?

Thanks,
Rick

0 new messages