TCP differences in 7.2 vs 7.1

David Samms

unread,

May 12, 2009, 1:41:22 PM5/12/09

to

After upgrading to 7.2 (amd64) some customers complained of very poor
bandwidth. Upon investigation all the effected customers were ATT DSL
clients located all over the USA, not in a single city, nor were other
ISPs effected. The server is a Supermicro with dual (quad core)
processors with a single Intel fxp network card on a 100mbit connection.
Kernel is GENERIC for both 7.1_release and 7.2_release. Normally a
client can max out their download connection, but for ATT DSL customers
the transfer rate would be about 5-10KB/s even though the server and
client where both idle.

Repeated tests were done, from multiple clients in different
geographical locations. The problem manifested itself regardless of
whether ftp, http, smtp, pop, or scp was used, and regardless of the OS
of the client. Believing it to be a routing issue we changed the route
and even changed the local router the server is connected to so that a
different NIC port would be used to talk to ATT DSL customers, but no
change in performance.

Turns out it is somehow related to differences in FreeBSD 7.1 and 7.2.
If I boot the same server with 7.1, all clients work as you would
expect. But, if 7.2 is used all clients with the exception of ATT DSL
clients would work normally, ATT customers would be limited to 5-10KB/s.

I have no reason to believe there is anything wrong with the ATT DSL
network, it just happen to be effected by whatever causes the problem.

Any theories?

A special thanks to cybercon.com tech support for being so helpful. If
you need a data center, they have good tech support.

_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Xin LI

unread,

May 12, 2009, 4:41:21 PM5/12/09

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi David,

David Samms wrote:
> After upgrading to 7.2 (amd64) some customers complained of very poor
> bandwidth. Upon investigation all the effected customers were ATT DSL
> clients located all over the USA, not in a single city, nor were other
> ISPs effected. The server is a Supermicro with dual (quad core)
> processors with a single Intel fxp network card on a 100mbit connection.

Could you please try if this would help:

sysctl net.inet.tcp.tso=0

Cheers,
- --
Xin LI <del...@delphij.net> http://www.delphij.net/
FreeBSD - The Power to Serve!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (FreeBSD)

iEYEARECAAYFAkoJ3vAACgkQi+vbBBjt66CKqQCgwPkg1IZnI61Q1+PWfr5sOvVm
n5IAnAzbI5HQXQqyPg+DmzHvCNhzhelI
=oHGO
-----END PGP SIGNATURE-----

David Samms

unread,

May 12, 2009, 5:31:01 PM5/12/09

to

Xin LI wrote:
> Hi David,
>
> David Samms wrote:
>> After upgrading to 7.2 (amd64) some customers complained of very poor
>> bandwidth. Upon investigation all the effected customers were ATT DSL
>> clients located all over the USA, not in a single city, nor were other
>> ISPs effected. The server is a Supermicro with dual (quad core)
>> processors with a single Intel fxp network card on a 100mbit connection.
>
> Could you please try if this would help:
>
> sysctl net.inet.tcp.tso=0
>
> Cheers,
> - --
> Xin LI <del...@delphij.net> http://www.delphij.net/
> FreeBSD - The Power to Serve!

Xin LI,

Thank you for your help.

Setting sysctl net.inet.tcp.tso=0 resolved the issue completely. What
does sysctl net.inet.tcp.tso=0 do? Where can I read more about the
option? I captured tcpdumps of a single file transfer to 7.1, 7.2 and
7.2 with sysctl net.inet.tcp.tso=0, but they are to large to attach to
this list. Let me know if you are interested in viewing the dump files.

Thanks again for your assistance!

Rick C. Petty

unread,

May 12, 2009, 5:36:35 PM5/12/09

to

On Tue, May 12, 2009 at 05:31:01PM -0400, David Samms wrote:
>
> Setting sysctl net.inet.tcp.tso=0 resolved the issue completely. What
> does sysctl net.inet.tcp.tso=0 do?

# sysctl -d net.inet.tcp.tso
net.inet.tcp.tso: Enable TCP Segmentation Offload

I had a similar problem with a different NIC. This option controls whether
we offload segmenting to the NIC. My NIC seemed to be limited by the
number of interrupts which could be delivered. You can also do this on a
card-by-card basis using "ifconfig <interface> -tso".

-- Rick C. Petty

Xin LI

unread,

May 12, 2009, 6:10:24 PM5/12/09

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, David,

David Samms wrote:
> Xin LI wrote:
>> Hi David,
>>
>> David Samms wrote:
>>> After upgrading to 7.2 (amd64) some customers complained of very poor
>>> bandwidth. Upon investigation all the effected customers were ATT DSL
>>> clients located all over the USA, not in a single city, nor were other
>>> ISPs effected. The server is a Supermicro with dual (quad core)
>>> processors with a single Intel fxp network card on a 100mbit connection.
>>
>> Could you please try if this would help:
>>
>> sysctl net.inet.tcp.tso=0
>>
>> Cheers,
>> - --
>> Xin LI <del...@delphij.net> http://www.delphij.net/
>> FreeBSD - The Power to Serve!
>
> Xin LI,
>
> Thank you for your help.
>
> Setting sysctl net.inet.tcp.tso=0 resolved the issue completely. What
> does sysctl net.inet.tcp.tso=0 do? Where can I read more about the
> option? I captured tcpdumps of a single file transfer to 7.1, 7.2 and
> 7.2 with sysctl net.inet.tcp.tso=0, but they are to large to attach to
> this list. Let me know if you are interested in viewing the dump files.
>
> Thanks again for your assistance!

Thanks for the offer but I think this is a known problem so perhaps the
dump files are no longer necessary. The problem was caused by the
reciever side (usually PPPoE clients, e.g. DSL users) which proposes a
smaller MSS than the interface MTU, the previous implementation sets the
packet length to interface MTU instead of the negotiated one, which
would cause problem.

Setting net.inet.tcp.tso=0 would turn off TCP Segment Offloading
completely. The previous release of FreeBSD does not include this feature.

I think yongari@ has committed a fix as revision 191867 (RELENG_7) and
190982 (HEAD):

Index: if_fxp.c
===================================================================
- --- if_fxp.c (revision 190981)
+++ if_fxp.c (revision 190982)
@@ -1485,7 +1485,8 @@
* checksum in the first frame driver should compute it.
*/
ip->ip_sum = 0;
- - ip->ip_len = htons(ifp->if_mtu);
+ ip->ip_len = htons(m->m_pkthdr.tso_segsz + (ip->ip_hl << 2) +
+ (tcp->th_off << 2));
tcp->th_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr,
htons(IPPROTO_TCP + (tcp->th_off << 2) +
m->m_pkthdr.tso_segsz));

To re@:

Perhaps we should issue an errata for this, at least document it in
errata (I can do this)?

Cheers,
- --
Xin LI <del...@delphij.net> http://www.delphij.net/
FreeBSD - The Power to Serve!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (FreeBSD)

iEYEARECAAYFAkoJ89AACgkQi+vbBBjt66B85ACeNJjEuVXitnceaC6GRG+9zWtB
OaUAoLqikyZXMEngwkLEtHboaDiQp8QI
=mcFR
-----END PGP SIGNATURE-----

Nigel Wohlers

unread,

May 12, 2009, 6:52:34 PM5/12/09

to

On 13/5/09 8:41 AM, Xin LI wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi David,
>
> David Samms wrote:
>> After upgrading to 7.2 (amd64) some customers complained of very poor
>> bandwidth. Upon investigation all the effected customers were ATT DSL
>> clients located all over the USA, not in a single city, nor were other
>> ISPs effected. The server is a Supermicro with dual (quad core)
>> processors with a single Intel fxp network card on a 100mbit connection.
>
> Could you please try if this would help:
>
> sysctl net.inet.tcp.tso=0
>
> Cheers,
> - --
> Xin LI<del...@delphij.net> http://www.delphij.net/

Thank you! This hint has saved me a lot of troubleshooting.

I was having the same issue as David with 3 servers recently upgraded to
7.2. Clients (MS Windows) were complaining that they were having
intermittent connectivity issues talking to these servers (https, imaps).

They too have fxp network interface cards, no issues with other servers
upgraded to 7.2 with em cards.

Thanks again.

Regards,
Nigel.

Pyun YongHyeon

unread,

May 12, 2009, 8:41:31 PM5/12/09

to

On Wed, May 13, 2009 at 10:52:34AM +1200, Nigel Wohlers wrote:
> On 13/5/09 8:41 AM, Xin LI wrote:
> >-----BEGIN PGP SIGNED MESSAGE-----
> >Hash: SHA1
> >
> >Hi David,
> >
> >David Samms wrote:
> >>After upgrading to 7.2 (amd64) some customers complained of very poor
> >>bandwidth. Upon investigation all the effected customers were ATT DSL
> >>clients located all over the USA, not in a single city, nor were other
> >>ISPs effected. The server is a Supermicro with dual (quad core)
> >>processors with a single Intel fxp network card on a 100mbit connection.
> >
> >Could you please try if this would help:
> >
> > sysctl net.inet.tcp.tso=0
> >
> >Cheers,
> >- --
> >Xin LI<del...@delphij.net> http://www.delphij.net/
>
>
> Thank you! This hint has saved me a lot of troubleshooting.
>
> I was having the same issue as David with 3 servers recently upgraded to
> 7.2. Clients (MS Windows) were complaining that they were having
> intermittent connectivity issues talking to these servers (https, imaps).
>
> They too have fxp network interface cards, no issues with other servers
> upgraded to 7.2 with em cards.
>

Instead of disabling TSO in network stack, just disable TSO in
fxp(4) as a workaround. Fix already is in RELENG_7(r191867) so you
can extract the patch and apply it by hand if you want.

For instance,
#cd /tmp
#fetch -o fxp.tso.patch "http://svn.freebsd.org/viewvc/base/head/sys/dev/fxp/if_fxp.c?r1=190982&r2=188176&view=patch"
#cd /usr/src/sys/dev/fxp
#patch -p4 < /tmp/fxp.tso.patch
And rebuild kernel.

Lars Eggert

unread,

May 14, 2009, 3:10:12 AM5/14/09

to

Hi,

I've been seeing similar issues ("IP bad-len 0" packets in tcpdump
traces") since 7.2-STABLE and em interfaces. Turning off TSO seems to
do the trick here, too. So at least from where I'm sitting, this is
not only an fxp problem.

Lars

Pyun YongHyeon

unread,

May 14, 2009, 4:27:50 AM5/14/09

to

Then you're seeing different problem on em(4). Last time I checked
em(4) TSO code in em(4) didn't use m_pullup and just returned
ENXIO to caller. I'm not sure that is related with your issue but
would you tell us your network configuration? If you can easily
reproduce the issue would you let us know?

> Lars

Lars Eggert

unread,

May 14, 2009, 4:28:43 AM5/14/09

to

Hi,

On 2009-5-14, at 11:27, Pyun YongHyeon wrote:
> Then you're seeing different problem on em(4). Last time I checked
> em(4) TSO code in em(4) didn't use m_pullup and just returned
> ENXIO to caller. I'm not sure that is related with your issue but
> would you tell us your network configuration?

this box is a Dell 2950 server/router running 7.2-STABLE. It has an
onboard bce interface and four dual-port Intel PRO/1000 NICs, giving
it 8 em interfaces. (Let me know if you want the boot dmesg.)

> If you can easily
> reproduce the issue would you let us know?

Reproducing the issue is as easy as setting net.inet.tcp.tso=1.

What's interesting is that I only see the issue on one of the eight em
interfaces. That interface is connected to a D-Link DIR-655 WLAN
router. When I tcpdump on the other interfaces with TSO enabled, I see
no "IP bad-len 0" messages.

Lars

Lars Eggert

unread,

May 14, 2009, 4:52:47 AM5/14/09

to

In my case, it's a

em4@pci0:12:0:0: class=0x020000 card=0x135e8086 chip=0x105e8086
rev=0x06 hdr=0x00
vendor = 'Intel Corporation'
device = 'PRO/1000 PT'
class = network
subclass = ethernet

Lars

On 2009-5-14, at 11:46, Lev Serebryakov wrote:
> Hello, Lars.

> You wrote 14 мая 2009 г., 12:28:43:
>
>> Reproducing the issue is as easy as setting net.inet.tcp.tso=1.
>> What's interesting is that I only see the issue on one of the eight
>> em
>> interfaces. That interface is connected to a D-Link DIR-655 WLAN
>> router. When I tcpdump on the other interfaces with TSO enabled, I
>> see
>> no "IP bad-len 0" messages.

> I have same problem (every one of 100-200 frames) on on-board if_em:
>
> em0@pci0:0:25:0: class=0x020000 card=0x82681043
> chip=0x10bd8086 rev=0x02 hdr=0x00
> vendor = 'Intel Corporation'
> device = '82566DM-2 Gigabit Network Connection'
> class = network
> subclass = ethernet
>
>
>
> --
> // Black Lion AKA Lev Serebryakov <l...@FreeBSD.org>
>

Pyun YongHyeon

unread,

May 15, 2009, 4:58:06 AM5/15/09

to

Would you try attached patch? I'm using the patch on my development
box. Originally the patch was written to address checksum offload
breakage on multicast packets(r182463).
However I didn't encounter TSO issue without the patch. Note, the
patch was not heavily tested so it may have uncovered bugs.

em.csum_tso.patch

Michael L. Squires

unread,

May 20, 2009, 5:55:29 PM5/20/09

to

I started having speed problems after shifting from 7.1-STABLE to
7.1-PRERELEASE. They have continued with 7.2-STABLLE.

Reverting to the 7.1-STABLE kernel eliminated the problem.

After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST
on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld
cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput
problems between my home system (on Comcast) and my office PC (connected
via a Time-Warner connection). This also affects connections to other
systems; downloading Web pages (ebay.com) speeds up after I change the TSO
entry.

The box in question runs NAT and has an fxp (Intel Pro100) interface connected
to a Comcast cable modem and an em (Intel Pro1000) interface connected to the
internal network.

There are no network errors in "netstat -i" on either interface.

The "if_fxp.c" code appears to be the May 7 version.

This is the dmesg entry for the card in question. The system is a dual Xeon
Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.

fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem 0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0
miibus0: <MII bus> on fxp0

Mike Squires

Pyun YongHyeon

unread,

May 20, 2009, 11:45:40 PM5/20/09

to

On Wed, May 20, 2009 at 05:55:29PM -0400, Michael L. Squires wrote:
> I started having speed problems after shifting from 7.1-STABLE to
> 7.1-PRERELEASE. They have continued with 7.2-STABLLE.
>
> Reverting to the 7.1-STABLE kernel eliminated the problem.
>
> After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST
> on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld
> cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput
> problems between my home system (on Comcast) and my office PC (connected
> via a Time-Warner connection). This also affects connections to other
> systems; downloading Web pages (ebay.com) speeds up after I change the TSO
> entry.
>
> The box in question runs NAT and has an fxp (Intel Pro100) interface
> connected to a Comcast cable modem and an em (Intel Pro1000) interface
> connected to the internal network.
>
> There are no network errors in "netstat -i" on either interface.
>
> The "if_fxp.c" code appears to be the May 7 version.
>

You should have cvs rev. 1.266.2.15 of if_fxp.c.

> This is the dmesg entry for the card in question. The system is a dual Xeon
> Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.
>
> fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem
> 0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0
> miibus0: <MII bus> on fxp0
>

Since you use both em(4) and fxp(4) I'd like to know which driver
has the issue. Instead of disabling TSO of network stack try
disabling TSO for each interface. For instance,
1. Diable TSO of em(4) and check you see the same issue
(ifconfig em0 -tso).
2. Diable TSO of fxp(4) and check you see the same issue
(ifconfig fxp0 -tso).

Michael L. Squires

unread,

May 22, 2009, 3:50:07 PM5/22/09

to

The version of if_fpx.c is in fact 1.266.2.15.

Connecting to the FreeBSD box from a PC with a bash shell under XP
SP3/Cygwin OpenSSH I find

(1) disable "tso" on the internal "em0" interface has no effect; but

(2) disabling "tso" on the external "fxp0" inteface eliminates the
througput problem. The effect appears to be the same as using sysctl to
disable tso on all interfaces.

With "tso" enabled on the "fxp0" interface the connection (reading email
using "pine" in a large window) hung completely.

There are no errors in "netstat -i" nor in /var/log/messages.

"netstat -e" on the XP PC shows no discards or errors; however, I don't
think I've ever seen a PC under Windows admit to network errors.

The fxp0 interface connects to a Comcast cable modem, which eventually
connects to my office PC which is in the "iga.in.gov" domain hosted by
TimeWarner.

I'll be happy to run anything else you want.

Mike Squires
UN*X at home
Since 1985

Pyun YongHyeon

unread,

May 22, 2009, 9:34:52 PM5/22/09

to

Would you capture the failing TCP session with tcpdump and mail me
the URL of the captured file(off-list)?

Lars Eggert

unread,

Sep 3, 2009, 8:11:02 AM9/3/09

to

Hi,

just a quick update: I still need to run with TSO off on RELENG_7
build Sep 1, because otherwise throughput via em interfaces is
sometimes very poor.

Lars