After upgrading to 7.2 (amd64) some customers complained of very poor bandwidth. Upon investigation all the effected customers were ATT DSL clients located all over the USA, not in a single city, nor were other ISPs effected. The server is a Supermicro with dual (quad core) processors with a single Intel fxp network card on a 100mbit connection. Kernel is GENERIC for both 7.1_release and 7.2_release. Normally a client can max out their download connection, but for ATT DSL customers the transfer rate would be about 5-10KB/s even though the server and client where both idle.
Repeated tests were done, from multiple clients in different geographical locations. The problem manifested itself regardless of whether ftp, http, smtp, pop, or scp was used, and regardless of the OS of the client. Believing it to be a routing issue we changed the route and even changed the local router the server is connected to so that a different NIC port would be used to talk to ATT DSL customers, but no change in performance.
Turns out it is somehow related to differences in FreeBSD 7.1 and 7.2. If I boot the same server with 7.1, all clients work as you would expect. But, if 7.2 is used all clients with the exception of ATT DSL clients would work normally, ATT customers would be limited to 5-10KB/s.
I have no reason to believe there is anything wrong with the ATT DSL network, it just happen to be effected by whatever causes the problem.
Any theories?
A special thanks to cybercon.com tech support for being so helpful. If you need a data center, they have good tech support.
David Samms wrote: > After upgrading to 7.2 (amd64) some customers complained of very poor > bandwidth. Upon investigation all the effected customers were ATT DSL > clients located all over the USA, not in a single city, nor were other > ISPs effected. The server is a Supermicro with dual (quad core) > processors with a single Intel fxp network card on a 100mbit connection.
Could you please try if this would help:
sysctl net.inet.tcp.tso=0
Cheers, - -- Xin LI <delp...@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD)
iEYEARECAAYFAkoJ3vAACgkQi+vbBBjt66CKqQCgwPkg1IZnI61Q1+PWfr5sOvVm n5IAnAzbI5HQXQqyPg+DmzHvCNhzhelI =oHGO -----END PGP SIGNATURE----- _______________________________________________ freebsd-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> David Samms wrote: >> After upgrading to 7.2 (amd64) some customers complained of very poor >> bandwidth. Upon investigation all the effected customers were ATT DSL >> clients located all over the USA, not in a single city, nor were other >> ISPs effected. The server is a Supermicro with dual (quad core) >> processors with a single Intel fxp network card on a 100mbit connection.
Setting sysctl net.inet.tcp.tso=0 resolved the issue completely. What does sysctl net.inet.tcp.tso=0 do? Where can I read more about the option? I captured tcpdumps of a single file transfer to 7.1, 7.2 and 7.2 with sysctl net.inet.tcp.tso=0, but they are to large to attach to this list. Let me know if you are interested in viewing the dump files.
I had a similar problem with a different NIC. This option controls whether we offload segmenting to the NIC. My NIC seemed to be limited by the number of interrupts which could be delivered. You can also do this on a card-by-card basis using "ifconfig <interface> -tso".
>> David Samms wrote: >>> After upgrading to 7.2 (amd64) some customers complained of very poor >>> bandwidth. Upon investigation all the effected customers were ATT DSL >>> clients located all over the USA, not in a single city, nor were other >>> ISPs effected. The server is a Supermicro with dual (quad core) >>> processors with a single Intel fxp network card on a 100mbit connection.
>> Could you please try if this would help:
>> sysctl net.inet.tcp.tso=0
>> Cheers, >> - -- >> Xin LI <delp...@delphij.net> http://www.delphij.net/ >> FreeBSD - The Power to Serve!
> Xin LI,
> Thank you for your help.
> Setting sysctl net.inet.tcp.tso=0 resolved the issue completely. What > does sysctl net.inet.tcp.tso=0 do? Where can I read more about the > option? I captured tcpdumps of a single file transfer to 7.1, 7.2 and > 7.2 with sysctl net.inet.tcp.tso=0, but they are to large to attach to > this list. Let me know if you are interested in viewing the dump files.
> Thanks again for your assistance!
Thanks for the offer but I think this is a known problem so perhaps the dump files are no longer necessary. The problem was caused by the reciever side (usually PPPoE clients, e.g. DSL users) which proposes a smaller MSS than the interface MTU, the previous implementation sets the packet length to interface MTU instead of the negotiated one, which would cause problem.
Setting net.inet.tcp.tso=0 would turn off TCP Segment Offloading completely. The previous release of FreeBSD does not include this feature.
I think yongari@ has committed a fix as revision 191867 (RELENG_7) and 190982 (HEAD):
Perhaps we should issue an errata for this, at least document it in errata (I can do this)?
Cheers, - -- Xin LI <delp...@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD)
iEYEARECAAYFAkoJ89AACgkQi+vbBBjt66B85ACeNJjEuVXitnceaC6GRG+9zWtB OaUAoLqikyZXMEngwkLEtHboaDiQp8QI =mcFR -----END PGP SIGNATURE----- _______________________________________________ freebsd-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> David Samms wrote: >> After upgrading to 7.2 (amd64) some customers complained of very poor >> bandwidth. Upon investigation all the effected customers were ATT DSL >> clients located all over the USA, not in a single city, nor were other >> ISPs effected. The server is a Supermicro with dual (quad core) >> processors with a single Intel fxp network card on a 100mbit connection.
Thank you! This hint has saved me a lot of troubleshooting.
I was having the same issue as David with 3 servers recently upgraded to 7.2. Clients (MS Windows) were complaining that they were having intermittent connectivity issues talking to these servers (https, imaps).
They too have fxp network interface cards, no issues with other servers upgraded to 7.2 with em cards.
On Wed, May 13, 2009 at 10:52:34AM +1200, Nigel Wohlers wrote: > On 13/5/09 8:41 AM, Xin LI wrote: > >-----BEGIN PGP SIGNED MESSAGE----- > >Hash: SHA1
> >Hi David,
> >David Samms wrote: > >>After upgrading to 7.2 (amd64) some customers complained of very poor > >>bandwidth. Upon investigation all the effected customers were ATT DSL > >>clients located all over the USA, not in a single city, nor were other > >>ISPs effected. The server is a Supermicro with dual (quad core) > >>processors with a single Intel fxp network card on a 100mbit connection.
> Thank you! This hint has saved me a lot of troubleshooting.
> I was having the same issue as David with 3 servers recently upgraded to > 7.2. Clients (MS Windows) were complaining that they were having > intermittent connectivity issues talking to these servers (https, imaps).
> They too have fxp network interface cards, no issues with other servers > upgraded to 7.2 with em cards.
Instead of disabling TSO in network stack, just disable TSO in fxp(4) as a workaround. Fix already is in RELENG_7(r191867) so you can extract the patch and apply it by hand if you want.
I've been seeing similar issues ("IP bad-len 0" packets in tcpdump traces") since 7.2-STABLE and em interfaces. Turning off TSO seems to do the trick here, too. So at least from where I'm sitting, this is not only an fxp problem.
On Thu, May 14, 2009 at 10:10:12AM +0300, Lars Eggert wrote: > Hi,
> I've been seeing similar issues ("IP bad-len 0" packets in tcpdump > traces") since 7.2-STABLE and em interfaces. Turning off TSO seems to > do the trick here, too. So at least from where I'm sitting, this is > not only an fxp problem.
Then you're seeing different problem on em(4). Last time I checked em(4) TSO code in em(4) didn't use m_pullup and just returned ENXIO to caller. I'm not sure that is related with your issue but would you tell us your network configuration? If you can easily reproduce the issue would you let us know?
> Then you're seeing different problem on em(4). Last time I checked > em(4) TSO code in em(4) didn't use m_pullup and just returned > ENXIO to caller. I'm not sure that is related with your issue but > would you tell us your network configuration?
this box is a Dell 2950 server/router running 7.2-STABLE. It has an onboard bce interface and four dual-port Intel PRO/1000 NICs, giving it 8 em interfaces. (Let me know if you want the boot dmesg.)
> If you can easily > reproduce the issue would you let us know?
Reproducing the issue is as easy as setting net.inet.tcp.tso=1.
What's interesting is that I only see the issue on one of the eight em interfaces. That interface is connected to a D-Link DIR-655 WLAN router. When I tcpdump on the other interfaces with TSO enabled, I see no "IP bad-len 0" messages.
> Hello, Lars. > You wrote 14 мая 2009 г., 12:28:43:
>> Reproducing the issue is as easy as setting net.inet.tcp.tso=1. >> What's interesting is that I only see the issue on one of the eight >> em >> interfaces. That interface is connected to a D-Link DIR-655 WLAN >> router. When I tcpdump on the other interfaces with TSO enabled, I >> see >> no "IP bad-len 0" messages. > I have same problem (every one of 100-200 frames) on on-board if_em:
On Thu, May 14, 2009 at 11:28:43AM +0300, Lars Eggert wrote: > Hi,
> On 2009-5-14, at 11:27, Pyun YongHyeon wrote: > >Then you're seeing different problem on em(4). Last time I checked > >em(4) TSO code in em(4) didn't use m_pullup and just returned > >ENXIO to caller. I'm not sure that is related with your issue but > >would you tell us your network configuration?
> this box is a Dell 2950 server/router running 7.2-STABLE. It has an > onboard bce interface and four dual-port Intel PRO/1000 NICs, giving > it 8 em interfaces. (Let me know if you want the boot dmesg.)
> >If you can easily > >reproduce the issue would you let us know?
> Reproducing the issue is as easy as setting net.inet.tcp.tso=1.
> What's interesting is that I only see the issue on one of the eight em > interfaces. That interface is connected to a D-Link DIR-655 WLAN > router. When I tcpdump on the other interfaces with TSO enabled, I see > no "IP bad-len 0" messages.
Would you try attached patch? I'm using the patch on my development box. Originally the patch was written to address checksum offload breakage on multicast packets(r182463). However I didn't encounter TSO issue without the patch. Note, the patch was not heavily tested so it may have uncovered bugs.
I started having speed problems after shifting from 7.1-STABLE to 7.1-PRERELEASE. They have continued with 7.2-STABLLE.
Reverting to the 7.1-STABLE kernel eliminated the problem.
After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput problems between my home system (on Comcast) and my office PC (connected via a Time-Warner connection). This also affects connections to other systems; downloading Web pages (ebay.com) speeds up after I change the TSO entry.
The box in question runs NAT and has an fxp (Intel Pro100) interface connected to a Comcast cable modem and an em (Intel Pro1000) interface connected to the internal network.
There are no network errors in "netstat -i" on either interface.
The "if_fxp.c" code appears to be the May 7 version.
This is the dmesg entry for the card in question. The system is a dual Xeon Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.
fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem 0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0 miibus0: <MII bus> on fxp0
On Wed, May 20, 2009 at 05:55:29PM -0400, Michael L. Squires wrote: > I started having speed problems after shifting from 7.1-STABLE to > 7.1-PRERELEASE. They have continued with 7.2-STABLLE.
> Reverting to the 7.1-STABLE kernel eliminated the problem.
> After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST > on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld > cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput > problems between my home system (on Comcast) and my office PC (connected > via a Time-Warner connection). This also affects connections to other > systems; downloading Web pages (ebay.com) speeds up after I change the TSO > entry.
> The box in question runs NAT and has an fxp (Intel Pro100) interface > connected to a Comcast cable modem and an em (Intel Pro1000) interface > connected to the internal network.
> There are no network errors in "netstat -i" on either interface.
> The "if_fxp.c" code appears to be the May 7 version.
You should have cvs rev. 1.266.2.15 of if_fxp.c.
> This is the dmesg entry for the card in question. The system is a dual Xeon > Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.
> fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem > 0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0 > miibus0: <MII bus> on fxp0
Since you use both em(4) and fxp(4) I'd like to know which driver has the issue. Instead of disabling TSO of network stack try disabling TSO for each interface. For instance, 1. Diable TSO of em(4) and check you see the same issue (ifconfig em0 -tso). 2. Diable TSO of fxp(4) and check you see the same issue (ifconfig fxp0 -tso). _______________________________________________ freebsd-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
On Thu, 21 May 2009, Pyun YongHyeon wrote: > On Wed, May 20, 2009 at 05:55:29PM -0400, Michael L. Squires wrote: >> I started having speed problems after shifting from 7.1-STABLE to >> 7.1-PRERELEASE. They have continued with 7.2-STABLLE.
>> Reverting to the 7.1-STABLE kernel eliminated the problem.
>> After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST >> on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld >> cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput >> problems between my home system (on Comcast) and my office PC (connected >> via a Time-Warner connection). This also affects connections to other >> systems; downloading Web pages (ebay.com) speeds up after I change the TSO >> entry.
>> The box in question runs NAT and has an fxp (Intel Pro100) interface >> connected to a Comcast cable modem and an em (Intel Pro1000) interface >> connected to the internal network.
>> There are no network errors in "netstat -i" on either interface.
>> The "if_fxp.c" code appears to be the May 7 version.
> You should have cvs rev. 1.266.2.15 of if_fxp.c.
>> This is the dmesg entry for the card in question. The system is a dual Xeon >> Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.
>> fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem >> 0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0 >> miibus0: <MII bus> on fxp0
> Since you use both em(4) and fxp(4) I'd like to know which driver > has the issue. Instead of disabling TSO of network stack try > disabling TSO for each interface. For instance, > 1. Diable TSO of em(4) and check you see the same issue > (ifconfig em0 -tso). > 2. Diable TSO of fxp(4) and check you see the same issue > (ifconfig fxp0 -tso).
The version of if_fpx.c is in fact 1.266.2.15.
Connecting to the FreeBSD box from a PC with a bash shell under XP SP3/Cygwin OpenSSH I find
(1) disable "tso" on the internal "em0" interface has no effect; but
(2) disabling "tso" on the external "fxp0" inteface eliminates the througput problem. The effect appears to be the same as using sysctl to disable tso on all interfaces.
With "tso" enabled on the "fxp0" interface the connection (reading email using "pine" in a large window) hung completely.
There are no errors in "netstat -i" nor in /var/log/messages.
"netstat -e" on the XP PC shows no discards or errors; however, I don't think I've ever seen a PC under Windows admit to network errors.
The fxp0 interface connects to a Comcast cable modem, which eventually connects to my office PC which is in the "iga.in.gov" domain hosted by TimeWarner.
On Fri, May 22, 2009 at 03:50:07PM -0400, Michael L. Squires wrote:
> On Thu, 21 May 2009, Pyun YongHyeon wrote:
> >On Wed, May 20, 2009 at 05:55:29PM -0400, Michael L. Squires wrote: > >>I started having speed problems after shifting from 7.1-STABLE to > >>7.1-PRERELEASE. They have continued with 7.2-STABLLE.
> >>Reverting to the 7.1-STABLE kernel eliminated the problem.
> >>After downloading 7.2-STABLE from cvsup.freebsd.org at about 10:40 AM EST > >>on 5/20/2009, doing a buildworld/buildkernel/installkernel/installworld > >>cycle I still need to execute "net.inet.tcp.tso=1" to elminate throughput > >>problems between my home system (on Comcast) and my office PC (connected > >>via a Time-Warner connection). This also affects connections to other > >>systems; downloading Web pages (ebay.com) speeds up after I change the TSO > >>entry.
> >>The box in question runs NAT and has an fxp (Intel Pro100) interface > >>connected to a Comcast cable modem and an em (Intel Pro1000) interface > >>connected to the internal network.
> >>There are no network errors in "netstat -i" on either interface.
> >>The "if_fxp.c" code appears to be the May 7 version.
> >You should have cvs rev. 1.266.2.15 of if_fxp.c.
> >>This is the dmesg entry for the card in question. The system is a dual > >>Xeon > >>Supermicro 1U box, 1GB RAM, single 300GB IDE hard drive.
> >>fxp0: <Intel 82551 Pro/100 Ethernet> port 0xe400-0xe43f mem > >>0xfebfd000-0xfebfdfff,0xfeb80000-0xfeb9ffff irq 27 at device 7.0 on pci0 > >>miibus0: <MII bus> on fxp0
> >Since you use both em(4) and fxp(4) I'd like to know which driver > >has the issue. Instead of disabling TSO of network stack try > >disabling TSO for each interface. For instance, > >1. Diable TSO of em(4) and check you see the same issue > > (ifconfig em0 -tso). > >2. Diable TSO of fxp(4) and check you see the same issue > > (ifconfig fxp0 -tso).
> The version of if_fpx.c is in fact 1.266.2.15.
> Connecting to the FreeBSD box from a PC with a bash shell under XP > SP3/Cygwin OpenSSH I find
> (1) disable "tso" on the internal "em0" interface has no effect; but
> (2) disabling "tso" on the external "fxp0" inteface eliminates the > througput problem. The effect appears to be the same as using sysctl to > disable tso on all interfaces.
> With "tso" enabled on the "fxp0" interface the connection (reading email > using "pine" in a large window) hung completely.
> There are no errors in "netstat -i" nor in /var/log/messages.
> "netstat -e" on the XP PC shows no discards or errors; however, I don't > think I've ever seen a PC under Windows admit to network errors.
> The fxp0 interface connects to a Comcast cable modem, which eventually > connects to my office PC which is in the "iga.in.gov" domain hosted by > TimeWarner.
> I'll be happy to run anything else you want.
Would you capture the failing TCP session with tcpdump and mail me the URL of the captured file(off-list)? _______________________________________________ freebsd-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"