If it is a TSO segment, that indicates to me that the code in tcp_output() that should
generate a TSO segment no greater than 65535 bytes in length is busted.
And this would imply just about any app doing large sosend()s could cause
this, I think? (NFS read replies/write requests of 64K would be one of them.)
rick
>
>
>
> Just to clarify, I'm experiencing this error with NFS, but also with
> iSCSI - I turned off my NFS server in rc.conf and rebooted, and I'm
> still able to create the error. This is not just a NFS issue on my
> machine.
>
>
>
> I our case, when it happens, the problem persists for quite some time
> (minutes or hours) if we don’t interact (ifconfig or reboot).
>
>
>
> The first few times that I ran into it, I had similar issues -
> Because I was keeping my system up and treating it like a temporary
> problem/issue. Worst case scenario resulted in reboots to reset the
> NIC. Then again, I find the ix's to be cranky if you ifconfig them
> too much.
>
> Now, I'm trying to find a root cause, so as soon as I start seeing
> any errors, I abort and reboot the machine to test the next theory.
>
>
> Additionally, I'm often able to create the problem with just 1 VM
> running iometer on the SAN storage. When the problem occurs, that
> connection is broken temporarily, taking network load off the SAN -
> That may improve my chances of keeping this running.
>
>
>
>
>
> > I am able to reproduce it fairly reliably within 15 min of a reboot
> > by
> > loading the server via NFS with iometer and some large NFS file
> > copies at
> > the same time. I seem to need to sustain ~2 Gbps for a few minutes.
>
> That’s probably why we can’t reproduce it reliably here. Although
> having 10gig cards in our blade servers, the ones affected are
> connected to a 1gig switch.
>
>
>
>
>
> It seems that it needs a lot of traffic. I have a 10 gig backbone
> between my SANs and my ESXi machines, so I can saturate quite
> quickly (just now I hit a record.. the error occurred within ~5 min
> of reboot and testing). In your case, I recommend firing up multiple
> VM's running iometer on different 1 gig connections and see if you
> can make it pop. I also often turn off ix1 to drive all traffic
> through ix0 - I've noticed it happens faster this way, but once
> again I'm not taking enough observations to make decent time
> predictions.
>
>
>
>
>
>
> Can you try this when the problem occurs?
>
> for CPU in {0..7}; do echo "CPU${CPU}"; cpuset -l ${CPU} ping -i 0.2
> -c 2 -W 1 10.0.0.1 | grep sendto; done
>
> It will tie ping to certain cpus to test the different tx queues of
> your ix interface. If the pings reliably fail only on some queues,
> then your problem is more likely to be the same as ours.
>
> Also, if you have dtrace available:
>
> kldload dtraceall
> dtrace -n 'fbt:::return / arg1 == EFBIG && execname == "ping" / {
> stack(); }'
>
> while you run pings over the interface affected. This will give you
> hints about where the EFBIG error comes from.
>
> > […]
>
>
> Markus
>
>
>
>
> Will do. I'm not sure what shell the first script was written for,
> it's not working in csh, here's a re-write that does work in csh in
> case others are using the default shell:
>
> #!/bin/csh
> foreach CPU (`seq 0 23`)
> echo "CPU$CPU";
> cpuset -l $CPU ping -i 0.2 -c 2 -W 1 10.0.0.1 | grep sendto;
> end
>
>
> Thanks for your input. I should have results to post to the list
> shortly.
>
>
The only explanation I can think of for this is that there might be
another net interface driver stacked on top of the ixgbe.c one and
that the setting doesn't get propagated up.
Does this make any sense?
IP_MAXPACKET can't be changed from 65535, but I can see an argument
for setting the default value of if_hw_tsomax to a smaller value.
For example, in sys/net/if.c change it from:
657 if (ifp->if_hw_tsomax == 0)
658 ifp->if_hw_tsomax = IP_MAXPACKET;
to
657 if (ifp->if_hw_tsomax == 0)
658 ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN);
This is a slightly smaller default which won't have much impact unless
the hardware device can only handle 32 mbuf clusters for transmit of
a segment and there are several of those.
Christopher, can you do your test run with IP_MAXPACKET set to 65518,
which should be the same as the above. If that gets rid of all the
EFBIG error replies, then I think the above patch will have the same
effect.
Thanks, rick
>
> > However, this is still a TSO related issue, it's just not one
> > related to
> > the setting of TSO's max size.
> >
> > A 10.0-STABLE system with tso disabled on ix0 doesn't have a single
> > packet
> > over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit longer
> > to
> > increase confidence in this assertion, but I don't want to waste
> > time on
> > this when I could be logging problem packets on a system with TSO
> > enabled.
> >
> > Comments are very welcome..
rick
>
> Markus
>
>
> >
> > rick
> >>
> >> 10.0 Code:
> >>
> >> 780 if (len > tp->t_tsomax - hdrlen) { !!
> >> 781 len = tp->t_tsomax - hdrlen; !!
> >> 782 sendalot = 1;
> >> 783 }
> >>
> >>
> >>
> >>
> >> I've put debugging here, set the nic's max TSO as per Rick's patch
> >> (
> >> set to say 32k), and have seen that tp->t_tsomax == IP_MAXPACKET.
> >> It's being set someplace else, and thus our attempts to set TSO on
> >> the nic may be in vain.
> >>
> >>
> >> It may have mattered more in 9.2, as I see the code doesn't use
> >> tp->t_tsomax in some locations, and may actually default to what
> >> the
> >> nic is set to.
> >>
> >> The NIC may still win, I didn't walk through the code to confirm,
> >> it
> >> was enough to suggest to me that setting TSO wouldn't fix this
> >> issue.
> >>
> >>
> I’m booting more systems with the test kernel and I will be watching
> all of them with dtrace to see I i find an occurence where
> tp->t_tsomax is off. I hope that with more systems, I’ll have an
> answer more quickly.
>
> But digging around the code, I still don’t see a way how tp->tsomax
> could not have been set from if_hw_tsomax when there are no stacked
> interfaces…
>
It seems to happen where you mentioned before. Since it only gets set
from cap.tsomax and that gets set from if_hw_tsomax, it would be 0
otherwise. Christopher sees in change when he changes IP_MAXPACET, so
the default setting works, but for him setting it in the driver didn't,
for some reason?
Thanks for doing the testing, rick
rick
> > - Boot in a NON LAGG environment. ix0 only.
> >
> > ixgbe's printf is showing packets up to 65530. Haven't run long
> > enough yet
> > to see if anything will go over 65535
> >
With the ethernet header length, it can be <= 65536, because that
is 32 * MCLBYTES.
rick