Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

icmp packets on em larger than 1472

1,353 views
Skip to first unread message

Kirill Yelizarov

unread,
Nov 10, 2010, 7:42:24 AM11/10/10
to freebsd...@freebsd.org
Hi,

All my em cards running 8.1 stable don't reply to icmp echo requests packets larger than 1472 bytes.

On stable 7.2 the same hardware works as expected:
# ping -s 1500 192.168.64.99
PING 192.168.64.99 (192.168.64.99): 1500 data bytes
1508 bytes from 192.168.64.99: icmp_seq=0 ttl=63 time=1.249 ms
1508 bytes from 192.168.64.99: icmp_seq=1 ttl=63 time=1.158 ms

Here is the dump on em interface
15:06:31.452043 IP 192.168.66.65 > *****: ICMP echo request, id 28729, seq 5, length 1480
15:06:31.452047 IP 192.168.66.65 > ****: icmp
15:06:31.452069 IP **** > 192.168.66.65: ICMP echo reply, id 28729, seq 5, length 1480
15:06:31.452071 IP *** > 192.168.66.65: icmp

Same ping from same source (it's a 8.1 stable with fxp interface) to em card running 8.1 stable
#pciconf -lv
em0@pci0:3:4:0: class=0x020000 card=0x10798086 chip=0x10798086 rev=0x03 hdr=0x00
vendor = 'Intel Corporation'
device = 'Dual Port Gigabit Ethernet Controller (82546EB)'
class = network
subclass = ethernet

# ping -s 1472 192.168.64.200
PING 192.168.64.200 (192.168.64.200): 1472 data bytes
1480 bytes from 192.168.64.200: icmp_seq=0 ttl=63 time=0.848 ms
^C

# ping -s 1473 192.168.64.200
PING 192.168.64.200 (192.168.64.200): 1473 data bytes
^C
--- 192.168.64.200 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

And here is it's dump on em card
5:11:15.191496 IP 192.168.66.65 > *****: ICMP echo request, id 33593, seq 0, length 1480
15:11:15.191534 IP 192.168.66.65 > *****: icmp
15:11:16.192119 IP 192.168.66.65 > *****: ICMP echo request, id 33593, seq 1, length 1480
15:11:16.192156 IP 192.168.66.65 > ******: icmp

igb cards on 8.1 stable are not affected

Regards,
Kirill



_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Jeremy Chadwick

unread,
Nov 10, 2010, 8:00:23 AM11/10/10
to Kirill Yelizarov, freebsd...@freebsd.org, Jack Vogel

Please provide uname -a output from the machine with the emX devices, as
well as relevant emX information from "dmesg" (e.g. driver version).
"sysctl dev.em.X" might also be helpful.

Thanks.

--
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |

Kirill Yelizarov

unread,
Nov 10, 2010, 8:56:34 AM11/10/10
to freebsd...@freebsd.org, Jeremy Chadwick

--- On Wed, 11/10/10, Jeremy Chadwick <fre...@jdc.parodius.com> wrote:

Here are the two examples

uname -a
FreeBSD border1 8.1-STABLE FreeBSD 8.1-STABLE #0: Thu Aug 26 16:54:15 MSD 2010 root@border1:/usr/obj/usr/src/sys/BORDER1 amd64

Oct 22 14:36:18 border1 kernel: em0: <Intel(R) PRO/1000 Legacy Network Connection 1.0.1> port 0xdc00-0xdc3f mem 0xfcfc0000-0xfcfdffff irq 54 at device 4.0 on pci3
Oct 22 14:36:18 border1 kernel: em0: [FILTER]
Oct 22 14:36:18 border1 kernel: em0: Ethernet address: 00:04:23:cc:df:ea
Oct 22 14:36:18 border1 kernel: em1: <Intel(R) PRO/1000 Legacy Network Connection 1.0.1> port 0xdc80-0xdcbf mem 0xfcfe0000-0xfcffffff irq 55 at device 4.1 on pci3
Oct 22 14:36:18 border1 kernel: em1: [FILTER]
Oct 22 14:36:18 border1 kernel: em1: Ethernet address: 00:04:23:cc:df:eb

dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0 handle=\_SB_.PCI0.MRP1.HART
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x34da class=0x020000
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.rx_int_delay: 66
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 250
dev.em.0.tx_abs_int_delay: 250
dev.em.0.rx_processing_limit: -1
dev.em.0.link_irq: 0
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1477444168
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 2757
dev.em.0.queue0.txd_tail: 2758
dev.em.0.queue0.tx_irq: 0
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 1419
dev.em.0.queue0.rxd_tail: 1418
dev.em.0.queue0.rx_irq: 0
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 1534369705
dev.em.0.mac_stats.good_pkts_recvd: 1534369705
dev.em.0.mac_stats.bcast_pkts_recvd: 197891
dev.em.0.mac_stats.mcast_pkts_recvd: 0
dev.em.0.mac_stats.rx_frames_64: 1528844
dev.em.0.mac_stats.rx_frames_65_127: 466039874
dev.em.0.mac_stats.rx_frames_128_255: 363351691
dev.em.0.mac_stats.rx_frames_256_511: 34424761
dev.em.0.mac_stats.rx_frames_512_1023: 53013458
dev.em.0.mac_stats.rx_frames_1024_1522: 616011077
dev.em.0.mac_stats.good_octets_recvd: 1076352218193
dev.em.0.mac_stats.good_octets_txd: 222914134983
dev.em.0.mac_stats.total_pkts_txd: 1750421340
dev.em.0.mac_stats.good_pkts_txd: 1750421339
dev.em.0.mac_stats.bcast_pkts_txd: 995
dev.em.0.mac_stats.mcast_pkts_txd: 0
dev.em.0.mac_stats.tx_frames_64: 591494
dev.em.0.mac_stats.tx_frames_65_127: 1309064841
dev.em.0.mac_stats.tx_frames_128_255: 320875656
dev.em.0.mac_stats.tx_frames_256_511: 84663967
dev.em.0.mac_stats.tx_frames_512_1023: 14057851
dev.em.0.mac_stats.tx_frames_1024_1522: 21167531
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 1080194011
dev.em.0.interrupts.rx_pkt_timer: 96628
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 36686
dev.em.0.interrupts.tx_abs_timer: 4
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
dev.em.1.%driver: em
dev.em.1.%location: slot=25 function=0 handle=\_SB_.PCI0.ILAN
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10cc subvendor=0x8086 subdevice=0x34da class=0x020000
dev.em.1.%parent: pci0
dev.em.1.nvm: -1
dev.em.1.rx_int_delay: 66
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 250
dev.em.1.tx_abs_int_delay: 250
dev.em.1.rx_processing_limit: -1
dev.em.1.link_irq: 0
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1477444160
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 8192
dev.em.1.fc_low_water: 6692
dev.em.1.queue0.txd_head: 3081
dev.em.1.queue0.txd_tail: 3081
dev.em.1.queue0.tx_irq: 0
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 2535
dev.em.1.queue0.rxd_tail: 2534
dev.em.1.queue0.rx_irq: 0
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 665694
dev.em.1.mac_stats.multiple_coll: 238794
dev.em.1.mac_stats.late_coll: 591710
dev.em.1.mac_stats.collision_count: 1262634
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 62957046
dev.em.1.mac_stats.missed_packets: 0
dev.em.1.mac_stats.recv_no_buff: 0
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 143055129
dev.em.1.mac_stats.good_pkts_recvd: 143055129
dev.em.1.mac_stats.bcast_pkts_recvd: 19788
dev.em.1.mac_stats.mcast_pkts_recvd: 0
dev.em.1.mac_stats.rx_frames_64: 0
dev.em.1.mac_stats.rx_frames_65_127: 0
dev.em.1.mac_stats.rx_frames_128_255: 0
dev.em.1.mac_stats.rx_frames_256_511: 0
dev.em.1.mac_stats.rx_frames_512_1023: 0
dev.em.1.mac_stats.rx_frames_1024_1522: 0
dev.em.1.mac_stats.good_octets_recvd: 35245157589
dev.em.1.mac_stats.good_octets_txd: 175509471230
dev.em.1.mac_stats.total_pkts_txd: 210873641
dev.em.1.mac_stats.good_pkts_txd: 210873641
dev.em.1.mac_stats.bcast_pkts_txd: 151
dev.em.1.mac_stats.mcast_pkts_txd: 0
dev.em.1.mac_stats.tx_frames_64: 0
dev.em.1.mac_stats.tx_frames_65_127: 0
dev.em.1.mac_stats.tx_frames_128_255: 0
dev.em.1.mac_stats.tx_frames_256_511: 0
dev.em.1.mac_stats.tx_frames_512_1023: 0
dev.em.1.mac_stats.tx_frames_1024_1522: 0
dev.em.1.mac_stats.tso_txd: 0
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 264725703
dev.em.1.interrupts.rx_pkt_timer: 0
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 0
dev.em.1.wake: 0

uname -a
FreeBSD web2 8.1-STABLE FreeBSD 8.1-STABLE #0: Thu Oct 21 17:24:16 MSD 2010 root@flash-srv:/usr/obj/nanobsd.WEB2_C7899_H16_S63/usr/src/sys/WEB2 amd64

Oct 22 13:06:51 web2 kernel: em0: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x1000-0x101f mem 0xb1a00000-0xb1a1ffff,0xb1900000-0xb19fffff,0xb1a20000-0xb1a23fff irq 28 at device 0.0 on pci1
Oct 22 13:06:51 web2 kernel: em0: Using MSI interrupt
Oct 22 13:06:51 web2 kernel: em0: [FILTER]
Oct 22 13:06:51 web2 kernel: em0: Ethernet address: 00:15:17:ac:e5:bd
Oct 22 13:06:51 web2 kernel: em1: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x20e0-0x20ff mem 0xb1b00000-0xb1b1ffff,0xb1b43000-0xb1b43fff irq 20 at device 25.0 on pci0
Oct 22 13:06:51 web2 kernel: em1: Using MSI interrupt
Oct 22 13:06:51 web2 kernel: em1: [FILTER]
Oct 22 13:06:51 web2 kernel: em1: Ethernet address: 00:15:17:ac:e5:bc
Oct 22 13:06:51 web2 kernel: Starting Network: lo0 em0 em1.
Oct 22 13:06:51 web2 kernel: em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
Oct 22 13:06:51 web2 kernel: em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
Oct 22 13:06:53 web2 kernel: em1: link state changed to UP
Oct 22 13:06:54 web2 kernel: em0: link state changed to UP


dev.em.0.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.1
dev.em.0.%driver: em
dev.em.0.%location: slot=4 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x1079 subvendor=0x8086 subdevice=0x1079 class=0x020000
dev.em.0.%parent: pci3
dev.em.0.debug: -1
dev.em.0.stats: -1
dev.em.0.rx_int_delay: 66
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 250
dev.em.0.tx_abs_int_delay: 250
dev.em.0.rx_processing_limit: -1
dev.em.1.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.1
dev.em.1.%driver: em
dev.em.1.%location: slot=4 function=1
dev.em.1.%pnpinfo: vendor=0x8086 device=0x1079 subvendor=0x8086 subdevice=0x1079 class=0x020000
dev.em.1.%parent: pci3
dev.em.1.debug: -1
dev.em.1.stats: -1
dev.em.1.rx_int_delay: 66
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 250
dev.em.1.tx_abs_int_delay: 250
dev.em.1.rx_processing_limit: -1

Kirill

Jack Vogel

unread,
Nov 10, 2010, 1:49:52 PM11/10/10
to Kirill Yelizarov, freebsd...@freebsd.org, Jeremy Chadwick
Try the code from HEAD, I've run that on a 82546 and it worked ok.

Jack

Kevin Oberman

unread,
Nov 10, 2010, 11:53:28 PM11/10/10
to Kirill Yelizarov, freebsd...@freebsd.org
> Date: Wed, 10 Nov 2010 04:21:12 -0800 (PST)
> From: Kirill Yelizarov <yki...@yahoo.com>
> Sender: owner-free...@freebsd.org

I'm unsure why it ever worked. Was the interface MTU set to 1500
(default) under V7?

Most ping programs (including FreeBSD) send the specified number of DATA
bytes. Add the ICMP header (8 bytes) and the IP header (20 bytes) to
1472 and you get 1500, the largest packet that should work. It even
gives you a hint as, for no reason I have never understood, the program
includes the ICMP header is the displayed packet size (1480/1508).

Also, the IP MTU of 1500 does not include the Ethernet framing which
includes 12 bytes of address and two bytes of ethertype or the CRC.

If the igb is allowing over 1500 bytes of IP packet through, assuming
the MTU has not been increased for the standard 1500, something is
clearly broken. 1472 is the right answer and 1500 (or 1473) is not.
--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: obe...@es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

Wilkinson, Alex

unread,
Nov 11, 2010, 12:02:40 AM11/11/10
to freebsd...@freebsd.org

works fine for me:

FreeBSD 8.1-STABLE #0 r213395

em0@pci0:0:25:0:class=0x020000 card=0x3035103c chip=0x10de8086 rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel Gigabit network connection (82567LM-3 )'


class = network
subclass = ethernet

#ping -s 1473 host
PING host(192.168.1.1): 1473 data bytes
1481 bytes from 192.168.1.1: icmp_seq=0 ttl=253 time=31.506 ms
1481 bytes from 192.168.1.1: icmp_seq=1 ttl=253 time=31.493 ms
1481 bytes from 192.168.1.1: icmp_seq=2 ttl=253 time=31.550 ms
^C

-Alex

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

Kevin Oberman

unread,
Nov 11, 2010, 12:28:17 AM11/11/10
to Wilkinson, Alex, freebsd...@freebsd.org
> Date: Thu, 11 Nov 2010 13:01:26 +0800
> From: "Wilkinson, Alex" <alex.wi...@dsto.defence.gov.au>
> Sender: owner-free...@freebsd.org

The reason the '-s 1500' worked was that the packets were fragmented. If
I add the '-D' option, '-s 1473' fails on v7 and v8. Are the V8 systems
where you see if failing without the '-D' on the same network segment?
If not, it is likely that an intervening device is refusing to fragment
the packet. (Some routers deliberately don't fragment ICMP Echos Request
packets.)

--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: obe...@es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

Kirill Yelizarov

unread,
Nov 11, 2010, 2:51:13 AM11/11/10
to freebsd...@freebsd.org

--- On Thu, 11/11/10, Kevin Oberman <obe...@es.net> wrote:

If i set -D -s 1473 sender side refuses to ping and that is correct. All mentioned above machines are behind the same router and switch. Same hardware running v7 is working while v8 is not. And i never saw such problems before. Also correct me if i'm wrong but the dump shows that the packet arrived. I'll try driver from head and will post here results.

Kirill

Kirill Yelizarov

unread,
Nov 11, 2010, 6:48:14 AM11/11/10
to freebsd...@freebsd.org

--- On Thu, 11/11/10, Kirill Yelizarov <yki...@yahoo.com> wrote:

Shame on me! It was pf. I disabled scrubbing. Any of the two methods work

1.
scrub in all
icmp_types = "{0, 3, 4, 8, 11 }"
pass out quick on $inside_if proto icmp from $inside_ip to any icmp-type $icmp_types no state
pass in quick on $inside_if proto icmp from any to $inside_ip icmp-type $icmp_types no state

2.
pass out quick on $inside_if proto icmp from $inside_ip to any no state
pass in quick on $inside_if proto icmp from any to $inside_ip no state
This works without scrubbing

Keep state also working

I disabled scrubbing because it seems to slow down nfs (i'm not shure if this is right) and i specified icmp types i want to use. What am i doing wrong with firewall icmp rules? Tcpdump shows echo requests and replies only.

I also compiled new driver from HEAD. It is working like the old one. And firewall with igb has scrubbing.

Kirill

Kevin Oberman

unread,
Nov 11, 2010, 11:12:11 AM11/11/10
to Kirill Yelizarov, freebsd...@freebsd.org, n...@freebsd.org
> Date: Wed, 10 Nov 2010 23:49:56 -0800 (PST)
> From: Kirill Yelizarov <yki...@yahoo.com>
>
>
>

I did a bit more looking at this today and I see that something bogus is
going on and it MAY be the em driver.

I tried 1473 data byte pings without the DF flag. I then captured the
packets on both ends (where the sending system has a bge (Broadcom GE)
and the responding end has an em (Intel) card.

What I saw was the fragmented IP packets all being received by the
system with the em interface and an ICMP Echo Reply being sent back,
again fragmented. I saw the reply on both ends, so both interfaces were
able to fragment an over-sized packet, transmit the two pieces, and
receive the two pieces. The em device could re-assemble them properly,
but the bge device does not seem to re-assemble them correctly or else
has a problem with ICMP packets bigger then MTU size.

When I send from the em system, I see the packets and fragments all
arrive in good form, but the system never sends out a reply. Since this
is a kernel function, it may be a driver, but I suspect that it is in
the IP stack since I am seeing the problem with a Broadcom card and I
see the data all arriving.

I think Jack can probably relax, but some patch to the network stack
seems to have broken at least ICMP processing. And, since the bge system
ups updated to 8-Stable on October 20 while the em system was updated
back on August 9, I suspect the flaw was not driver related and was
committed between August 9 and Oct. 20.

I think this needs to go to the network list where the folks who tinker
with that part of the kernel tend to hang out. Sorry for the cross-post.

Mike Tancsa

unread,
Nov 11, 2010, 2:02:22 PM11/11/10
to freebsd...@freebsd.org


I am not sure I follow. If you do a
ping -s 1473 -D
on an interface that has the default MTU of 1500, it wont work, as the
entire packet is going to be 1501 (note the data bytes)

eg.
# ping -q -s 1472 -c 1 192.168.43.219
PING 192.168.43.219 (192.168.43.219): 1472 data bytes

--- 192.168.43.219 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.714/1.714/1.714/0.000 ms
on 192.168.43.219, I see

and on .43.219, I see

0(ich10)# tcpdump -vvvni em2 icmp
tcpdump: listening on em2, link-type EN10MB (Ethernet), capture size 96
bytes
13:49:17.564482 IP (tos 0x0, ttl 63, id 53656, offset 0, flags [none],
proto ICMP (1), length 1500)
192.168.42.11 > 192.168.43.219: ICMP echo request, id 23315, seq 0,
length 1480
13:49:17.564499 IP (tos 0x0, ttl 64, id 14346, offset 0, flags [none],
proto ICMP (1), length 1500)
192.168.43.219 > 192.168.42.11: ICMP echo reply, id 23315, seq 0,
length 1480


Note the length is 1500 of the packet.

That being said, if its failing on the em nic where you dont specify the
-D flag on the ping, then there is a bug somewhere. On certain em nics,
I found doing
ifconfig em0 -rxcsum
ifconfig em0 -txcsum
ifconfig em0 -tso

works around a number of bugs

---Mike

Kirill Yelizarov

unread,
Nov 11, 2010, 3:24:54 PM11/11/10
to freebsd...@freebsd.org

--- On Thu, 11/11/10, Mike Tancsa <mi...@sentex.net> wrote:

> From: Mike Tancsa <mi...@sentex.net>
> Subject: Re: icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]

Yes, i know it. This was the first thing i tried. Sorry, i didn't mention it.

Kirill

Pyun YongHyeon

unread,
Nov 11, 2010, 4:08:59 PM11/11/10
to Kevin Oberman, freebsd...@freebsd.org, Kirill Yelizarov, n...@freebsd.org
On Thu, Nov 11, 2010 at 08:10:57AM -0800, Kevin Oberman wrote:
> > Date: Wed, 10 Nov 2010 23:49:56 -0800 (PST)
> > From: Kirill Yelizarov <yki...@yahoo.com>
> >
> >
> >
> > --- On Thu, 11/11/10, Kevin Oberman <obe...@es.net> wrote:
> >
> > > From: Kevin Oberman <obe...@es.net>
> > > Subject: Re: icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]
> > > To: "Wilkinson, Alex" <alex.wi...@dsto.defence.gov.au>
> > > Cc: freebsd...@freebsd.org
> > > Date: Thursday, November 11, 2010, 8:26 AM
> > > > Date: Thu, 11 Nov 2010 13:01:26
> > > +0800
> > > > From: "Wilkinson, Alex" <alex.wi...@dsto.defence.gov.au>
> > > > Sender: owner-free...@freebsd.org
> > > >
> > > >
> > > >? ???0n Wed, Nov 10, 2010 at

> > > 04:21:12AM -0800, Kirill Yelizarov wrote:
> > > >
> > > >? ???>All my em cards running

> > > 8.1 stable don't reply to icmp echo requests packets larger
> > > than 1472 bytes.
> > > >? ???>
> > > >? ???>On stable 7.2 the same
> > > hardware works as expected:
> > > >? ???># ping -s 1500
> > > 192.168.64.99
> > > >? ???>PING 192.168.64.99
> > > (192.168.64.99): 1500 data bytes
> > > >? ???>1508 bytes from

> > > 192.168.64.99: icmp_seq=0 ttl=63 time=1.249 ms
> > > >? ???>1508 bytes from

> > > 192.168.64.99: icmp_seq=1 ttl=63 time=1.158 ms
> > > >? ???>
> > > >? ???>Here is the dump on em
> > > interface
> > > >? ???>15:06:31.452043 IP

> > > 192.168.66.65 > *****: ICMP echo request, id 28729, seq
> > > 5, length 1480
> > > >? ???>15:06:31.452047 IP
> > > 192.168.66.65 > ****: icmp
> > > >? ???>15:06:31.452069 IP ****

> > > > 192.168.66.65: ICMP echo reply, id 28729, seq 5, length
> > > 1480
> > > >? ???>15:06:31.452071 IP ***
> > > > 192.168.66.65: icmp
> > > >? ???>
> > > >? ???>Same ping from same source

> > > (it's a 8.1 stable with fxp interface) to em card running
> > > 8.1 stable
> > > >? ???>#pciconf -lv
> > > >?
> > > ???>em0@pci0:3:4:0:???

> > > class=0x020000 card=0x10798086 chip=0x10798086 rev=0x03
> > > hdr=0x00
> > > >? ???>? ? vendor?
> > > ???= 'Intel Corporation'
> > > >? ???>? ? device?
> > > ???= 'Dual Port Gigabit Ethernet Controller
> > > (82546EB)'
> > > >? ???>? ? class?
> > > ? ? = network
> > > >? ???>? ?
> > > subclass???= ethernet
> > > >? ???>
> > > >? ???># ping -s 1472
> > > 192.168.64.200
> > > >? ???>PING 192.168.64.200
> > > (192.168.64.200): 1472 data bytes
> > > >? ???>1480 bytes from

> > > 192.168.64.200: icmp_seq=0 ttl=63 time=0.848 ms
> > > >? ???>^C
> > > >? ???>
> > > >? ???># ping -s 1473
> > > 192.168.64.200
> > > >? ???>PING 192.168.64.200
> > > (192.168.64.200): 1473 data bytes
> > > >? ???>^C
> > > >? ???>--- 192.168.64.200 ping
> > > statistics ---
> > > >? ???>4 packets transmitted, 0

> > > packets received, 100.0% packet loss
> > > >
> > > > works fine for me:
> > > >
> > > > FreeBSD 8.1-STABLE #0 r213395
> > > >
> > > > em0@pci0:0:25:0:class=0x020000 card=0x3035103c
> > > chip=0x10de8086 rev=0x02 hdr=0x00
> > > >? ???vendor?
> > > ???= 'Intel Corporation'
> > > >? ???device?
> > > ???= 'Intel Gigabit network connection
> > > (82567LM-3 )'
> > > >? ???class? ? ? =
> > > network
> > > >? ???subclass???=

Most ethernet controllers including bge(4) have a function to
specify how much RX buffer space would be allocated to receive a
frame. When controller receive a frame that has larger size than
the size specified in RX buffer space, it would drop the frame.
Because the oversized frame was silently dropped in driver layer
upper stack has no chance to reply back ICMP responses with
fragmentation needed bit for frames that set don't fragment bit.
This is where correct MTU configuration play an important role in
driver layer. If you want to handle oversized frame you also have
to set correct MTU of interface. However all controllers should be
able to receive standard MTU sized frame including VLAN tag so no
special configuration is needed when you handle standard MTU sized
frames. Some old controllers can't handle VLAN oversized frame such
that you would have no way to send or receive them.

em(4) controllers have different receiving logic where it allows
chaining multiple oversized frames into a single frame. So up to
certain point, which depends on the size of jumbo frame controller
supports, em(4) can receive these oversized frames regardless of
MTU configuration with the help of driver. The chaining is done in
driver layer and that would add additional overhead(chaining +
multiple mbuf allocation) but it has its own advantages.

I was not able to to reproduce the issue with em(4)/bge(4) on
CURRENT and these drivers worked as expected.

> I think Jack can probably relax, but some patch to the network stack
> seems to have broken at least ICMP processing. And, since the bge system
> ups updated to 8-Stable on October 20 while the em system was updated
> back on August 9, I suspect the flaw was not driver related and was
> committed between August 9 and Oct. 20.
>
> I think this needs to go to the network list where the folks who tinker
> with that part of the kernel tend to hang out. Sorry for the cross-post.
> --
> R. Kevin Oberman, Network Engineer
> Energy Sciences Network (ESnet)
> Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
> E-mail: obe...@es.net Phone: +1 510 486-8634
> Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
> _______________________________________________

> freeb...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net...@freebsd.org"

Kevin Oberman

unread,
Nov 11, 2010, 4:38:33 PM11/11/10
to pyu...@gmail.com, freebsd...@freebsd.org, Kirill Yelizarov, n...@freebsd.org
> From: Pyun YongHyeon <pyu...@gmail.com>
> Date: Thu, 11 Nov 2010 13:04:36 -0800

I don't have any systems running CURRENT at the moment, so I can't check
it out. I hope it is fixed there, but it needs to be fixed in
STABLE. Not fragmenting packets that will not fit in a standard frame is
a very serious issues as, when the frame is dropped, the source
re-transmits the same over-sized frame.

Of course, this should not happen if the interface is set to an MTU of
1500 as the higher layers should never pass a block of data larger than
1480 bytes to the IP layer. That's the only reason this had not already
been noticed.


--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: obe...@es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
_______________________________________________

0 new messages