Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bge problems in RELENG_9, bge0: watchdog timeout -- resetting

162 views
Skip to first unread message

Anders Nordby

unread,
Jul 3, 2012, 2:57:04 PM7/3/12
to
Hi,

I'm having lots of difficulties with BCM5719, which is the default
network card of HP Proliant DL 360 G8 servers. I can get a few ping
replies before I get a couple of these:

bge0: watchdog timeout -- resetting
bge0: watchdog timeout -- resetting

Then everything hangs. Can not log in using ssh.

I'm running: FreeBSD-9.0-RELENG_9-20120701-JPSNAP-amd64

Info about the NIC:

# devinfo -rv | grep phy
brgphy0 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=1
brgphy1 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=2
brgphy2 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=3
brgphy3 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=4
# grep bge /var/run/dmesg.boot
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bf0000-0xf6bfffff,
0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32 at device 0.0 on pci3
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
bge0: Ethernet address: 2c:76:8a:54:08:14
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bc0000-0xf6bcffff,
0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36 at device 0.1 on pci3
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
bge1: Ethernet address: 2c:76:8a:54:08:15
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b90000-0xf6b9ffff,
0xf6b80000-0xf6b8ffff,0xf6b70000-0xf6b7ffff irq 32 at device 0.2 on pci3
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
bge2: Ethernet address: 2c:76:8a:54:08:16
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b60000-0xf6b6ffff,
0xf6b50000-0xf6b5ffff,0xf6b40000-0xf6b4ffff irq 36 at device 0.3 on pci3
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
bge3: Ethernet address: 2c:76:8a:54:08:17

Searching other bug reports and posts, I've tried:

hw.bge.allow_asf="0"
hw.pci.enable_msi="0"

But it didn't help. Any ideas?

If I don't use the loader.conf settings above, I also get (before the
watchdog timeouts):

bge0: 2 link states coalesced
bge0: 2 link states coalesced
bge0: 2 link states coalesced

Best regards,

--
Anders.
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Rick Miller

unread,
Jul 3, 2012, 4:04:40 PM7/3/12
to
Hi Anders....

I've not had good luck with the BCM5719 in stable/8 either. Not sure
if the driver has been updated in 9 or not, but I have a blog post
explaining my woes with the BCM5719 at http://blog.hostileadmin.com/
--
Sent from my mobile device

Take care
Rick Miller

YongHyeon PYUN

unread,
Jul 4, 2012, 9:01:36 PM7/4/12
to
There is a WIP version at the following URL.
http://people.freebsd.org/~yongari/bge/if_bge.c
http://people.freebsd.org/~yongari/bge/if_bgereg.h
http://people.freebsd.org/~yongari/bge/brgphy.c

I have a couple of positive feedbacks but it seems it still has
some issues. Let me know whether it makes any difference on your
box.

Sean Bruno

unread,
Jul 9, 2012, 1:34:21 PM7/9/12
to
On Wed, 2012-07-04 at 18:01 -0700, YongHyeon PYUN wrote:
> here is a WIP version at the following URL.
> http://people.freebsd.org/~yongari/bge/if_bge.c
> http://people.freebsd.org/~yongari/bge/if_bgereg.h
> http://people.freebsd.org/~yongari/bge/brgphy.c
>
> I have a couple of positive feedbacks but it seems it still has
> some issues. Let me know whether it makes any difference on your
> box.

I grabbed these updates and applied them cleanly to stable/9 on a Dell
R620 with a quad port BCM5720, I still see watchdog timeouts and reset
indications. I am able to ping out of the box for a short amount of
time before the device hangs and times out.



-bash-4.2# ping XXX.XXX.XXX.1
PING XXX.XXX.XXX.1 (XXX.XXX.XXX.XXX): 56 data bytes
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
Jul 9 17:31:41 <kern.crit> x89 kernel: bge2: watchdog timeout --
resetting
Jul 9 17:31:41 <kern.notice> x89 kernel: bge2: link state changed to
DOWN
Jul 9 17:31:41 <kern.notice> x89 kernel: bge2: link state changed to
DOWN
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from XXX.XXX.XXX.1: icmp_seq=9 ttl=64 time=1.408 ms
Jul 9 17:31:45 <kern.notice> x89 kernel: bge2: link state changed to UP
Jul 9 17:31:45 <kern.notice> x89 kernel: bge2: link state changed to UP
64 bytes from 10.73.149.1: icmp_seq=10 ttl=64 time=1.697 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=11 ttl=64 time=1.835 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=12 ttl=64 time=1.390 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=13 ttl=64 time=1.392 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=14 ttl=64 time=1.392 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=15 ttl=64 time=1.848 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=16 ttl=64 time=1.389 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=17 ttl=64 time=1.541 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=18 ttl=64 time=1.575 ms

The stats counters don't really show much here, but here they are
regardless.
dev.bge.2.%desc: Broadcom NetXtreme Gigabit Ethernet, ASIC rev.
0x5720000
dev.bge.2.%driver: bge
dev.bge.2.%location: slot=0 function=0 handle=\_SB_.PCI0.PE1C.NDX0
dev.bge.2.%pnpinfo: vendor=0x14e4 device=0x165f subvendor=0x1028
subdevice=0x1f5b class=0x020000
dev.bge.2.%parent: pci1
dev.bge.2.forced_collapse: 0
dev.bge.2.msi: 1
dev.bge.2.forced_udpcsum: 0
dev.bge.2.stats.FramesDroppedDueToFilters: 0
dev.bge.2.stats.DmaWriteQueueFull: 0
dev.bge.2.stats.DmaWriteHighPriQueueFull: 0
dev.bge.2.stats.NoMoreRxBDs: 0
dev.bge.2.stats.InputDiscards: 0
dev.bge.2.stats.InputErrors: 0
dev.bge.2.stats.RecvThresholdHit: 0
Jul 9 17:33:35 <kern.notice> x89 kernel: bge2: link state changed to
DOWN
dev.bge.2.stats.rx.ifHCInOctets: 109580
dev.bge.2.stats.rx.Fragments: 0
dev.bge.2.stats.rx.UnicastPkts: 212
dev.bge.2.stats.rx.MulticastPkts: 282
dev.bge.2.stats.rx.BroadcastPkts: 543
dev.bge.2.stats.rx.FCSErrors: 0
dev.bge.2.stats.rx.AlignmentErrors: 0
dev.bge.2.stats.rx.xonPauseFramesReceived: 0
dev.bge.2.stats.rx.xoffPauseFramesReceived: 0
dev.bge.2.stats.rx.ControlFramesReceived: 0
dev.bge.2.stats.rx.xoffStateEntered: 0
dev.bge.2.stats.rx.FramesTooLong: 0
dev.bge.2.stats.rx.Jabbers: 0
dev.bge.2.stats.rx.UndersizePkts: 0
dev.bge.2.stats.tx.ifHCOutOctets: 30916
dev.bge.2.stats.tx.Collisions: 0
dev.bge.2.stats.tx.XonSent: 0
dev.bge.2.stats.tx.XoffSent: 0
dev.bge.2.stats.tx.InternalMacTransmitErrors: 0
dev.bge.2.stats.tx.SingleCollisionFrames: 0
dev.bge.2.stats.tx.MultipleCollisionFrames: 0
dev.bge.2.stats.tx.DeferredTransmissions: 0
dev.bge.2.stats.tx.ExcessiveCollisions: 0
dev.bge.2.stats.tx.LateCollisions: 0
dev.bge.2.stats.tx.UnicastPkts: 203
dev.bge.2.stats.tx.MulticastPkts: 0
dev.bge.2.stats.tx.BroadcastPkts: 3

Anders Nordby

unread,
Aug 23, 2012, 12:15:05 PM8/23/12
to
Hi,

On ons, jul 04, 2012 at 06:01:36pm -0700, YongHyeon PYUN wrote:
> There is a WIP version at the following URL.
> http://people.freebsd.org/~yongari/bge/if_bge.c
> http://people.freebsd.org/~yongari/bge/if_bgereg.h
> http://people.freebsd.org/~yongari/bge/brgphy.c
>
> I have a couple of positive feedbacks but it seems it still has
> some issues. Let me know whether it makes any difference on your
> box.

I tried these bge source files in 9.1-PRERELEASE this week, and it does
not help. If I try to log in with SSH I get:

Aug 23 17:30:32 login: ROOT LOGIN (root) ON ttyu0
bge0: watchdog timeout -- resetting
Aug 23 17:31:31 kernel: bge0: watchdog timeout -- resetting
Aug 23 17:31:31 kernel: bge0: link state changed to DOWN
Aug 23 17:31:35 kernel: bge0: link state changed to UP
bge0: watchdog timeout -- resetting
Aug 23 17:33:24 kernel: bge0: watchdog timeout -- resetting
Aug 23 17:33:24 kernel: bge0: link state changed to DOWN
Aug 23 17:33:28 kernel: bge0: link state changed to UP

I tried setting hw.bge.allow_asf to 0, but it did not help.

During boot I get:

pcib3: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pci0:3:0:0: failed to read VPD data.
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bf0000-0xf6bfffff0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32
at device 0.0 on pci3
bge0: APE FW version: NCSI v1.0.80.0
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5719C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 2c:76:8a:54:08:14
pci0:3:0:1: failed to read VPD data.
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bc0000-0xf6bcffff0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36
at device 0.1 on pci3
bge1: APE FW version: NCSI v1.0.80.0
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
brgphy1: <BCM5719C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 2c:76:8a:54:08:15
pci0:3:0:2: failed to read VPD data.
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b90000-0xf6b9ffff0xf6b80000-0xf6b8ffff,0xf6b70000-0xf6b7ffff irq 32
at device 0.2 on pci3
bge2: APE FW version: NCSI v1.0.80.0
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
brgphy2: <BCM5719C 1000BASE-T media interface> PHY 3 on miibus2
brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge2: Ethernet address: 2c:76:8a:54:08:16
pci0:3:0:3: failed to read VPD data.
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b60000-0xf6b6ffff0xf6b50000-0xf6b5ffff,0xf6b40000-0xf6b4ffff irq 36
at device 0.3 on pci3
bge3: APE FW version: NCSI v1.0.80.0
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
brgphy3: <BCM5719C 1000BASE-T media interface> PHY 4 on miibus3
brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge3: Ethernet address: 2c:76:8a:54:08:17

Regards,

--
Anders.

YongHyeon PYUN

unread,
Aug 24, 2012, 1:52:48 PM8/24/12
to
On Thu, Aug 23, 2012 at 06:15:05PM +0200, Anders Nordby wrote:
> Hi,
>
> On ons, jul 04, 2012 at 06:01:36pm -0700, YongHyeon PYUN wrote:
> > There is a WIP version at the following URL.
> > http://people.freebsd.org/~yongari/bge/if_bge.c
> > http://people.freebsd.org/~yongari/bge/if_bgereg.h
> > http://people.freebsd.org/~yongari/bge/brgphy.c
> >
> > I have a couple of positive feedbacks but it seems it still has
> > some issues. Let me know whether it makes any difference on your
> > box.
>
> I tried these bge source files in 9.1-PRERELEASE this week, and it does
> not help. If I try to log in with SSH I get:
>
> Aug 23 17:30:32 login: ROOT LOGIN (root) ON ttyu0
> bge0: watchdog timeout -- resetting
> Aug 23 17:31:31 kernel: bge0: watchdog timeout -- resetting
> Aug 23 17:31:31 kernel: bge0: link state changed to DOWN
> Aug 23 17:31:35 kernel: bge0: link state changed to UP
> bge0: watchdog timeout -- resetting
> Aug 23 17:33:24 kernel: bge0: watchdog timeout -- resetting
> Aug 23 17:33:24 kernel: bge0: link state changed to DOWN
> Aug 23 17:33:28 kernel: bge0: link state changed to UP
>
> I tried setting hw.bge.allow_asf to 0, but it did not help.

The loader tunable has no effect for controllers with
APE(Application Processor Engine).

>
> During boot I get:
>
> pcib3: <ACPI PCI-PCI bridge> at device 2.0 on pci0
> pci3: <ACPI PCI bus> on pcib3
> pci0:3:0:0: failed to read VPD data.
> bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
> 0xf6bf0000-0xf6bfffff0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32
> at device 0.0 on pci3
> bge0: APE FW version: NCSI v1.0.80.0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It seems your APE runs slightly newer NC-SI firmware. I was able to
reproduce watchdog timeouts on Dell R820 but I'm not sure you're
also seeing the same issue here. Due to unknown reason, it seems
programming RX MTU register has no effect with BCM5720 on R820.
Receiving frames larger than 175(?) bytes seem to hang the
controller on R820. Current workaround for the issue is to set
the MTU of sender(i.e. link partner or switch) to some low value,
128 for example. That would show poor performance but shall make
your controller work. I asked help to Broadcom and waiting for
answers/hint from Broadcom.
0 new messages