Ethernet fails 3% of the time

113 views
Skip to first unread message

Dave Tucker

unread,
Mar 19, 2018, 4:47:48 PM3/19/18
to BeagleBoard
Ethernet on my Beaglebone Black is failing to come online about 3% of the time after power up.  Attached are dmesg output from times when ethernet works (good_dmesg.txt) and the more rare times when it fails (bad_dmesg.txt).  I've also attached full_bad_boot.txt which is the debug serial output from a boot that failed.  Here is a diff of the two dmesgs, which I think might highlight the relevant areas:
70c70
< raid6: int32x1  xor()   140 MB/s
---
> raid6: int32x1  xor()   143 MB/s
76c76
< raid6: int32x8  xor()   118 MB/s
---
> raid6: int32x8  xor()   117 MB/s
124c124
< davinci_mdio 4a101000.mdio: detected phy mask fffffffe
---
> davinci_mdio 4a101000.mdio: detected phy mask fffffffb
126c126
< davinci_mdio 4a101000.mdio: phy[0]: device 4a101000.mdio:00, driver SMSC LAN8710/LAN8720
---
> davinci_mdio 4a101000.mdio: phy[2]: device 4a101000.mdio:02, driver SMSC LAN8710/LAN8720
169a170,172
> EXT4-fs (mmcblk1p3): INFO: recovery required on readonly filesystem
> EXT4-fs (mmcblk1p3): write access will be enabled during recovery
> EXT4-fs (mmcblk1p3): recovery complete
174a178
> EXT4-fs (mmcblk1p2): recovery complete
175a180
> EXT4-fs (mmcblk1p4): recovery complete
183c188,189
< SMSC LAN8710/LAN8720 4a101000.mdio:00: attached PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=4a101000.mdio:00, irq=-1)
---
> libphy: PHY 4a101000.mdio:00 not found
> net eth0: phy "4a101000.mdio:00" not found on slave 0, err -19
185,186c191
< cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
< IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
---
> random: crng init done

I don't know what any of this means.  I do know I'm not doing anything different for the times that fail.  I've automated the entire process of turning it on and off and seeing if the network interface works.

I'm running a custom build put together with Yocto (Rocko, 2.4), and I've compiled u-boot separately.  I can provide my device tree dts and uEnv.txt if needed.

Is this a known problem?  What can I do to fix this?

Thanks,

Dave
bad_dmesg.txt
good_dmesg.txt
full_bad_boot.txt

Dave Tucker

unread,
Mar 19, 2018, 4:50:19 PM3/19/18
to BeagleBoard
Oh, and I should note kernel is 4.12.12-yocto-standard.

Robert Nelson

unread,
Mar 19, 2018, 4:56:08 PM3/19/18
to Beagle Board, theaet...@gmail.com
Yes... Apply this patch, and it'll fix it 95% of the time...

https://github.com/RobertCNelson/bb-kernel/blob/am33x-v4.12/patches/drivers/ti/cpsw/0001-cpsw-search-for-phy.patch

Later boards (am43/etc) added a gpio to reset the phy, due to what was
seen on this design..

Regards,

--
Robert Nelson
https://rcn-ee.com/

Dave Tucker

unread,
Mar 28, 2018, 10:15:35 AM3/28/18
to BeagleBoard
Thank you so much for the patch.  This did seem to fix my problem, but then I found that it seems to work for some Beaglebone Blacks but not others.  Even boards from the same manufacturer (but potentially different lots).  What causes this issue?  Is there something that would explain why the patch would fix the problem on some boards but not others?  Are there some manufacturers that are likely to be better than others?

Thanks,

Dave
Reply all
Reply to author
Forward
0 new messages