Boot failure "external abort on non-linefetch" in cpsw_probe with any image after Wi-Fi install

2,060 views
Skip to first unread message

lorena...@gmail.com

unread,
Jan 4, 2014, 3:44:27 PM1/4/14
to beagl...@googlegroups.com
Bought a BBB [0 047132904547 A6] last month, and had about two hours of delight, followed by literal days of fruitless struggle Googling for clues. I'm about out of ideas. 

Contacting Beagleboard produced:
Did you send the same info to beagl...@googlegroups.com?

So here it is...  Maybe it will help someone...



There are lots of lines in my boot log that look like errors, such as:
-----
OMAP SD/MMC: 0
mmc_send_cmd : timeout: No status update
reading u-boot.img
...
WARNING: Caches not enabled
NAND:  No NAND device found!!!
0 MiB
MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
*** Warning - readenv() failed, using default environment
...
Net:   <ethaddr> not set. Validating first E-fuse MAC
Phy not found
PHY reset timed out
...
Uncompressing Linux... done, booting the kernel.
[    0.196219] omap2_mbox_probe: platform not supported
[    0.206814] tps65217-bl tps65217-bl: no platform data provided
[    0.283357] bone-capemgr bone_capemgr.8: slot #0: No cape found
[    0.320463] bone-capemgr bone_capemgr.8: slot #1: No cape found
[    0.357572] bone-capemgr bone_capemgr.8: slot #2: No cape found
[    0.394681] bone-capemgr bone_capemgr.8: slot #3: No cape found
[    0.414455] bone-capemgr bone_capemgr.8: slot #6: BB-BONELT-HDMIN conflict P8.45 (#5:BB-BONELT-HDMI)
[    0.424062] bone-capemgr bone_capemgr.8: slot #6: Failed verification
[    0.444551] omap_hsmmc mmc.4: of_parse_phandle_with_args of 'reset' failed
[    0.451844] bone-capemgr bone_capemgr.8: loader: failed to load slot-6 BB-BONELT-HDMIN:00A0 (prio 2)
[    0.518153] pinctrl-single 44e10800.pinmux: pin 44e10854 already requested by 44e10800.pinmux; cannot claim for gpio-leds.7
[    0.529856] pinctrl-single 44e10800.pinmux: pin-21 (gpio-leds.7) status -22
[    0.537166] pinctrl-single 44e10800.pinmux: could not request pin 21 on device pinctrl-single
-----



But I've decided all of them are normal except this one:
-----
[    2.502401] Detected MACID = 90:59:af:4d:71:eb
[    2.506978] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe089e000
[    2.515192] Internal error: : 1008 [#1] SMP ARM
[    2.519936] Modules linked in:
[    2.523139] CPU: 0    Not tainted  (3.8.13-bone30 #1)
[    2.528444] PC is at cpsw_probe+0x528/0xbbc
[    2.532830] LR is at ioremap_page_range+0x10c/0x164
[    2.537939] pc : [<c03fbf40>]    lr : [<c02eddb4>]    psr: a0000113
[    2.537939] sp : df051e20  ip : df469a0c  fp : 00000001
[    2.549962] r10: df7f8800  r9 : df7f8d90  r8 : 00000000
[    2.555433] r7 : df0d3600  r6 : df0d3610  r5 : e089e000  r4 : df7f8800
[    2.562268] r3 : df0ca840  r2 : c03fbf24  r1 : 4a100e13  r0 : e089e000
[    2.569105] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[    2.576760] Control: 10c5387d  Table: 80004019  DAC: 00000015
[    2.582777] Process swapper/0 (pid: 1, stack limit = 0xdf050240)
[    2.589067] Stack: (0xdf051e20 to 0xdf052000)
...
[    2.722176] [<c03fbf40>] (cpsw_probe+0x528/0xbbc) from [<c0386064>] (platform_drv_probe+0x14/0x18)
[    2.731569] [<c0386064>] (platform_drv_probe+0x14/0x18) from [<c0384f80>] (driver_probe_device+0xb0/0x1dc)
[    2.741687] [<c0384f80>] (driver_probe_device+0xb0/0x1dc) from [<c038510c>] (__driver_attach+0x60/0x84)
[    2.751530] [<c038510c>] (__driver_attach+0x60/0x84) from [<c0383994>] (bus_for_each_dev+0x50/0x84)
[    2.761010] [<c0383994>] (bus_for_each_dev+0x50/0x84) from [<c038470c>] (bus_add_driver+0x9c/0x20c)
[    2.770490] [<c038470c>] (bus_add_driver+0x9c/0x20c) from [<c03855dc>] (driver_register+0x9c/0x138)
[    2.779973] [<c03855dc>] (driver_register+0x9c/0x138) from [<c0008880>] (do_one_initcall+0x90/0x160)
[    2.789556] [<c0008880>] (do_one_initcall+0x90/0x160) from [<c08ef8f4>] (kernel_init_freeable+0xf8/0x1c4)
[    2.799589] [<c08ef8f4>] (kernel_init_freeable+0xf8/0x1c4) from [<c0609ed4>] (kernel_init+0x8/0xe4)
[    2.809076] [<c0609ed4>] (kernel_init+0x8/0xe4) from [<c000d618>] (ret_from_fork+0x14/0x3c)
[    2.817827] Code: e59f1644 ebfe1a5b ea000150 e58455c0 (e5953000)
[    2.824213] ---[ end trace ee1f1f895243fb40 ]---
[    2.829303] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.829303]
[    2.838880] drm_kms_helper: panic occurred, switching back to text console
-----

Makes sense since cpsw_probe is about ethernet and it was called while loading drivers, and I apparently killed the board trying to install a Wi-Fi adapter. (But the adapter was working! Until I tried to reboot...)



I loaded Ubuntu 12.04 onto my uSD, and it fails with the same error as Angstrom in the eMMC or Angstrom on the uSD - but after minutes of boot messages, not seconds like Angstrom. 

Here's all the ethernet related clues I can find: 
-----
*** Warning - readenv() failed, using default environment
...
Net:   <ethaddr> not set. Validating first E-fuse MAC
Could not get PHY for cpsw: addr 0
cpsw, usb_ether
...
[    0.116207] hw-breakpoint: debug architecture 0x4 unsupported.
[    0.117564] cpsw.0: No hwaddr in dt. Using 90:59:af:4d:71:be from efuse
[    0.117585] cpsw.1: No hwaddr in dt. Using 90:59:af:4d:71:ed from efuse
...
[    2.472570] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    2.479609] Detected MACID = 90:59:af:4d:71:be
[    2.484177] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe089e000
[    2.492394] Internal error: : 1008 [#1] SMP ARM
-----



With the Ubuntu uSD boot, it clearly reads a different uEnv (Angstrom eMMC is 26 bytes, uSD is 14 bytes):
-----
mmc0 is current device
SD/MMC found on device 0
reading uEnv.txt
340 bytes read in 3 ms (110.4 KiB/s)  <-- not 14 or 26 bytes  [see below]
Loaded environment from uEnv.txt
Importing environment from mmc ...  <-- is it importing the poison that kills Angstrom?
Running uenvcmd ...
-----


The only ethernet related items I see in the environment are: 
-----
U-Boot#  printenv
ethact=cpsw
ethaddr=90:59:af:4d:71:be  <-- eth0 except for last "be"
usbnet_devaddr=90:59:af:4d:71:be  <-- eth0 except for last "be"
-----

In the log above it found two variants of eth0:
Using 90:59:af:4d:71:be from efuse  <-- eth0 except for last "be"
Using 90:59:af:4d:71:ed from efuse  <-- eth0 except for last "ed"

The original MAC addresses from ifconfig, before the boot failures: 
eth0      Link encap:Ethernet  HWaddr 90:59:AF:4D:71:EB
ra0       Link encap:Ethernet  HWaddr 00:0C:43:00:7D:7F
usb0      Link encap:Ethernet  HWaddr 6E:5A:F6:F0:F3:45

I'm familiar with Linux routers having multiple similar MAC addresses, but they have multiple similar interfaces. As far as I know the BBB has only eth0 and usb0 (until I install my Wi-fi adapter). Why does eth0 seem to have three different MACs here? 



So by installing the Wi-if adapter I seem to have broken something that persists even in a different Linux distro booted from a different device. My BBB and Wi-Fi vendor claims they should work together:

I do have the release they say works:
BBB-eMMC-flasher-2013.09.04
and
Angstrom-Cloud9-IDE-GNOME-eglibc-ipk-v2012.12-beaglebone-2013.09.05.img.xz

Just before their install procedure I did the update, and installed VNC:
-----
root@beaglebone:/# opkg install x11vnc
Package x11vnc (0.9.13-r0.8) installed in root is up to date.
root@beaglebone:/# x11vnc -bg -o %HOME/.x11vnc.log.%VNCDISPLAY -auth /var/run/gdm/auth-for -gdm*/database -display :0  -forever
01/01/2000 01:20:29 Expanded logfile to '%HOME/.x11vnc.log.beaglebone:5900'
01/01/2000 01:20:29 Expanded logfile to '/home/root/.x11vnc.log.beaglebone:5900'
PORT=5900
-----

And set the timezone and timeserver:
-----
root@beaglebone:/# timedatectl set-timezone America/Los_Angel;e  es
root@beaglebone:/# /usr/lib/connman/test/set-global-timeservers pool.ntp.org
Setting timeserver to ['pool.ntp.org']
root@beaglebone:/# shutdown -r now
-----

Those could hardly be a problem, since it rebooted successfully there. The Wi-Fi was successfully shown in ifconfig with Tx and Rx packets, but was not the main interface. After I unplugged the wired ethernet and USB connection, the Wi-Fi did not become the active interface. I tried their command:

If Wifi is not working you can try restarting the connman service: systemctl restart connman.service

After that the board has failed to boot. 



There was one additional issue at that point. To unplug the USB, I had to provide 5V power, and the first power supply I grabbed sagged too far to run the board properly. I'm not seeing reports of that causing permanent problems, but could it? Whatever is the fancy tps65217c Power Management chip for if it can't cope with a momentary power sag? 



I see two kinds of interpetation for an ARM external abort. The ARM Architecture Reference Manual says they are due to faulty external hardware. But most TI links say they happen because somebody hasn't turned on some subsystem clock. My error seems to happen right after recognizing the proper ifconfig MAC for eth0:
-----
[    2.502401] Detected MACID = 90:59:af:4d:71:eb
[    2.506978] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe089e000
-----

But maybe that means it has probed for ra0? Still, if the Ralink adapter isn't found or its clock isn't enabled, shouldn't cpsw_probe be able to report not found, or at least fail gracefully? 



Stuck here with only U-boot access, I'm not sure what else to try. Is this a hardware problem? 

On Friday, January 03, 2014 at 8:26 AM, RMA Account <r...@beagleboard.org>
wrote:

> If I may, I suggest you follow the flashing procedure found at
> written to reflash the board with latest Angstrom image. Do not skip any
> steps. Make sure you hold the button down for at least 2 seconds after
> power is applied. That is all it takes. Do not use USB power. In fact do
> not have the USB installed at all. Just the DC power cable.

Did that. Same exact error, except at slightly different load address. 


S2 does make a difference:
-----
1c1
< U-Boot SPL 2013.04-dirty (Jul 10 2013 - 14:02:53)  <-- original eMMC version
---
> U-Boot SPL 2013.04-dirty (Jun 19 2013 - 09:57:14)  <-- Flasher uSD version
13,14d13
< mmc_send_cmd : timeout: No status update
-----


Make BBB U-Boot logs comparable (in TextPad) by hiding differing timestamps: 
Find
^\[[0-9. ]\{12\}\]
Replace
[ time ]

Difference from original eMMC to Flasher boot:
-----
Compare: (<)C:\Users\Loren\Documents\Projects\Computing\BeagleBone Black\BBB Linux - U-Boot\First Angstrom eMMC boot fail compare.txt (7047 bytes)
  with: (>)C:\Users\Loren\Documents\Projects\Computing\BeagleBone Black\BBB Linux - U-Boot\First Flasher eMMC boot fail compare.txt (6974 bytes)
1,3c1
< First Boot from eMMC (?):
< U-Boot SPL 2013.04-dirty (Jul 10 2013 - 14:02:53)
---
> U-Boot SPL 2013.04-dirty (Jun 19 2013 - 09:57:14)
15d13
< mmc_send_cmd : timeout: No status update
20c17
< U-Boot 2013.04-dirty (Jul 10 2013 - 14:02:53)
---
> U-Boot 2013.04-dirty (Jun 19 2013 - 09:57:14)
56,58c53,55
< 4385024 bytes read in 766 ms (5.5 MiB/s)
< gpio: pin 56 (gpio 56) value is 1
< 24808 bytes read in 42 ms (576.2 KiB/s)
---
> 4270840 bytes read in 744 ms (5.5 MiB/s)
> gpio: pin 56 (gpio 56) value is 1
> 24125 bytes read in 40 ms (588.9 KiB/s)
63c60
<    Data Size:    4384960 Bytes = 4.2 MiB
---
>    Data Size:    4270776 Bytes = 4.1 MiB
71c68
<    Using Device Tree in place at 80f80000, end 80f890e7
---
>    Using Device Tree in place at 80f80000, end 80f88e3c
78,87c75,84
< [ time ] bone-capemgr bone_capemgr.8: slot #0: No cape found
< [ time ] bone-capemgr bone_capemgr.8: slot #1: No cape found
< [ time ] bone-capemgr bone_capemgr.8: slot #2: No cape found
< [ time ] bone-capemgr bone_capemgr.8: slot #3: No cape found
< [ time ] bone-capemgr bone_capemgr.8: slot #6: BB-BONELT-HDMIN conflict P8.45 (#5:BB-BONELT-HDMI)
< [ time ] bone-capemgr bone_capemgr.8: slot #6: Failed verification
< [ time ] omap_hsmmc mmc.4: of_parse_phandle_with_args of 'reset' failed
< [ time ] bone-capemgr bone_capemgr.8: loader: failed to load slot-6 BB-BONELT-HDMIN:00A0 (prio 2)
< [ time ] pinctrl-single 44e10800.pinmux: pin 44e10854 already requested by 44e10800.pinmux; cannot claim for gpio-leds.7
< [ time ] pinctrl-single 44e10800.pinmux: pin-21 (gpio-leds.7) status -22
---
> [ time ] bone-capemgr bone_capemgr.9: slot #0: No cape found
> [ time ] bone-capemgr bone_capemgr.9: slot #1: No cape found
> [ time ] bone-capemgr bone_capemgr.9: slot #2: No cape found
> [ time ] bone-capemgr bone_capemgr.9: slot #3: No cape found
> [ time ] bone-capemgr bone_capemgr.9: slot #6: BB-BONELT-HDMIN conflict P8.45 (#5:BB-BONELT-HDMI)
> [ time ] bone-capemgr bone_capemgr.9: slot #6: Failed verification
> [ time ] bone-capemgr bone_capemgr.9: loader: failed to load slot-6 BB-BONELT-HDMIN:00A0 (prio 2)
> [ time ] omap_hsmmc mmc.4: of_parse_phandle_with_args of 'reset' failed
> [ time ] pinctrl-single 44e10800.pinmux: pin 44e10854 already requested by 44e10800.pinmux; cannot claim for gpio-leds.8
> [ time ] pinctrl-single 44e10800.pinmux: pin-21 (gpio-leds.8) status -22
...
[plus all the lines with register addresses]
-----


So it is using the Flasher to boot in the "hold S2, connect power" tries. It sees a different U-Boot version, different sized boot files, and a different length device tree. But it still says:
 Loaded environment from uEnv.txt
 Importing environment from mmc ...
 
Does it use "mmc" there to mean the uSD, or is it really reading a possibly corrupt file from the actual eMMC before flashing over it? 


Anyway, the same error happens at the same addresses using the Flasher: 
-----
[    0.760803] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe09fe000
[    0.768818] Internal error: : 1008 [#1] SMP THUMB2
[    0.773825] Modules linked in:
[    0.777030] CPU: 0    Not tainted  (3.8.13 #1)
[    0.781693] PC is at cpsw_probe+0x3fc/0x8a2
-----

No reply yet from the RMA people. 



Meanwhile, here are some notes from my struggle. Maybe they will help someone...  


The third boot log in this Google post seems a lot like mine, with the same cpsw_probe error at the same address and offset:
-----
booting with BBB-eMMC-flasher-2013.06.20.img.:
...
mmc0 is current device
micro SD card found
mmc0 is current device
gpio: pin 54 (gpio 54) value is 1
SD/MMC found on device 0
reading uEnv.txt
14 bytes read in 4 ms (2.9 KiB/s)  <-- same size uEnv
Loaded environment from uEnv.txt
Importing environment from mmc ...
gpio: pin 55 (gpio 55) value is 1
4270840 bytes read in 842 ms (4.8 MiB/s)  <-- different boot files
gpio: pin 56 (gpio 56) value is 1
24125 bytes read in 83 ms (283.2 KiB/s)  <-- different boot files
Booting from mmc ...
...
Using Device Tree in place at 80f80000, end 80f88e3c  <-- different length device tree
...
[    0.760819] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe09fe000
[    0.768836] Internal error: : 1008 [#1] SMP THUMB2
[    0.773842] Modules linked in:
[    0.777048] CPU: 0    Not tainted  (3.8.13 #1)
[    0.781711] PC is at cpsw_probe+0x3fc/0x8a2  <-- exact same cpsw address and offset!
[    0.786094] LR is at ioremap_page_range+0xc3/0x100
...
[    0.983733] [<c023f752>] (cpsw_probe+0x3fc/0x8a2) from [<c01f2cb9>] (platform_drv_probe+0xd/0xe)
...
[    1.088807] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
-----

The other two logs from that board, like my Ubuntu log, show the MACID probe just before the kernel panic:
-----
[    1.935973] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    1.942727] mmcblk0boot1: mmc1:0001 MMC02G partition 2 1.00 MiB
[    1.949451] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    1.957562] Detected MACID = c8:a0:30:b3:32:45[    1.962278]  mmcblk0: p1 p2
[    1.965340] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe09d8000
-----

and
-----
[    1.839815] TCP: cubic registered
[    1.843409] Initializing XFRM netlink socket
[    1.848032] mmcblk0: mmc0:1234 SA04G 3.63 GiB 
[    1.853882] NET: Registered protocol family 17
[    1.858813]  mmcblk0: p1 p2
[    1.861883] NET: Registered protocol family 15
[    1.868043] Key type dns_resolver registered
[    1.872871] VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 3
[    1.881063] ThumbEE CPU extension supported.
[    1.885622] Registering SWP/SWPB emulation handler
[    1.891664] registered taskstats version 1
[    1.947718] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[    1.954135] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    1.960995] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    1.968060] Detected MACID = c8:a0:30:b3:32:45
[    1.972694] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe09d8000
-----



I found this: 
-----
cpsw_probe
Defined as a function in:
drivers/net/ethernet/ti/cpsw.c, line 1906
Referenced (in 1 files total) in:
drivers/net/ethernet/ti/cpsw.c:
line 1907
line 2282
-----

Here's that code:
-----
1907 static int cpsw_probe(struct platform_device *pdev)
-----
(But it isn't much help without a debugger view.)



-----
Virtual Memory System Architecture 
B4-14 Copyright © 1996-1998, 2000, 2004, 2005 ARM Limited. All rights reserved. ARM DDI 0100I

B4.5 Aborts
Mechanisms that can cause the ARM processor to take an exception because of a memory access are:
MMU fault The MMU detects the restriction and signals the processor.
Debug abort Monitor debug-mode is enabled and a breakpoint or a watchpoint has been detected.
External abort The external memory system signals an illegal or faulting memory access.
Collectively, these are called aborts. Accesses that cause aborts are said to be aborted, and use Fault Address 
and Fault Status registers to record associated context information. The FAR and FSR registers are 
described in Fault Address and Fault Status registers on page B4-19

B4.5.3 External aborts
External memory errors are defined as those that occur in the memory system other than those that are 
detected by an MMU. External memory errors are expected to be rare and are likely to be fatal to the running 
process. An example of an event that could cause an external memory error is an uncorrectable parity or 
ECC failure on a Level 2 Memory structure.

Status (bits[1:0]) 
00 = Idle
01 = Queued
10 = Running
11 = Complete/Error
-----



-----
kernel panic which causes this three errors to be printed to screen

Unhandled fault: external abort on non-linefetch (0x008)
Unhandled fault: imprecise external abort (0xc06)
Kernel panic - not syncing: Fatal exception in interrupt

The values in parenthesis are the ifsr (instruction fault status) register. There are many causes for aborts and these give a specific cause. There are some tables in the kernel that handle particular fault causes and other have a handler which does a printk and aborts a task or can panic() the kernel. 

Note: Each fault status register has an abort.S file which is different for the particular ARM CPU. For example see abort-ev7.S v7_early_abort. This is put in a processor table which is matched at boot time.

Unhandled fault - trying to read memory that is not mapped (via MMU).
Kernel panic - an unhandled fault occurred in code deemed un-recoverable.

You may have device mapping setup properly. A common case is where the clocks for a peripheral are not enabled and the device does not respond to a bus request; especially external abort type messages maybe due to a missing clk_prepare_enable().
-----
"external abort on non-linefetch" is the explanation automatically read from the error file table...



-----
mmap works fine if I don't access any of the registers. But when I try to print out the register values I got
Unhandled fault: external abort on non-linefetch (0x1018) at 0x4002100c Bus error

You need to make sure that the appropriate McSPI functional (CM_FCLKEN1_CORE)  and interface (CM_ICLKEN1_CORE) clocks are enabled before reading the McSPI registers.
-----



-----
A Panda board does not have any onboard flash, where many other development or evaluation boards keep their bootloader. Rather, code onboard the board (presumably in ROM) reads the second-stage bootloaders from the MMC (SD card).

The first-stage bootloader runs directly on the board from power-up. I don't know the name of this bootloader(From TI official wiki, it called Boot Rom). This bootloader initializes a minimal amount of CPU and board hardware, then accesses the first partition of the SD card (which must be in FAT format), and loads a file called "MLO", and executes it. "MLO" is the second-stage bootloader.

The second-stage bootloader can apparently be one of either the X-loader or SPL. This bootloader apparently also just reads the first partition of the SD card, and loads a file called "u-boot.bin", and executes it. "u-boot.bin" is the third-stage bootloader.

The third-stage bootloader is U-boot, which is a popular bootloader for many different embedded boards and products. This bootloader has lots of different features, including an interactive shell, variables, ability to access the SD card and show its contents, etc. What happens next depends on the version of U-boot you have for the Panda board, and how it is configured. In a very simple configuration, U-Boot will look for the file "uImage" in the root of the first partition of the SD card (which, again, must be formatted as a FAT partition), and execute that. This is the Linux kernel. U-Boot passes the kernel a command line argument. Depending on how the kernel is configured it may accept the command line from U-Boot, or use one that was compiled into it when it was built.
-----


-----
by default the BBB boots from eMMC, see page 6 of the schematics. To force a boot from SD you need to remove power from the board completely, hold down S2 and then re-apply power. Keep holding the button until the four leds start turning on.  You have to do this at power on, and once you've done it the board will continue to boot from SD on a reboot or reset, only removing power will change the behaviour.
You could also move R68 to R93 if you want to make the board boot from SD by default.
 
Also note the boot sequence in the table on page 6 of the schematics, by default if MLO can't be found on the eMMC, it'll look for it on the SD card. So deleting MLO normally causes the board to boot from SD if the appropriate files are present.

It's worth a short bit of background info here too.  The boot device will always be seen by the OS as /dev/mmcblk0, there's some un-explained requirement for this to be the case. Some code in u-Boot checks a gpio for the prescence of an uSD card and swaps the ordering if it's found.  So be careful... you can't make the assumption that eMMC is always going to be /dev/mmcblk0 and uSD is mmcblk1. 

The way to tell which is which is that when you run fdisk -l you'll see two additional devices /dev/mmcblk0boot1 & /dev/mmcblk0boot1. Exactly what these are isn't really important, but they should only appear for the eMMC, not a normal uSD card, so you can use that to work out what device is the eMMC. Knowing which is which means you don't accidentally delete the files from the wrong one.
-----


I'm still confused about the S2 boot switch...  

uSD removed:
--
mmc0(part 0) is current device
Card did not respond to voltage select!
No micro SD card found, setting mmcdev to 1
mmc1(part 0) is current device
SD/MMC found on device 1
reading uEnv.txt
26 bytes read in 3 ms (7.8 KiB/s)  <-- Read from eMMC
--

uSD replaced, simple reset: 
or release S2 boot button...  
  after "U-Boot SPL" ends, before "U-Boot":
  at first user LED:
or hold button through entire boot fail:
--
mmc0 is current device
micro SD card found
mmc0 is current device
gpio: pin 54 (gpio 54) value is 1
SD/MMC found on device 0
reading uEnv.txt
14 bytes read in 2 ms (6.8 KiB/s)  <-- read from uSD
--

Looks like it boots from the uSD (or at least uses its uEnv) whenever it is present!
At least when Ubuntu is on the uSD...
But not when the Angstrom Flasher is on the uSD! It only tries to boot the Flasher if you connect power while holding S2. 


Confirming where it is getting these file sizes:

U-Boot# fatls mmc 0:1  <-- looks like the uSD
       72   id.txt
    99976   mlo
            .trashes/
       14   uenv.txt  <-- uSD

From the eMMC via SSH terminal (before uSD or Wi-Fi):
./media/BEAGLEBONE:
total 536
drwx------  5 root root   1024 Jan  1  1970 .
drwxr-xr-x 11 root root   4096 Jan  1 00:00 ..
-rw-r--r--  1 root root  99976 Mar 18  2013 MLO
-rw-r--r--  1 root root 379428 Mar 18  2013 u-boot.img
-rw-r--r--  1 root root     26 Mar 18  2013 uEnv.txt  <-- the one that gets read? 
...
./boot:
total 4592
-rwxr-xr-x  1 root root      33 Mar  4  2013 uEnv.txt  <-- supposedly the "real" one, but...
lrwxrwxrwx  1 root root      13 Mar 18  2013 uImage -> uImage-3.8.13
-rw-r--r--  1 root root   18940 Sep  4  2013 omap4-panda.dtb
-rw-r--r--  1 root root   24379 Sep  4  2013 am335x-bone.dtb



U-Boot#  printenv
with Ubuntu on the uSD:
Compare: (<)C:\Users\Loren\Documents\Projects\Computing\BeagleBone Black\printenv Ubuntu uSD (booted).txt (4164 bytes)
   with: (>)C:\Users\Loren\Documents\Projects\Computing\BeagleBone Black\printenv Angstrom eMMC (Ubuntu uSD present not booted).txt (3868 bytes)
-----
23d23
< filesize=154

29,30c28
< loadfdt=load mmc ${mmcdev}:${mmcpart} ${fdtaddr} ${bootdir}/dtbs/${fdtfile}
< loadimage=load mmc ${mmcdev}:${mmcpart} ${loadaddr} ${bootdir}/${bootfile}
---
> loadfdt=load mmc ${bootpart} ${fdtaddr} ${bootdir}/${fdtfile}

36d34
< mmcpart=2

50d47
< optargs=fixrtc

69d65
< uenvcmd=i2c mw 0x24 1 0x3e; kd=0; if test $mmcdev -eq 1; then mmc dev 0; if mmc rescan; then kd=1; fi; mmc dev 1; fi; setenv mmcroot /dev/mmcblk${kd}p${mmcpart} ro

74c69
< Environment size: 4178/131068 bytes
---
> Environment size: 3877/131068 bytes
-----



LED pattern of boot from eMMC at 0:38:
LED pattern of boot from uSD at 1:15:

User LEDs:
-----
USER0 is the heartbeat indicator from the Linux kernel.
USER1 turns on when the SD card is being accessed
USER2 is an activity indicator. It turns on when the kernel is not in the idle loop.
USER3 turns on when the onboard eMMC is being accessed.
-----
But do those apply during U-Boot?



Loren

Robert Nelson

unread,
Jan 5, 2014, 5:33:52 PM1/5/14
to Beagle Board
On Sat, Jan 4, 2014 at 2:44 PM, <lorena...@gmail.com> wrote:
> Bought a BBB [0 047132904547 A6] last month, and had about two hours of
> delight, followed by literal days of fruitless struggle Googling for clues.
> I'm about out of ideas.
>
> Contacting Beagleboard produced:
> Did you send the same info to beagl...@googlegroups.com?
>
> So here it is... Maybe it will help someone...
>
> But I've decided all of them are normal except this one:
> -----
> [ 2.502401] Detected MACID = 90:59:af:4d:71:eb
> [ 2.506978] Unhandled fault: external abort on non-linefetch (0x1008) at
> 0xe089e000
> [ 2.515192] Internal error: : 1008 [#1] SMP ARM
> [ 2.519936] Modules linked in:
> [ 2.523139] CPU: 0 Not tainted (3.8.13-bone30 #1)

Your kernel is out of date, please upgrade to "v3.8.13-bone35" first..

Contact "whoever" you got the image from for directions..

Regards,

--
Robert Nelson
http://www.rcn-ee.com/

Vaibhav Bedia

unread,
Jan 5, 2014, 7:05:52 PM1/5/14
to beagl...@googlegroups.com
On Sat, Jan 4, 2014 at 3:44 PM, <lorena...@gmail.com> wrote:
> Bought a BBB [0 047132904547 A6] last month, and had about two hours of
> delight, followed by literal days of fruitless struggle Googling for clues.
> I'm about out of ideas.
>
> Contacting Beagleboard produced:
> Did you send the same info to beagl...@googlegroups.com?
>
> So here it is... Maybe it will help someone...
>
>
>
> There are lots of lines in my boot log that look like errors, such as:
> -----
> OMAP SD/MMC: 0
> mmc_send_cmd : timeout: No status update
> reading u-boot.img
> ...
> WARNING: Caches not enabled
> NAND: No NAND device found!!!
> 0 MiB
> MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1
> *** Warning - readenv() failed, using default environment
> ...
> Net: <ethaddr> not set. Validating first E-fuse MAC
> Phy not found
> PHY reset timed out
I guess ^^^ that's your problem. I can't see why the WiFi will interfere
with the CPSW. After you update the image as suggested by Robert
check if you get a similar PHY timeout message. If yes, you want
to look at https://groups.google.com/forum/#!topic/beagleboard/9mctrG26Mc8

The CPSW driver should handle the error gracefully but then that's another
problem.

Loren Amelang

unread,
Jan 6, 2014, 1:33:38 AM1/6/14
to beagl...@googlegroups.com
@RobertCNelson:  "Your kernel is out of date, please upgrade to "v3.8.13-bone35" first..."

That kernel version is from attempts to boot Ubuntu. Maybe I could find a newer Ubuntu image, but I have the same kernel panic problem trying to boot Angstrom. It only says:
## Booting kernel from Legacy Image at 80007fc0 ...
   Image Name:   Angstrom/3.8.13/beaglebone
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    4270776 Bytes = 4.1 MiB
How do I tell if that is out of date? It is specifically the image the RMA people told me to use - BBB-eMMC-flasher-2013.09.04.img.xz

There seem to be a maddening variety of images available, but I haven't found any newer production versions. 


@Vaibhav: 
> PHY reset timed out 
I guess ^^^ that's your problem. I can't see why the WiFi will interfere with the CPSW.

I've seen lots of boot logs show those lines and still boot successfully. Your links and others I've found are about the ethernet dying after an hour or more, or failing to start maybe 1/50 times while the rest of the board boots successfully. 

My Ubuntu boots do show a later version of phy not found - just before the fatal error:
-----
[    2.459389] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[    2.465825] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    2.472570] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    2.479609] Detected MACID = 90:59:af:4d:71:eb
[    2.484177] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe089e000
[    2.492394] Internal error: : 1008 [#1] SMP ARM
-----
But right after saying the probe failed, it displays the proper MAC address for the interface it couldn't find! 


Something is obviously screwed up, but my kernel booted fine before I tried to install Wi-Fi, and it is the one Beagleboard RMA says I should use. Wish I had boot logs from back then, but my first console port adapter arrived with a bad cable. I'm not seeing anyone else (except for the one link I posted) say cpsw_probe totally prevents booting. And in that one instance, the kernel that failed in one board worked in the other five...  


Loren Amelang

unread,
Jan 6, 2014, 1:57:52 AM1/6/14
to beagl...@googlegroups.com
@Vaibhav,

Just found this:

about a problem with clocks not being enabled before mdio: probe - which seems directly applicable to my Ubuntu fail!

"Make the driver control the device clocks. Appearantly, the Davinci
platform probes this driver with the clock all powered up, but on OMAP,
this isn't the case.Make the driver control the device clocks. Appearantly, the Davinci
platform probes this driver with the clock all powered up, but on OMAP,
this isn't the case." 

"Certainly, with respect to CPSW & MDIO, this patch is not enough and
requires further investigation. I have started looking at this and
hopefully will have some solution soon..."

But that was over a year ago - was it resolved? 

Robert Nelson

unread,
Jan 6, 2014, 8:54:54 AM1/6/14
to Beagle Board
On Mon, Jan 6, 2014 at 12:33 AM, Loren Amelang <lorena...@gmail.com> wrote:
> @RobertCNelson: "Your kernel is out of date, please upgrade to
> "v3.8.13-bone35" first..."
>
> That kernel version is from attempts to boot Ubuntu. Maybe I could find a
> newer Ubuntu image,

http://elinux.org/BeagleBoardUbuntu

Kept updated once a month, and kernel's can easily be upgraded as they
are released...
Message has been deleted

Loren Amelang

unread,
Jan 6, 2014, 4:17:02 PM1/6/14
to beagl...@googlegroups.com
Robert,

Sorry if I seem dense or argumentative, but I can't find any prebuilt images using the 3.8.13-bone35 kernel. The newest images say bone32. I chose the Ubuntu 12.04 image for the LTS, which is important to my eventual project. But at this point I'd try anything that might actually boot. 

I don't suppose swapping your bone35 kernel into an image I have is simple...  I found 
and
about creating images, and they are way beyond my current understanding. Plus I'm out in "northwest nowhere" with slow, expensive internet so pulling big things from git is painful. And my only operating Linux machine is a tiny netbook with Ubuntu Intrepid...  

Is it worth trying one of the bone-32 images? 

Is it unrealistic to think I could have an Ubuntu LTS image that would work on a BBB? Maybe one must grab it at just the right point in its lifecycle, and I'm too late for 12.04? 

"kernel's can easily be upgraded as they 
are released... "

Have I missed some simple procedure? 

Loren

Robert Nelson

unread,
Jan 6, 2014, 4:20:14 PM1/6/14
to Beagle Board
On Mon, Jan 6, 2014 at 3:15 PM, Loren Amelang <lorena...@gmail.com> wrote:
> Robert,
>
> Sorry if I seem dense or argumentative, but I can't find any prebuilt images
> using the 3.8.13-bone35 kernel. The newest images say bone32. I chose the
> Ubuntu 12.04 image for the LTS, which is important to my eventual project.

LOL! "LTS" doesn't meet crap on arm, good luck getting anything fixed
that does not affect x86. Canonical does not have enough arm
developers to support it.. They only support the current release...

> But at this point I'd try anything that might actually boot.
>
> I don't suppose swapping your bone35 kernel into an image I have is
> simple... I found

Example install saucy: http://elinux.org/BeagleBoardUbuntu#Saucy_13.10

then run:
wget http://rcn-ee.net/deb/saucy-armhf/v3.8.13-bone35/install-me.sh
sudo /bin/bash install-me.sh

sudo reboot

Loren Amelang

unread,
Jan 6, 2014, 6:26:42 PM1/6/14
to beagl...@googlegroups.com
Robert,

Sorry! Me again...  

I have your Saucy image. Verified. But it appears it requires your setup_sdcard script to be run, from a Linux machine. I've been using Win32DiskImager...  So I've copied it to the netbook, and I see the script, but when I plug in my uSD card it says my system can't mount ext4. I suppose that means Intrepid can't directly create the uSD image either. Is there some option to write your output back to a local file and use Win32DiskImager to write it from Windows? 

And of course if the bone32 kernel doesn't boot my BBB, I won't be able to update to bone35, so the whole exercise might be moot. I thought the whole idea was that I needed bone35 to solve my cpsw_probe error that keeps me from booting at all. (But bone30 booted fine for the first two hours!!!) 

Thanks for clueing me in about LTS. Maybe you can provide the same insight about the whole idea of using the BBB? I've now spent almost a month struggling with what was supposed to be a $50 "black box" component of a much larger project. I didn't want another hobby, nor do I want to go buy a competent Linux development system to support my $50 hardware. Would I be smart to give up now? Maybe it is just my own strange karma? I see thousands of "makers" out there using these little boards without all this hassle...  

Loren

Robert Nelson

unread,
Jan 6, 2014, 6:30:43 PM1/6/14
to Beagle Board


On Jan 6, 2014 5:27 PM, "Loren Amelang" <lorena...@gmail.com> wrote:
>>
>> Robert,
>>
>> Sorry! Me again...  
>>
>> I have your Saucy image. Verified. But it appears it requires your setup_sdcard script to be run, from a Linux machine. I've been using Win32DiskImager...  So I've copied it to the netbook, and I see the script, but when I plug in my uSD card it says my system can't mount ext4. I suppose that means Intrepid can't directly create the uSD image either. Is there some option to write your output back to a local file and use Win32DiskImager to write it from Windows? 

Scroll down you find both a flasher and microsd version for you non linux users.

>>
>> And of course if the bone32 kernel doesn't boot my BBB, I won't be able to update to bone35, so the whole exercise might be moot. I thought the whole idea was that I needed bone35 to solve my cpsw_probe error that keeps me from booting at all. (But bone30 booted fine for the first two hours!!!) 
>>
>> Thanks for clueing me in about LTS. Maybe you can provide the same insight about the whole idea of using the BBB? I've now spent almost a month struggling with what was supposed to be a $50 "black box" component of a much larger project. I didn't want another hobby, nor do I want to go buy a competent Linux development system to support my $50 hardware. Would I be smart to give up now? Maybe it is just my own strange karma? I see thousands of "makers" out there using these little boards without all this hassle...  
>>
>> Loren
>>

> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Vaibhav Bedia

unread,
Jan 6, 2014, 7:11:22 PM1/6/14
to beagl...@googlegroups.com
As per the commit logs of the mainline kernel a patch adding pm_runtime_*
calls in the driver went in around 3.7-rc3. I believe the images that you
have which are based on v3.8 will have them in place.

I still can't think of a reason why it would worked earlier. Could be a race
condition or could even be a hardware thing. Try changing the kernel build
to narrow it down. And yes, the error that you see would typically come up
when the init fails (typically clock).

Loren Amelang

unread,
Jan 6, 2014, 11:24:40 PM1/6/14
to beagl...@googlegroups.com
Vaibhav,

I've now been through six different images and they all end up with the same panic. I don't know what else to do except push on the RMA people for a hardware change. But then I'd be afraid I'd create the same problem on a new board...  

As for command timing, I've seen vastly different command sequences leading up to the panic, so unless you mean the timing within cpsw_probe itself it is hard to imagine how that could be a factor. And this is 100% failure, with any image, on either memory, power cycled or reset in any order. With or without console to USB or console to level converter connection, with or without USB or ethernet connected. 

Obviously cpsw_probe needs to fail more gracefully. Hopefully with a clue as to what is wrong. Could that be the "mdio failed with error -5" line? 


Here's a 1 in 100 boots problem with that error:

But he gets a "random" MAC - I get the correct one:
| davinci_mdio: probe of 4a101000.mdio failed with error -5
| Random MACID = 16:74:44:51:f1:0f
| gpio-keys volume_keys.6: Unable to claim irq 0; error -22
| gpio-keys: probe of volume_keys.6 failed with error -22


The guy mentioned way above with my same problem on only one of six BBB boards shows the -5 error:


Found (maybe?) the error doc:
-----
enum   { MDIO_ERROR = -1, MDIO_LENGTH_ERRMSG = 240 }
...
Error numbers.
Enumeration values:
MDIO_ERROR_NONE No error.
MDIO_ERROR_WARN A warning.
MDIO_ERROR_BADVAL An illegal value pertaining to file.
MDIO_ERROR_NOMEM Memory cannot be allocated.
MDIO_ERROR_OPEN Cannot open a file.
MDIO_ERROR_CLOSE Cannot close a file.
MDIO_ERROR_READ Unable to read from file.
MDIO_ERROR_WRITE Unable to write to file.
MDIO_ERROR_SEEK Unable to perform a byte seek within file.
MDIO_ERROR_SYNTAX Syntax error occurred with file.
MDIO_ERROR_UNXEOF The end-of-file marker occurred unexpectedly.
-----

But that doesn't make much sense to me...  It is all about opening and closing files, not probing devices. Yes, I know "everything in Unix is a file", but those errors really do sound like file errors. 

Guess I'm in over my depth. 

Loren

Loren Amelang

unread,
Jan 6, 2014, 11:29:35 PM1/6/14
to beagl...@googlegroups.com
Robert,

Tried the prebuilt Ubuntu 13.10 Flasher image:
BBB-eMMC-flasher-ubuntu-13.10-2013-12-17-2gb.img.xz

Pretty much the same result as the Ubuntu uSD, and all the other images I've tried. Lots of apparently nonfatal errors:

U-Boot SPL 2013.10-00015-gab7a95a (Nov 08 2013 - 16:01:27)
reading args
spl: error reading image args, err - -1
...
WARNING: Caches not enabled
NAND:  0 MiB
MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
*** Warning - readenv() failed, using default environment

Net:   <ethaddr> not set. Validating first E-fuse MAC
Could not get PHY for cpsw: addr 0
cpsw, usb_ether


Different boot file sizes:

reading uEnv.txt
1313 bytes read in 3 ms (426.8 KiB/s)
Importing environment from mmc ...  <-- still this line that I fear is copying my problem
gpio: pin 55 (gpio 55) value is 1
Checking if uenvcmd is set ...
gpio: pin 56 (gpio 56) value is 1
Running uenvcmd ...
reading zImage
3334336 bytes read in 313 ms (10.2 MiB/s)
reading initrd.img
2996231 bytes read in 282 ms (10.1 MiB/s)
reading /dtbs/am335x-boneblack.dtb
24884 bytes read in 8 ms (3 MiB/s)
Kernel image @ 0x80200000 [ 0x000000 - 0x32e0c0 ]
## Flattened Device Tree blob at 815f0000
   Booting using the fdt blob at 0x815f0000
   Using Device Tree in place at 815f0000, end 815f9133


Still the same MAC addresses:

[    0.117178] cpsw.0: No hwaddr in dt. Using 90:59:af:4d:71:eb from efuse
[    0.117199] cpsw.1: No hwaddr in dt. Using 90:59:af:4d:71:ed from efuse


Still the same kernel panic:

[    2.683230] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[    2.689662] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    2.696404] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    2.703442] Detected MACID = 90:59:af:4d:71:eb
[    2.707998] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe089e000
[    2.716213] Internal error: : 1008 [#1] SMP ARM
[    2.720957] Modules linked in:
[    2.724162] CPU: 0    Not tainted  (3.8.13-bone32 #1)
[    2.729465] PC is at cpsw_probe+0x528/0xbc8
[    2.733849] LR is at ioremap_page_range+0xd8/0x16c
...
[    2.923102] [<c03f245c>] (cpsw_probe+0x528/0xbc8) from [<c037c4e0>] (driver_probe_device+0xa4/0x1e4)
[    2.932676] [<c037c4e0>] (driver_probe_device+0xa4/0x1e4) from [<c037c6cc>] (__driver_attach+0x68/0x8c)
[    2.942529] [<c037c6cc>] (__driver_attach+0x68/0x8c) from [<c037ad1c>] (bus_for_each_dev+0x70/0x84)



I don't think any image is going to boot this board. I am concerned that every one of them displays the line: 

Importing environment from mmc ... 

If that means what it says, it seems like it is copying whatever configuration problem is killing my cpsw_probe to each new attempt at booting or flashing. 

The only environment lines that jump out at me are: 

ethact=cpsw
ethaddr=90:59:af:4d:71:eb
...
usbnet_devaddr=90:59:af:4d:71:eb

The original MAC addresses from ifconfig, before the boot failures: 
eth0      Link encap:Ethernet  HWaddr 90:59:AF:4D:71:EB
ra0       Link encap:Ethernet  HWaddr 00:0C:43:00:7D:7F
usb0      Link encap:Ethernet  HWaddr 6E:5A:F6:F0:F3:45


That "importing" line is echoed directly from the env:

fdt_high=0xffffffff
fdtaddr=0x80F80000
fdtfile=am335x-boneblack.dtb
findfdt=if test $board_name = A33515BB; then setenv fdtfile am335x-evm.dtb; fi; if test $board_name = A335X_SK; then setenv fdtfile am335x-evmsk.dtb; fi;if test $board_name = A335BONE; then setenv fdtfile am335x-bone.dtb; fi; if test $board_name = A335BNLT; then setenv fdtfile am335x-boneblack.dtb; fi
importbootenv=echo Importing environment from mmc ...; env import -t $loadaddr $filesize
kloadaddr=0x80007fc0
loadaddr=0x80200000

Help! I need some better suggestions than to try yet another image. 

Loren

Gerald Coley

unread,
Jan 7, 2014, 9:30:23 AM1/7/14
to beagl...@googlegroups.com
Send the board in under an RMA and get it looked at.

Gerald

Loren Amelang

unread,
Jan 7, 2014, 4:01:03 PM1/7/14
to beagl...@googlegroups.com
RMA initiated. Hope they reveal what went wrong with my board!

Found one clue, my fears about importbootenv are probably unfounded: 
---
On Thu, 12 Sep 2013 18:41:43 -0500, Ryan Barnett wrote:
> Some boards in u-boot support the ability to modify the environment
> by placing a plain text file as uEnv.txt in the root of the partition
> of an SD card. For the extact placement of where the uEnv.txt should
> be, consult your u-boot environment. Your board supports this
> overwriting of environment variables if "loadbootenv" and
> "importbootenv" are defined in the board's environment.

loadbootenv and importbootenv are just U-Boot scripts that are specific
to certain board configurations.

All what loadbootenv does it load a file into memory, and all what
importbootenv does is call 'env import -t <addr> <size>' to load the
environment into U-Boot.
---

So I have:
loadbootenv=load mmc ${mmcdev} ${loadaddr} ${bootenv}
and
importbootenv=echo Importing environment from mmc ...; env import -t $loadaddr $filesize
and
loadaddr=0x80200000
bootenv=uEnv.txt

Filesize is not defined, but it doesn't seem to be defined in any environments I can find: 

Several pages say tftp automatically sets $filesize:
but I haven't found any mention of loadbootenv setting it. It must...  

So my fear of importing another environment from a hosed eMMC seems unfounded. All of this apparently only loads one environment:
---
bootcmd=gpio set 53; i2c mw 0x24 1 0x3e; run findfdt; mmc dev 0; if mmc rescan ; then echo micro SD card found;setenv mmcdev 0;else echo No micro SD card found, setting mmcdev to 1;setenv mmcdev 1;fi;setenv bootpart ${mmcdev}:2;mmc dev ${mmcdev}; if mmc rescan; then gpio set 54; echo SD/MMC found on device ${mmcdev};if run loadbootenv; then echo Loaded environment from ${bootenv};run importbootenv;fi;if test -n $uenvcmd; then echo Running uenvcmd ...;run uenvcmd;fi;gpio set 55; if run loaduimage; then gpio set 56; run loadfdt;run mmcboot;fi;fi;
---
micro SD card found
mmc0 is current device
SD/MMC found on device 0
reading uEnv.txt
1313 bytes read in 3 ms (426.8 KiB/s)
Loaded environment from uEnv.txt
Importing environment from mmc ...
Running uenvcmd ...
---

Off to pack up the board for RMA...

Loren

Gerald Coley

unread,
Jan 7, 2014, 4:21:15 PM1/7/14
to beagl...@googlegroups.com
We will check out the HW and if it is bad, will will repair it.

Gerald



Vaibhav Bedia

unread,
Jan 7, 2014, 5:57:59 PM1/7/14
to beagl...@googlegroups.com
On Tue, Jan 7, 2014 at 4:21 PM, Gerald Coley <ger...@beagleboard.org> wrote:
We will check out the HW and if it is bad, will will repair it.


Hi Gerald,

In case it's not a logistics nightmare (which it could very well be), could you let the list know if it's really a HW thing? Based on the multiple things that Loren has tried out, my hunch is it might be. However, there's always the possibility of this being due to some subtle s/w bug that someone would need to chase down and make things more reliable.

Gerald Coley

unread,
Jan 8, 2014, 8:43:49 AM1/8/14
to beagl...@googlegroups.com
Depends on if we get the board to look at. 

Gerald

Víctor MV

unread,
Jan 9, 2014, 4:35:32 PM1/9/14
to beagl...@googlegroups.com
Hi,

I'm experiencing a similar behavior in a board based on the BeagleBone (the BB itself hasn't been modified):

The boot log:

...
[    2.155945] mmc0: new high speed SDHC card at address aaaa
[    2.162581] mmcblk0: mmc0:aaaa SU08G 7.40 GiB 
[    2.169448]  mmcblk0: p1 p2
[    2.172606] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[    2.179049] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    2.187284] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    2.194505] Detected MACID = bc:6a:29:84:8d:3a
[    2.199173] Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0898000
[    2.207399] Internal error: : 1008 [#1] SMP ARM
[    2.212149] Modules linked in:
[    2.215364] CPU: 0    Not tainted  (3.8.13-bone35 #1)
[    2.220681] PC is at cpsw_probe+0x530/0xbcc
[    2.225077] LR is at ioremap_page_range+0xd8/0x16c
[    2.230101] pc : [<c03f77d8>]    lr : [<c02ead08>]    psr: a0000113
[    2.230101] sp : cf05fe38  ip : cf04d260  fp : cf43aa98
[    2.242123] r10: 00000001  r9 : cf43ad40  r8 : d0898000
[    2.247598] r7 : cf0d4800  r6 : 00000000  r5 : cf0d4810  r4 : cf43a800
[    2.254436] r3 : 00000000  r2 : 00000000  r1 : 4a100e13  r0 : d0898000
[    2.261279] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[    2.268937] Control: 10c5387d  Table: 80004019  DAC: 00000015
[    2.274959] Process swapper/0 (pid: 1, stack limit = 0xcf05e240)
[    2.281254] Stack: (0xcf05fe38 to 0xcf060000)
[    2.285822] fe20:                                                       00000000 00000000
[    2.294398] fe40: cf439e08 cf43ad40 00000000 c014cff0 22222222 00000020 00000000 cf439e88
[    2.302974] fe60: cf439e08 cf439e08 00000008 c014cee0 00000000 cf439e08 cf0d1488 cf4329c0
[    2.311548] fe80: 00000000 c014d8a0 cf0474b8 c005e690 00000000 00000003 cf0d1488 00000000
[    2.320123] fea0: c0a2342c cf0d4810 cf0d4818 cf0d4810 cf0d4844 c0a2342c c0998794 c09b6000
[    2.328698] fec0: 00000000 cf05e008 00000000 c0381024 00000000 cf0d4810 cf0d4844 c0998794
[    2.337272] fee0: 00000000 c0381210 00000000 c0998794 c03811a8 c037f860 cf047478 cf0d0c80
[    2.345848] ff00: c0998794 cf4329c0 c098e038 c03807e8 c07f9eea c07f9eea 00000000 c0998794
[    2.354421] ff20: c091ac78 c092d984 c090df10 c038175c 00000007 c091ac78 c092d984 c090df10
[    2.362995] ff40: c09b6000 c0008894 c090df10 0000f434 c092d9b0 00000008 00000007 c091ac78
[    2.371568] ff60: c092d984 c09b6000 c09b6000 000000f3 c091ac80 c08e8918 00000007 00000007
[    2.380140] ff80: c08e8270 00000000 00000000 c0605bbc 00000000 00000000 00000000 00000000
[    2.388711] ffa0: 00000000 c0605bc4 00000000 c000d478 00000000 00000000 00000000 00000000
[    2.397282] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.405854] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 e7feefe6 e6bafaae
[    2.414456] [<c03f77d8>] (cpsw_probe+0x530/0xbcc) from [<c0381024>] (driver_probe_device+0xa4/0x1e4)
[    2.424036] [<c0381024>] (driver_probe_device+0xa4/0x1e4) from [<c0381210>] (__driver_attach+0x68/0x8c)
[    2.433888] [<c0381210>] (__driver_attach+0x68/0x8c) from [<c037f860>] (bus_for_each_dev+0x70/0x84)
[    2.443374] [<c037f860>] (bus_for_each_dev+0x70/0x84) from [<c03807e8>] (bus_add_driver+0xdc/0x218)
[    2.452861] [<c03807e8>] (bus_add_driver+0xdc/0x218) from [<c038175c>] (driver_register+0x9c/0x124)
[    2.462350] [<c038175c>] (driver_register+0x9c/0x124) from [<c0008894>] (do_one_initcall+0x8c/0x150)
[    2.471936] [<c0008894>] (do_one_initcall+0x8c/0x150) from [<c08e8918>] (kernel_init_freeable+0x108/0x1cc)
[    2.482071] [<c08e8918>] (kernel_init_freeable+0x108/0x1cc) from [<c0605bc4>] (kernel_init+0x8/0xe4)
[    2.491657] [<c0605bc4>] (kernel_init+0x8/0xe4) from [<c000d478>] (ret_from_fork+0x14/0x3c)
[    2.500394] Code: e59f164c ebfe1b48 ea0000d1 e58485c0 (e5982000) 
[    2.506770] ---[ end trace 9974d47096abe9bf ]---



Víctor MV

unread,
Jan 10, 2014, 6:17:08 PM1/10/14
to beagl...@googlegroups.com
Some more information, i decided to give it a try with the image Angstrom-Cloud9-IDE-eglibc-ipk-v2012.02-core-beaglebone-2012.02.14.img.xz available at http://www.angstrom-distribution.org/demo/beaglebone/

Surprisingly this is what i'm getting:

-Boot 2011.09-00010-g81c8c79 (Feb 13 2012 - 14:48:03)

I2C:   ready
DRAM:  256 MiB
No daughter card present
NAND:  HW ECC Hamming Code selected
16 MiB
MMC:   OMAP SD/MMC: 0
*** Warning - readenv() failed, using default environment

Net:   cpsw
Hit any key to stop autoboot:  0 
SD/MMC found on device 0
reading uEnv.txt

33 bytes read
Loaded environment from uEnv.txt
Importing environment from mmc ...
reading uImage

3137440 bytes read
## Booting kernel from Legacy Image at 80007fc0 ...
   Image Name:   Angstrom/3.2/beaglebone
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    3137376 Bytes = 3 MiB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
   XIP Kernel Image ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[    0.068294] _omap_mux_get_by_name: Could not find signal uart1_cts.uart1_cts
[    0.068315] omap_hwmod_mux_init: Could not allocate device mux entry
[    0.068465] _omap_mux_get_by_name: Could not find signal uart2_cts.uart2_cts
[    0.068483] omap_hwmod_mux_init: Could not allocate device mux entry
[    0.068644] _omap_mux_get_by_name: Could not find signal uart3_cts_rctx.uart3_cts_rctx
[    0.068728] omap_hwmod_mux_init: Could not allocate device mux entry
[    0.106314] cpuidle-am33xx cpuidle-am33xx.0: failed to register driver
[    0.261139] _omap_mux_get_by_name: Could not find signal leds-gpio
[    0.651272] omap2_set_init_voltage: unable to get clk dpll1_ck
[    0.657458] omap2_set_init_voltage: unable to set vdd_mpu_iva
[    0.663475] omap2_set_init_voltage: unable to get clk l3_ick
[    0.669409] omap2_set_init_voltage: unable to set vdd_core
[    0.927024] hub 1-0:1.0: over-current condition on port 1
[    1.186994] hub 1-0:1.0: over-current condition on port 1
[    1.447002] hub 1-0:1.0: over-current condition on port 1
[    1.706975] hub 1-0:1.0: over-current condition on port 1
[    1.966988] hub 1-0:1.0: over-current condition on port 1
[    2.226976] hub 1-0:1.0: over-current condition on port 1
[    2.486983] hub 1-0:1.0: over-current condition on port 1
[    2.746982] hub 1-0:1.0: over-current condition on port 1
[    3.006980] hub 1-0:1.0: over-current condition on port 1
[    3.266980] hub 1-0:1.0: over-current condition on port 1
[    3.526979] hub 1-0:1.0: over-current condition on port 1
[    3.786975] hub 1-0:1.0: over-current condition on port 1
[    4.046994] hub 1-0:1.0: over-current condition on port 1
[    4.306975] hub 1-0:1.0: over-current condition on port 1
[    4.566981] hub 1-0:1.0: over-current condition on port 1
[    4.827045] hub 1-0:1.0: over-current condition on port 1
[    5.086988] hub 1-0:1.0: over-current condition on port 1
[    5.346982] hub 1-0:1.0: over-current condition on port 1
[    5.606985] hub 1-0:1.0: over-current condition on port 1
[    5.866990] hub 1-0:1.0: over-current condition on port 1
[    6.126976] hub 1-0:1.0: over-current condition on port 1
[    6.386980] hub 1-0:1.0: over-current condition on port 1
[    6.646976] hub 1-0:1.0: over-current condition on port 1
[    6.906974] hub 1-0:1.0: over-current condition on port 1
[    7.166987] hub 1-0:1.0: over-current condition on port 1
[    7.427043] hub 1-0:1.0: over-current condition on port 1
[    7.687078] hub 1-0:1.0: over-current condition on port 1
[    7.947139] hub 1-0:1.0: over-current condition on port 1
[    8.207116] hub 1-0:1.0: over-current condition on port 1
[    8.467097] hub 1-0:1.0: over-current condition on port 1
[    8.727113] hub 1-0:1.0: over-current condition on port 1
[    8.987448] hub 1-0:1.0: over-current condition on port 1
systemd-fsck[56]: Angstrom-Cloud9-: clean, 28959/874496 files, 748256/3494137 blocks
[    9.247089] hub 1-0:1.0: over-current condition on port 1
[    9.507206] hub 1-0:1.0: over-current condition on port 1

For what i've read online about this:
  1. too much current is being consumed by whatever is plugged into that port or 
  2. the power FET (U13?) for the USB port has been damaged causing it to indicate an over current condition
On my design (BB-based), P1 (i'm assuming it's referring to P10 Ethernet connector) is not soldered. And i doubt it's due to the power FET.

Maybe if i compile the kernel without davinci_mdio?

Víctor MV

unread,
Jan 10, 2014, 7:22:06 PM1/10/14
to beagl...@googlegroups.com

Loren Amelang

unread,
Jan 10, 2014, 11:09:22 PM1/10/14
to beagl...@googlegroups.com
@Victor:

In your first post, you had exactly my error report. So do I understand correctly you tried the two older images listed below and received the totally different error reports? I've never seen any of those messages, but I've only tried newer images, like Angstrom-Cloud9-IDE-GNOME-eglibc-ipk-v2012.12-beaglebone-2013.09.05.img.xz. 
 

[    0.651272] omap2_set_init_voltage: unable to get clk dpll1_ck
[    0.657458] omap2_set_init_voltage: unable to set vdd_mpu_iva
[    0.663475] omap2_set_init_voltage: unable to get clk l3_ick

Looks like those are complaining about clocks? Have you seen this:

about a problem with clocks not being enabled before mdio: probe

"Make the driver control the device clocks. Appearantly, the Davinci
platform probes this driver with the clock all powered up, but on OMAP,
this isn't the case."

And Vaibhav's reply:
As per the commit logs of the mainline kernel a patch adding pm_runtime_* 
calls in the driver went in around 3.7-rc3.


And for the over-current messages, have you seen this:
-----
And if you go to 6V, it will blow up the power control switch for the USB host, rated at 5.5VDC maximum. 
Gerald
-----
Looks like the TPS2051 is one of the few components that isn't protected from overvoltage by the TPS65217C,.  

Víctor MV

unread,
Jan 11, 2014, 8:11:35 AM1/11/14
to beagl...@googlegroups.com
Thanks for your comments @Loren.

It's quite odd that the board boots fine with old kernels and not with now ones, isn't it?. I tried the images available at http://www.armhf.com/index.php/boards/beaglebone-black/:
pretty much the same result as you get:

....
[    2.477446] registered taskstats version 1
[    2.538597] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[    2.545147] davinci_mdio 4a101000.mdio: no live phy, scanning all
[    2.552441] davinci_mdio: probe of 4a101000.mdio failed with error -5
[    2.559885] Detected MACID = bc:6a:29:84:8d:3a
[    2.565340] Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0894000
[    2.573651] Internal error: : 1008 [#1] SMP ARM
[    2.578448] Modules linked in:
[    2.581705] CPU: 0    Not tainted  (3.8.13-bone30 #1)
[    2.587076] PC is at cpsw_probe+0x528/0xbc8
[    2.591518] LR is at ioremap_page_range+0xd8/0x16c
[    2.596595] pc : [<c03f23f4>]    lr : [<c02e6168>]    psr: a0000113
[    2.596595] sp : cf05de38  ip : cf04d250  fp : cf42f298
[    2.608723] r10: 00000001  r9 : cf42f540  r8 : d0894000
[    2.614252] r7 : cf113800  r6 : 00000000  r5 : cf113810  r4 : cf42f000
[    2.621154] r3 : 00000000  r2 : 00000000  r1 : 4a100e13  r0 : d0894000
[    2.628061] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[    2.635788] Control: 10c5387d  Table: 80004019  DAC: 00000015
[    2.641865] Process swapper/0 (pid: 1, stack limit = 0xcf05c240)
[    2.648216] Stack: (0xcf05de38 to 0xcf05e000)
[    2.652831] de20:                                                       00000000 00000000
[    2.661486] de40: cf447c08 cf42f540 00000000 c014c2f0 22222222 00000020 00000000 cf447c88
[    2.670140] de60: cf447c08 cf447c08 00000008 c014c1e0 00000000 cf447c08 cf112488 cf446d40
[    2.678794] de80: 00000000 c014cba0 cf0474b8 c005e608 00000000 00000003 cf112488 00000000
[    2.687446] dea0: c0a171ec cf113810 cf113818 cf113810 cf113844 c0a171ec c098c5b4 c09a9dc0
[    2.696099] dec0: 00000000 cf05c008 00000000 c037c480 00000000 cf113810 cf113844 c098c5b4
[    2.704751] dee0: 00000000 c037c66c 00000000 c098c5b4 c037c604 c037acbc cf047478 cf111c80
[    2.713405] df00: c098c5b4 cf446d40 c0981ff0 c037bc44 c07ee6aa c07ee6aa 00000000 c098c5b4
[    2.722060] df20: c090ebe8 c09218d4 c0901ed4 c037cbb8 00000007 c090ebe8 c09218d4 c0901ed4
[    2.730716] df40: c09a9dc0 c0008894 c0901ed4 0000f442 c0921900 00000008 00000007 c090ebe8
[    2.739369] df60: c09218d4 c09a9dc0 c09a9dc0 000000f1 c090ebf0 c08dc918 00000007 00000007
[    2.748020] df80: c08dc270 00000000 00000000 c05fd740 00000000 00000000 00000000 00000000
[    2.756672] dfa0: 00000000 c05fd748 00000000 c000d478 00000000 00000000 00000000 00000000
[    2.765323] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.773974] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffeffae6 5fbfaaaa
[    2.782667] [<c03f23f4>] (cpsw_probe+0x528/0xbc8) from [<c037c480>] (driver_probe_device+0xa4/0x1e4)
[    2.792340] [<c037c480>] (driver_probe_device+0xa4/0x1e4) from [<c037c66c>] (__driver_attach+0x68/0x8c)
[    2.802304] [<c037c66c>] (__driver_attach+0x68/0x8c) from [<c037acbc>] (bus_for_each_dev+0x70/0x84)
[    2.811885] [<c037acbc>] (bus_for_each_dev+0x70/0x84) from [<c037bc44>] (bus_add_driver+0xdc/0x218)
[    2.821463] [<c037bc44>] (bus_add_driver+0xdc/0x218) from [<c037cbb8>] (driver_register+0x9c/0x124)
[    2.831044] [<c037cbb8>] (driver_register+0x9c/0x124) from [<c0008894>] (do_one_initcall+0x8c/0x150)
[    2.840729] [<c0008894>] (do_one_initcall+0x8c/0x150) from [<c08dc918>] (kernel_init_freeable+0x108/0x1cc)
[    2.850979] [<c08dc918>] (kernel_init_freeable+0x108/0x1cc) from [<c05fd748>] (kernel_init+0x8/0xe4)
[    2.860665] [<c05fd748>] (kernel_init+0x8/0xe4) from [<c000d478>] (ret_from_fork+0x14/0x3c)
[    2.869505] Code: e59f1650 ebfe1d58 ea0000d1 e58485c0 (e5982000) 
[    2.875956] ---[ end trace 85aa0dcf7be9c2ab ]---
[    2.881634] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

I'll try to set compile the kernel myself and play a bit with the modules to see if there's something that can be done.

Víctor MV

unread,
Jan 12, 2014, 1:49:21 PM1/12/14
to beagl...@googlegroups.com
I ended up compiling the kernel again and deactivating the TI MDIO driver from menuconfig. After this tweak the 3.8 kernel boots fine.

I briefly described what i did here. Hope somebody can benefit from it.

Loren Amelang

unread,
Jan 21, 2014, 6:01:18 PM1/21/14
to beagl...@googlegroups.com
My board is back from RMA, with what looks like a new ethernet chip. It is ever so slightly raised up along one edge, though the pin alignment and soldering job are perfect. I guess it could have been that way before, but usually that's a sign of manual replacement. The only info I was able to get from the RMA Team was: 
---
After running the diagnostic tests, we found that there was a Ethernet malfunction. We have fixed the issue and everything is properly working. 
---

The board was carefully solvent cleaned after the repair; a little glob of glue or rosin I had noticed before is now gone. But I noticed lots of tiny solder splashes on the bottom of the board, mostly along the expansion connector pins. A couple of them could have been a real problem if the board coating hadn't protected the traces. All popped off easily with a fingernail or blunt plastic tool. 

So far, the board boots fine and works as expected. 


The differences between booting and panic:

< cpsw, usb_ether
---
> Phy not found  <-- with bad ethernet, just before reading uEnv.txt
> PHY reset timed out
> cpsw, usb_ether

< [ time ] pinctrl-single 44e10800.pinmux: could not request pin 21 on device pinctrl-single
< systemd-fsck[85]: Angstrom: clean, 49509/112672 files, 354728/449820 blocks
< [ time ] libphy: PHY 4a101000.mdio:01 not found  <-- with good ethernet!
< [ time ] net eth0: phy 4a101000.mdio:01 not found on slave 1  <-- last line before logo
< .---O---.                                           
< |       |                  .-.           o o        
< |   |   |-----.-----.-----.| |   .----..-----.-----.
< |       |     | __  |  ---'| '--.|  .-'|     |     |
< |   |   |  |  |     |---  ||  --'|  |  |  '  | | | |
< '---'---'--'--'--.  |-----''----''--'  '-----'-'-'-'
<                 -'  |
<                 '---'
---
> [ time ] pinctrl-single 44e10800.pinmux: could not request pin 21 on device pinctrl-single
> [ time ] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe09fe000

So in both conditions it complains about "phy not found"! With a bad chip, it complains near the beginning of U-Boot. With working ethernet, it complains at the very end of kernel boot. It seems like someone who knows the details of cpsw_probe needs to figure out how to make it report a failed ethernet chip gracefully. And why libphy still reports an error when the ethernet is good and boot is successful. 


I'm finally able to login and view files. I'm wondering if these are standard, or are they leftover from the RMA testing:
---
root@beaglebone:/# cat /media/BEAGLEBONE/uEnv.txt
optargs=quiet drm.debug=7
root@beaglebone:/# cat /media/BEAGLEBONE/uEnv.txtboot
optargs=run_hardware_tests quiet
---


After receiving the board back, I couldn't use VNC or SSH, though I could ping the ethernet ports. In both cases Wireshark showed my external request followed by an immediate RST from the BBB. I tried re-installing the previous VNC package, but it said "Package x11vnc (0.9.13-r0.8) installed in root is up to date. Still, the trick to make it load itself didn't seem to work. I found 
and that installed and worked immediately after a restart. The "netstat -lntu" command did not see it until after it was active, even though it did seem to see all the other open ports immediately after booting. 

SSH was trickier. I finally found
-----
"ssh_exchange_identification: Connection closed by remote host"
From looking at the script above (/etc/init.d/dropbear) it seems like the identity file in /etc/dropbear/dropbear_rsa_host_key might be causing the problem and the script recreates them if they don't exist.  So I removed it and started dropbear (/etc/init.d/dropbear start) again and it generated new keys and then I could ssh in.  It now works!  (The side effect of doing this is you also have to remove a line in the client's ~/.shh/know_hosts because the identity of the beaglebone has changed.)
-----
My /etc/dropbear/dropbear_rsa_host_key file was zero-length, so I removed it. The "dropbear start" command didn't work for me, a BBB restart was required after I manually deleted the key file. I also unchecked the "History" box in TeraTerm - and it saved a new RSA fingerprint. Now works with default password choice and blank password field, and also works with Tunnelier. 


Other random things I just learned...  

At least on Windows, when the USB cable is connected, there is a "Gadget Serial" device USBSER000 from "Linux Developer Community" available as a COM port (ttyGS0 in the BBB), alongside the "USB Serial Port" VCP0 from FTDI which is my debug console adapter COM port (ttyO0 in the BBB). The "gadget" port is only active after boot is complete, so I didn't have much opportunity to see it before! But it claimed a lower COM port number, so I assume it installed along with the ethernet gadget when I first connected via USB. 


That leaves the question, could I have somehow fried my ethernet chip? I checked my incoming cable and it is fully DC isolated. The connector on the BBB is fully DC isolated. It is not a POE-capable connector, there is no diode array that could feed power into the grounded pin 8. So if I did something to cause my failure, it was not through the ethernet cable. 

Vaibhav Bedia

unread,
Jan 24, 2014, 7:04:06 PM1/24/14
to beagl...@googlegroups.com
Thanks for updating the thread. Good to know it's working now.



Other random things I just learned...  

At least on Windows, when the USB cable is connected, there is a "Gadget Serial" device USBSER000 from "Linux Developer Community" available as a COM port (ttyGS0 in the BBB), alongside the "USB Serial Port" VCP0 from FTDI which is my debug console adapter COM port (ttyO0 in the BBB). The "gadget" port is only active after boot is complete, so I didn't have much opportunity to see it before! But it claimed a lower COM port number, so I assume it installed along with the ethernet gadget when I first connected via USB. 


That leaves the question, could I have somehow fried my ethernet chip? I checked my incoming cable and it is fully DC isolated. The connector on the BBB is fully DC isolated. It is not a POE-capable connector, there is no diode array that could feed power into the grounded pin 8. So if I did something to cause my failure, it was not through the ethernet cable.

Hmm this could be a one-off case. I guess if there are more instances like this then someone needs to
dig deeper. For now just hack away ;)

Loren Amelang

unread,
Feb 9, 2014, 9:13:42 PM2/9/14
to beagl...@googlegroups.com
My BBB is still working great, wireless and all, after the RMA repair. But one silly detail has been bothering me...  

The ifconfig-reported MAC address of the usb0 port changed after the repair:
usb0      Link encap:Ethernet  HWaddr 6E:5A:F6:F0:F3:45
After RMA repair:
usb0      Link encap:Ethernet  HWaddr 06:57:23:9E:EA:C7

Both of those are locally administered addresses, so they probably aren't read from any hardware, and thus probably don't suggest anything about what was done to my board during the RMA repair. But why the change? 

-----
When using a USB-Ethernet dongle a valid MAC address must be set in the environment. To create a valid address please read this page. Then issue the following command:
U-Boot # setenv usbethaddr value:from:link:above
-----

I guess that "dongle" setting doesn't apply to the usb0 "Linux USB Ethernet/RNDIS Gadget", because none of my environments have ever included it. What I do find in printenv, before and after repair, is:
 usbnet_devaddr=90:59:af:4d:71:eb
Which is = eth0!

The usb0 MACs shown by ifconfig don't appear in any printenv or boot log, only in ifconfig! They also don't show in externally connected systems:

Ethernet - 140120:
Windows ipconfig /all
---
Ethernet adapter Local Area Connection 5:
   Description . . . . . . . . . . . : Linux USB Ethernet/RNDIS Gadget
   Physical Address. . . . . . . . . : 90-59-AF-4D-71-ED
---

That is one of the two MAC addresses stored in BBB hardware. It, and the ifconfig eth0 MAC address, stayed the same after the repair: 

-----
The values read from Control Module (Base address 0x44E1_0000)  MAC_ID0_LO register (Offset 0x630), MAC_ID0_HI register (Offset 0x634), MAC_ID1_LO register (Offset 0x638), and MAC_ID1_HI register (Offset 0x63C) represent  unique MAC addresses assigned to each AM335x device.  The values in these registers are programmed into each AM335x device by TI and can not be changed.

Software can be configured to read and use the MAC addresses programmed into the AM335x device or non-volatile memory devices attached to AM335x. 
-----
-----
You can read them out by using devmem2:
root@beaglebone:/dev# devmem2 0x44e10630  [and 34/38/3C]
These values are read out in am33xx_cpsw_init() in arch/arm/mach-omap2/devices.c. 
-----
-->  Mine show the eth0 address, and another just two bits higher that Windows shows as the usb0 port. 

So...  

Where do the usb0 MAC addresses reported by ifconfig come from?

Why did they change after the repair?

Why do ifconfig and Windows disagree about the usb0 MAC, when they agree about the eth0 MAC? 
Reply all
Reply to author
Forward
0 new messages