Re: [Opencompute-onie] can't boot ONL after ONIE update

505 views
Skip to first unread message

Curt Brune

unread,
Aug 8, 2014, 11:01:40 AM8/8/14
to Nikolay Shopik, opennetw...@googlegroups.com
Hello Nikolay -

The ONIE boot up and NOS install look OK to me. Adding the ONL folks
for more help.

Cheers,
Curt

On Fri Aug 08 11:46, Nikolay Shopik wrote:
> Hi Curt,
>
> This is Quanta LB9, here is full console output
> https://gist.github.com/nshopik/e466ae6b0e760062a841
>
> On 07/08/14 23:49, Curt Brune wrote:
> > Hello Nikolay -
> >
> > What hardware platform are you using?
> >
> > If you could attach a complete console log from power up to the
> > failure that would be very helpful.
> >
> > Cheers,
> > Curt
> >
> > On Thu Aug 07 16:34, Nikolay Shopik wrote:
> >> Hey,
> >>
> >> I've updated to ONIE 2014.08-rc1. After update and reboot ONL able to
> >> bootup, but after power off/on it seems stuck on error messages.
> >>
> >> Re-run installer, but got same results.
> >>
> >> ** No partition table - ide 0 **
> >> WARNING: adjusting available memory to 30000000
> >> Wrong Image Format for bootm command
> >> ERROR: can't get kernel image!
> >>
> >> Anyone can give me clue what's wrong? I'm just unsure if its ONL
> >> re-partion issue or some ONIE changes?
> >>
> >> Thanks
> >> _______________________________________________
> >> Opencompute-onie mailing list
> >> Opencomp...@lists.opencompute.org
> >> http://lists.opencompute.org/mailman/listinfo/opencompute-onie

Rob Sherwood

unread,
Aug 11, 2014, 1:16:56 PM8/11/14
to Curt Brune, Nikolay Shopik, opennetw...@googlegroups.com
Hi Nikolay,

Sorry to see you're running into problems here -- though I have to admit that I'm confused as to why you're seeing problems here.  Can you please send us a complete dump of your uboot environment?  I'm trying to understand if there is some incompatibility with uboot, with the new ONIE, with the software, or even if the hardware components have changed.  Fwiw, this is the system that I personally test my ONL builds with and it works for me :-/

Let's see if we can get some more information and hopefully something shakes loose from there.

Thanks for the interest,

- Rob
.



--
You received this message because you are subscribed to the Google Groups "opennetworklinux" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opennetworklin...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nikolay Shopik

unread,
Sep 1, 2014, 2:35:36 PM9/1/14
to opennetw...@googlegroups.com
Hi Rob,

Here is output https://gist.github.com/nshopik/cfb3d25e49c22c488a44

We also just recently received these for evaluation. Also you may notice
we are running on 9600 port speed, we found 115200 is rather unstable
lots grabage sometimes. We tried to replace usb-com to onboard com
didn't help though.

Any of you guys had problems with 115200?

Rob Sherwood

unread,
Sep 1, 2014, 2:53:01 PM9/1/14
to Nikolay Shopik, opennetw...@googlegroups.com
Hi Nikolay,

Sorry for the delay in replies.

So, the error you're seeing is quite strange.  It is uboot saying that it doesn't have an IDE partition in the right place, which should have been setup by the installer.  Do you have any way of capturing the output of the installer?  If so, there must be some problem during the installation.

Also, if you could try running the following commands from uboot and sending me the output (command by command), it would be appreciated:

bdinfo

flinfo

Thanks,

- Rob
.

Rob Sherwood

unread,
Sep 1, 2014, 2:55:43 PM9/1/14
to Nikolay Shopik, opennetw...@googlegroups.com
Actually, if you could also send me the output of:

diskboot 0x10000000 0:1     (from uboot)


That should help as well.  I'm trying to understand why your system is different from my reference system.


Thanks,


- Rob

.

Nikolay Shopik

unread,
Sep 8, 2014, 5:08:55 AM9/8/14
to Rob Sherwood, opennetw...@googlegroups.com
Hi Rob,

Here is output 

=> diskboot 0x10000000 0:1
** No partition table - ide 0 **

=> bdinfo
memstart    = 0x00000000
memsize     = 0x40000000
flashstart  = 0xFE000000
flashsize   = 0x04000000
flashoffset = 0x00000000
sramstart   = 0x00000000
sramsize    = 0x00000000
immr_base   = 0xE0000000
bootflags   = 0x00000000
vco         =    660 MHz
sccfreq     =    165 MHz
brgfreq     =    165 MHz
intfreq     =    825 MHz
cpmfreq     =    330 MHz
busfreq     =    330 MHz
ethaddr     = 00:E0:0C:00:00:FD
IP addr     = 192.168.2.1
baudrate    =  38400 bps
relocaddr   = 0x3FF30000

Rob Sherwood

unread,
Sep 10, 2014, 2:19:50 AM9/10/14
to Nikolay Shopik, opennetw...@googlegroups.com
Hi Nikolay,

Good news is we've found the most basic problem -- your IDE partition did not get created.  The bad news is we don't know why.  Any chance you can run the installer script again and send me the output?  Maybe a simpler way to hand this would be to get some additional info from uboot:

From uboot, try:
ide info
ide device
ide device 0

My guess is that one or more of the above commands will fail and that will give us more information.  That said, the complete output of running the installer would be the most useful thing.

Thanks for keeping with it -- we'll debug this :-)

- Rob
.

Nikolay Shopik

unread,
Sep 10, 2014, 7:29:45 AM9/10/14
to Rob Sherwood, opennetw...@googlegroups.com
=> ide info

IDE device 0: Model: 4GB CompactFlash Card Firm: Ver6.04J Ser#:
CDE207331D0100001270
Type: Hard Disk
Capacity: 3811.9 MB = 3.7 GB (7806960 x 512)
=> ide device

IDE device 0: Model: 4GB CompactFlash Card Firm: Ver6.04J Ser#:
CDE207331D0100001270
Type: Hard Disk
Capacity: 3811.9 MB = 3.7 GB (7806960 x 512)
=> ide device 0

IDE device 0: Model: 4GB CompactFlash Card Firm: Ver6.04J Ser#:
CDE207331D0100001270
Type: Hard Disk
Capacity: 3811.9 MB = 3.7 GB (7806960 x 512)
... is now current device


Output looks good. Regarding installer script take a look here
https://gist.github.com/nshopik/e466ae6b0e760062a841#file-gistfile1-txt-L273
line 273. It seems strange busybox show me help for umount during
install, possible it not accept some values passed to it?

But these messages exist even when I was on old uboot and able install
it w/o problems, still it worth note these messages.

Rob Sherwood

unread,
Sep 11, 2014, 8:42:58 PM9/11/14
to Nikolay Shopik, opennetw...@googlegroups.com
Hi Nikolay,

First - did you say this was working with you old version of uboot but is now no longer working?  I didn't see that in the initial mail and might explain a lot of things.  From your mail, you're running "2013.01.01-g40d0967" and my reference system is running "U-Boot 2010.12 (Oct 08 2013 - 17:11:37)".  That's at least one difference.

I just re-installed my box and did a side-by-side diff of the install scripts (see attached for the full working version) and it looks like things are effectively identical.  I'm going to try to upgrade my version of uboot and see if I can reproduce your problem.  

Just for my sanity, where did you get your new uboot image?  Did it ship that way or did you build it yourself?  Also, can you send me the output of `help bootm` from the uboot prompt to make sure it takes the same format commands.

Thanks,

- Rob
.



ONIE-installer-working.txt

Nikolay Shopik

unread,
Sep 12, 2014, 3:50:50 AM9/12/14
to Rob Sherwood, opennetw...@googlegroups.com
Hi Rob,

Sorry in first mail I didn't mention from what version I was upgraded to
2014.08-rc1. IIRC I had version similar to yours (U-Boot 2010.12
(Oct 08 2013 - 17:11:37)) from factory.

I've build ONIE+uboot from git

=> help bootm
bootm - boot application image from memory

Usage:
bootm [addr [arg ...]]
- boot application image stored in memory
passing arguments 'arg ...'; when booting a Linux kernel,
'arg' can be the address of an initrd image
When booting a Linux kernel which requires a flat device-tree
a third argument is required which is the address of the
device-tree blob. To boot that kernel without an initrd image,
use a '-' for the second argument. If you do not pass a third
a bd_info struct will be passed instead

For the new multi component uImage format (FIT) addresses
must be extened to include component or configuration unit name:
addr:<subimg_uname> - direct component image specification
addr#<conf_uname> - configuration specification
Use iminfo command to get the list of existing component
images and configurations.

Sub-commands to do part of the bootm sequence. The sub-commands must be
issued in the order below (it's ok to not issue all sub-commands):
start [addr [arg ...]]
loados - load OS image
ramdisk - relocate initrd, set env initrd_start/initrd_end
fdt - relocate flat device tree
cmdline - OS specific command line processing/setup
bdt - OS specific bd_t processing
prep - OS specific prep before relocation or go
go - start OS

Nikolay Shopik

unread,
Oct 6, 2014, 4:52:58 AM10/6/14
to opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
Hi guys,

So after few more days of debugging ONL and doing manual install, am
clearly sure its not ONL issue. Everything set in place correctly during
install.

Partition setup correctly and image loaded in, but soon after reboot and
uboot kicks in, I can no longer mount any partition. I can load into
ONIE and won't able to mount any of just created partition. While fdisk
still says they exist.

So far only my ideas its wrong Endianness with patch from quanta. Right here
https://github.com/opencomputeproject/onie/commit/b3150829e133bac27cccf38f450af883fb4afc90#diff-880e9b040369e0eec42c62860791445cR1433

PPC platform is big endian, but I can barely can read large amounts of C
code. While I highly doubt that is case, I just don't understand what
goes wrong with partition after ONL installations is complete and
reboots and they not unmountable.

Well I've even check if its uboot bug, I've find 1-2 bugs with
partitions in uboot, but these hardly my case.
http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/155794/focus=156173
http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/188580/focus=188692

But I've seen lots of people complain on error "No partition table" in
uboot, but most of these are using mmc/sd cards.

I'll appreciate any pointers

Rob Sherwood

unread,
Oct 6, 2014, 12:56:05 PM10/6/14
to Nikolay Shopik, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
Thanks for continuing to dig into this Nikolay.

Can you say a little more about what testing you've done?  It would be ideal to have a partition table dump after ONL has done the install but before it reboots.  Then we can all be fairly certain the problem is elsewhere.

For others who have not been following this, the whole process (uboot + ONIE + ONL installer) worked for Nikolay until he upgraded uboot, so at least for my money, that's where the likely culprit is.

- Rob
.

Curt Brune

unread,
Oct 6, 2014, 1:16:51 PM10/6/14
to Rob Sherwood, Nikolay Shopik, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
I agree with Rob. Perhaps something changed with the LB9 u-boot. I
would reach out to the Quanta ONIE maintainer. IIRC the LB9 has a IDE
compact flash card -- there may be some inits or resets required that
only work right after a power cycle (a warm boot may not be thorough
enough).

As a simple experiment you could:

1. power cycle
2. use fdisk from ONIE to manually create a couple of partitions
3. reboot (warm boot) -- do you see the problem here?
4. now power cycle -- do you see the problem here?

Cheers,
Curt

Nikolay Shopik

unread,
Oct 6, 2014, 1:17:32 PM10/6/14
to Rob Sherwood, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
Well basically I've run by hands all commands of installer.sh and make
sure partition are created and they are worked (mountable) and contain
data untill reboot.

After reboot and uboot complain about unable to kernel image and booting
into ONIE, I no longer can mount any of partition. While fdisk says they
still exist, so partition table intact but something wrong with FS?

But if I run mkdosfs /dev/sda1, mount it, create file and reboot, I can
mount /dev/sda1 and see this file.

Also today I've tried zeroing my CF card dd if=/dev/zero of=/dev/sda,
after command complete I've run hexdump -C /dev/sda, lookup head and
tail, and looks like data wasn't even zeroed. Which is kinda strange.

Additionally I've rebuild ONIE on x86-64 platform originally my ONIE was
built on i386 PC, just in case.

Nikolay Shopik

unread,
Oct 7, 2014, 12:25:34 PM10/7/14
to Curt Brune, Rob Sherwood, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
So another update, partition are created normal and works correctly
nothing corrupted on them after warm/cold reboot. I've just tried only
first partition previously but it was never meant to be mountable as its
just gzip kernel, silly me.

Original factory uboot 2010.12 assume ide disk is here /dev/hda, while
newer is here /dev/sda, thus uboot always complain no partition for ide
part command I've seen newer uboot build using sda on some devices.

Device Tree Structure - is have only sda in patch from quanta
https://github.com/opencomputeproject/onie/commit/b3150829e133bac27cccf38f450af883fb4afc90#diff-f948998caf65ecb861227ac9fa1ff009R69

Example uboot changes naming for some devices hde->sda
http://lists.denx.de/pipermail/u-boot/2010-August/075028.html

Rob log clearly says its loading from hda1
https://gist.github.com/anonymous/5a523d8fce3f656ccb02#file-installer-latest-out-L570

Thoughts?

Rob Sherwood

unread,
Oct 9, 2014, 11:43:57 PM10/9/14
to Nikolay Shopik, Curt Brune, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
I'm not too certain that this name change matters.  The installer runs from ONIE and ONIE has always thought the device was 'sda'.  The way ONL references the device is by the partition number.  Specifically, the "0:1" in the nos_bootcmd is "disk 0, partition 1", not referenced by the name:

platform_bootcmd='diskboot 0x10000000 0:1 ; setenv bootargs console=$consoledev,$baudrate onl_platform=powerpcquanta-lb9-r0; bootm 0x10000000'

I still think it would be worth while to interrupt the install after it's done (or boot back into ONIE afterwards) and send the output of fdisk just to make sure everything got setup correctly, e.g.,:

ONIE:/ # fdisk /dev/sda

Command (m for help): p

Disk /dev/sda: 3997 MB, 3997163520 bytes
255 heads, 63 sectors/track, 485 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sda1               1           3       24066  83 Linux
/dev/sda2               4          12       72292+ 83 Linux
/dev/sda3              13         485     3799372+ 83 Linux

- Rob
.

Nikolay Shopik

unread,
Oct 10, 2014, 4:42:04 AM10/10/14
to Rob Sherwood, Curt Brune, opencomp...@lists.opencompute.org, opennetw...@googlegroups.com
That is exactly what I've done, in my previous update, check partition
right after install.

Well ONIE is working correctly, this is u-boot issue here. I don't know
internal structure of u-boot, but I just guess, that it may reference
disk by names just like Linux internally when you run diskboot command.
It does detect IDE device but won't recognize partition by "ide part 0"
command. So this is give hints about different naming internally, or
endianness issue.

I strongly believe its endianess, because its *ONLY* lb9 platform have
patch for u-boot which is make changes to cmd_ide.c.

Also I'm got image from pica8, to test their software, and run into same
issue. While pica8 NOS only ask for 1 partition, it still have same problem.

Regardless, these I'm able to workaround booting issue for ONL and Pica8
NOS and able to boot into both NOS. A merely replaced diskboot with
tftpboot.

Here is example of original pica8 nos_bootcmd.

loadaddr=0x08000000
fdtaddr=0x400000
setenv bootargs root=/dev/hda1 rw noinitrd
console=ttyS0,$baudrate;ext2load ide 0:1 $loadaddr boot/uImage;ext2load
ide 0:1 $fdtaddr boot/LB9.dtb;bootm $loadaddr - $fdtaddr

I've replaced it with these commands in u-boot

setenv bootargs root=/dev/hda1 rw noinitrd console=ttyS0,$baudrate;
tftpboot 0x08000000 192.168.20.69:uImage
tftpboot 0x400000 192.168.20.69:LB9.dtb
bootm 0x08000000 - 0x400000

ONL its even simplier, I've just need to tftpboot onl.loader into
0x10000000 and when bootm 0x10000000

And iminfo command which is allow me to check if I loadup correct image
into memory.

So I'm still questing this changed line
https://github.com/opencomputeproject/onie/blob/master/machine/quanta/quanta_lb9/u-boot/platform-quanta-lb9.patch#L1435

And rather wonder if quanta actually tested patch for ONIE as whole,
maybe they done test with different u-boot and didn't notice issue?

quantasw...@quantatw.com

unread,
Oct 16, 2014, 7:49:25 AM10/16/14
to opennetw...@googlegroups.com, rob.sh...@bigswitch.com, cu...@cumulusnetworks.com, opencomp...@lists.opencompute.org, sho...@inblock.ru
Hi Nikolay,
    Please take a look at this patch.

If this works, I will create a new pull request for this fix.


Nikolay Shopik於 2014年10月10日星期五UTC+8下午4時42分04秒寫道:

Nikolay Shopik

unread,
Oct 17, 2014, 3:51:16 AM10/17/14
to quantasw...@quantatw.com, opennetw...@googlegroups.com, rob.sh...@bigswitch.com, cu...@cumulusnetworks.com, opencomp...@lists.opencompute.org
Yep, this patch is working! Please create new pull request.

On side note I though T3048-LY2 and T1048-LB9 same, and only difference
is ASIC, why T3048-LY2 doesn't require patch at all regarding byte-swap?
It still using same frescale P2020, with big-endian architecture or I
mistaken and T3048-LY2 have some difference?

Curt, can we push this fix for 2014.11?

Thanks guys at Quanta for fixing this.

Curt Brune

unread,
Oct 17, 2014, 10:44:09 AM10/17/14
to Nikolay Shopik, quantasw...@quantatw.com, opennetw...@googlegroups.com, rob.sh...@bigswitch.com, opencomp...@lists.opencompute.org
On Fri Oct 17 11:51, Nikolay Shopik wrote:
> Yep, this patch is working! Please create new pull request.
>
> Curt, can we push this fix for 2014.11?

Pushed to master:

- f43fcbb QuantaMesh 1000 Series T1048-LB9: Fix Compact Flash Card read fail while ide interface byte swap is not correct

Cheers,
Curt
Reply all
Reply to author
Forward
0 new messages