The initial reboot followed the installation of ZFS-only version 5/28 system
reports error:
Attempting Boot From Hard Drive (C:)
gptzfsboot: error 1 lba 32
gptzfsboot: error 1 lba 1
gptzfsboot: No ZFS pools located, can't boot
The same installation procedure on older ProLiant with Compaq Smart Array 5i
do not cause any problems.
The system has been installed based on FreeBSD 8.2-20110731-SNAP i386 802510.
The P410i Controller presents two units, and the disk da0 has been partitioned as follow:
gpart destroy -F /dev/da0
dd if=/dev/zero of=/dev/da0 bs=1024 count=10000
gpart create -s GPT /dev/da0
gpart add -b 32K -s 64K -t freebsd-boot -l disk0boot /dev/da0
gpart add -s 30G -t freebsd-zfs -l disk0 /dev/da0
gpart add -s 4G -t freebsd-swap -l swap0 /dev/da0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 /dev/da0
gpart set -a bootme -i 1 /dev/da0
Early experimentation show the following footprint:
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
probe_drive(360): drive 0x0: type 0: unit 0: slice 0: part 0: <-- dsk.drive=0 instead of 0x80 ?
vdev_probe(): off=16384, sizeof(vdev_phys_t)=114688
vdev_read_phys(): reading 114688 bytes at 0x4000 to <-- *buf is empty
gptzfsboot: error 1 lba 32 <-- why lba is not zero ?
drvsize(): packet.count=16, off=0, seg=8192, lba=32
drvsize(): dsk->drive=0, type=0, unit=0, slice=0, part=0, init=0, start=0
vdev_read_phys(): rc from vdev->v_phys_read =4294967295 <-- -1
gptzfsboot: error 1 lba 1
drvsize(): packet.count=1, off=0, seg=8704, lba=1
drvsize(): dsk->drive=0, type=0, unit=0, slice=0, part=0, init=0, start=0
main(): retun from probe_drive(): spa_name=: kname=: drive=0:
probe_drive(360): drive 0x81: type 0: unit 1: slice 0: part 0: <-- disk da1 is empty
vdev_probe(): off=16384, sizeof(vdev_phys_t)=114688
vdev_read_phys(): reading 114688 bytes at 0x4000 to <-- *buf is empty
vdev_read_phys(): rc from vdev->v_phys_read =0
probe_drive(390): drive 0x81: type 0: unit 1: slice 0: part 0:
main(): spa_name=, kname=,drive=129: <-- da1 (0x81) do not contain any ZFS informations
gptzfsboot: No ZFS pools located, can't boot
Best regards,
Christoph
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"
The system will successfully boot only if the OS installation is laying on
the second drive or higher (0x81 and more).
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
probe_drive(): drive 0x0: type 0: unit 0: slice 0: part 0: <-- 0x0 instead of 0x80 ?
gptzfsboot: error 1 lba 32
gptzfsboot: error 1 lba 1
probe_drive(): drive 0x81: type 0: unit 1: slice 0: part 0: <-- already 0x81, 0x80 is missing
BTX loader 1.00 BTX version is 1.02
Console: internal video/keyboard
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS drive D: is disk2
BIOS 637kB/3658940kB available memory
FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1
[…]
Even there is no floppy drive on this system, BIOS will report it as drive A.
This will be mapped as 0x80 and gptzfsboot reports error. Next drive to probe
will be 0x81 after zfsboot increments it in line 500.
Any comments would be appreciated.
Best regards,
Christoph
Are you using clang? If so, you should try either using GCC or using this
patch with clang as a workaround from the previous thread on zfsboot issues:
Index: sys/boot/i386/zfsboot/Makefile
===================================================================
--- sys/boot/i386/zfsboot/Makefile (revision 224653)
+++ sys/boot/i386/zfsboot/Makefile (working copy)
@@ -20,7 +20,6 @@
-fomit-frame-pointer \
-fno-unit-at-a-time \
-mno-align-long-strings \
- -mrtd \
-DBOOT2 \
-DSIOPRT=${BOOT_COMCONSOLE_PORT} \
-DSIOFMT=${B2SIOFMT} \
Index: sys/boot/i386/gptzfsboot/Makefile
===================================================================
--- sys/boot/i386/gptzfsboot/Makefile (revision 224653)
+++ sys/boot/i386/gptzfsboot/Makefile (working copy)
@@ -22,7 +22,6 @@
-fomit-frame-pointer \
-fno-unit-at-a-time \
-mno-align-long-strings \
- -mrtd \
-DGPT -DBOOT2 \
-DSIOPRT=${BOOT_COMCONSOLE_PORT} \
-DSIOFMT=${B2SIOFMT} \
--
John Baldwin
No, I and not using clang.
My problem persists even I apply the patch.
As a workaround I have to put OS on second LUN presented by the
P410i Controller.
Regards,
Christoph
--
Christoph Hoffmann
Despite the BIOS information about the nonexistent floppy, the zfsboot.c code
will prevent to boot from the first HDD if a floppy is given as a first available device.
The drive 0x0 (floppy) will be probed before the code below and an error occurs:
[…]
gptzfsboot: error 1 lba 32
gptzfsboot: error 1 lba 1
[…]
The continue statement will skip the rest of the iteration because
if ((i | DRV_HARD) == *(uint8_t *)PTOV(ARGS))
is true if the drive equals 0x80. As a result we do not call probe_drive()
for this drive.
Eliminating
if ((i | DRV_HARD) == *(uint8_t *)PTOV(ARGS))
continue;
would help.
Any comments will be appreciated.
Best Regards,
Christoph
i386/zfsboot/zfsboot.c
int
main(void)
{
[…]
/*
* Probe the rest of the drives that the bios knows about. This
* will find any other available pools and it may fill in missing
* vdevs for the boot pool.
*/
for (i = 0; i < *(unsigned char *)PTOV(BIOS_NUMDRIVES); i++) {
if ((i | DRV_HARD) == *(uint8_t *)PTOV(ARGS))
continue;
if (!int13probe(i | DRV_HARD))
break;
[…]
probe_drive(dsk, NULL);
}
[…]
On 08/01/11 06:07, Christoph Hoffmann wrote:
> Hello,
>
> The initial reboot followed the installation of ZFS-only version 5/28
> system reports error:
>
> Attempting Boot From Hard Drive (C:)
> gptzfsboot: error 1 lba 32
> gptzfsboot: error 1 lba 1
> gptzfsboot: No ZFS pools located, can't boot
>
> The same installation procedure on older ProLiant with Compaq Smart
> Array 5i do not cause any problems.
Looks like for some reason the drive number (%dl) didn't get passed
through ARGS (by pmbr.s).
Note that we shouldn't really pass the whole %dx here, as pmbr.s have
different understanding of %dh. You may want to add a xor %dh, %dh
before the store line but after the main.2 label.
Please let us know if this helps or not. Thanks in advance!
Cheers,
- --
Xin LI <del...@delphij.net> https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)
iQEcBAEBCAAGBQJOPwrDAAoJEATO+BI/yjfBIJ0IALzzDN/b0vfGLyqH8XgQSsqz
YDU3bxSRBVBcuZ76II0gOSaFZrWaH+bRcfjE/LNmS26TTWir67UVsoFk5gCaYkl+
L1oe76o5ISrp3Tr/mGNyKv/TN1WHeo8I0ExABfJUNw0NHIhXivtJMb7NLOJl5eed
/XfgYzHw8zAlnjbF7ZfMElEjjKUqTLl3VyHth+3KsUsK+zrZcU4gLzBHh7JnR31p
NvtyLxyMsQQTHKiaDtGVPGOgUPsDfTHdmAI77fgE26W6Z6FqCqV+xdEOuc+g5tRi
kC28HPUSijoX44vkDYp4B57988JUGauoJrKkTZ4L2LAh918ZAvuFhgRgtUpiHLY=
=O02h
-----END PGP SIGNATURE-----
MBR boot loaders aren't defined to do that. They pass %dl directly via the
register. For gptboot and gptzfsboot, sys/boot/i386/gptboot/gptldr.S
already stores the saved value of %dl in MEM_ARG.
--
John Baldwin
But that shouldn't happen if ARGS has a drive number of 0. (In that case 0x80
!= 0x0, so it shouldn't match.)
This shows that PTOV(ARGS) actually has a %dl value of 0x80 which is correct.
The question is how your initial 'dsk' ended up using 0x0 instead of 0x80.
Note that your 'type' is 0, so that means that it was ok initially (TYPE_AD is
0):
dsk->drive = *(uint8_t *)PTOV(ARGS);
dsk->type = dsk->drive & DRV_HARD ? TYPE_AD : TYPE_FD;
Somewhere between where 'dsk' is initalized in main() and before probe_drive()
is called in main() for 'dsk', 'dsk->drive' is getting clobbered. Can you add
some additional printfs to nail down where that is happening?
Thank you very much indeed for your reply.
The pmbr.s passes the ARGS set to 0x900 to main() in zfsboot.c and
*(uint8_t *)PTOV(ARGS)) is 0x80.
In zfsboot.c main(), before the line
bootinfo.bi_version = BOOTINFO_VERSION;
gets executed we still keep the right value of the dsk->drive and just after
the execution, the dsk->drive is equal to zero.
Adding
printf("hello\n");
before
dsk = malloc(sizeof(struct dsk));
keeps the dsk->drive value assigned to 0x80 and the box will boot.
Any comments will be appreciated.
Best Regards,
Christoph
--
Christoph Hoffmann
That is odd indeed. Can you print out a few things:
1) if high_heap_size is > 0
2) the value of 'dsk' and '&bootinfo' (try this both with a printf
before the first call to malloc() and without).
First with printf() before
dsk = malloc(sizeof(struct dsk));
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
464: high_heap_size=0x300000; dsk=0x0; &bootinfo=0x8714 <--- the malloc() is next.
474: high_heap_size=0x300000; dsk=0xdf325000; &bootinfo=0x8714
probe_drive(): drive 0x80: type 0: unit 0: slice 1: part 0:
probe_drive(): drive 0x81: type 0: unit 1: slice 0: part 0:
BTX loader 1.00 BTX version is 1.02
Consoles: internal video/keyboard
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS drive D: is disk2
and now without printf() at line 464
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
474: high_heap_size=0x300000; dsk=0xdf325000; &bootinfo=0x86b4
probe_drive(): drive 0x0: type 0: unit 0: slice 0: part 0:
gptzfsboot: error 1 lba 32
gptzfsboot: error 1 lba 1
probe_drive(): drive 0x81: type 0: unit 1: slice 0: part 0:
gptzfsboot: No ZFS pools located, can't boot
Regards,
Christoph
--
Christoph Hoffmann
Hmm, so the entire 'dsk' structure gets zero'd it seems. What if you force
high_heap_size to 0?
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
474: high_heap_size=0x0; dsk=0x1a000; &bootinfo=0x8694
malloc failure
--
Christoph Hoffmann
Hmm, I am really at a loss for what is trashing 'dsk'. You could
possibly try adjusting bios_getmem() to force it to use
high_heap_size from bios_extmem (that is at the bottom of the
function) perhaps. However, that is mostly a bit of a guess that
some part of your BIOS is randomly zero'ing dsk.
I'm at a loss as to how the assignments to bootinfo would trash
'dsk'. :(
Thank you very much indeed for the hints.
I am under the impression that we are facing a problem with synchronisation
of CPU local caches. I also wasn't able to find any problem with memory
allocation.
This box is equipped with:
1 Processor(s) detected, 4 total cores enabled, Hyperthreading is enabled
Proc 1: Intel(R) Xeon(R) CPU E5630 @ 2.53GHz
QPI Speed: 5.8 GT/s
Changing the order of execution in zfsboot.c main() function to
[…]
int
main(void)
{
[…]
bios_getmem();
if (high_heap_size > 0) {
[…]
bootinfo.bi_version = BOOTINFO_VERSION;
bootinfo.bi_size = sizeof(bootinfo);
bootinfo.bi_basemem = bios_basemem / 1024;
bootinfo.bi_extmem = bios_extmem / 1024;
bootinfo.bi_memsizes_valid++;
/* bootinfo.bi_bios_dev = dsk->drive; */
bootinfo.bi_bios_dev = *(uint8_t *)PTOV(ARGS);
dsk = malloc(sizeof(struct dsk));
dsk->drive = *(uint8_t *)PTOV(ARGS);
dsk->type = dsk->drive & DRV_HARD ? TYPE_AD : TYPE_FD;
dsk->unit = dsk->drive & DRV_MASK;
dsk->slice = *(uint8_t *)PTOV(ARGS + 1) + 1;
dsk->part = 0;
dsk->start = 0;
dsk->init = 0;
bootdev = MAKEBOOTDEV(dev_maj[dsk->type],
dsk->slice, dsk->unit, dsk->part),
[…]
fixes the problem.
Any comments will be appreciated.
Best Regards,
Christoph
--
Christoph Hoffmann
What if you leave the order as-is but just change this one line to use
PTOV(ARGS) directly here instead of 'dsk->drive'?
Unfortunately not, as we is still need 4 additional instructions or some sort of memory
barrier [ like mb() in Tru64 :) ] .
Regards,
Christoph
--
Christoph Hoffmann
> Hello,
>
> The initial reboot followed the installation of ZFS-only version 5/28 system
> reports error:
>
> Attempting Boot From Hard Drive (C:)
> gptzfsboot: error 1 lba 32
> gptzfsboot: error 1 lba 1
> gptzfsboot: No ZFS pools located, can't boot
The bug may be unrelated but try clang patches, too.
http://lists.freebsd.org/pipermail/freebsd-current/2011-August/026263.html
http://lists.freebsd.org/pipermail/freebsd-current/2011-August/026338.html
Thank you very much for your information.
Even with both of them being implemented, I still have to re-order
the zfsboot.c main() function to get it working.
Regards,
Christoph
--
Christoph Hoffmann
Well, x86 CPUs generally don't need memory barriers assuming the compiler
hasn't done something invalid, especially for opertions that are only on a
single CPU. However, if the compiler was broken presumably zfsboot would be
broken everywhere.
Can you please use -save-temps to save the intermediate zfsboot.s files,
both before and after you change this order, then post them here? It's
easiest to just do:
DEBUG_FLAGS=-save-temps make -C /usr/src/sys/boot/i386/gptzfsboot clean all
then save /usr/obj/usr/src/sys/boot/i386/gptzfsboot/zfsboot.s somewhere.
As per Dimitry request, please find set of gzip'ed zfsboot.s files (61223 bytes).
Due to size of the attachments which exceeded 200 KB, the original message
has been rejected.
Thank you very much indeed for your help and I am sorry of this spam.
Regards,
Christoph
Daniel
Last time I checked up on the issue was on the 23rd of September,
it was not fixed then.
I was able to to boot from drive 0x80 after adding:
*** zfsboot.c.orig Fri Sep 23 18:03:26 2011
--- zfsboot.c Fri Sep 23 18:47:44 2011
***************
*** 459,464 ****
--- 459,465 ----
heap_end = (char *) PTOV(bios_basemem);
}
+ printf("Hello! I am a hack.\n");
dsk = malloc(sizeof(struct dsk));
dsk->drive = *(uint8_t *)PTOV(ARGS);
dsk->type = dsk->drive & DRV_HARD ? TYPE_AD : TYPE_FD;
I am inclined to think that this is related to the way how we compile this code,
especially when run on the following particular processor:
1 Processor(s) detected, 4 total cores enabled, Hyperthreading is enabled
Proc 1: Intel(R) Xeon(R) CPU E5630 @ 2.53GHz
QPI Speed: 5.8 GT/s.
Regards,
Christoph
Can you try the latest code in head?
I've removed all the optimization/pessimization compiler flags for gpt/zfs boot
blocks that at times seemed to do more harm than good.
--
Andriy Gapon
On 13.10.11 00:33, Christoph Hoffmann wrote:
> I am inclined to think that this is related to the way how we compile this code,
> especially when run on the following particular processor:
>
> 1 Processor(s) detected, 4 total cores enabled, Hyperthreading is enabled
> Proc 1: Intel(R) Xeon(R) CPU E5630 @ 2.53GHz
> QPI Speed: 5.8 GT/s.
For me, this happens on
CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2400.10-MHz
K8-class CPU)
On HP DL360 G7
I try to boot -stable.
This is still happening with FreeBSD 9.0-RELEASE, as I have just
discovered. The hack works like a charm, but seems kind of odd... :)
Any progress in getting a "real" fix into the repository? Any risks with
the hack - is it likely to believe that it will suddenly or sporadically
fail?
Cheers,
Palle
Christoph Hoffmann skrev:
I think this bug has been fix by John Baldwin (see below) after he found that HP
implemented 'e09127r3 EDD-4 Hybrid MBR boot code annex' dated
4 January 2010.
Maybe John could shade some light on it?
Regards,
Christoph
Author: jhb
Date: Wed Nov 9 18:26:19 2011
New Revision: 227400
URL:
http://svn.freebsd.org/changeset/base/227400
Log:
MFC 226748:
- Add a new header for the x86 boot code that defines various structures
and constants related to the BIOS Enhanced Disk Drive Specification.
- Use this header instead of magic numbers and various duplicate structure
definitions for doing I/O.
- Use an actual structure for the request to fetch drive parameters in
drvsize() rather than a gross hack of a char array with some magic
size. While here, change drvsize() to only pass the 1.1 version of
the structure and not request device path information. If we want
device path information you have to set the length of the device
path information as an input (along with probably checking the actual
EDD version to see which size one should use as the device path
information is variable-length). This fixes data smashing problems
from passing an EDD 3 structure to BIOSes supporting EDD 4.
Approved by: re (kib)
--
Christoph Hoffmann
Hmm, this fix should be in 9.0, so I don't have an explanation for why booting
on 9.0 would still be broken.
--
John Baldwin
5 mar 2012 kl. 18:39 skrev John Baldwin <j...@freebsd.org>:
> On Saturday, March 03, 2012 7:06:14 pm Christoph Hoffmann wrote:
>> Hello,
>>
>> I think this bug has been fix by John Baldwin (see below) after he found that HP
>> implemented 'e09127r3 EDD-4 Hybrid MBR boot code annex' dated
>> 4 January 2010.
>>
>> Maybe John could shade some light on it?
>
> Hmm, this fix should be in 9.0, so I don't have an explanation for why booting
> on 9.0 would still be broken.
Ok, that's odd. I tried 9.0, it does fail, and the printf actually makes it work.
Palle
Can you try editing sys/boot/i386/common/drv.c and adding some additional padding after
the edd_params? Perhaps the BIOS is assuming it always gets the full thing even if
we pass in a 1.1-sized structure. Just try putting a edd_params_v4 structure after the
normal one.
> On Monday, March 05, 2012 2:35:59 pm Palle Girgensohn wrote:
>>
>> 5 mar 2012 kl. 18:39 skrev John Baldwin <j...@freebsd.org>:
>>
>>> On Saturday, March 03, 2012 7:06:14 pm Christoph Hoffmann wrote:
>>>> Hello,
>>>>
>>>> I think this bug has been fix by John Baldwin (see below) after he found that HP
>>>> implemented 'e09127r3 EDD-4 Hybrid MBR boot code annex' dated
>>>> 4 January 2010.
>>>>
>>>> Maybe John could shade some light on it?
>>>
>>> Hmm, this fix should be in 9.0, so I don't have an explanation for why booting
>>> on 9.0 would still be broken.
>>
>>
>> Ok, that's odd. I tried 9.0, it does fail, and the printf actually makes it work.
>
> Can you try editing sys/boot/i386/common/drv.c and adding some additional padding after
> the edd_params? Perhaps the BIOS is assuming it always gets the full thing even if
> we pass in a 1.1-sized structure. Just try putting a edd_params_v4 structure after the
> normal one.
Yes, I'll try that. It might take a couple of days, since I'm in the a quite busy the next day or two, but I will check it out.
Cheers,
Palle
I still have the HW and will test this next week.
Thanks,
Björn
Skickat från min iPhone