Re: netboot issues, 8.0, mfsroot mount failure

Jeremy Chadwick

unread,

Feb 16, 2010, 9:27:34 PM2/16/10

to freebsd...@freebsd.org

On Tue, Feb 16, 2010 at 06:11:36PM -0800, Jeremy Chadwick wrote:
> The NFS root mount you see happening later is a result of the root
> filesystem not being available. This is normal if mfsroot fails.

A follow-up to my own post:

The above paragraph is incorrect. The NFS root mount is proper,
expected, and required -- but the failures you see later on (e.g.
/etc/fstab missing, etc. -- which are in your private mail to me but not
here), I believe, are a result of the mfsroot stuff not working.

Sorry for the confusion; I should really be asleep right now...

Jeremy Chadwick

unread,

Feb 16, 2010, 9:11:36 PM2/16/10

to freebsd...@freebsd.org

On Tue, Feb 16, 2010 at 08:28:03PM -0500, Charles Sprickman wrote:
> Howdy,
>
> I'm having some problems getting 8.0 to install over the network.
> I've got my dhcp, tftp and nfs server working well, and I've tested
> all three services from this host before attempting to boot over the
> network.
>
> pxeboot seems to work, and I see it get loaded via tftp. The kernel
> boots, and parses the options in loader.conf that exist in my
> nfs-exported 8.0 DVD fileset:
>
> [root@archive /home/spork/tmp]# cat
> /usr/local/netboot/freebsd8/boot/loader.conf
> mfsroot_load="YES"
> mfsroot_type="mfs_root"
> mfsroot_name="/boot/mfsroot"
> boot_multicons="YES"
> boot_serial="YES"
> console="comconsole,vidconsole"
> vfs.root.mountfrom="ufs:/dev/md0c"
>
> I see the kernel does find mfsroot and attaches it:
>
> md0: Preloaded image </boot/mfsroot> 4423680 bytes at 0xc0f6dfe0
>
> But then when it's ready to mount the root filesystem, I get this:
>
> SMP: AP CPU #1 Launched!
> Trying to mount root from ufs:/dev/md0c
> ROOT MOUNT ERROR:
>
> If you have invalid mount options, reboot, and first try the
> following from the loader prompt:
>
> set vfs.root.mountfrom.options=rw
>
> and then remove invalid mount options from /etc/fstab.
>
> It doesn't really state what the error is. It's hinting that it's
> read-only, but that seems odd. Even if it couldn't mount r/w,
> shouldn't it just drop to single-user at this point?
>
> Next it tries nfs:
>
> Trying to mount root from nfs:
> NFS ROOT: 192.168.1.111:/usr/local/netboot/freebsd8
> em0: link state changed to UP
>
> And there it sits. Remotely I can't do anything. If I'm local, I
> can ctrl-alt-del a few times and then about a minute later it does
> an orderly restart.

We've been talking off-list about this (and other things), but at this
point I'm pretty sure the problem is that the local slice naming
convention has changed in RELENG_8 from what it was in RELENG_7.

This is the cause/result of the "GEOM overhall" (or whatever it is; I
don't know what to call it. Is it libdisk changes? GEOM? Both? I
really don't know). Basically, the way the full size of the disk gets
handled differs now from RELENG_7. (See footnote for "fun") So, try
changing this:

vfs.root.mountfrom="ufs:/dev/md0c"

...to this (look closely):

vfs.root.mountfrom="ufs:/dev/md0"

Remember: the mfsroot image is essentially a UFS filesystem that's
mounted as memory disk. Since you re-created mfsroot (like you're
supposed to :-) ) on a RELENG_8 box, the layout is different.

The NFS root mount you see happening later is a result of the root
filesystem not being available. This is normal if mfsroot fails.

Please let me (on the list) know if this fixes your problem.

Footnote: This is why I tell folks to zero out the first 8192 bytes of
any disk they've previously installed FreeBSD on (even if the disk has
no filesystems/slices on it). The way FreeBSD determines the size of
the disk differs in RELENG_8; I believe GEOM "figures it out" on its own
now, while previous releases relied on the "c" slice. The method I've
recommended: do dd if=/dev/zero of=/dev/adX bs=512 count=16.

Garrett Cooper

unread,

Feb 16, 2010, 9:49:32 PM2/16/10

to Charles Sprickman, FreeBSD-STABLE Mailing List

> I'm not aware of a good way to snoop on nfs traffic, but tcpdump shows nfs
> traffic between the two hosts, which appears to be the client stat-ing a
> file or directory. �tcpdump also shows some checksum errors, but I recall a
> few threads here mentioning that on Intel cards that generally is not a
> cause for concern.
>
>> From another host, I have no issues mounting that nfs filesystem r/w:
>
> root@h10[/home/spork]# mount_nfs 192.168.1.111:/usr/local/netboot/freebsd8
> /mntroot@h10[/home/spork]# ls /mnt/
> .cshrc � � � � �HARDWARE.TXT � �boot.catalog � �media � � � � � sbin
> .profile � � � �README.HTM � � �cdrom.inf � � � mnt � � � � � � stand
> 8.0-RELEASE � � README.TXT � � �dev � � � � � � packages � � � �sys
> COPYRIGHT � � � RELNOTES.HTM � �docbook.css � � proc � � � � � �tmp
> ERRATA.HTM � � �RELNOTES.TXT � �etc � � � � � � rescue � � � � �usr
> ERRATA.TXT � � �bin � � � � � � lib � � � � � � root � � � � � �var
> HARDWARE.HTM � �boot � � � � � �libexec � � � � rr_moved
> root@h10[/home/spork]# touch /mnt/foo
> root@h10[/home/spork]# rm /mnt/foo
> root@h10[/home/spork]# umount /mnt
>
> Any ideas? �I've got about a dozen remote boxes to upgrade, so I want to
> totally nail down this procedure. �I've been putting off learning this for a
> few years, and now I've got an actual need for it.

I'll be in your shoes in a little bit... I ran into some issues
when I last tried with NFSv3 on a Solaris server so we'll see how
things go in the second go-around [with a FreeBSD nfs rootfs server],
but 7.x netbooted and 8.x didn't when I tried last; the 7.x images
have some secret sauce fixes for PXE booting -- the ones I know of are
as follows (apply to loader.conf as you feel fit):

vfs.root.mountfrom="nfs"
boot.nfsroot.path="/absolute/path/to/netboot/dir"
boot.nfsroot.server="nfs-server-ip-addr"

There were also some code changes made to `fix' netbooting with
pxeloader, but I'm not sure if they're applicable or needed in 8.x,
and I'm not sure what the actual changes are TBH...
Cheers,
-Garrett

Charles Sprickman

unread,

Feb 16, 2010, 11:43:30 PM2/16/10

to Jeremy Chadwick, freebsd...@freebsd.org

I made the change on the server, but the box is stuck until I can get over
there again. Serial consoles are nice, but not being able to send
"ctrl-alt-del" is a sad limitation. :)

> Remember: the mfsroot image is essentially a UFS filesystem that's
> mounted as memory disk. Since you re-created mfsroot (like you're
> supposed to :-) ) on a RELENG_8 box, the layout is different.

In this case, I'm still using the stock 8.0 mfsroot. The only change was
to un-gzip it. But this particular issue is probably due to the geom
change you noted, so we'll see what happens on reboot.

> The NFS root mount you see happening later is a result of the root
> filesystem not being available. This is normal if mfsroot fails.

I'm still stumped as to why it hangs there. I do have something for it to
mount there via NFS (the 8.0 dvd contents), and it appears to try, but
then it just sits there. Not locked up, just waiting...

> Please let me (on the list) know if this fixes your problem.

As soon as she boots, I'll report back.

> Footnote: This is why I tell folks to zero out the first 8192 bytes of
> any disk they've previously installed FreeBSD on (even if the disk has
> no filesystems/slices on it). The way FreeBSD determines the size of
> the disk differs in RELENG_8; I believe GEOM "figures it out" on its own
> now, while previous releases relied on the "c" slice. The method I've
> recommended: do dd if=/dev/zero of=/dev/adX bs=512 count=16.

Is it also advisable to blot out any old glabel stuff at the end of the
disk? What's the math to get that? Get a sector count for the whole
disk, set "bs" to 512 and "skip" to (sector count - 1)?

Thanks,

Charles

> --
> | Jeremy Chadwick j...@parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, USA |
> | Making life hard for others since 1977. PGP: 4BD6C0CB |
>

> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
>

Charles Sprickman

unread,

Feb 16, 2010, 11:47:10 PM2/16/10

to Garrett Cooper, FreeBSD-STABLE Mailing List

In my case the server is FreeBSD (7.2). It's running with default nfsd
flags, so I suppose it's offering both tcp and udp. No idea what version.
It seems to work enough for the loader to fetch the loader files and
kernel...

> but 7.x netbooted and 8.x didn't when I tried last; the 7.x images
> have some secret sauce fixes for PXE booting -- the ones I know of are
> as follows (apply to loader.conf as you feel fit):
>
> vfs.root.mountfrom="nfs"
> boot.nfsroot.path="/absolute/path/to/netboot/dir"
> boot.nfsroot.server="nfs-server-ip-addr"

Is this documented somewhere?

> There were also some code changes made to `fix' netbooting with
> pxeloader, but I'm not sure if they're applicable or needed in 8.x,
> and I'm not sure what the actual changes are TBH...

Ugh. With all these variables AND the general nuttiness of PC hardware I
see many reboot cycles in my future.

Charles

> Cheers,
> -Garrett
>

Adam Vande More

unread,

Feb 16, 2010, 11:51:43 PM2/16/10

to Charles Sprickman, Garrett Cooper, FreeBSD-STABLE Mailing List

On Tue, Feb 16, 2010 at 10:47 PM, Charles Sprickman <sp...@bway.net> wrote:

> Is this documented somewhere?
>

Here:

http://www.nber.org/sys-admin/FreeBSD-diskless.html

Whoever wrote that, thank you.

--
Adam Vande More

Jeremy Chadwick

unread,

Feb 17, 2010, 2:22:53 PM2/17/10

to freebsd...@freebsd.org

On Tue, Feb 16, 2010 at 11:43:30PM -0500, Charles Sprickman wrote:
> >Footnote: This is why I tell folks to zero out the first 8192 bytes of
> >any disk they've previously installed FreeBSD on (even if the disk has
> >no filesystems/slices on it). The way FreeBSD determines the size of
> >the disk differs in RELENG_8; I believe GEOM "figures it out" on its own
> >now, while previous releases relied on the "c" slice. The method I've
> >recommended: do dd if=/dev/zero of=/dev/adX bs=512 count=16.
>
> Is it also advisable to blot out any old glabel stuff at the end of
> the disk? What's the math to get that? Get a sector count for the
> whole disk, set "bs" to 512 and "skip" to (sector count - 1)?

I don't think the glabel data (which goes at the end of the disk) is
relevant to the above problem I described. You can erase it if you
want, but I doubt it's responsible for warnings like "Disk geometry does
not match label!" or situations where a user is re-using a disk (that
had its slices created on RELENG_7) on RELENG_8 and experiences
problems. An alternative to the dd method might be to try "gpart
destroy"; I haven't tried to see if relieves the problem.

As far as how to erase the glabel metadata -- "glabel clear" is supposed
to do this for you. What I don't know is whether or not addition of a
glabel decreases what GEOM thinks the total size of the disk is, so I
can't say for certain doing some math + zeroing the last sector of the
disk would actually work.

Antony Mawer

unread,

Feb 17, 2010, 11:17:39 PM2/17/10

to Jeremy Chadwick, freebsd...@freebsd.org

On Thu, Feb 18, 2010 at 6:22 AM, Jeremy Chadwick
<fre...@jdc.parodius.com> wrote:
> On Tue, Feb 16, 2010 at 11:43:30PM -0500, Charles Sprickman wrote:
>> >Footnote: This is why I tell folks to zero out the first 8192 bytes of
>> >any disk they've previously installed FreeBSD on (even if the disk has
>> >no filesystems/slices on it). �The way FreeBSD determines the size of
>> >the disk differs in RELENG_8; I believe GEOM "figures it out" on its own
>> >now, while previous releases relied on the "c" slice. �The method I've
>> >recommended: do dd if=/dev/zero of=/dev/adX bs=512 count=16.
>>
>> Is it also advisable to blot out any old glabel stuff at the end of
>> the disk? �What's the math to get that? �Get a sector count for the
>> whole disk, set "bs" to 512 and "skip" to (sector count - 1)?
>
> I don't think the glabel data (which goes at the end of the disk) is
> relevant to the above problem I described. �You can erase it if you
> want, but I doubt it's responsible for warnings like "Disk geometry does
> not match label!" or situations where a user is re-using a disk (that
> had its slices created on RELENG_7) on RELENG_8 and experiences
> problems. �An alternative to the dd method might be to try "gpart
> destroy"; I haven't tried to see if relieves the problem.
>
> As far as how to erase the glabel metadata -- "glabel clear" is supposed
> to do this for you. �What I don't know is whether or not addition of a
> glabel decreases what GEOM thinks the total size of the disk is, so I
> can't say for certain doing some math + zeroing the last sector of the
> disk would actually work.

I have recently been using the following snippet in an install script
to zero out any existing gmirror/etc metadata before the install
proceeds (and potentially reconfigures a new gmirror etc):

# Specify the disk device to clear
diskdev="da0"

# Clear metadata from the last sector on the drive
echo "Clearing any GEOM metadata on drive..."
diskinfo=`diskinfo /dev/$diskdev`
sector_size=`echo $diskinfo | cut -f2 -d\ `
size_in_sectors=`echo $diskinfo | cut -f4 -d\ `
geom_offset=$(($size_in_sectors-1))
dd if=/dev/zero of=/dev/$diskdev bs=$sector_size
oseek=$geom_offset count=1 2> /dev/null

In preliminary testing it seems to do the job...

-- Antony