Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Jumpstart on LINUX boot hangs

51 views
Skip to first unread message

Manfred Durban

unread,
May 4, 2004, 2:31:54 PM5/4/04
to
Hi everybody!

I am currently trying to port a Jumpstartserver onto LINUX (SLES8) and
running into a mayor problem.

I set up all the services (bootparamd, tftpd, rarpd, nfsd) und copied
the bits to the certain directories. That should all be OK!

The SUN-Machine boots (gets its IP config & tftps the bootkernel) and
then hangs saying
"Requesting Internetadress for x:x:x:x:x:x
panic - boot: Could not mount filesystem"

The last syslogentry on the Linuxbox is:
"bootparamd whoami got questin from x:x:x:x:x:x:x
bootparamd This is host hostname
bootparamd Returning hostname SERVERIP"

The weird thing, that sometimes it boots quite some steps further. Then
it actually starts to mount the filesystem - sometimes even configure
/dev and the networkinterfaces - and then hangs at "whoami: no domain name"

As I said before, the weird thing is, that is sometimes boots further,
then the other time.

Did anybody try this before and experienced similar problems???

THX for any help!

-- Manfred

Darren Dunham

unread,
May 4, 2004, 8:32:43 PM5/4/04
to
Manfred Durban <manni....@myskoda.de> wrote:
> Hi everybody!

> I am currently trying to port a Jumpstartserver onto LINUX (SLES8) and
> running into a mayor problem.

> I set up all the services (bootparamd, tftpd, rarpd, nfsd) und copied
> the bits to the certain directories. That should all be OK!

> The SUN-Machine boots (gets its IP config & tftps the bootkernel) and
> then hangs saying
> "Requesting Internetadress for x:x:x:x:x:x
> panic - boot: Could not mount filesystem"

Looks to me like it requested (and received) the a server:/path for the
root filesystem, but the NFS mount for that failed.

Snoop/tcpdump the wire and look for NFS mount requests.

> The weird thing, that sometimes it boots quite some steps further. Then
> it actually starts to mount the filesystem - sometimes even configure
> /dev and the networkinterfaces - and then hangs at "whoami: no domain name"

Ugh. If it hangs after NFS starts up, there may be a problem with how
the NFS server handles device files. Make sure you have all patches for
the NFS stuff on the Linux server.

> As I said before, the weird thing is, that is sometimes boots further,
> then the other time.

Two servers?

--
Darren Dunham ddu...@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >

Manfred Durban

unread,
May 5, 2004, 9:48:01 AM5/5/04
to
Darren Dunham wrote:
> Looks to me like it requested (and received) the a server:/path for the
> root filesystem, but the NFS mount for that failed.
>
> Snoop/tcpdump the wire and look for NFS mount requests.

It mounts it!

>
> Ugh. If it hangs after NFS starts up, there may be a problem with how
> the NFS server handles device files. Make sure you have all patches for
> the NFS stuff on the Linux server.
>

My server is on the latest Patchlevel, is there a way to modify the
server in order to handle those files correctly?

Thx for help!
Manfred

Darren Dunham

unread,
May 5, 2004, 11:34:48 AM5/5/04
to
Manfred Durban <manni....@myskoda.de> wrote:
> Darren Dunham wrote:
>> Looks to me like it requested (and received) the a server:/path for the
>> root filesystem, but the NFS mount for that failed.
>>
>> Snoop/tcpdump the wire and look for NFS mount requests.

> It mounts it!

Then what crosses the wire just before the panic? NFS packets?

You might try playing with the server to either have it run only tcp/udp
or v2/v3 and see if some combination is better.

Manfred Durban

unread,
May 5, 2004, 12:29:38 PM5/5/04
to
Darren Dunham wrote:

>
> Then what crosses the wire just before the panic? NFS packets?

YES

>
> You might try playing with the server to either have it run only tcp/udp
> or v2/v3 and see if some combination is better.
>

Sorry... but how? :(

Darren Dunham

unread,
May 5, 2004, 12:48:03 PM5/5/04
to
Manfred Durban <manni....@myskoda.de> wrote:
> Darren Dunham wrote:

>>
>> Then what crosses the wire just before the panic? NFS packets?

> YES

Well, what kind? What are the contents? Use -v and look. It should be
readable with the client attempting to find a file and the server either
returning data or an error message. I think reading that should be a
good first step.

*something* happens just before the panic. If you knew what, it might
point to a problem or misconfiguration that you can resolve.

>> You might try playing with the server to either have it run only tcp/udp
>> or v2/v3 and see if some combination is better.
>>
> Sorry... but how? :(

You've got a linux server. I have no idea. Presumably running a man on
nfsd might return something useful.

Manfred Durban

unread,
May 5, 2004, 1:53:10 PM5/5/04
to
Darren Dunham wrote:

> Well, what kind? What are the contents? Use -v and look. It should be
> readable with the client attempting to find a file and the server either
> returning data or an error message. I think reading that should be a
> good first step.
>
> *something* happens just before the panic. If you knew what, it might
> point to a problem or misconfiguration that you can resolve.

Well... there is data crossing the wire! Shortly before it freezes,
there are "normal" reads, lookups, access etc... then it certainly stops
- no error message!

> > You've got a linux server. I have no idea. Presumably running a man on
> nfsd might return something useful.
>

Well.. I tested around with that, but obviously the client uses both
protocols (V2 at the beginning and V3 later on). So I finally
recompliled my kernel (included all available NFS-Options) - dind't help!

Any further ideas??? Somebody must have done this before!

thx,
-- Manfred

Leach

unread,
May 5, 2004, 4:24:18 PM5/5/04
to

This help any? It's for Solaris 7, but shows specifying an rsize option
in the rootopts.

http://www.unixpeople.com/HOWTO/jumpstart_on_linux.html

David Williams

unread,
May 5, 2004, 5:30:50 PM5/5/04
to

"Manfred Durban" <manni....@myskoda.de> wrote in message
news:aS9mc.915$Nm2...@news.cpqcorp.net...

> Darren Dunham wrote:
>
> > Well, what kind? What are the contents? Use -v and look. It should be
> > readable with the client attempting to find a file and the server either
> > returning data or an error message. I think reading that should be a
> > good first step.
> >
> > *something* happens just before the panic. If you knew what, it might
> > point to a problem or misconfiguration that you can resolve.
>
> Well... there is data crossing the wire! Shortly before it freezes,
> there are "normal" reads, lookups, access etc... then it certainly stops
> - no error message!
>

I was getting the same.

> > > You've got a linux server. I have no idea. Presumably running a man
on
> > nfsd might return something useful.
> >
> Well.. I tested around with that, but obviously the client uses both
> protocols (V2 at the beginning and V3 later on). So I finally
> recompliled my kernel (included all available NFS-Options) - dind't help!
>
> Any further ideas??? Somebody must have done this before!
>

Me the other day.. I have it working now!!! tcpdump and even ethereal did
not help
much!

If the Sun box already has Solaris on it and it has not been overwritten
run

bpgetfile -v

from the Sun box to check that the root path is correct.

On the Linux box change the /etc/rc2.3 or /etc/rc3.d or /etc/xinetd.d
script which starts
bootparamd.

For Fedora Core 1 I changed

daemon bootparamd

to

daemon bootparamd -d -s

Then run the script with a stop argument and again with a start
argument.
The bootparamd script will hang there with bootparamd running.

I think the script hangs because bootparamd does not fork and go into
background but runs in foreground but it is still working
(use ps -ef to see that it is running).)

boot net - install

again and look in the syslog on the Linux machine.

My linux machine has two network cards in it (192.168.0.4 and 10.0.0.4).
My Sun box is on the 192.168.0.0 network but bootparamd was sending
10.0.0.4 to the Sun box (something like default route!!).

Changing the bootparamd startup to

bootparamd -r 192.168.0.4

to force the correct IP address in the conversion between bootparamd and
the
Sun box worked!

(The reason tcpdump and etherreal do not work was that I think the sun
box was
trying to sending packets to the wrong ip address and finding no route
to my 10.0.0.0
network!).


> thx,
> -- Manfred


Darren Dunham

unread,
May 5, 2004, 5:30:11 PM5/5/04
to
Manfred Durban <manni....@myskoda.de> wrote:
> Well... there is data crossing the wire! Shortly before it freezes,
> there are "normal" reads, lookups, access etc... then it certainly stops
> - no error message!

Hmm.. That shouldn't cause a "can't mount fileserver" error.

>> > You've got a linux server. I have no idea. Presumably running a man on
>> nfsd might return something useful.
>>
> Well.. I tested around with that, but obviously the client uses both
> protocols (V2 at the beginning and V3 later on). So I finally
> recompliled my kernel (included all available NFS-Options) - dind't help!

The client will start high (tcp vs udp and 3 vs 2). But if the server
doesn't support them, it will do 2 and udp. The server should be
configurable in what it will advertise to clients.

> Any further ideas??? Somebody must have done this before!

Yes. In general I assume that issues people run into are not the ones
you are seeing. The howtos I see on google don't go into the server
configuration, so it must not normally be a problem.

http://www.unixpeople.com/HOWTO/jumpstart_on_linux.html

Bernd Raschke

unread,
May 6, 2004, 6:03:32 AM5/6/04
to
Leach <le...@none.invalid> wrote:
>
>http://www.unixpeople.com/HOWTO/jumpstart_on_linux.html

I don't when or with what kernel versions (of Linux) this ever worked, i
couldn't get it to work with a vanilla kernel. After lots of debugging, i
finally found the difference between SUN and Linux NFS (or better: the one
difference that kept netbooting a SUN from a Linux NFS server):
When determining access rights to a special file, Solaris apparently uses
different criteria than Linux. In other words: Although the root directory
is exported as read-only, when the client requests access rights to a
special file (/dev/console or the like) and the file attributes are -rw-rw-rw
write access is granted. IMHO that's ok, because special files only have
meaning on the client anyway.
Linux, on the other hand, just sees the read-only exported file systems and
tells the client 'No write access here', which causes the booting Sun to
coredump (at least try, it's r/o after all) and die. If you need it, i can
send you the patch (it's just three lines in vfs.c)

Ciao,
Arty


--
Bernd Raschke NetAge Solutions GmbH
Opinions expressed herein are my own and may not represent those of my employer.

Manfred Durban

unread,
May 6, 2004, 6:52:10 AM5/6/04
to
Thank your for the answer!

Please send me the patch

Darren Dunham

unread,
May 6, 2004, 11:36:02 AM5/6/04
to
Bernd Raschke <bdra...@despammed.com> wrote:

> When determining access rights to a special file, Solaris apparently uses
> different criteria than Linux. In other words: Although the root directory
> is exported as read-only, when the client requests access rights to a
> special file (/dev/console or the like) and the file attributes are -rw-rw-rw
> write access is granted. IMHO that's ok, because special files only have
> meaning on the client anyway.
> Linux, on the other hand, just sees the read-only exported file systems and
> tells the client 'No write access here', which causes the booting Sun to
> coredump (at least try, it's r/o after all) and die. If you need it, i can
> send you the patch (it's just three lines in vfs.c)

Very interesting analysis! That fits perfectly with the anecdotal
evidence I've seen over the years, especially with the fact that the
blowups happen when accessing device files on the root filesystem.

Of course I've also read lots of reports of this working on pretty much
out-of-the-box Linux machines around the beginning of the 2.4 kernel
timeframe.

Do you know of any versions of Linux NFS that would *not* be subject to
the behavior you mention?

David Williams

unread,
May 6, 2004, 5:13:19 PM5/6/04
to

"Darren Dunham" <ddu...@redwood.taos.com> wrote in message
news:CXsmc.45553$9Y6....@newssvr29.news.prodigy.com...

> Bernd Raschke <bdra...@despammed.com> wrote:
>
> > When determining access rights to a special file, Solaris apparently
uses
> > different criteria than Linux. In other words: Although the root
directory
> > is exported as read-only, when the client requests access rights to a
> > special file (/dev/console or the like) and the file attributes
are -rw-rw-rw
> > write access is granted. IMHO that's ok, because special files only have
> > meaning on the client anyway.
> > Linux, on the other hand, just sees the read-only exported file systems
and
> > tells the client 'No write access here', which causes the booting Sun to
> > coredump (at least try, it's r/o after all) and die. If you need it, i
can
> > send you the patch (it's just three lines in vfs.c)
>
> Very interesting analysis! That fits perfectly with the anecdotal
> evidence I've seen over the years, especially with the fact that the
> blowups happen when accessing device files on the root filesystem.
>
> Of course I've also read lots of reports of this working on pretty much
> out-of-the-box Linux machines around the beginning of the 2.4 kernel
> timeframe.
>
> Do you know of any versions of Linux NFS that would *not* be subject to
> the behavior you mention?
>

I have it working on Fedore Core 1

Bernd Raschke

unread,
May 7, 2004, 3:17:07 AM5/7/04
to
David Williams <d...@smooth1.fsnet.co.uk> wrote:
>
> I have it working on Fedore Core 1

A coworker of mine had a jumpstart server running on RedHat 7.2, but this
was with a r/w NFS export, so the symptom did not show. As r/w is supposed
to frell your boot environment (although i do not know _how_ exactly), i
tried to do it r/o and had to come up with the patch. Why it works with
different setups (and there are many reports that it does), i do not know.
Maybe things changed from Solaris 7 to 8 and Linux 2.2 to 2.4? I started
with Sol8 12/02 and Linux 2.4.21

Cheers,

Manfred Durban

unread,
May 7, 2004, 6:06:10 AM5/7/04
to
Maybe would should compare the certain setups in more detail.

Which LINUX (kernel, nfs-version..) on which machine?
Which Solaris?
Which Sun-Client?
Which Bootprom-version?

Maybe that helps in order to reproduce the error.

Here are my setups, which did NOT work:

JS: Proliant DL360 SLES 8 kernel 2.4.21 latest patchlevel/ shipped nfs
JC: SunUltra 1 SBUS (UltraSParc 167Mhz) OpenBoot 3.11
SW: Solaris 8 & 9

JS: Proliant DL360 SLES 8 kernel 2.4.21 latest patchlevel/ shipped nfs
JC: 8-slot SunEnterprise 4000/5000 OpenBoot 3.2.27
SW: Solaris 8 & 9

JS: Proliant DL360 RH 7.3 kernel 2.4.18 / shipped NFS
JC: SunUltra 1 SBUS (UltraSParc 167Mhz) OpenBoot 3.11
SW: Solaris 8

JS: Proliant DL360 SuSE 9.1 kernel 2.6.4 latest patchlevel/ shipped NFS
JC: JC: SunUltra 1 SBUS (UltraSParc 167Mhz) OpenBoot 3.11
SW: Solaris 8

All those setups stalled at the point on mounting the /dev/files :(

Please post your setups!

Thx

-- Manfred

Bernd Raschke

unread,
May 7, 2004, 8:58:57 AM5/7/04
to
Manfred Durban <manni....@myskoda.de> wrote:
>Thank your for the answer!
>
>Please send me the patch
>
I tried to reply to your myskoda address, but just got a 'mailbox full'
error. So here i post it. This patch works against 2.6.4, for other
versions, use the source.


*** linux/fs/nfsd/vfs.c Fri May 7 14:53:27 2004
--- linux-2.6.4-br/fs/nfsd/vfs.c Fri May 7 14:52:58 2004
***************
*** 1550,1557 ****
*/
if (!(acc & MAY_LOCAL_ACCESS))
if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
! if (EX_RDONLY(exp) || IS_RDONLY(inode))
return nfserr_rofs;
if (/* (acc & MAY_WRITE) && */ IS_IMMUTABLE(inode))
return nfserr_perm;
}
--- 1550,1561 ----
*/
if (!(acc & MAY_LOCAL_ACCESS))
if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
! if (EX_RDONLY(exp) || IS_RDONLY(inode)) {
! if(!(S_ISBLK(inode->i_mode) ||
! S_ISCHR(inode->i_mode) ||
! S_ISFIFO(inode->i_mode)))
return nfserr_rofs;
+ }
if (/* (acc & MAY_WRITE) && */ IS_IMMUTABLE(inode))
return nfserr_perm;

Manfred Durban

unread,
May 12, 2004, 8:32:35 AM5/12/04
to
Bernd Raschke wrote:
> I tried to reply to your myskoda address, but just got a 'mailbox full'
> error. So here i post it. This patch works against 2.6.4, for other
> versions, use the source.

Hi Bernd!

Thank you for the patch!

I just aplied it, but it didn't show any significant change! My
Jumpstart-Client is still not able to access the /dev files. Do I have
to change anything else?

Thx!

Manfred

Bernd Raschke

unread,
May 12, 2004, 8:17:27 AM5/12/04
to
Manfred Durban <manni....@myskoda.de> wrote:
>
>Thank you for the patch!
np

>I just aplied it, but it didn't show any significant change! My
>Jumpstart-Client is still not able to access the /dev files. Do I have

:-(

>to change anything else?
Not that i know of. This is an excerpt of my /etc/exports:
/jumpstart *(ro,sync,no_root_squash)

The rest won't help you much, as i did my installs with DHCP, not the
old-fashioned rarp/bootparamd way.

How did you get the OS from CD to the NFS shared space? Maybe something
went wrong when doing this? This is what /dev/console looks like on
the exported filesystem of my laptop:

hyakutake Boot # ls -ln dev/console
lrwxrwxrwx 1 0 1 30 Dec 12 15:57 dev/console -> \
../devices/pseudo/cn@0:console
hyakutake Boot # ls -ln devices/pseudo/cn\@0\:console
crw--w---- 1 0 7 0, 0 Jun 12 2003 \
devices/pseudo/cn@0:console

[ excuse the stupid \ slrn wouldn't let me post with >80 characters per line ]
Regards,
Bernd

Manfred Durban

unread,
May 12, 2004, 9:25:34 AM5/12/04
to
Bernd Raschke wrote:
>
> Not that i know of. This is an excerpt of my /etc/exports:
> /jumpstart *(ro,sync,no_root_squash)

Same here!

>
> The rest won't help you much, as i did my installs with DHCP, not the
> old-fashioned rarp/bootparamd way.
>

Maybe you can give me that info, too? Even if I export rw - and the boot
process continues - I get a bgtfile error: unable to access network
information. :(

> How did you get the OS from CD to the NFS shared space? Maybe something
> went wrong when doing this? This is what /dev/console looks like on
> the exported filesystem of my laptop:

I used the following command:

tar cf - . | (cd /jumpstart/sol8 && tar xfp - )

The files should be OK!

Any other suggestions... :(

-- Manfred

Michael Hocke

unread,
May 18, 2004, 11:51:20 AM5/18/04
to
Manfred Durban wrote:

>> Ugh. If it hangs after NFS starts up, there may be a problem with how
>> the NFS server handles device files. Make sure you have all patches for
>> the NFS stuff on the Linux server.
>>
>
> My server is on the latest Patchlevel, is there a way to modify the
> server in order to handle those files correctly?

Did you make sure that your NFS mounted filesystems are accessible by
root? Quite a few files are either set suid root or are only readable by
root. On Solaris you have to explicitly allow root access in your dfstab
(i.e. where you list the file systems to be shared va NFS):

share -F nfs -o ro,root=jumpstartee /usr/local/jumpstart

- Michael

0 new messages