booting off raid1 /dev/mdX, lilo problems

1 Aufruf
Direkt zur ersten ungelesenen Nachricht

Giulio Orsero

ungelesen,
05.05.2001, 04:27:1305.05.01
an
System details at the end of the email.

PROBLEM:
I understand the latest lilo should allow to boot off a raid device, however I'm
having trouble doing so. Booting with the complete mirror is ok, booting with
the first disk (hda) is ok, booting with just the 2nd (hdc) stops at LI.
When doing tests I power off, and disconnect the IDE cable and reboot.
BIOS is all to auto.

I see there are some weird things with CHS (I'm using old HDs on a test system,
before doing this on a "real" system), could they be the cause of my problems?

When doing "lilo" I see there's a * (star) near the "linux" image on hda, but
not for the one of hdc. I tried setting the bootable flag on hdc1, but it does
not change anything.

I'm sure the bios allows to boot with just HDC, as I tried this is the past with
not mirrored disks.


SYSTEM DETAILS:
RedHat6.x system with 2.2.19 kernel (from RH + some minor patches not related to
raid, so with 0.90 raid).

== /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc2[1] hda2[0] 819392 blocks [2/2] [UU]
md1 : active raid1 hdc3[1] hda3[0] 205568 blocks [2/2] [UU]
md2 : active raid1 hdc4[1] hda4[0] 177344 blocks [2/2] [UU]
unused devices: <none>

== lilo version
LILO version 21.4-4

== lilo.conf
boot = /dev/md0
message = /boot/message
lba32

timeout = 200

image = /boot/vmlinuz
label = linux
root = /dev/md0
read-only

== output of lilo -v (no * for the hdc "linux")
LILO version 21.4-4, Copyright (C) 1992-1998 Werner Almesberger
'lba32' extensions Copyright (C) 1999,2000 John Coffman

boot = /dev/hda, map = /boot/map.0302
Reading boot sector from /dev/hda
Merging with /boot/boot.b
Mapping message file /boot/message
Boot image: /boot/vmlinuz
Added linux *
/boot/boot.0300 exists - no backup copy made.
Writing boot sector.
boot = /dev/hdc, map = /boot/map.1602
Reading boot sector from /dev/hdc
Merging with /boot/boot.b
Mapping message file /boot/message
Boot image: /boot/vmlinuz
Added linux
/boot/boot.1600 exists - no backup copy made.
Writing boot sector.

== partition tables
Partition Table for /dev/hda
---Starting--- ----Ending---- Start Number of
# Flags Head Sect Cyl ID Head Sect Cyl Sector Sectors
-- ----- ---- ---- ---- ---- ---- ---- ---- -------- ---------
1 0x00 1 1 0 0x82 127 63 32 63 266049
2 0x00 0 1 33 0xFD 127 63 236 266112 1645056
3 0x00 0 1 237 0xFD 127 63 287 1911168 411264
4 0x00 0 1 288 0xFD 127 63 614 2322432 2636928

Partition Table for /dev/hdc
---Starting--- ----Ending---- Start Number of
# Flags Head Sect Cyl ID Head Sect Cyl Sector Sectors
-- ----- ---- ---- ---- ---- ---- ---- ---- -------- ---------
1 0x00 1 1 0 0x82 63 63 24 63 100737
2 0x80 0 1 25 0xFD 63 63 431 100800 1641024
3 0x00 0 1 432 0xFD 63 63 533 1741824 411264
4 0x00 0 1 534 0xFD 63 63 621 2153088 354816

== kernel boot messages regarding partition
hda: SAMSUNG WU32543A (2.54GB), ATA DISK drive
hdc: FUJITSU M1636TAU, ATA DISK drive
hdd: PCRW804, ATAPI CDROM drive
hda: SAMSUNG WU32543A (2.54GB), 2423MB w/109kB Cache, CHS=615/128/63
hdc: FUJITSU M1636TAU, 1226MB w/128kB Cache, CHS=2491/16/63

Partition check:
hda: hda1 hda2 hda3 hda4
hdc: [PTBL] [622/64/63] hdc1 hdc2 hdc3 hdc4


== /etc/raidtab
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hda2
raid-disk 0
device /dev/hdc2
raid-disk 1

raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hda3
raid-disk 0
device /dev/hdc3
raid-disk 1

raiddev /dev/md2
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hda4
raid-disk 0
device /dev/hdc4
raid-disk 1

== df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/md0 792800 203145 548686 27% /
/dev/md1 199015 13 188724 0% /data1
/dev/md2 171711 13 162831 0% /data2

hda1 and hdc1 are used for swap, not on raid.

Thanks

--
giu...@pobox.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majo...@vger.kernel.org

Al Hooton

ungelesen,
05.05.2001, 12:28:3605.05.01
an
At 10:27 AM 5/5/2001 +0200, Giulio Orsero wrote:
>PROBLEM:
>I understand the latest lilo should allow to boot off a raid device,
>however I'm
>having trouble doing so. Booting with the complete mirror is ok, booting with
>the first disk (hda) is ok, booting with just the 2nd (hdc) stops at LI.
>When doing tests I power off, and disconnect the IDE cable and reboot.
>BIOS is all to auto.

I had this exact problem with Mandrake 8.0 (2.4.3-20mdk), and
decided to do the following. Now that I've done it, I even like it better,
because I think it's simpler (less to go wrong).

Summary: since I couldn't get LILO to write out boatloaders that
would successfully boot off /dev/mdX, and deal with situations where
/dev/mdX was running in degraded mode (your question, actually), I now
write a slightly different boot loader to the MBR of each disk in the
array. This way the machine will come up off whichever physical disk is
the first it finds at boot time, and not worry about any raid complications
until the stage-2 bootloader mounts the root fs from /dev/mdX (which *does*
work in degraded mode).

Details:

- Make one copy of /etc/lilo.conf for each disk, i.e., lilo-sda.conf,
lilo-sdb.conf, etc.

- Edit each copy as such:
- Make the "boot=" line match the device, i.e., boot=/dev/sda,
boot=/dev/sdb, etc.
- Keep a different map file for each bootloader by editing the
"map=" entry in each file, i.e., map=map-sda, map=map-sdb, etc.

- Write a boot loader to each disk, i.e., "lilo -C /etc/lilo-sda.conf",
"lilo -C /etc/lilo-sdb.conf", etc.

Now, the box can correctly boot off *any* disk in the array
without worrying about raid support.

Caveat: I've only done all this with a raid-1 set, and have no
first-hand knowledge for other raid configs.

Hope that helps!

-Al


-------------------------------------------------------------------
| voice: 503.247.9256
Lots of folks confuse bad management | email: a...@teleport.com
with destiny. | cell: 503.709.0028
| email to my cell:
- Kin Hubbard | 50370...@mobile.att.net
-------------------------------------------------------------------

Giulio Orsero

ungelesen,
06.05.2001, 06:39:0006.05.01
an

>), I now
>write a slightly different boot loader to the MBR of each disk in the
>array. This way the machine will come up off whichever physical disk is
>...

>- Write a boot loader to each disk, i.e., "lilo -C /etc/lilo-sda.conf",
> "lilo -C /etc/lilo-sdb.conf", etc.
Thanks, this way it works; now my mirror boots off hdc just fine.

However, I had to do the lilo for hdc with hda disconnected.
With hda connected lilo gave me a warning about "hdc not first disk" and at boot
I got "LI".
I tried adding to the hdc lilo.conf
disk = /dev/hdc
bios = 0x80
this eliminated the warning, but still "LI".

Did lilo work for you with both hda and hdc connected?

Thanks.

Al Hooton

ungelesen,
06.05.2001, 11:03:2006.05.01
an
At 12:39 PM 5/6/2001 +0200, Giulio Orsero wrote:
> >), I now
> >write a slightly different boot loader to the MBR of each disk in the
> >array. This way the machine will come up off whichever physical disk is
> >...
> >- Write a boot loader to each disk, i.e., "lilo -C /etc/lilo-sda.conf",
> > "lilo -C /etc/lilo-sdb.conf", etc.
>
>Thanks, this way it works; now my mirror boots off hdc just fine.
>
>However, I had to do the lilo for hdc with hda disconnected.
>With hda connected lilo gave me a warning about "hdc not first disk" and
>at boot
>I got "LI".
>I tried adding to the hdc lilo.conf
>disk = /dev/hdc
>bios = 0x80
>this eliminated the warning, but still "LI".
>
>Did lilo work for you with both hda and hdc connected?

Giulio,

No, I didn't have to disconnect any drives when writing out the
different bootloaders, but I'm using SCSI drives, not IDE drives (I'm
assuming you're using IDE drives because your device names are
"hdX"). It's possible this is an issue for lilo when using IDE, but not SCSI.

-Al

-------------------------------------------------------------------
| voice: 503.247.9256
Lots of folks confuse bad management | email: a...@teleport.com
with destiny. | cell: 503.709.0028
| email to my cell:
- Kin Hubbard | 50370...@mobile.att.net
-------------------------------------------------------------------

Dave Meythaler

ungelesen,
07.05.2001, 14:13:0207.05.01
an
The boot hanging at "LI" seems to be a "feature" of the LILO which shipped
with RedHat 6.2.

The simple workaround for it is to add a "default=" line to your lilo.conf
file.

I haven't heard of more recent versions of lilo having this problem, but
have not tried any of them myself (or later RH versions).


-----Original Message-----
From: Giulio Orsero [mailto:giu...@pobox.com]
Sent: Saturday, May 05, 2001 1:27 AM
To: linux...@vger.kernel.org
Subject: booting off raid1 /dev/mdX, lilo problems


System details at the end of the email.

PROBLEM:


I understand the latest lilo should allow to boot off a raid device, however
I'm
having trouble doing so. Booting with the complete mirror is ok, booting
with
the first disk (hda) is ok, booting with just the 2nd (hdc) stops at LI.
When doing tests I power off, and disconnect the IDE cable and reboot.
BIOS is all to auto.

<cut ...>

<cut ...>

Giulio Orsero

ungelesen,
07.05.2001, 15:36:2807.05.01
an
On Mon, 7 May 2001 11:13:02 -0700, you wrote:

>The boot hanging at "LI" seems to be a "feature" of the LILO which shipped
>with RedHat 6.2.
>The simple workaround for it is to add a "default=" line to your lilo.conf
>file.
>I haven't heard of more recent versions of lilo having this problem, but
>have not tried any of them myself (or later RH versions).

I've tried rh71 21.4.4, and even 21.7.

Yes, adding "default = linux" made the * near hdc "linux" image to appear, but
LI was still still there.

I made some tests (maybe the following is something very obvious, but since the
howto is not updated to the boot=/dev/md0 setup I missed it):
it seems that for the single lilo.conf with boot=/dev/md0 to work you need to
use the same HS (CHS) for the 2 disks, and start the boot /dev/mdX device at the
same C (CHS).
Otherwise you need the 2 lilo.conf.hdx setup that Al suggested in the previous
email and/or use CHS values in lilo.conf.
Basically, the 2 lilo.conf's setup will _always_ work, the boot=/dev/md0 has
more initial requirements (CHS, "default" in lilo.conf, ...).

The problem is that fdisk/bios/linux like to assign different default CHS even
to equal HDs. Today I tried with 2 _identical_ 10GB IDE disks: didn't pay
attention to CHS (I made MB based partitions) and hda was recognized as
H255,S63,CX, while hdc was something like H63,S63 or similar. Raid1 worked, boot
did not. Had to force the hdc disk to be H255,S63, and then both raid1 and boot
off raid1 worked.

I was able to make my initial setup (2 completely different HDs) to work with
boot=/dev/md0 by forcing the 2nd HD to H128,S63,C311 (it was H63,S63,C622 while
the 1st is H128,S63,C615).

Thanks.

--
giu...@pobox.com

Jim Meyer

ungelesen,
07.05.2001, 14:33:2607.05.01
an
Howdy!

A friend revealed this to me some time ago: the letters "LILO" actually
come up individually at boot time, each after successfully completing a
major phase of the bootstrap. Here's the official word from the LILO
README:

When LILO loads itself, it displays the word "LILO". Each letter is
printed
before or after performing some specific action. If LILO fails at some
point, the letters printed so far can be used to identify the problem.
This
is described in more detail in the technical overview.

Note that some hex digits may be inserted after the first "L" if a
transient disk problem occurs. Unless LILO stops at that point,
generating
an endless stream of error codes, such hex digits do not indicate a
severe
problem.

(<nothing>) No part of LILO has been loaded. LILO either isn't
installed
or the partition on which its boot sector is located isn't active.
L <error> ... The first stage boot loader has been loaded and
started,
but it can't load the second stage boot loader. The two-digit error
codes indicate the type of problem. (See also section "Disk error
codes".) This condition usually indicates a media failure or a
geometry
mismatch (e.g. bad disk parameters, see section "Disk geometry").
LI The first stage boot loader was able to load the second stage
boot
loader, but has failed to execute it. This can either be caused by a
geometry mismatch or by moving /boot/boot.b without running the map
installer.
LIL The second stage boot loader has been started, but it can't
load
the descriptor table from the map file. This is typically caused by
a
media failure or by a geometry mismatch.
LIL? The second stage boot loader has been loaded at an incorrect
address. This is typically caused by a subtle geometry mismatch or
by
moving /boot/boot.b without running the map installer.
LIL- The descriptor table is corrupt. This can either be caused by
a
geometry mismatch or by moving /boot/map without running the map
installer.
LILO All parts of LILO have been successfully loaded.

Cheers!

--j

Dave Meythaler wrote:
>
> The boot hanging at "LI" seems to be a "feature" of the LILO which shipped
> with RedHat 6.2.
>
> The simple workaround for it is to add a "default=" line to your lilo.conf
> file.
>
> I haven't heard of more recent versions of lilo having this problem, but
> have not tried any of them myself (or later RH versions).
>

> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majo...@vger.kernel.org

--
Jim Meyer, Geek At Large pu...@wildbrain.com

Michael

ungelesen,
07.05.2001, 17:48:4507.05.01
an
> On Mon, 7 May 2001 11:13:02 -0700, you wrote:
>
> >The boot hanging at "LI" seems to be a "feature" of the LILO which shipped
> >with RedHat 6.2.
> >The simple workaround for it is to add a "default=" line to your lilo.conf
> >file.
> >I haven't heard of more recent versions of lilo having this problem, but
> >have not tried any of them myself (or later RH versions).
>
> I've tried rh71 21.4.4, and even 21.7.
>
> Yes, adding "default = linux" made the * near hdc "linux" image to
> appear, but LI was still still there.
>

<snip>

I suspect that the problem has to do with the raid aware LILO not
being smart enough to figure out the individual disk geometries. You
can circumvent the problem entirely by always specifying the disk
geometry and the bios # you use to boot from. This is particularly
useful because not all systems automatically boot from the next
available device and some systems will not boot at all from device
0x81. Most, but not all scsii sub-systems will do this however if the
failed 1st disk does not answer the scsi bus probe.

The simplest solution it to pretend you do not have a raid aware LILO
and do it the old way as outlined in the b+r+r+lilo howto

Michael
Mic...@Insulin-Pumpers.org

Allen antworten
Dem Autor antworten
Weiterleiten
0 neue Nachrichten