Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Software VS Hardware Raid

0 views
Skip to first unread message

jo...@iteso.mx

unread,
Jan 30, 2002, 1:56:25 AM1/30/02
to
On Mon, 28 Jan 2002, Jason Lim wrote:

> detected the drive, but during the part that "lilo: " is supposed to come
> up, nothing did. The disk kept grinding and grinding, and eventually asked
> for a floppy. I was hoping that the 2nd, working drive in the raid array
> would kick in any moment, but that didn't happen. Everything stalled right
> there.

Lilo would have to know about your RAID setup (and of course it doesn't),
that's why it's not recommended to use software RAID on the root partition.

I'd say software RAID should be used on data partitions, and keep a backup
of your root partition somewhere, so that when the disk holding it fails,
you just swap in a new one and recover your root backup. When a disk holding
the data partition (on sw/raid) fails I assume it'd work as advertised.

You can't be 24x7-high-availability with software raid only, there's always
some down time involved with it, or at least a higher risk of downtime than
with hardware raid.

> If the bad drive is put in by itself, after a while the disk is
> failed and it tries to boot by floppy.

Does Lilo ever appear? or does the BIOS ask itself for a floppy disk?
if LILO Loading Linux...... does not appear, then your HD is never
going to make it as a root partition holder.


> cable btw. The BIOS had the usual settings allowing me to set the boot
> order (Floppy first, CDrom next, hard disk 0, then network (no, i can't
> put hard disk 1, I wish i could), and finally had "Boot other devices" set
> to yes.

What would happen if you plug the faulty drive on the second HD instead
of the first one? so that lilo boots... ??

>
> My question: if this was hardware RAID 1... would this have happened?
> Would the hardware RAID controller recognise the problem, and only stop
> briefly, then try the second disk automatically and transparently?

In my experience (ICP-Vortex fibrechannel and scsi), yes the hardware RAID does
spot the faulty drive and kicks in with the sane one immediately, the OS is
alerted that a drive is at fault in the array, but apart from that everything
runs smoothly.

Depending on your syslog configuration, it whines that you should change the
faulty drive with a good one until you do.


> Case 2)
> I simulated errors by connecting a flaky IDE cable to one of the drives. I
> was hoping the software RAID would either compensate by doing most of it's
> reading from the good drive (with a good cable) or labelling the flaky
> cable/drive as bad, but instead it started slowing down, and writing to
> the array was taking much longer and strange errors starting occurring
> during writing.
>
> My question: would hardware raid have handled this situation any better?


Again, in my experience: definitely yes.

>
> And as for Hardware IDE raid, which is better... Promise or HighPoint?
> promise seems to be better supported in the kernel, but I'm not so sure.
> What happens when (for example) a disk in the array fails? How do you
> control the hardware raid so you can control a rebuild? And for Promise,
> HighPoint, etc., what are the devices going to be called (/dev/hde? or
> maybe /dev/raid/array1?)

Dunno about IDE RAID, but with ICP-Vortex both FC and SCSI, you get a nifty
little console application (icpcon) which allows you to manage every feature of
the hardware, you can add/remove/modify arrays, change raid levels in an array,
monitor IO and cache in physical/host/array drives, rescan the bus for new
disks/devices, change cluster/non-shared settings and a big etc. Actually
icpcon does everything that the controller BIOS allows, with the same
'interface' but on the shell.

I assume that Promise or whatever would have an application that would allow
to mangle the arrays or at least monitor them... but then again if you don't
have hot-swap capability there isn't much that you can change once the system
is up and running.

Although I think at comdex I saw some IDE RAID boxes with hot-swap bays, I
don't know how commercial those might be as opposed to SCA/hotswap scsi which
seems to be everywhere now.


Jose


--
To UNSUBSCRIBE, email to debian-is...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Jason Lim

unread,
Jan 30, 2002, 2:02:55 AM1/30/02
to

> > If the bad drive is put in by itself, after a while the disk is
> > failed and it tries to boot by floppy.
>
> Does Lilo ever appear? or does the BIOS ask itself for a floppy
disk?
> if LILO Loading Linux...... does not appear, then your HD is never
> going to make it as a root partition holder.

Nope, Lilo never gets loaded. Because hda is not working, it gets stuck
there trying to access it continually. I can hear it grinding away, but
nothing happens.

>
> > cable btw. The BIOS had the usual settings allowing me to set the boot
> > order (Floppy first, CDrom next, hard disk 0, then network (no, i
can't
> > put hard disk 1, I wish i could), and finally had "Boot other devices"
set
> > to yes.
>
> What would happen if you plug the faulty drive on the second HD
instead
> of the first one? so that lilo boots... ??
>

Yeap... if you plug the faulty drive into hdc (you know what i mean) and
the working into hda, then it boots.

> > My question: if this was hardware RAID 1... would this have happened?
> > Would the hardware RAID controller recognise the problem, and only
stop
> > briefly, then try the second disk automatically and transparently?
>
> In my experience (ICP-Vortex fibrechannel and scsi), yes the hardware
RAID does
> spot the faulty drive and kicks in with the sane one immediately, the
OS is
> alerted that a drive is at fault in the array, but apart from that
everything
> runs smoothly.
>
> Depending on your syslog configuration, it whines that you should
change the
> faulty drive with a good one until you do.

Thats want I want then... ;-)

> > Case 2)
> > I simulated errors by connecting a flaky IDE cable to one of the
drives. I
> > was hoping the software RAID would either compensate by doing most of
it's
> > reading from the good drive (with a good cable) or labelling the flaky
> > cable/drive as bad, but instead it started slowing down, and writing
to
> > the array was taking much longer and strange errors starting occurring
> > during writing.
> >
> > My question: would hardware raid have handled this situation any
better?
>
>
> Again, in my experience: definitely yes.

No choice but hardware raid then ... despite the extra costs.

The 3ware 7xxx IDE RAID cards and associated hotwap bays are hotswap... at
a price.

wypiwyg (what you pay is what you get) in this case.

jo...@iteso.mx

unread,
Jan 30, 2002, 2:55:05 AM1/30/02
to
On Wed, 30 Jan 2002, Jason Lim wrote:

> > Depending on your syslog configuration, it whines that you should
> change the
> > faulty drive with a good one until you do.
>
> Thats want I want then... ;-)

[..and from 'closest to debian']
>>Unfortunately, if this happens at 3am in the morning, no one wants to go

I forgot to mention that a kewl feature of these controllers (maybe some others
also have it, but when I did my reaserch to buy RAID I dont remember that
anyone else's had this) you can define a 'hot-spare' drive and assign it
to a given array (or several), so that when a disk in an array gets fsckd up,
the controller automagically starts to rebuild the array with this 'hot-spare',
which becomes now part of it and thus no longer available for other arrays...
so you get redundancy back as soon as it finishes rebuilding the array by the
morning :).

> The 3ware 7xxx IDE RAID cards and associated hotwap bays are hotswap... at
> a price.
>
> wypiwyg (what you pay is what you get) in this case.
>

Yup, I agree, the dual channel 64bit FC RAIDs + JBODs + disks its the most
expensive thing we've got here (pricier than the not-so-shameful debian boxes
they are attached to, actually). But the single channel SCSI/Ultra160s are not
that expensive considering they basically have all the features of the first
ones, only missing fibrechannel related stuff, actually.

The only problem we had with them was with those pesky Intel GX mobos, but
that changed when the 'new' STL2s came in.


Jose

Russell Coker

unread,
Jan 30, 2002, 3:20:17 AM1/30/02
to
On Wed, 30 Jan 2002 17:54, jo...@iteso.mx wrote:
> > detected the drive, but during the part that "lilo: " is supposed to come
> > up, nothing did. The disk kept grinding and grinding, and eventually
> > asked for a floppy. I was hoping that the 2nd, working drive in the raid
> > array would kick in any moment, but that didn't happen. Everything
> > stalled right there.
>
> Lilo would have to know about your RAID setup (and of course it doesn't),
> that's why it's not recommended to use software RAID on the root
> partition.

Who recommends that you don't use software RAID on the root file system?

Not me (lilo maintainer and user of this), not the lilo author, not the
software RAID kernel maintainer.

> I'd say software RAID should be used on data partitions, and keep a
> backup of your root partition somewhere, so that when the disk holding it
> fails, you just swap in a new one and recover your root backup. When a disk
> holding the data partition (on sw/raid) fails I assume it'd work as
> advertised.

If the primary disk fails and the BIOS and boot loader don't allow booting
from the second disk then you just have to physically swap disks (which is
much less effort than swapping disks and restoring from backup).

> You can't be 24x7-high-availability with software raid only, there's
> always some down time involved with it, or at least a higher risk of
> downtime than with hardware raid.

Actually LinuxBIOS could solve this issue...

--
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/ Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page

Jose Alberto Guzman

unread,
Jan 31, 2002, 6:24:03 PM1/31/02
to

Russell Coker wrote:

>On Wed, 30 Jan 2002 17:54, jo...@iteso.mx wrote:
>
>>>detected the drive, but during the part that "lilo: " is supposed to come
>>>up, nothing did. The disk kept grinding and grinding, and eventually
>>>asked for a floppy. I was hoping that the 2nd, working drive in the raid
>>>array would kick in any moment, but that didn't happen. Everything
>>>stalled right there.
>>>
>> Lilo would have to know about your RAID setup (and of course it doesn't),
>> that's why it's not recommended to use software RAID on the root
>>partition.
>>
>

>Who recommends that you don't use software RAID on the root file system?
>
>Not me (lilo maintainer and user of this), not the lilo author, not the
>software RAID kernel maintainer.
>

Sorry, I'm not up to date on the newest features of LILO (it's cool
that is supports SW/RAID now, btw), I stated this because of what I read
on the Software-RAID-HOWTO.

http://www.ibiblio.org/pub/Linux/docs/HOWTO/other-formats/html_single/Software-RAID-HOWTO.html

'The latest official lilo distribution (Version 21) doesn't handle RAID
devices, and thus the kernel cannot be loaded at boot-time from a RAID
device. If you use this version, your |/boot| filesystem will have to
reside on a non-RAID device. A way to ensure that your system boots no
matter what is, to create similar |/boot| partitions on all drives in
your RAID, that way the BIOS can always load data from eg. the first
drive available. This requires that you do not boot with a failed disk
in your system.'

It is stated there also that you can boot root RAID filesystems, but it
requires some tweaking (applying some RedHat patches to lilo,
installing on a spare disk, then copying the installation on the RAID
fs...), which is less straightforward than having the / partition on a
normal device.

Btw, while searching for the howto, I found several of them dealing with
the issue:
http://www.ibiblio.org/pub/Linux/docs/HOWTO/other-formats/html_single/Root-RAID-HOWTO.html
http://www.ibiblio.org/pub/Linux/docs/HOWTO/other-formats/html_single/Boot+Root+Raid+LILO.html

>
>> I'd say software RAID should be used on data partitions, and keep a
>>backup of your root partition somewhere, so that when the disk holding it
>>fails, you just swap in a new one and recover your root backup. When a disk
>>holding the data partition (on sw/raid) fails I assume it'd work as
>>advertised.
>>
>

>If the primary disk fails and the BIOS and boot loader don't allow booting
>from the second disk then you just have to physically swap disks (which is
>much less effort than swapping disks and restoring from backup).
>

>> You can't be 24x7-high-availability with software raid only, there's
>>always some down time involved with it, or at least a higher risk of
>>downtime than with hardware raid.
>>
>

>Actually LinuxBIOS could solve this issue...
>

--

Jason Lim

unread,
Jan 31, 2002, 9:44:16 PM1/31/02
to

> > drive available. This requires that you do not boot with a failed disk
> > in your system.'
>

> Which won't necessarily work with the most recent LILO because it relies
on
> the BIOS detecting the disk as bad and skipping it (which may not
happen).
>

I think that may have been the problem I encountered... the motherboard
BIOS did not recognize the disk as bad/failed, because the disk was sort
of moving and working (grinding sounds could be heard from it)... the
drive responded during the disk identification bootup phase. Maybe some
other motherboards might work better, but I tested with an Asus and
Magic-Pro motherboard, and had the same thing happen.

I think (hopefully) that a Hardware IDE Raid card should solve this
problem. I am in the process of buying a couple of 3ware cards right now
(especially after Promise said outright that they do not support Debian,
and Adaptec had no response, only 3ware replied with help). I will connect
the failed drive to it, and see if it does anything, and let you know.
Perhaps enough of the disk's chipsets and such are responding that it
could even trick the RAID card into thinking it is sort of working
(although one would imagine that a specialized RAID card would have more
intelligence than a regular IDE motherboard).

Russell Coker

unread,
Jan 31, 2002, 10:47:47 PM1/31/02
to
On Fri, 1 Feb 2002 13:36, Jason Lim wrote:
> > > drive available. This requires that you do not boot with a failed disk
> > > in your system.'
> >
> > Which won't necessarily work with the most recent LILO because it relies
>
> on
>
> > the BIOS detecting the disk as bad and skipping it (which may not
>
> happen).
>
>
> I think that may have been the problem I encountered... the motherboard
> BIOS did not recognize the disk as bad/failed, because the disk was sort
> of moving and working (grinding sounds could be heard from it)... the
> drive responded during the disk identification bootup phase. Maybe some
> other motherboards might work better, but I tested with an Asus and
> Magic-Pro motherboard, and had the same thing happen.

There are some motherboards which have software RAID in the BIOS. This
allows them to deal with that problem at boot time, and then the kernel does
software RAID with the same mapping once it's loaded.

> I think (hopefully) that a Hardware IDE Raid card should solve this
> problem. I am in the process of buying a couple of 3ware cards right now
> (especially after Promise said outright that they do not support Debian,
> and Adaptec had no response, only 3ware replied with help). I will connect
> the failed drive to it, and see if it does anything, and let you know.

I look forward to it.

Also perhaps you should post a photo-copy of your 3ware receipt to Promise
with an explanation of why you could never purchase or recommend their
products.

--
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/ Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page

Jason Lim

unread,
Feb 1, 2002, 2:00:48 AM2/1/02
to

>
> There are some motherboards which have software RAID in the BIOS. This
> allows them to deal with that problem at boot time, and then the kernel
does
> software RAID with the same mapping once it's loaded.

Mmm... those boards use the Highpoint chip... thats not real good at
anything ;-) Performance is lackluster, and reliability... well, lets put
it this way: there are IDE RAID cards here with Highpoint chips for US$25
with change ;-)

>
> > I think (hopefully) that a Hardware IDE Raid card should solve this
> > problem. I am in the process of buying a couple of 3ware cards right
now
> > (especially after Promise said outright that they do not support
Debian,
> > and Adaptec had no response, only 3ware replied with help). I will
connect
> > the failed drive to it, and see if it does anything, and let you know.
>
> I look forward to it.
>
> Also perhaps you should post a photo-copy of your 3ware receipt to
Promise
> with an explanation of why you could never purchase or recommend their
> products.
>

3ware, Promise, Adaptec are all price approximately around each other.
Even if Promise would be willing to recompile their binary for Debian, I
would have considered them... but their total lack of help means they
weren't even in the running. In these hard economic times, you'd imagine
Promise would be a bit "nicer". Oh well.

0 new messages