Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

A question about RAID1

0 views

Skip to first unread message

Kelledin

unread,

Mar 29, 2001, 12:29:59 PM3/29/01

I'm not sure exactly how the kernel RAID driver works...but I can apply some
of my own knowledge of hardware RAID.

First of all, in the failure scenario you mention, the first disk in the
array is not always the first one that has data written to it. A good RAID
driver will start both write operations in parallel if possible, and which
operation will complete first is never known for certain.

This is often a problem with low-end hardware RAID as well. If the power
fails in the middle of a write operation, then the RAID data may have slight
inconsistencies. Running a RAID array through a verify/fix operation will
fix inconsistencies but may not actually restore the data to what was
supposed to be written. Consider this scenario:

Some given location on a RAID 1 array contains a byte of value 0x12.
A RAID driver gets a request to write value 0x21 to this location.
The RAID driver starts both writes in parallel. The second disk gets
written first.
The power fails suddenly. Since the first disk didn't get written, the RAID
data is inconsistent.
A verify/fix is started on the RAID array next time the system is powered
up. Maybe it starts automatically, or maybe a sysadmin starts it.
The verify/fix algorithm notices the inconsistency at our given location on
disk. Not knowing which write completed first, it decides the first one
probably completed first and fixes the inconsistency accordingly.
Now our RAID array contains value 0x12 at the given location. Even though
RAID data is consistent, the value at our given location is wrong. Data
corruption has occurred, and the RAID driver has no way of detecting it.

The way high-end hardware RAID controllers often handle this is by having a
battery backup module built into the controller and recording which disk
writes have or have not been completed in some form of non-volatile RAM
(basically, you have a journaling RAID controller). When power fails, the
battery keeps the RAID controller up long enough to keep its journaling info
up-to-date. The writes are then completed next time all components of the
RAID array are functional. This solution proves to be near-perfect (nothing
is absolutely perfect, not even in computers).

AFAIK, software RAID has no solution as near-perfect as this. One possible
workaround for this sort of problem is to use a journaling filesystem such
as ReiserFS, although it still isn't quite as near-perfect. This brings to
mind another question: does the RAID driver cache disk writes, reporting a
disk write as complete before it is actually complete?

Kelledin, the Dreaming Minstrel
http://kelledin.tripod.com/scovsms.jpg

> For a write operation to a 2-disks RAID 1 volume, the raid driver should
> copy the data to both disks. What will happen if the write operation to
the
> first disk succeeds and the second fails (due to kernel panic or power
down)
> ? This will create inconsistence between 2 disks, but does any part of
RAID
> code detect this condition ?
>
> Thanks
>
> Hsing
>
>

Hamish Marson

unread,

Apr 3, 2001, 9:27:01 AM4/3/01

The AIX LVM has a thing called mirror write consistency. Basically it keeps an
area of the disk as a transaction log to update which mirror pair copies have
completed their writes, and which haven't. Writes can be either parallel or
serial (Selectable on a per LV basis).

On restart, when the resync kicks in to update any stale mirror copies, it uses
this area of disk to choose which of the two copies are the correct one to copy
over the other.

It takes a little bit of a performance hit, but not enough to notice (i.e. on
disk intensive stuff i"ve tuned it off, and have got maybe a 3-5% difference in
write speed max).

0 new messages