Can't Recover from RAID5 Failure

Cress, Andrew R

unread,

Nov 8, 2001, 3:33:19 PM11/8/01

to

What kernel version are you running?
What does 'cat /proc/mdstat' report? Is the raid started at all?
Apparently the raid5 is not your boot disk.

I'm not sure how the system handles reassigning device names between steps 6
& 7.
It may be that sdd became sdc in step 7, then when the old sdc is inserted
into the middle again in step 8, it doesn't adjust the raid5 configuration
to account for the shift, so that it now only finds sda & sdb as valid, but
can't understand which of sdc & sdd should be valid.

Andy

-----Original Message-----
From: Gary Huntress [mailto:ghun...@mediaone.net]
Sent: Tuesday, October 30, 2001 5:43 PM
To: linux...@vger.kernel.org
Subject: Can't Recover from RAID5 Failure

==============================================
[...]

1) Configure a very generic /etc/raidtab pointing to /dev/sda1 through sdd1
with no spares
2) Created array /dev/md0 with no problems (120 minutes! wow!)
3) Mounted /dev/md0 and began using it as a data partition for postgres
4) Beat on full 4 drive array for a while using psql
5) Shutdown system normally
6) Unplugged /dev/sdc1 power
7) Rebooted and came up fine, remounted /dev/md0 again and also beat on it
with psql
8) Shutdown normally again and plugged /dev/sdc power back in

Now I want to recover my array after simulating this failure. I guess I'm
supposed to use raidhotadd /dev/md0 /dev/sdc1 but that doesn't work because
it appears that the array is not online.

raidstart /dev/md0 complains that there is a problem with /dev/sda1,
therefore 2 of 4 drives are offline and it can't start the array. /dev/sda1
is definitely physically fine because I'm rebuilding (wiping out) the array
and restarting from scratch to repeat this.

[...]
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Gary Huntress

unread,

Nov 8, 2001, 8:09:13 PM11/8/01

to

Andrews (and everyone),

I won't repeat my entire problem here, its laid out in the included email
below....refer to steps 1 through 8 for what I have done so far. I am still
left trying to recover an array of 4 drives after physically removing one
device, running degraded and then returning the device to its original
position (this was just a test)

To answer your questions, I am running 2.2.16-22 as part of a plain vanilla
RH 7.0 installation. I believe the raidtools are 0.90.
No this is not my boot disk.

/proc/mdstat reports:

Personalities : [raid5]
read_ahead not set
unused devices: <none>

Your statement about my sdd becoming sdc is exactly what I think is the
problem, I just don't know how to correct it. I would think that my
situation would be typicial (rebooting and having the drive /dev/sd?
assignments change due to lost devices)

Here is my very generic /etc/raidtab:
raiddev /dev/md0
raid-level 5
nr-raid-disks 4
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/sda1
raid-disk 0
device /dev/sdb1
raid-disk 1
device /dev/sdc1
raid-disk 2
device /dev/sdd1
raid-disk 3

here is what I get when I try to start the array after putting the fourth
drive back in:
(read) sda1's sb offset: 8909696 [events: 0000000d]
(read) sdb1's sb offset: 8909696 [events: 0000000d]
(read) sdc1's sb offset: 8909696 [events: 0000000b]
autorun ...
considering sdc1 ...
adding sdc1 ...
adding sdb1 ...
adding sda1 ...
created md0
bind<sda1,1>
bind<sdb1,2>
bind<sdc1,3>
running: <sdc1><sdb1><sda1>
now!
sdc1's event counter: 0000000b
sdb1's event counter: 0000000d
sda1's event counter: 0000000d
md: superblock update time inconsistency -- using the most recent one
freshest: sdb1
md: kicking non-fresh sdc1 from array!
unbind<sdc1,2>
export_rdev(sdc1)
md0: former device sdc1 is unavailable, removing from array!
md0: max total readahead window set to 384k
md0: 3 data-disks, max readahead per data-disk: 128k
raid5: device sdb1 operational as raid disk 1
raid5: device sda1 operational as raid disk 0
raid5: not enough operational devices for md0 (2/4 failed)
RAID5 conf printout:
--- rd:4 wd:2 fd:2
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda1
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb1
disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
disk 3, s:0, o:0, n:3 rd:3 us:1 dev:[dev 00:00]
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
raid5: failed to run raid set md0
pers->run() failed ...
do_md_run() returned -22
unbind<sdb1,1>
export_rdev(sdb1)
unbind<sda1,0>
export_rdev(sda1)
md0 stopped.
... autorun DONE.

I've pretty much run out of ideas by myself and I'd appreciate any help!

Regards,
Gary "SuperID" Huntress
=======================================================
FreeSQL.org offering free database hosting to developers
Visit http://www.freesql.org

Andrew Klaassen

unread,

Nov 8, 2001, 9:19:14 PM11/8/01

to

On Thu, Nov 08, 2001 at 09:02:03PM -0500,
Gary Huntress wrote:

> Yes, clean shutdowns, yes healthy drives, yes I get
> exactly what you would expect when I "fdisk -l /dev/sdd1" (and
> all 4 drives have the type raid autodetect"
>
> I will try reordering the scsi IDs as you suggest...but I
> don't think that will work because I already tried reordering
> them in /etc/raidtab. Even if that does work I would think
> that is not really a proper solution anyway.

I understand that you're not actually trying to recover data,
but if all else fails you could always mark the disk with the
0000000b event counter as a failed disk and run mkraid -f...

...this would, though, be even less a "proper solution"...

> Very strange.

Indeed. I've no idea why the RAID code isn't even attempting to
access sdd1... ...unless there's something stored in the
superblock that says "only these partitions/drives are part of
this particular RAID device", and that info got "updated" down
to the first three devices when you pulled out sdc. (That's why
I suggested switching SCSI IDs around - just on the off
chance...)

Andrew Klaassen

Gary Huntress

unread,

Nov 8, 2001, 9:02:03 PM11/8/01

to

Hi Andrew,

Yes, clean shutdowns, yes healthy drives, yes I get exactly what you
would expect when I "fdisk -l /dev/sdd1" (and all 4 drives have the type
raid autodetect"

I will try reordering the scsi IDs as you suggest...but I don't think
that will work because I already tried reordering them in /etc/raidtab.
Even if that does work I would think that is not really a proper solution
anyway.

Very strange. I appreciate the help :)

Regards,
Gary "SuperID" Huntress
=======================================================
FreeSQL.org offering free database hosting to developers
Visit http://www.freesql.org

----- Original Message -----
From: "Andrew Klaassen" <a...@dkp.com>
To: <linux...@vger.kernel.org>
Sent: Thursday, November 08, 2001 8:52 PM
Subject: Re: Can't Recover from RAID5 Failure

> On Thu, Nov 08, 2001 at 08:09:13PM -0500,
> Gary Huntress wrote:
>
> > Andrews (and everyone),
>
> I'm a different Andrew, but I've been doing a little RAID voodoo
> recently, so...

>
> > Your statement about my sdd becoming sdc is exactly what I
> > think is the problem, I just don't know how to correct it. I
> > would think that my situation would be typicial (rebooting and
> > having the drive /dev/sd? assignments change due to lost
> > devices)
>

> > here is what I get when I try to start the array after putting
> > the fourth drive back in:
> > (read) sda1's sb offset: 8909696 [events: 0000000d]
> > (read) sdb1's sb offset: 8909696 [events: 0000000d]
> > (read) sdc1's sb offset: 8909696 [events: 0000000b]
> > autorun ...
>

> If you shut down the system cleanly each time - and you say you
> did - then I'd say that the sdc1 you've put back in - the one
> with 0000000b in its event counter - is the same old sdc that
> you started with, based on the event counts.
>
> This is probably a stupid question, but where's sdd1? Can you
> see it with fdisk -l/less -f/whatever?
>
> What happens if you re-order SCSI IDs so that what was sdd shows
> up as sdc, and the current sdc becomes sdd?

Andrew Klaassen

unread,

Nov 8, 2001, 8:52:27 PM11/8/01

to

Gary Huntress

unread,

Nov 8, 2001, 9:31:45 PM11/8/01

to

Ok, I did reorder the IDs as suggested, and the 3 drive degraded array came
right up and I was able to raidhotadd the drive that I had originally
unplugged (sdc1 which is now sdd1)

Thanks for the help and suggestions....

Hopefully I'll be able to understand this well enough in the future to be
able to recover in the event of an actual failure :)
(this still feels kludgy to me)

Regards,
Gary "SuperID" Huntress
=======================================================
FreeSQL.org offering free database hosting to developers
Visit http://www.freesql.org
----- Original Message -----
From: "Andrew Klaassen" <a...@dkp.com>
To: <linux...@vger.kernel.org>
Sent: Thursday, November 08, 2001 9:19 PM
Subject: Re: Can't Recover from RAID5 Failure

> On Thu, Nov 08, 2001 at 09:02:03PM -0500,
> Gary Huntress wrote:
>

> > Yes, clean shutdowns, yes healthy drives, yes I get
> > exactly what you would expect when I "fdisk -l /dev/sdd1" (and
> > all 4 drives have the type raid autodetect"
> >
> > I will try reordering the scsi IDs as you suggest...but I
> > don't think that will work because I already tried reordering
> > them in /etc/raidtab. Even if that does work I would think
> > that is not really a proper solution anyway.
>

> I understand that you're not actually trying to recover data,
> but if all else fails you could always mark the disk with the
> 0000000b event counter as a failed disk and run mkraid -f...
>
> ...this would, though, be even less a "proper solution"...
>
> > Very strange.
>
> Indeed. I've no idea why the RAID code isn't even attempting to
> access sdd1... ...unless there's something stored in the
> superblock that says "only these partitions/drives are part of
> this particular RAID device", and that info got "updated" down
> to the first three devices when you pulled out sdc. (That's why
> I suggested switching SCSI IDs around - just on the off
> chance...)
>

Neil Brown

unread,

Nov 9, 2001, 1:29:13 AM11/9/01

to

On Thursday November 8, ghun...@mediaone.net wrote:
> Ok, I did reorder the IDs as suggested, and the 3 drive degraded array came
> right up and I was able to raidhotadd the drive that I had originally
> unplugged (sdc1 which is now sdd1)
>
> Thanks for the help and suggestions....
>
> Hopefully I'll be able to understand this well enough in the future to be
> able to recover in the event of an actual failure :)
> (this still feels kludgy to me)

May I recommend "mdctl" to you.
http://www.cse.unsw.edu.au/~neilb/source/mdctl/

It is a replacement for raidtools that works well for me.
To assemble an array from componenet discs (which may have changed
names and device numbers) you use

mdctl --assemble /dev/md0 /dev/sda1 /dev/sda2 /dev/sda3

or whatever.
"raidstart" in raidtools uses a flawed model for starting raid arrays
that fails badly if device numbers change, or even if the first device
listed doesn't exist.

Without mdctl, the only reliable way to start a RAID array is to mark
all the component partitions as "Linux RAID" and have them
autodetected.

NeilBrown