Raid not shutting down when disks are lost?

Pierre Ossman

unread,

Oct 8, 2009, 10:41:07 AM10/8/09

to ne...@suse.de, LKML

Today one RAID6 array I manage decided to lose four out of eight disks.
Oddly enough, the array did not shut down but instead I got
intermittent read and writer errors from the filesystem.

It's been some time since I had a failure of this magnitude, but I seem
to recall that once the array lost too many disks, it would shut down
and refuse to write a single byte. The nice effect of this was that if
it was a temporary error, you could just reboot and the array would
start nicely (albeit in degraded mode).

Has something changed? Is this perhaps an effect of using RAID6 (I used
to run RAID5 arrays)? Or was I simply lucky the previous instances I've
had?

Related, it would be nice if you could control how it handles lost
disks. E.g. I'd like it to go read-only when it goes in to fully
degraded mode. In case the last disk lost was only a temporary glitch,
the array could be made to recover without a lengthy resync.

Rgds
--
-- Pierre Ossman

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.

signature.asc

Pierre Ossman

unread,

Nov 21, 2009, 11:09:14 AM11/21/09

to ne...@suse.de, LKML

Neil?

On Thu, 8 Oct 2009 16:39:52 +0200
Pierre Ossman <pierr...@ossman.eu> wrote:

> Today one RAID6 array I manage decided to lose four out of eight disks.
> Oddly enough, the array did not shut down but instead I got
> intermittent read and writer errors from the filesystem.
>
> It's been some time since I had a failure of this magnitude, but I seem
> to recall that once the array lost too many disks, it would shut down
> and refuse to write a single byte. The nice effect of this was that if
> it was a temporary error, you could just reboot and the array would
> start nicely (albeit in degraded mode).
>
> Has something changed? Is this perhaps an effect of using RAID6 (I used
> to run RAID5 arrays)? Or was I simply lucky the previous instances I've
> had?
>
> Related, it would be nice if you could control how it handles lost
> disks. E.g. I'd like it to go read-only when it goes in to fully
> degraded mode. In case the last disk lost was only a temporary glitch,
> the array could be made to recover without a lengthy resync.
>
> Rgds

--
-- Pierre Ossman

WARNING: This correspondence is being monitored by FRA, a
Swedish intelligence agency. Make sure your server uses

signature.asc

Dan Williams

unread,

Nov 21, 2009, 2:22:10 PM11/21/09

to Pierre Ossman, ne...@suse.de, LKML

On Sat, Nov 21, 2009 at 9:03 AM, Pierre Ossman <pierr...@ossman.eu> wrote:
> Neil?
>
> On Thu, 8 Oct 2009 16:39:52 +0200
> Pierre Ossman <pierr...@ossman.eu> wrote:
>
>> Today one RAID6 array I manage decided to lose four out of eight disks.
>> Oddly enough, the array did not shut down but instead I got
>> intermittent read and writer errors from the filesystem.

This is expected.

The array can't shutdown when there is a mounted filesystem. Reads
may still be serviced from the survivors, all writes should be aborted
with an error.

>>
>> It's been some time since I had a failure of this magnitude, but I seem
>> to recall that once the array lost too many disks, it would shut down
>> and refuse to write a single byte. The nice effect of this was that if
>> it was a temporary error, you could just reboot and the array would
>> start nicely (albeit in degraded mode).
>>
>> Has something changed? Is this perhaps an effect of using RAID6 (I used
>> to run RAID5 arrays)? Or was I simply lucky the previous instances I've
>> had?

It should not come back up nicely in this scenario. You need
"--force" to attempt to reassemble a failed array.

>>
>> Related, it would be nice if you could control how it handles lost
>> disks. E.g. I'd like it to go read-only when it goes in to fully
>> degraded mode. In case the last disk lost was only a temporary glitch,
>> the array could be made to recover without a lengthy resync.
>>

When you say "fully-degraded" do you mean "failed"? In general the
bitmap mechanism provides fast resync after temporary disk outages.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Pierre Ossman

unread,

Nov 21, 2009, 2:37:03 PM11/21/09

to Dan Williams, ne...@suse.de, LKML

On Sat, 21 Nov 2009 12:21:58 -0700
Dan Williams <dan.j.w...@intel.com> wrote:

> On Sat, Nov 21, 2009 at 9:03 AM, Pierre Ossman <pierr...@ossman.eu> wrote:
> > Neil?
> >
> > On Thu, 8 Oct 2009 16:39:52 +0200
> > Pierre Ossman <pierr...@ossman.eu> wrote:
> >
> >> Today one RAID6 array I manage decided to lose four out of eight disks.
> >> Oddly enough, the array did not shut down but instead I got
> >> intermittent read and writer errors from the filesystem.
>
> This is expected.
>
> The array can't shutdown when there is a mounted filesystem. Reads
> may still be serviced from the survivors, all writes should be aborted
> with an error.
>

It could "shut down" in the sense that it refuses to touch the
underlying hardware and just report errors to upper layers. I.e. don't
update the md superblock marking more disks as failed.

> >>
> >> It's been some time since I had a failure of this magnitude, but I seem
> >> to recall that once the array lost too many disks, it would shut down
> >> and refuse to write a single byte. The nice effect of this was that if
> >> it was a temporary error, you could just reboot and the array would
> >> start nicely (albeit in degraded mode).
> >>
> >> Has something changed? Is this perhaps an effect of using RAID6 (I used
> >> to run RAID5 arrays)? Or was I simply lucky the previous instances I've
> >> had?
>
> It should not come back up nicely in this scenario. You need
> "--force" to attempt to reassemble a failed array.
>

If the last disk is thrown out either because of a read error, or
because of the first write of a stripe (i.e. what's on the platters is
still in sync) then a force would not be needed. This requires the md
code to not mark that last disk as failed in the superblocks of the
remaining disks though.

> >>
> >> Related, it would be nice if you could control how it handles lost
> >> disks. E.g. I'd like it to go read-only when it goes in to fully
> >> degraded mode. In case the last disk lost was only a temporary glitch,
> >> the array could be made to recover without a lengthy resync.
> >>
>
> When you say "fully-degraded" do you mean "failed"? In general the
> bitmap mechanism provides fast resync after temporary disk outages.
>

Fully degraded means still working but without any redundancy. I.e. one
lost disk with RAID 5 or two with RAID 6. And the bitmap mechanism
seems to be broken in that case since I always experience full,
multi-hour resyncs whenever a disk is lost.

Or is there perhaps some magic mdadm command to add a lost disk and get
it up to speed without a complete sync?

signature.asc

Dan Williams

unread,

Nov 21, 2009, 3:59:13 PM11/21/09

to Pierre Ossman, ne...@suse.de, LKML

On Sat, Nov 21, 2009 at 12:36 PM, Pierre Ossman <pierr...@ossman.eu> wrote:
> Or is there perhaps some magic mdadm command to add a lost disk and get
> it up to speed without a complete sync?

If you care about resync speed then look into adding a write intent
bitmap to your configuration; otherwise, look into the --assume-clean
option to recreate the array.

Quoting from the man page:
--assume-clean
Tell mdadm that the array pre-existed and is known to be clean.
It can be useful when trying to recover from a major failure as you
can be sure that no data will be affected unless you actually write to
the array. It can also be used when creating a RAID1 or RAID10 if you
want to avoid the initial resync, however this practice, while
normally safe, is not recommended. Use this only if you really know
what you are doing.