ZFS Pool Always Faulted After Reboot

2,039 views
Skip to first unread message

Gordan Bobic

unread,
Mar 21, 2011, 9:32:29 PM3/21/11
to KQStor ZFS Discussion
I seem to be experiencing a strange but consistent behaviour of ZFS on
my system. I can create a zpool and zfs without any problems:

# zpool create zfs raidz2 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/
sde1
# zfs set dedup=on zfs
# zfs create zfs/test

Then I shut down, remove one of the disks, and reboot. ZFS fails to
reassemble the redundant stripe.

# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
zfs - - - - - FAULTED -

# zpool status zfs
pool: zfs
state: UNAVAIL
status: One or more devices could not be used because the label is
missing
or invalid. There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from
a backup source.
see: http://www.sun.com/msg/ZFS-8000-5E
scan: none requested
config:

NAME STATE READ WRITE CKSUM
zfs UNAVAIL 0 0 0 insufficient replicas
raidz2-0 UNAVAIL 0 0 0 insufficient replicas
sda1 FAULTED 0 0 0 corrupted data
sdb1 FAULTED 0 0 0 corrupted data
sdc1 FAULTED 0 0 0 corrupted data
sdd1 FAULTED 0 0 0 corrupted data
sde1 UNAVAIL 0 0 0

None of the disks moved or were removed, but for some reason the zpool
doesn't get reassembled. Am I missing a step somewhere? My suspicion
is that the problem is that the first disk was removed, so all the
disks' names shifted by one. Is this normal? Should ZFS not reassemble
the pool correctly even if the device names have changed? Is there a
recommended way around this?

TIA.

Gordan

Gordan Bobic

unread,
Mar 21, 2011, 9:35:58 PM3/21/11
to KQStor ZFS Discussion
Just wanted to add - re-adding the disk doesn't fix the problem:


# zpool status
pool: zfs
state: UNAVAIL
status: One or more devices could not be used because the label is
missing
or invalid. There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from
a backup source.
see: http://www.sun.com/msg/ZFS-8000-5E
scan: none requested
config:

NAME STATE READ WRITE CKSUM
zfs UNAVAIL 0 0 0 insufficient replicas
raidz2-0 UNAVAIL 0 0 0 insufficient replicas
sda1 FAULTED 0 0 0 corrupted data
sdb1 FAULTED 0 0 0 corrupted data
sdc1 FAULTED 0 0 0 corrupted data
sdd1 FAULTED 0 0 0 corrupted data
sde1 FAULTED 0 0 0 corrupted data

Florian Franzen

unread,
Mar 21, 2011, 9:47:59 PM3/21/11
to KQStor ZFS Discussion
Hello Gordan,

have you tried exporting and reimporting your zpool. This worked for
me in such a case.

> Should ZFS not reassemble
> the pool correctly even if the device names have changed?

You are right. I think ZFS should not depend on the udev drive mapping
as well. This is especially annoying if you plug in a USB device
before booting and are than surprised by the message, that your whole
pool seems to be corrupted beyond repair.

Greetings
Florian

Stone

unread,
Mar 22, 2011, 3:03:43 AM3/22/11
to kqstor-zf...@googlegroups.com
Hi,

Use /dev/disk/by-id/... I think it should be consistent.

Stone

Gordan Bobic

unread,
Mar 23, 2011, 2:25:26 PM3/23/11
to KQStor ZFS Discussion
> Use /dev/disk/by-id/... I think it should be consistent.

I just tried that. I created a raidz2 zpool from 5 devices by wwn in /
dev/disk/by-id. Shut down, removed one of the devices, and the array
comes up as faulted with 4 devices, one device unavailable. Surely the
zpool should come up as long as no more than two devices are missing
in raidz2? Is the Linux implementation really so incomplete that it
doesn't include fault tolerance? Or is something else going on?

Gordan

robsee

unread,
Mar 23, 2011, 5:13:54 PM3/23/11
to KQStor ZFS Discussion
Gordan,

If you zpool export the faulted pool and then reimport it, it will
come up. Before you export it, what does it say for the device names
when you run zpool status

-Rob

Gordan Bobic

unread,
Mar 25, 2011, 11:04:31 AM3/25/11
to KQStor ZFS Discussion
Hmm, I must have done something wrong last time because I rebuilt the
whole thing from scratch and now I cannot get it to reoccur. I'll keep
an eye on it and see if it happens again.

Gordan

Robert Greenwalt

unread,
Sep 8, 2013, 10:27:13 AM9/8/13
to kqstor-zf...@googlegroups.com, gordan...@gmail.com
I have a similar issue where ever other or every 3rd boot has a device unavailable "because the label is missing or invalid."  I've found that "zpool offline <pool> <drive>" followed by "zpool online <pool> <drive>" fixes the problem and a full scrub shows no issues.  It's almost like the drive isn't quite ready when zfs starts checking things, perhaps a delayed spinup problem?

R
Reply all
Reply to author
Forward
0 new messages