Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

can not change zpool from "suspended" though all devs are OK

1,402 views
Skip to first unread message

andyw...@gmail.com

unread,
Aug 23, 2014, 9:06:42 AM8/23/14
to
Hello All,
Runing Solaris 11.1 on x86 box, I seem to have overloaded one of power supplies (by playing with USB hot-swap disks), and this resulted in some of permanent disks go offline. However, *all* disks in the zpool were marked UNAVAIL, even those powered from another PSU. This happened at very quiet time, with no IO expected on that zpool, so I still believe the data should be recoverable.
I have upgraded all PSUs, evvectively doubled the power available to MB, rpool and all other pools. All disks are visible and editabble in "format", however, I can not find a vay to make zfs recognize them as repaired.

root@fuji:~# fmadm repaired zfs://pool=77a0d6db6629a235/vdev=6d0ef3a462561d1d/pool_name=junk/vdev_name=id1,sd@SATA_____Hitachi_HDS72202______JK1101YAJ90USS/a
fmadm: recorded repair to of zfs://pool=77a0d6db6629a235/vdev=6d0ef3a462561d1d/pool_name=junk/vdev_name=id1,sd@SATA_____Hitachi_HDS72202______JK1101YAJ90USS/a

did that with all disks. Also tried "zpool clear" - no luck

root@fuji:~# zpool status -lvx junk
pool: junk
state: SUSPENDED
status: One or more devices are unavailable in response to IO failures.
The pool is suspended.
action: Make sure the affected devices are connected, then run 'zpool clear' or
'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-HC
scan: resilvered 61K in 0h0m with 1 errors on Sat Aug 23 13:54:26 2014
config:

NAME STATE READ WRITE CKSUM
junk UNAVAIL 0 0 0
mirror-0 UNAVAIL 0 0 0
c9t2d0 UNAVAIL 0 0 0
c8t0d0 UNAVAIL 0 0 0
mirror-1 UNAVAIL 0 0 0
c10t0d0 UNAVAIL 0 0 0
c9t4d0 UNAVAIL 0 0 0
mirror-2 UNAVAIL 0 0 0
c9t1d0 UNAVAIL 0 0 0
c8t1d0 UNAVAIL 0 0 0

device details:

c9t2d0 UNAVAIL experienced I/O failures
status: ZFS detected errors on this device.
The pool experienced I/O failures.

c8t0d0 UNAVAIL experienced I/O failures
status: ZFS detected errors on this device.
The pool experienced I/O failures.

c10t0d0 UNAVAIL experienced I/O failures
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-FD for recovery

c9t4d0 UNAVAIL experienced I/O failures
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.

c9t1d0 UNAVAIL experienced I/O failures
status: ZFS detected errors on this device.
The pool experienced I/O failures.

c8t1d0 UNAVAIL experienced I/O failures
status: ZFS detected errors on this device.
The pool experienced I/O failures.
see: http://support.oracle.com/msg/ZFS-8000-QJ for recovery

root@fuji:~#

andyw...@gmail.com

unread,
Aug 23, 2014, 9:35:24 AM8/23/14
to
Oops, another hard reboot fixed it!

cindy.sw...@gmail.com

unread,
Aug 25, 2014, 5:28:02 PM8/25/14
to
On Saturday, August 23, 2014 7:35:24 AM UTC-6, andyw...@gmail.com wrote:
> Oops, another hard reboot fixed it!

Success or failure of USB hot swapping live pool devices usually depends on whether the hardware
generates or fabricates devIDs. On Oracle hardware, ZFS can follow the devIDs if the controller
changes, for example. Some hardware doesn't generate or fabricate devIDs so with a hardware change, ZFS can't follow the devIDs and this makes live ZFS storage pools sad because it can't find the pool devices. A reboot helped in this case, which is good. I would recommend exporting pools before making significant hardware changes so the pool device info can be re-read on import. Another check to ensure that ZFS is following device changes before or after a hardware change (even if the pool is exported) is with zdb.

# zdb -l /dev/dsk/c2t1d0s0 <-- note the s0 syntax for zdb even if the pool doesn't use slices

Thanks, Cindy

andyw...@gmail.com

unread,
Sep 1, 2014, 4:13:16 PM9/1/14
to
Hello Cindy,
Thanks for the explanation.
I did not intend to use those USB sticks in any pool, the pool in question is 4x2 2TB SATA3 fixed disks. Only 2 of them went offline as a result of one of PSUs overload, however, all 8 disks marked UNAVAIL.
touch /reconfigure && init6 alone did not fix the problem, only fmadm repaired && reboot did.
Regards
Andrei
0 new messages