Hi,
I was originally trying to get zfs to use entries from the /dev/disks/
by-id tree rather than the /dev/sdx entries which were changing for me
regularly. I did this because I was running into a problem where
almost every reboot I had to delete the zpool.cache and reimport the
pools to get them to come up. I've since replaced the esata
controllers and that problem seems to have rectified itself. First I
offlined the disk I was trying to switch, and then I tried to zpool
replace the old name with the new name. I did this with 2 disks of my
raidz2 volume before I decided that something was wrong, and I should
probably stop.
here is what my current zpool status looks like:
pool: eightbay2
state: DEGRADED
status: One or more devices is currently being resilvered. The pool
will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Fri Mar 11 23:08:36 2011
45.0G scanned out of 9.64T at 77.4M/s, 36h6m to go
4.01G resilvered, 0.46% done
config:
NAME STATE READ WRITE CKSUM
eightbay2 DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sdd ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
(resilvering)
sdh ONLINE 0 0 0
sdc ONLINE 0 0 0
sdb ONLINE 0 0 0
replacing-6 DEGRADED 0 0 200K
sde ONLINE 0 0 0
(resilvering)
17131933909712789936 UNAVAIL 0 0 0 was /
dev/sdh1
14275510670410547429 UNAVAIL 0 0 0 was /
dev/sda1
replacing-7 UNAVAIL 0 0 0
insufficient replicas
475396521707658792 UNAVAIL 0 0 0 was /
dev/sdh
15235888937654799190 UNAVAIL 0 0 0 was /
dev/sdh1
errors: No known data errors
Here is the zdb tree
eightbay2:
version: 28
name: 'eightbay2'
state: 0
txg: 8177712
pool_guid: 17389717143361898507
hostid: 8323329
hostname: 'eonfbsd'
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 17389717143361898507
children[0]:
type: 'raidz'
id: 0
guid: 8240857012312489910
nparity: 2
metaslab_array: 23
metaslab_shift: 36
ashift: 9
asize: 12002376286208
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 7046159470888568936
path: '/dev/sdd'
whole_disk: 1
DTL: 378
children[1]:
type: 'disk'
id: 1
guid: 8000225162792066463
path: '/dev/sdf'
whole_disk: 1
DTL: 377
children[2]:
type: 'disk'
id: 2
guid: 5229947180552272116
path: '/dev/sdg'
whole_disk: 1
DTL: 372
children[3]:
type: 'disk'
id: 3
guid: 16384725385907175025
path: '/dev/sdh'
whole_disk: 1
DTL: 376
children[4]:
type: 'disk'
id: 4
guid: 504467111533001616
path: '/dev/sdc'
whole_disk: 1
DTL: 329
children[5]:
type: 'disk'
id: 5
guid: 11937532446586896064
path: '/dev/sdb'
whole_disk: 1
DTL: 375
children[6]:
type: 'replacing'
id: 6
guid: 14334310672528416367
whole_disk: 0
children[0]:
type: 'disk'
id: 0
guid: 8750903116896420211
path: '/dev/sde'
whole_disk: 1
DTL: 374
children[1]:
type: 'disk'
id: 1
guid: 17131933909712789936
path: '/dev/sdh1'
whole_disk: 0
not_present: 1
DTL: 435
resilvering: 1
children[2]:
type: 'disk'
id: 2
guid: 14275510670410547429
path: '/dev/sda1'
whole_disk: 0
not_present: 1
DTL: 438
resilvering: 1
children[7]:
type: 'replacing'
id: 7
guid: 8589017169731279389
whole_disk: 0
children[0]:
type: 'disk'
id: 0
guid: 475396521707658792
path: '/dev/sdh'
whole_disk: 1
not_present: 1
DTL: 373
children[1]:
type: 'disk'
id: 1
guid: 15235888937654799190
path: '/dev/sdh1'
whole_disk: 0
not_present: 1
DTL: 436
resilvering: 1
I'm also getting this stack dump, but I don't know whether it is
connected to my problem:
[ 352.713234] SPL: Showing stack for process 3109
[ 352.713239] Pid: 3109, comm: txg_sync Tainted: P
2.6.35-22-server #35-Ubuntu
[ 352.713241] Call Trace:
[ 352.713255] [<ffffffffa050b607>] spl_debug_dumpstack+0x27/0x40
[spl]
[ 352.713263] [<ffffffffa050f67d>] kmem_alloc_debug+0x11d/0x130
[spl]
[ 352.713296] [<ffffffffa05b3a21>] dsl_scan_setup_sync+0x1e1/0x210
[zfs]
[ 352.713322] [<ffffffffa05b604c>] dsl_scan_sync+0x1dc/0x3a0 [zfs]
[ 352.713351] [<ffffffffa060b50c>] ? zio_destroy+0xac/0xf0 [zfs]
[ 352.713378] [<ffffffffa05c179a>] spa_sync+0x3fa/0x9a0 [zfs]
[ 352.713384] [<ffffffff8107f096>] ? autoremove_wake_function
+0x16/0x40
[ 352.713388] [<ffffffff8104d203>] ? __wake_up+0x53/0x70
[ 352.713416] [<ffffffffa05d2bf5>] txg_sync_thread+0x215/0x3a0 [zfs]
[ 352.713444] [<ffffffffa05d29e0>] ? txg_sync_thread+0x0/0x3a0 [zfs]
[ 352.713452] [<ffffffffa05100f8>] thread_generic_wrapper+0x78/0x90
[spl]
[ 352.713459] [<ffffffffa0510080>] ? thread_generic_wrapper+0x0/0x90
[spl]
[ 352.713462] [<ffffffff8107eb26>] kthread+0x96/0xa0
[ 352.713466] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10
[ 352.713469] [<ffffffff8107ea90>] ? kthread+0x0/0xa0
[ 352.713472] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10
If you still need my zpool history I can send that to you directly (it
is large). On a side note, when I try and run zpool history on that
array, I get the following stack dump:
[ 1166.957630] SPL: Showing stack for process 4404
[ 1166.957633] Pid: 4404, comm: zpool Tainted: P 2.6.35-22-
server #35-Ubuntu
[ 1166.957635] Call Trace:
[ 1166.957643] [<ffffffffa050b607>] spl_debug_dumpstack+0x27/0x40
[spl]
[ 1166.957650] [<ffffffffa050f67d>] kmem_alloc_debug+0x11d/0x130
[spl]
[ 1166.957678] [<ffffffffa05f395f>] zfs_ioc_pool_get_history+0x9f/
0x110 [zfs]
[ 1166.957683] [<ffffffffa026dd4e>] ? pool_namecheck+0x5e/0x180
[zcommon]
[ 1166.957711] [<ffffffffa05f425f>] zfsdev_ioctl+0xef/0x1c0 [zfs]
[ 1166.957715] [<ffffffff81162e1d>] vfs_ioctl+0x3d/0xd0
[ 1166.957718] [<ffffffff811635b1>] do_vfs_ioctl+0x81/0x3d0
[ 1166.957721] [<ffffffff815a2569>] ? do_page_fault+0x159/0x350
[ 1166.957724] [<ffffffff81163981>] sys_ioctl+0x81/0xa0
[ 1166.957728] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
Thanks,
-Rob