Unplug and replug device out of a raidz = data corrupt?

33 views
Skip to first unread message

Cyron

unread,
Jun 5, 2007, 12:02:23 PM6/5/07
to zfs-fuse
Hi,

I want to change the place of my serial-ata device in my tower, but I
don't know which logical device it is under Linux.

But I have a RAIDZ1for testing, so I thought - no problem, remove the
drive, a RAIDZ is build didn't care about it.

So I unplug the device from the power, and looked via "zpool status
pool" and was confused ...
> pool: pool
> state: ONLINE
> scrub: scrub stopped with 0 errors on Sun Jun 3 17:20:31 2007
>config:
>
> NAME STATE READ WRITE CKSUM
> pool ONLINE 0 0 0
> raidz1 ONLINE 0 0 0
> sdc2 ONLINE 0 0 0
> sdd2 ONLINE 0 0 0
>
>errors: No known data errors

it was still online ... so I took removed my hard disk from the tower
and added it again, I looked some times for the status, for minutes
zfs-fuse didn't notice that the device was removed.
So I want to look for the iostatus:
connect: Connection refused
>Please make sure that the zfs-fuse daemon is running.
>internal error: failed to initialize ZFS library
Don't know if I cause the crash of zfs-fuse, but know the status
didn't work as well:
>Please make sure that the zfs-fuse daemon is running.
>internal error: failed to initialize ZFS library

So I restarted it with "sudo zfs-fuse" and try to remount my pool with
"sudo zfs mount -a" then I looked for the status:
> pool: pool
> state: UNAVAIL
>status: One or more devices could not be opened. There are insufficient
> replicas for the pool to continue functioning.
>action: Attach the missing device and online it using 'zpool online'.
> see: http://www.sun.com/msg/ZFS-8000-D3
> scrub: none requested
>config:
>
> NAME STATE READ WRITE CKSUM
> pool UNAVAIL 0 0 0 insufficient replicas
> raidz1 UNAVAIL 0 0 0 insufficient replicas
> sdc2 UNAVAIL 0 0 0 cannot open
> sdd2 UNAVAIL 0 0 0 cannot open
But the second was available ...

I replug my device again and now booth device'd appered as online in
the status overview:
> pool: pool
> state: UNAVAIL
> scrub: none requested
>config:
>
> NAME STATE READ WRITE CKSUM
> pool UNAVAIL 0 0 0 insufficient replicas
> raidz1 UNAVAIL 0 0 0 corrupted data
> sdc2 ONLINE 0 0 0
> sdd2 ONLINE 0 0 0
Corrupted data? Why, the second drive should be the whole time online
and running. So I tried to scrub the pool:
>cannot scrub 'pool': pool is currently unavailable
I try to switch a drive offline witch I think it is the drive which
had I unplugged and replugged:
>cannot open 'pool': pool is unavailable
I try to import the pool:
>cannot import 'pool': no such pool available
I try to export the pool, but I was still in the path of the pool with
one bash:
>umount: /data/cyron: device is busy
>umount: /data/cyron: device is busy
>cannot unmount '/data/cyron': umount failed
I changed the active directory of my bash, and now the export works
very well, but the import don't find a pool:
>cannot import 'pool': no such pool available
And the status command says:
>no pools available

Whats going wrong here?

Theres no important data on the pool, cause I was testing around with
zfs-fuse, but I want to switch completely to zfs because of the save
of data, but know ...

Does I do something wrong? :/

My System:
4 × 250 GiB Hitachi Drives
2 × 224 GB second primary partitions on two of the HDDs
as one pool, named "pool"
with 3 filesystems
pool/home
pool/home/cyron
pool/home/cyron/data

If you want I can test some things with the pool, I only want to
secure zfs-fuse, cause I want to use it for my whole homedir.

Thanks in advance :)

Ricardo Correia

unread,
Jun 11, 2007, 3:33:03 PM6/11/07
to zfs-...@googlegroups.com, Cyron
Hi Cyron,

First of all, sorry for taking so long to answer.

On Tuesday 05 June 2007 17:02:23 Cyron wrote:
> > pool: pool
> > state: ONLINE
> > scrub: scrub stopped with 0 errors on Sun Jun 3 17:20:31 2007
> >config:
> >
> > NAME STATE READ WRITE CKSUM
> > pool ONLINE 0 0 0
> > raidz1 ONLINE 0 0 0
> > sdc2 ONLINE 0 0 0
> > sdd2 ONLINE 0 0 0
> >
> >errors: No known data errors

Just a note: there's no point in using RAIDZ1 with only 2 disks. I suppose
that you're only testing, though :)

> it was still online ... so I took removed my hard disk from the tower
> and added it again, I looked some times for the status, for minutes
> zfs-fuse didn't notice that the device was removed.

That is expected. Since zfs-fuse is a userspace application, it can't easily
detect device removal, so it will only detect it when it tries to read or
write to it. But I think we should add notification support (through HAL) in
the future.

> So I want to look for the iostatus:
> connect: Connection refused
> >Please make sure that the zfs-fuse daemon is running.
> >internal error: failed to initialize ZFS library
>
> Don't know if I cause the crash of zfs-fuse, but know the status
>
> didn't work as well:
> >Please make sure that the zfs-fuse daemon is running.
> >internal error: failed to initialize ZFS library
>

Ok, that is not so expected. Apparently the zfs-fuse daemon crashed..

No, you didn't do anything wrong. That looks like a (serious) bug. I'll try to
reproduce it and get back to you.

Thanks for the report.

Cyron

unread,
Jun 11, 2007, 5:13:03 PM6/11/07
to zfs-fuse
Hi Ricardo,

> First of all, sorry for taking so long to answer.

No problem, I know that phenomenon, no time for much work ;)

> Just a note: there's no point in using RAIDZ1 with only 2 disks. I suppose
> that you're only testing, though :)

Sure, there is a point for using RAIDZ1 for two disks.

I have four disks and want to move all content to one big RAIDZ1, two
disks
are nearly full, so I created a RAIDZ1, copied all content to the RAID
and then
I'll (if all works) delete the original data from the single disk.
Then add these
disk to the RAID, move next data and so on ...

> No, you didn't do anything wrong. That looks like a (serious) bug. I'll try to
> reproduce it and get back to you.

Ok :)
> Thanks for the report.
Never mind

Thanks in advance :)

Ricardo Correia

unread,
Jun 11, 2007, 5:21:34 PM6/11/07
to zfs-...@googlegroups.com, Cyron
On Monday 11 June 2007 22:13:03 Cyron wrote:
> Sure, there is a point for using RAIDZ1 for two disks.
>
> I have four disks and want to move all content to one big RAIDZ1, two
> disks
> are nearly full, so I created a RAIDZ1, copied all content to the RAID
> and then
> I'll (if all works) delete the original data from the single disk.
> Then add these
> disk to the RAID, move next data and so on ...

Unfortunately that won't work, because ZFS doesn't support expanding/adding
devices to existing raidz vdevs.

The best you can do is either create the RAIDZ-1 with all the devices at once,
or create a mirror vdev with 2 disks and then add another mirrored vdev with
the other 2 disks, essentially creating a RAID-10 pool (with half the storage
capacity).

Regards,
Ricardo Correia

Ruben Wisniewski

unread,
Jun 28, 2007, 6:46:13 PM6/28/07
to zfs-...@googlegroups.com
Am Mon, 11 Jun 2007 22:21:34 +0100
schrieb Ricardo Correia <rcor...@wizy.org>:

Hi Ricardo,

well ... what's about this:

Create a ZFS volume for four sparse files - two on each of the two free
disks. Then move the content of one of the full disks to to raidz
device.
Now umount the raidz, write the sparsefile to the now empty disk, mount
the raidz and remove the second sparsefile of the second disk with now
two sparsefiles on it from the raidz (as defect disk) and remove the
sparsefile. Now move the last disk on the raidz, and add the now empty
disk to the raidz
Remove one of the sparsefiles of the raidz (as defect disk) and add the
drive as harddisk to the raidz, and again with the last sparsefile.

----

Or create four sparsefiles, one on each disk, now mount them as raidz
and arrange the content of the two full disks equably to all
four disks, now move a bit of the data of the first, of the second, of
the third of the fourth disk, and again, and again, as long as data
remains out of the raidz.

After that, remove one of the sparsefiles from the raidz, and add the
harddisk for it, and again, and again, and again.


Do you think this would work?

Thanks in advance

Best regards,

Cyron

signature.asc
Reply all
Reply to author
Forward
0 new messages