aborting zfs commands does not work (?)

63 views
Skip to first unread message

Maximilian Mehnert

unread,
Jun 19, 2010, 1:21:38 PM6/19/10
to zfs-...@googlegroups.com
I just noticed some odd behaviour that I did not expect.

I tried a zfs send|zfs receive on my machine.
Then noticed that I did want to only send incremental data and canceled
the command with CTRL+C. So far so good.
I continued with the incremental data which worked out fine and tried to
export the target pool afterwards.

That did not work. Zpool claimed that the pool was still busy.
I checked to see that it was already unmounted. Then I noticed that
there was still I/O activity on the disk. According to iostat, it was
actually the full bandwidth that was possible for that disk.

The same happens with CTRL+C while doing a "zfs rollback". Activity
continues until the rollback is finished.

I noticed similar behaviour earlier while resilvering. In case the disk
that is being resilvered is suddlenly removed, heavy reading activity
continues on the source disk.

For the first case I would expect zfs to just cancel activity on both
the source and target pool. The same for cancelling a rollback.

If the target fails while resilvering, that should probably be detected...

I don't have solaris on any machine to test those cases there.
Does someone know what should really happen? Did someone else experience
these problems?

All the best,
Maximilian.


sgheeren

unread,
Jun 19, 2010, 2:11:20 PM6/19/10
to zfs-...@googlegroups.com
On 06/19/2010 07:21 PM, Maximilian Mehnert wrote:
I just noticed some odd behaviour that I did not expect.

I tried a zfs send|zfs receive on my machine.
Then noticed that I did want to only send incremental data and canceled
the command with CTRL+C. So far so good.
I continued with the incremental data which worked out fine and tried to
export the target pool afterwards.

That did not work. Zpool claimed that the pool was still busy.
I checked to see that it was already unmounted. Then I noticed that
there was still I/O activity on the disk. According to iostat, it was
actually the full bandwidth 
Read or write?

that was possible for that disk.
  
This would make sense if you just interrupted the 'receive' side of things. On some shells, CTRL+C will only interrupt parts of the pipe. Can you state the exact invocation environment (distro, shell brand and version, traps, functions, shell opts etc. if applicable)?

The same happens with CTRL+C while doing a "zfs rollback". Activity
continues until the rollback is finished.

I noticed similar behaviour earlier while resilvering. In case the disk
that is being resilvered is suddlenly removed, heavy reading activity
continues on the source disk.
  
I noticed issue #20 (by you) is about the same kind of issue. http://zfs-fuse.net/issues/20

Can you state which versions these new report applies to. Could you state the exact command used.

E.g. 'zfs send ... | zfs recv' only works since recently on zfs-fuse
zfs send -D work since even more recently

zfs rollback -R does noticebly different things than just zfs rollback etc.

For the first case I would expect zfs to just cancel activity on both
the source and target pool. 
Agreed

The same for cancelling a rollback.
  
Not agreed. I would expect ZFS to complete the currently running atomic transaction, and then abort the rollback. This could take some time, especially under e.g. dedup.

If the target fails while resilvering, that should probably be detected...
  
Well, now you are referring (without mentioning it?) to issue #20, I am _guessing_? Otherwise this quip means little to me. Do you mean you _know_ it detects it, or you just hope it will; do you have anyparticular reason to mention it? If so, please share more details :)

I don't have solaris on any machine to test those cases there.
  
I do.

Does someone know what should really happen? 
Sun does :)

Did someone else experience
these problems?
  
No.
All the best,
Maximilian.


  

Maximilian Mehnert

unread,
Jun 19, 2010, 2:51:29 PM6/19/10
to zfs-...@googlegroups.com
On 19/06/10 20:11, sgheeren wrote:
>> That did not work. Zpool claimed that the pool was still busy.
>> I checked to see that it was already unmounted. Then I noticed that
>> there was still I/O activity on the disk. According to iostat, it was
>> actually the full bandwidth
> Read or write?

I'm not sure anymore. But it was definitely the disk with the receiving
pool.
If needed I'll try to reproduce it.

>> that was possible for that disk.
>>
> This would make sense if you just interrupted the 'receive' side of
> things. On some shells, CTRL+C will only interrupt parts of the pipe.
> Can you state the exact invocation environment (distro, shell brand and
> version, traps, functions, shell opts etc. if applicable)?

Latest zfs from the rudd-o repository.
debian, bash (4.1.5(1)), not sure about the exact invocation of zfs.
Probably zfs send pool@snapshot |zfs receive -F target, then
zfs send -i earlier pool@snapshot |zfs receive -F target


>> I noticed similar behaviour earlier while resilvering. In case the disk
>> that is being resilvered is suddlenly removed, heavy reading activity
>> continues on the source disk.
>>
> I noticed issue #20 (by you) is about the same kind of issue.
> http://zfs-fuse.net/issues/20

That's correct.


> Can you state which versions these new report applies to.

commit 281dcc3aea76fa371a55af83869049bef159a9af
Author: Seth Heeren <sghe...@hotmail.com>
Date: Thu Jun 3 21:06:11 2010 +0200


>> The same for cancelling a rollback.
> Not agreed. I would expect ZFS to complete the currently running atomic
> transaction, and then abort the rollback. This could take some time,
> especially under e.g. dedup.

Agreed ;-)


>> If the target fails while resilvering, that should probably be detected...
>>
> Well, now you are referring (without mentioning it?) to issue #20, I am
> _guessing_? Otherwise this quip means little to me.

Sorry not to make that clearer. It looked very similar to mee in a way
since those were all cases I'd expect zfs to stop doing something.

> Do you mean you
> _know_ it detects it, or you just hope it will; do you have
> anyparticular reason to mention it? If so, please share more details :)

Ok. I'll try. I've two devices in a raidz pool. Both via device mapper.
Let's say /dev/mapper/part1, /dev/mapper/part2.
part1 is ok and part2 ist resilvering. When part2 is unplugged, zpool
status shows an ongoing resilvering process. Read activity on part1
continues (I guess till the end of the supposed resilvering process.
Never waited that long).


sgheeren

unread,
Jun 19, 2010, 3:01:27 PM6/19/10
to zfs-...@googlegroups.com
Good

The most important news is that you found the 'latest' version (testing is advancing further still, but nothing spectacular yet).

I'll be adding this info to issue #20 so we can further investigate

Seth
Reply all
Reply to author
Forward
0 new messages