[Fwd: [ZFS for Linux Issue Tracker] #13 - Re: tools hang on zfs]

24 views
Skip to first unread message

sgheeren

unread,
Jan 15, 2010, 6:20:23 AM1/15/10
to zfs-...@googlegroups.com
for info

-------- Original Message --------
Subject: [ZFS for Linux Issue Tracker] #13 - Re: tools hang on zfs
Date: Fri, 15 Jan 2010 06:19:52 -0500
From: Webmaster at Rudd-O.com <webm...@rudd-o.com>
To: zfs-...@sehe.nl


A new response has been given to the issue tools hang on zfs ioctl due to bug in put_nvlist in the tracker Issue tracker by Seth Heeren.

Response Information

Issue
tools hang on zfs ioctl due to bug in put_nvlist (http://zfs-fuse.net/issues/13)
Issue state
unconfirmed -> open
Severity
Important -> Critical

Response Details:

Ok, people, this thing is a major stopper for anything but hte
simplest pool layouts (higher number of pools, vdevs or longer
vdev names will result in put_nvlist running out of buffer room
quickly   --> hanging userland tools (not to mention server
threads)).    At least 4 bugs with the symptom have been reported
on the user group. 2 of them confirmed the problem went away when
upgrading to Emmanuels repo/master.    Anyone wishing to
check/confirm the solution: switch to a later version (notably
including 37af7fac0524a741b44734057e8f635e40155cb4)    So eg. do
git remote add rainemu http://rainemu.swishparty.co.uk/git/zfs
git checkout -b test-0.7.0-pre-alpha
37af7fac0524a741b44734057e8f635e40155cb4)    rebuild, install,
test. Beware: you'll be running an unstable version. Don't do
anything/more than necesarry on valuable pools?

* This is an automated email, please do not reply - Webmaster at Rudd-O.com

Rudd-O

unread,
Jan 15, 2010, 2:50:03 PM1/15/10
to zfs-...@googlegroups.com
Now that we're on this, I tried to set up a mail account to be a sender
for the list, sending any bug tracker activity to the mailing list, but
it turns out that webm...@rudd-o.com or webm...@zfs-fuse.net are
SWIFTLY rejected by Google Groups as valid addresses. Even if forced to
be added.

Dang.

> --
> To post to this group, send email to zfs-...@googlegroups.com
> To visit our Web site, click on http://zfs-fuse.net/


sgheeren

unread,
Jan 15, 2010, 3:03:28 PM1/15/10
to zfs-...@googlegroups.com
I say it needs to be tested out a bit on various systems. I suggest it is tested by the victims of the problems (so we know it at least _fixes_ their problems, which I fully expect).

To be honest, I escaped all this big unapetizing bad-news-party by simply merging the fix myself. I might have merged a lot more, but I'm afraid I'm such a nitwit I can't work out the differences of my working 'production' versions to the release 0.6.0 version. Can you help me find that out?

My findings: with official/0.6.0 I get the hang when doing e.g.

root@karmic:/tmp/ztest_15346# killall -9 zfs-fuse; rm -rfv /etc/zfs/zpool.cache; sleep 1; zfs-fuse
`/etc/zfs/zpool.cache’ is verwijderd
root@karmic:/tmp/ztest_15346# zpool status
no pools available
root@karmic:/tmp/ztest_15346# r=$RANDOM; devdir=/tmp/ztest_$r; mkdir -pv $devdir; cd $devdir; for a  in a-particular-long-named-device-backing-file-named-atrociously-raw{1,2,3,4,5,6,7,8,9,10}; do dd of=$a bs=1024M seek=1024 count=0 2>/dev/null; done; zpool create pool-$r raidz $devdir/a-*
mkdir: map `/tmp/ztest_7140’ is aangemaakt
root@karmic:/tmp/ztest_7140# zpool status
  pool: pool-7140
 state: ONLINE
 scrub: none requested
config:

    NAME                                                                                     STATE     READ WRITE CKSUM
    pool-7140                                                                                ONLINE       0     0     0
      raidz1                                                                                 ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw1   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw10  ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw2   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw3   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw4   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw5   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw6   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw7   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw8   ONLINE       0     0     0
        /tmp/ztest_7140/a-particular-long-named-device-backing-file-named-atrociously-raw9   ONLINE       0     0     0

errors: No known data errors
root@karmic:/tmp/ztest_7140# r=$RANDOM; devdir=/tmp/ztest_$r; mkdir -pv $devdir; cd $devdir; for a  in a-particular-long-named-device-backing-file-named-atrociously-raw{1,2,3,4,5,6,7,8,9,10}; do dd of=$a bs=1024M seek=1024 count=0 2>/dev/null; done; zpool create pool-$r raidz $devdir/a-*
mkdir: map `/tmp/ztest_21314’ is aangemaakt
root@karmic:/tmp/ztest_21314# zpool status
^C

The second pool exceeds the standard buffer requirements (on my system) and hangs the userland (and presumably the server thread) indefinitely.

After a

git checkout -b test-hotfix 0.6.0
git cherry-pick -n efdd33ca
# vim zfs-fuse/zfs_ioctl.c for a trivial conflict resolution
cd src
scons -c; scons -j10; sudo scons install

The problem has disappeared. I'll spare you the boring repetitive output, suffice the following proof in the end:
root@karmic:/tmp/ztest_394# zpool status|wc -l
104
root@karmic:/tmp# zpool iostat
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pool-11966   144K  10,0T      0      1     92  1,64K
pool-16354   144K  10,0T      0      1     98  1,75K
pool-30378   144K  10,0T      0      1     97  1,72K
pool-394     144K  10,0T      0      1    100  1,77K
pool-4856    144K  10,0T      0      1     99  1,76K
----------  -----  -----  -----  -----  -----  -----


-------- Original Message --------
Subject: [ZFS for Linux Issue Tracker] #13 - Re: tools hang on zfs
Date: Fri, 15 Jan 2010 14:25:45 -0500
From: Webmaster at Rudd-O.com <webm...@rudd-o.com>
To: zfs-...@sehe.nl


A new response has been given to the issue tools hang on zfs ioctl due to bug in put_nvlist in the tracker Issue tracker by Rudd-O.

Response Information

Issue
tools hang on zfs ioctl due to bug in put_nvlist (http://zfs-fuse.net/issues/13)

Response Details:

I have merged Emmanuel's rainemu repository and pushed into master
official, so you can continue using the official repo as long as
you use the master branch rather than one of the stable tags.
Guys, advise -- can we merge the quoted commits that fix the
issues into stable, to have a 0.6.1 release?

sgheeren

unread,
Jan 15, 2010, 3:05:46 PM1/15/10
to zfs-...@googlegroups.com
Rudd-O wrote:
> Now that we're on this, I tried to set up a mail account to be a sender
> for the list, sending any bug tracker activity to the mailing list, but
> it turns out that webm...@rudd-o.com or webm...@zfs-fuse.net are
> SWIFTLY rejected by Google Groups as valid addresses. Even if forced to
> be added.
>
May I warn/remind you that the mail adresses @zfs-fuse.net are still
unreliable/not working because of rotten DNS confusion.
I don't know if my ISP (provides the DNS) is in error, or just some DNS
clients/forwarders. Anyways, I haven't got it working since the day we
resolved the problem affecting the git clone.

Rudd-O

unread,
Jan 15, 2010, 4:41:44 PM1/15/10
to zfs-...@googlegroups.com

> To be honest, I escaped all this big unapetizing bad-news-party by
> simply merging the fix myself. I might have merged a lot more, but I'm
> afraid I'm such a nitwit I can't work out the differences of my
> working 'production' versions to the release 0.6.0 version. Can you
> help me find that out?

You should be able to use a variant of the git log 0.6.0..HEAD to tell
you what changesets you have atop 0.6.0. You can also use giggle to
visually traverse the commit tree from 0.6.0 to what you have.

It would be awesome if you could share the commits on the list so we
know what's working and we can do 0.6.1.

sgheeren

unread,
Jan 15, 2010, 4:49:04 PM1/15/10
to zfs-...@googlegroups.com
Rudd-O wrote:
> It would be awesome if you could share the commits on the list so we
> know what's working and we can do 0.6.1.
>
Yeah it would, wouldn't it. Hang on:
> [...snipped...]

>> After a
>>
>> git checkout -b test-hotfix 0.6.0
>> git cherry-pick -n efdd33ca
>> # vim zfs-fuse/zfs_ioctl.c for a trivial conflict resolution
>> cd src
>> scons -c; scons -j10; sudo scons install
>>
>>
Incidentally, that commit was named in the original bug report (#13) on
December 23rd.... Search for 'cherry' and you'll see it
Now I'm getting the impression that you don't have time to read any of
the mails/issue reports. Feel free to ask for a summary, next time!

Seth

sgheeren

unread,
Jan 15, 2010, 5:15:09 PM1/15/10
to zfs-...@googlegroups.com
Rudd-O wrote:
>> To be honest, I escaped all this big unapetizing bad-news-party by
>> simply merging the fix myself. I might have merged a lot more, but I'm
>> afraid I'm such a nitwit I can't work out the differences of my
>> working 'production' versions to the release 0.6.0 version. Can you
>> help me find that out?
>>
>
> You should be able to use a variant of the git log 0.6.0..HEAD to tell
> you what changesets you have atop 0.6.0. You can also use giggle to
> visually traverse the commit tree from 0.6.0 to what you have.
>
Sorry, but both approaches give oodles of confusing logs, and even
useless info.

I'm afraid part of it has to with giggle showing all branches at once.
Not easy to see which commits go on what branch, let alone see the
lineage of a branch: <giggle.png>

The git log 0.6.0..HEAD thingie gives a lot of spurious log entries: I'm
afraid you may have rebased Emmanuels commits, instead of pulling/FF-ing
them? In case you still had that 'old problem' with the merge, please
look back at my reply back then[1], it didn't have to be a problem. If
there is some other reason why the commits get different ID's, please
educate me, because I really don't get it, and I hate git for it :)

In our case, I'm almost sure my 'production' branch (mubi:
http://zfs-fuse.sehe.nl/?p=zfs-fuse;a=commit;h=2b8d8565b1b4abd15c150c0b6b79e94325e76138)
is the result of doing

git checkout -b mubi 0.6.0
git merge 37af7fac

which should be awful close to what you have done lately. I really hate to say I find it next to impossible to tell the exact differences, without resorting to a bloddy diff. And even the diff tells me nothing:

git diff mubi rudd-o/master | diffstat
...
...
162 files changed, 10644 insertions(+), 2300 deletions(-)


[1] http://groups.google.com/group/zfs-fuse/msg/4d69ec53ae43742b

giggle.png

Manuel Amador (Rudd-O)

unread,
Jan 16, 2010, 4:59:27 AM1/16/10
to zfs-...@googlegroups.com
I *have* been following the list with less intensity. They're riding me hard
at work :-/ but the work itself is fun so I don't complain.
signature.asc

jafo

unread,
Jan 17, 2010, 3:47:41 PM1/17/10
to zfs-fuse
For another data point, I've been able to populate my pool with a
couple of
hundred GBs of data, but I seem to run into the hang again when I do a
"zpool scrub data". This is with 0.6.0. After the scrub, "zpool
status"
hangs, but the file system can continue to be accessed, "zfs umount -
a"
works.

I did a restart of zfs-fuse, and "zpool status" completed.

Then I tried a "zpool scrub" and it hung. "zpool status" at that
point
also hung.

I tried another restart of zfs-fuse, and now "zpool status" is hanging
with
80+% CPU use on zfs-fuse.

Sean

sgheeren

unread,
Jan 17, 2010, 5:08:12 PM1/17/10
to zfs-...@googlegroups.com
It would certainly help to know what changes you made to get this behaviour.

Did you only make the device names somewhat shorter? Or did you change
to a version of the code that fixes the bug in put_nvlist/zpool status?

In the former case, you're simply playing Russian roulette. Anything
that causes the result of a ZFS ioctl to exceed the buffer size
allocated by the client will hang the client (zpool status). In your
case, apparently having a scrub status to be reported is enough to
trigger it?

In the other case (code modification), please do file a new bug report
with relevant details. Please be very specific about the code used
(preferrably a commit id).

$0.02

jafo

unread,
Jan 19, 2010, 3:40:44 AM1/19/10
to zfs-fuse
On Jan 17, 3:08 pm, sgheeren <sghee...@hotmail.com> wrote:
> It would certainly help to know what changes you made to get this behaviour.

The only change I made was to reduce the size of the device names by
something like half, hoping that would be enough.

As far as code changes, it wasn't clear to me what code changes to
make to
get something like a 0.6.0 with this fix. I'd like my storage server
to
stay with a stable release rather than tracking the trunk.

I've just tried doing the:

git checkout -b mubi 0.6.0
git merge 37af7fac

but this seems to be a lot more than just the put_nvlist fix, as it's
also
saying that the zpool version needs to be upgraded.

FWIW, before I did this change, my pool had become completely
non-operational, everything was hanging.

Thanks,
Sean

sgheeren

unread,
Jan 19, 2010, 4:01:57 AM1/19/10
to zfs-...@googlegroups.com
jafo wrote:
On Jan 17, 3:08 pm, sgheeren <sghee...@hotmail.com> wrote:
  
It would certainly help to know what changes you made to get this behaviour.
    
I've just tried doing the:

   git checkout -b mubi 0.6.0
   git merge 37af7fac
  
Sidenote [1]
You may want to replace the merge with a cherry-pick. Anyhow, I noticed indeed 37af7fac seems to contain many more changes that seem more related to onnv synchronisation (notably: boot pools and snapshot holds) so I can see how you don't want to have that. Here's my minimal change patch that I'll also submit with issue #13 for inclusion in 'stable' (official/master) -- see also attached.
diff --git a/src/lib/libzfs/libzfs_dataset.c b/src/lib/libzfs/libzfs_dataset.c
index fea911d..17df500 100644
--- a/src/lib/libzfs/libzfs_dataset.c
+++ b/src/lib/libzfs/libzfs_dataset.c
@@ -2330,6 +2330,8 @@ zfs_iter_filesystems(zfs_handle_t *zhp, zfs_iter_f func, void *data)
                        zcmd_free_nvlists(&zc);
                        return (ret);
                }
+               // next child is allowed to use the full size !!!
+               zc.zc_nvlist_dst_size = 4096;
        }
        zcmd_free_nvlists(&zc);
        return ((ret < 0) ? ret : 0);
diff --git a/src/zfs-fuse/zfs_ioctl.c b/src/zfs-fuse/zfs_ioctl.c
index a8bd40d..91f96b5 100644
--- a/src/zfs-fuse/zfs_ioctl.c
+++ b/src/zfs-fuse/zfs_ioctl.c
@@ -834,11 +834,7 @@ put_nvlist(zfs_cmd_t *zc, nvlist_t *nvl)
                  syslog(LOG_WARNING,"put_nvlist: error %s on xcopyout",strerror(error));
        }
 
-       /* zc->zc_nvlist_dst_size = size; */
-       /* This commented allocation was probably some kind of optimization
-       since this zc is sent to the socket. Except that put_nvlist is sometimes
-       called recursively and in this case we get very fast an out of memory error
-       in this function. Simply commenting out the allocation fixes the problem */
+       zc->zc_nvlist_dst_size = size;
        return (error);
 }


[1] Coollll.... now it confuses me why you would name the branch that. This can only lead to confusion as my mubi branch is not for public consumption[2]
[2] it contains a subset of Emmanuels repo, it is so-to-say a label of the version _I_ use in my servers. Mind you: it _is_ unstable, and yes, I have paranoid backups (to Opensolaris ZFS in the cloud and to XFS on external disks).

FWIW, before I did this change, my pool had become completely
non-operational, everything was hanging.
  
I hope you have been able to follow my explanation of that in my previous response?
Thanks,
Sean
  

patch.put_nvlist_hotfix

sgheeren

unread,
Jan 19, 2010, 4:06:48 AM1/19/10
to zfs-...@googlegroups.com
For info (http://zfs-fuse.net/issues/13)

-------- Original Message --------
Subject: [ZFS for Linux Issue Tracker] #13 - Re: tools hang on zfs
Date: Tue, 19 Jan 2010 04:06:02 -0500
From: Webmaster at Rudd-O.com <webm...@rudd-o.com>
To: zfs-...@sehe.nl

A new response has been given to the issue *tools hang on zfs ioctl due
to bug in put_nvlist* in the tracker *Issue tracker* by *Seth Heeren*.


Response Information

Issue
tools hang on zfs ioctl due to bug in put_nvlist
(http://zfs-fuse.net/issues/13)

*Response Details*:

As jafo correctly pointed out on the list, the 37af7fac commits
introduces a lot more change than might be warranted for stable
deployments. It certainly does _not_ lend itself to be
cherrypicked, like I suggested earlier. Sorry for my mistake[1]. I
have reduced things to a minimum fix, only related to this
particular problem (attached). Can we test this/get it merged
instead of rolling the master forward to include all the other
upstream changes? [1] this must have happened because in
actual life I'm ahead of the 0.6.0 release myself, so I didn't
give that commit the due attention because I have it anyway

Manuel Amador (Rudd-O)

unread,
Jan 19, 2010, 3:19:12 PM1/19/10
to zfs-...@googlegroups.com
> Sidenote [1]
> You may want to replace the merge with a cherry-pick.

Correct. Merge merges all commits UP to the number you gave. Cherry-pick
only merges the exact commit.

signature.asc

jafo

unread,
Jan 22, 2010, 6:12:07 PM1/22/10
to zfs-fuse
Thanks for the pointers, guys. As you've noticed, I'm not much of a
git-er. I need to get more familiar with it, but right now I'm kind
of
just stuck in the svn and bzr worlds.

I've applied the patches above, and so far so good. I'll put them on
one
of my test systems and start up a stress test on that. Unfortunately,
one
looks to have had a power supply die, and the other I had to swap to
another machine and it needs some love before it comes back up
apparently.
:-(

On my main storage server it's looking good though -- I've been able
to
start a scrub and run several "zpool status" commands and all. It
definitely was dieing before this point before, so it's at least a
good
early indicator.

Thanks,
Sean

Reply all
Reply to author
Forward
0 new messages