Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[GIT BISECT] first bad commit: 1f36f774 Switch !O_CREAT case to use of do_last()

5 views
Skip to first unread message

Boaz Harrosh

unread,
Mar 24, 2010, 11:50:02 AM3/24/10
to
1f36f774b22a0ceb7dd33eca626746c81a97b6a5 is the first bad commit
commit 1f36f774b22a0ceb7dd33eca626746c81a97b6a5
Author: Al Viro <vi...@zeniv.linux.org.uk>
Date: Sat Dec 26 10:56:19 2009 -0500

Switch !O_CREAT case to use of do_last()

... and now we have all intents crap well localized

Signed-off-by: Al Viro <vi...@zeniv.linux.org.uk>

:040000 040000 53c1effc5b22746bb83cdbc6a419bf898067882d 9606f7275fde188a056fd174aa1d622ac0630893 M fs

I have a file corruption regression bisected to above. (The one step before def4af30cf945a3735ffca865788ea84b30b25d9
was fine)

The test:

- I have an exofs filesystem mounted on /mnt/exofs
- []$ cd /mnt/exofs/some_linux_git; git status;
All is fine
- []$ mount -t nfs4 -o minorversion=0 localhost:/ /mnt/nfs
(Where etc/exports will export /mnt/exofs via nfs4.1)
- []$ cd /mnt/nfs/some_linux_git; git status;
This will fail and will corrupt the .git/index file. Sometimes the file would be
too short, and sometimes the file will become a directory (Yes really)

Simple revert of this patch over 2.6.34-rc2 will not work so I've not checked it.
A single threaded read like: "diff -r /mnt/nfs/some_linux_git /mnt/nfs/some_linux_git.back"
or even "diff -r /mnt/exofs/some_linux_git /mnt/nfs/some_linux_git" works fine.
A simple dd write test works fine as well.

So it looks like a race that only something like git can exercise, with multi threaded
reads and writes, through nfs/nfsd.

I know Benny also had failures with files export ontop of 2.6.34-rc2
(I looked at the patch, it is totally out of my league, please help)

I will try to reproduce this with a more conventional FS, or any thing else
I should test?

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Al Viro

unread,
Mar 24, 2010, 12:10:03 PM3/24/10
to
On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote:
> > Bloody impressive... Does that happen to underlying fs or to what you
> > are seeing via NFS?
>
> Only via NFS. All local access is fine.
>
> After the corruption above I can cd to the local mount cp a fresh copy
> of .git/index file and play around just fine.
> Once I return to the NFS mounted directory, a git status will do it.
> It does not matter if caches are cold (Takes a long time) or hot it happens
> every time.
>
> Weird I know, I'm playing some more with it as we speak

What happens if you export to box running older kernel *or* from box
running older kernel? IOW, is that nfsd or nfs client getting unhappy?
I'd suspect the latter, but...

Boaz Harrosh

unread,
Mar 24, 2010, 12:10:03 PM3/24/10
to
On 03/24/2010 06:00 PM, Al Viro wrote:

> On Wed, Mar 24, 2010 at 05:49:39PM +0200, Boaz Harrosh wrote:
>> - I have an exofs filesystem mounted on /mnt/exofs
>> - []$ cd /mnt/exofs/some_linux_git; git status;
>> All is fine
>> - []$ mount -t nfs4 -o minorversion=0 localhost:/ /mnt/nfs
>> (Where etc/exports will export /mnt/exofs via nfs4.1)
>> - []$ cd /mnt/nfs/some_linux_git; git status;
>> This will fail and will corrupt the .git/index file. Sometimes the file would be
>> too short, and sometimes the file will become a directory (Yes really)
>
> Bloody impressive... Does that happen to underlying fs or to what you
> are seeing via NFS?

Only via NFS. All local access is fine.

After the corruption above I can cd to the local mount cp a fresh copy
of .git/index file and play around just fine.
Once I return to the NFS mounted directory, a git status will do it.
It does not matter if caches are cold (Takes a long time) or hot it happens
every time.

Weird I know, I'm playing some more with it as we speak

Thanks

Al Viro

unread,
Mar 24, 2010, 12:10:03 PM3/24/10
to
On Wed, Mar 24, 2010 at 05:49:39PM +0200, Boaz Harrosh wrote:
> - I have an exofs filesystem mounted on /mnt/exofs
> - []$ cd /mnt/exofs/some_linux_git; git status;
> All is fine
> - []$ mount -t nfs4 -o minorversion=0 localhost:/ /mnt/nfs
> (Where etc/exports will export /mnt/exofs via nfs4.1)
> - []$ cd /mnt/nfs/some_linux_git; git status;
> This will fail and will corrupt the .git/index file. Sometimes the file would be
> too short, and sometimes the file will become a directory (Yes really)

Bloody impressive... Does that happen to underlying fs or to what you
are seeing via NFS?

Boaz Harrosh

unread,
Mar 24, 2010, 12:20:03 PM3/24/10
to
On 03/24/2010 06:07 PM, Al Viro wrote:
> On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote:
>>> Bloody impressive... Does that happen to underlying fs or to what you
>>> are seeing via NFS?
>>
>> Only via NFS. All local access is fine.
>>
>> After the corruption above I can cd to the local mount cp a fresh copy
>> of .git/index file and play around just fine.
>> Once I return to the NFS mounted directory, a git status will do it.
>> It does not matter if caches are cold (Takes a long time) or hot it happens
>> every time.
>>
>> Weird I know, I'm playing some more with it as we speak
>
> What happens if you export to box running older kernel *or* from box
> running older kernel? IOW, is that nfsd or nfs client getting unhappy?
> I'd suspect the latter, but...


Good question, I'm just getting to that because currently it's all
over localhost (same kernel, BTW inside a UML)

I will try what you said. Please through any other tests on me, if needed.

Boaz

Al Viro

unread,
Mar 24, 2010, 12:50:02 PM3/24/10
to
On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote:
> On 03/24/2010 06:07 PM, Al Viro wrote:
> > On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote:
> >>> Bloody impressive... Does that happen to underlying fs or to what you
> >>> are seeing via NFS?
> >>
> >> Only via NFS. All local access is fine.
> >>
> >> After the corruption above I can cd to the local mount cp a fresh copy
> >> of .git/index file and play around just fine.
> >> Once I return to the NFS mounted directory, a git status will do it.
> >> It does not matter if caches are cold (Takes a long time) or hot it happens
> >> every time.
> >>
> >> Weird I know, I'm playing some more with it as we speak
> >
> > What happens if you export to box running older kernel *or* from box
> > running older kernel? IOW, is that nfsd or nfs client getting unhappy?
> > I'd suspect the latter, but...
>
>
> Good question, I'm just getting to that because currently it's all
> over localhost (same kernel, BTW inside a UML)
>
> I will try what you said. Please through any other tests on me, if needed.

Very interesting... Just to see which path we are hitting: add
if (IS_ERR(nd->intent.open.file))
printk("foo: %s", pathname);
right after
error = do_lookup(nd, &nd->last, path);
if (error)
goto exit;
in fs/namei.c:do_last() and see whether we are hitting it or not on objects
that get corrupted.

Boaz Harrosh

unread,
Mar 24, 2010, 1:20:02 PM3/24/10
to
On 03/24/2010 06:39 PM, Al Viro wrote:
> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote:
>> On 03/24/2010 06:07 PM, Al Viro wrote:
>>> On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote:
>>>>> Bloody impressive... Does that happen to underlying fs or to what you
>>>>> are seeing via NFS?
>>>>
>>>> Only via NFS. All local access is fine.
>>>>
>>>> After the corruption above I can cd to the local mount cp a fresh copy
>>>> of .git/index file and play around just fine.
>>>> Once I return to the NFS mounted directory, a git status will do it.
>>>> It does not matter if caches are cold (Takes a long time) or hot it happens
>>>> every time.
>>>>
>>>> Weird I know, I'm playing some more with it as we speak
>>>
>>> What happens if you export to box running older kernel *or* from box
>>> running older kernel? IOW, is that nfsd or nfs client getting unhappy?
>>> I'd suspect the latter, but...
>>
>>
>> Good question, I'm just getting to that because currently it's all
>> over localhost (same kernel, BTW inside a UML)
>>
>> I will try what you said. Please through any other tests on me, if needed.
>

As you suspected old-server+new-client fails. any-thing+old-client is
fine. (two separate machines this time)

> Very interesting... Just to see which path we are hitting: add
> if (IS_ERR(nd->intent.open.file))
> printk("foo: %s", pathname);
> right after
> error = do_lookup(nd, &nd->last, path);
> if (error)
> goto exit;
> in fs/namei.c:do_last() and see whether we are hitting it or not on objects
> that get corrupted.

Sorry was busy shifting setups, didn't see your mail, will do that next ...

Thanks
Boaz

Boaz Harrosh

unread,
Mar 24, 2010, 1:40:04 PM3/24/10
to


Below is what I changed. (I hope its what you meant)
It does not get hit, just that git corruption as before but I don't see the prints.
I'll try running with nfs dbg-prints on see what it does around the time gits complains

Boaz

---
diff --git a/fs/namei.c b/fs/namei.c
index 1c0fca6..d1c96f0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1650,6 +1650,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path,


error = do_lookup(nd, &nd->last, path);
if (error)
goto exit;

+
+ if (IS_ERR(nd->intent.open.file)) {
+ printk(KERN_ERR "foo: %s", pathname);
+ WARN_ON(1);
+ }
+
error = -ENOENT;
if (!path->dentry->d_inode)
goto exit_dput;

Boaz Harrosh

unread,
Mar 24, 2010, 2:00:02 PM3/24/10
to
On 03/24/2010 07:47 PM, Boaz Harrosh wrote:
>>> On 03/24/2010 06:39 PM, Al Viro wrote:
>>>> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote:
>>>>> On 03/24/2010 06:07 PM, Al Viro wrote:
>>>>>>>> Bloody impressive... Does that happen to underlying fs or to what you
>>>>>>>> are seeing via NFS?
>>>>>>>
>>>>>>> Only via NFS. All local access is fine.
>>>>>>>
<snip>

Al hi

Would you like to attempt a revert of this patch (or group of patches)
Just to get rid of the thought that git bisect was just peeking the
wrong guy. Maybe it's just something else? Can you understand the
relevance of all this?

(I'll try other setups as well but tomorrow, it's getting late out here)
Thanks for your help
Boaz

Trond Myklebust

unread,
Mar 24, 2010, 2:10:02 PM3/24/10
to
On Wed, 2010-03-24 at 19:47 +0200, Boaz Harrosh wrote:
> Attached is an output of when I:
> $ echo $((0x7fff)) > /proc/sys/sunrpc/nfs_debug
> and then run git status. (On a new client)
>
> We can see the complains after things got broken but what broke it
> that's hard for me to see.
>
> (If the file is too big I'll put it on the web somewhere, see if it arrives)
>
> Boaz

Something weird is going on in your trace:

NFS: open file(5b/46ff70a61cf4e159a0339df0e02113bf35f805)
NFS: permission(0:12/323044), mask=0x24, res=0
NFS: revalidating (0:12/323044)
--> nfs4_setup_sequence clp 00000000791f3000 session (null) sr_slotid
128
<-- nfs4_setup_sequence status=0
encode_compound: tag=
decode_attr_type: type=00
decode_attr_change: change attribute=10077553255782547456
decode_attr_size: file size=921
decode_attr_fsid: fsid=(0x0/0x0)
decode_attr_fileid: fileid=0
decode_attr_fs_locations: fs_locations done, error = 0
decode_attr_mode: file mode=00
decode_attr_nlink: nlink=1
decode_attr_owner: uid=-2
decode_attr_group: gid=-2
decode_attr_rdev: rdev=(0x0:0x0)
decode_attr_space_used: space used=0
decode_attr_time_access: atime=0
decode_attr_time_metadata: ctime=1269422731
decode_attr_time_modify: mtime=1269422731
decode_attr_mounted_on_fileid: fileid=0
decode_getfattr: xdr returned 0

A file type of '0' in the above trace is just wrong, and probably
indicates that the server didn't even return that attribute.

I'd say you have a corruption issue either on the server side or on your
client.

Trond

Al Viro

unread,
Mar 24, 2010, 2:10:01 PM3/24/10
to
On Wed, Mar 24, 2010 at 07:58:00PM +0200, Boaz Harrosh wrote:
> Al hi
>
> Would you like to attempt a revert of this patch (or group of patches)
> Just to get rid of the thought that git bisect was just peeking the
> wrong guy. Maybe it's just something else? Can you understand the
> relevance of all this?

If you see breakage at that commit and do not see it on its parent, we
do have the right guy...

As for reverting, try reverting 781b16775ba0bb55fac0e1757bf0bd87c8879632
first, then this commit.

How consistent are the effects you are seeing from test to test on the same
kernel? This one was very interesting, since it seemed to fail with
-EISDIR while opening .git/objects/pack. Which is a directory and which
should fail with -EISDIR if and only if we pass O_CREAT to open(). And
passing O_CREAT on that one is probably not an intended behaviour of git...

Does anybody else see NFS breakage starting at that commit, BTW? Other
testcases would be useful...

Trond Myklebust

unread,
Mar 24, 2010, 2:20:02 PM3/24/10
to
On Wed, 2010-03-24 at 14:02 -0400, Trond Myklebust wrote:
> I'd say you have a corruption issue either on the server side or on your
> client.

By the way. Does this issue happen with NFSv4 too, or is it only
NFSv4.1?

Doug Nazar

unread,
Mar 24, 2010, 2:30:02 PM3/24/10
to
On 2010-03-24 2:06 PM, Al Viro wrote:
>
> How consistent are the effects you are seeing from test to test on the same
> kernel? This one was very interesting, since it seemed to fail with
> -EISDIR while opening .git/objects/pack. Which is a directory and which
> should fail with -EISDIR if and only if we pass O_CREAT to open(). And
> passing O_CREAT on that one is probably not an intended behaviour of git...
>
> Does anybody else see NFS breakage starting at that commit, BTW? Other
> testcases would be useful...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

I also see it....didn't bisect it yet since I was in a rush and had a
simple work around. I was doing a git pull between two repositories on a
NFS 4 w/krb5 security. Got the error about .git/objects/pack (error:
unable to open object pack directory: .git/objects/pack: Is a
directory). If I then run 'ls .git/objects/pack', and then the 'git
pull' again it works.

Doug

Al Viro

unread,
Mar 24, 2010, 3:00:03 PM3/24/10
to
On Wed, Mar 24, 2010 at 02:26:29PM -0400, Doug Nazar wrote:
> I also see it....didn't bisect it yet since I was in a rush and had
> a simple work around. I was doing a git pull between two
> repositories on a NFS 4 w/krb5 security. Got the error about
> .git/objects/pack (error: unable to open object pack directory:
> .git/objects/pack: Is a directory). If I then run 'ls
> .git/objects/pack', and then the 'git pull' again it works.

Very interesting... The damn thing *is* a directory, which should have
made NFS skip all lookup_instantiate_filp() tricks completely. IOW, it's
not hitting anything intent-related at that case.

I really wonder where the hell does EISDIR come from; no matter how screwed
the cached attributes are, there's not a lot of places where we can return
that sucker. It either comes from something in NFS itself, or it's
may_open() getting MAY_WRITE on that object or it has to see O_CREAT in
open_flag. And acc_mode is not modified after it's set in do_filp_open().
If that calculation (acc_mode by open_flag) would be buggered for O_RDONLY,
we'd be seeing a lot more breakage, starting with ls(1) ;-)

One possibility is that shit hits the fan a bit earlier and git really passes
something odd to that open() as the result of bogus stat(), etc.

Folks, could you try the following: in do_last() move case LAST_DOT:
to immediately after follow_dotdot(nd); and see if that changes anything?
It shouldn't, but if nfs is playing odd tricks with ->d_revalidate() for
directories acting differently depending on LOOKUP_PARENT in flags...

IOW, replace
case LAST_DOTDOT:
follow_dotdot(nd);
dir = nd->path.dentry;
if (nd->path.mnt->mnt_sb->s_type->fs_flags & FS_REVAL_DOT) {
if (!dir->d_op->d_revalidate(dir, nd)) {
error = -ESTALE;
goto exit;
}
}
/* fallthrough */
case LAST_DOT:
case LAST_ROOT:
with
case LAST_DOTDOT:
follow_dotdot(nd);
/* fallthrough */
case LAST_DOT:
dir = nd->path.dentry;
if (nd->path.mnt->mnt_sb->s_type->fs_flags & FS_REVAL_DOT) {
if (!dir->d_op->d_revalidate(dir, nd)) {
error = -ESTALE;
goto exit;
}
}
/* fallthrough */
case LAST_ROOT:
and see what'll change.

Boaz Harrosh

unread,
Mar 25, 2010, 5:20:01 AM3/25/10
to
On 03/24/2010 08:10 PM, Trond Myklebust wrote:
> On Wed, 2010-03-24 at 14:02 -0400, Trond Myklebust wrote:
>> I'd say you have a corruption issue either on the server side or on your
>> client.
>

It is surely a corruption on the client. I've tested with an old server
and it fails just the same.

> By the way. Does this issue happen with NFSv4 too, or is it only
> NFSv4.1?
>
> Trond
>

-o minorversion=0 or =1 does not matter.
nfsv3 does not have that problem. (Same client different server though)

Boaz

Boaz Harrosh

unread,
Mar 25, 2010, 5:40:01 AM3/25/10
to

(Below is what I changed)

It makes no difference, fails just the same. Would an "strace" help?

I will now setup an ext2 export. exofs is an heavy ext2 copy paste in every
thing meta-data and nfs-export. I hope to strimline to the basics.

You have asked before, if the failure is reproducible. Yes it fails every
time in exactly the same way. "git status" on that tree has never succeeded
from the 2.6.34-rc2 client.

Boaz
---
diff --git a/fs/namei.c b/fs/namei.c

index 1c0fca6..e8a6a7b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1618,8 +1618,10 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
int error = -EISDIR;

switch (nd->last_type) {
- case LAST_DOTDOT:
+ case LAST_DOTDOT:
follow_dotdot(nd);
+ /* fallthrough */
+ case LAST_DOT:


dir = nd->path.dentry;
if (nd->path.mnt->mnt_sb->s_type->fs_flags & FS_REVAL_DOT) {
if (!dir->d_op->d_revalidate(dir, nd)) {

@@ -1628,7 +1630,6 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
}
}
/* fallthrough */
- case LAST_DOT:
case LAST_ROOT:
if (open_flag & O_CREAT)
goto exit;
@@ -1650,6 +1651,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path,


error = do_lookup(nd, &nd->last, path);
if (error)
goto exit;
+
+ if (IS_ERR(nd->intent.open.file)) {
+ printk(KERN_ERR "foo: %s", pathname);
+ WARN_ON(1);
+ }
+
error = -ENOENT;
if (!path->dentry->d_inode)
goto exit_dput;

Al Viro

unread,
Mar 25, 2010, 6:20:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:

> It makes no difference, fails just the same. Would an "strace" help?

It might, especially if you ran it for identical repositories on local
fs and on NFS; at least that way it would be possible to see where do
they diverge...

Benny Halevy

unread,
Mar 25, 2010, 6:20:02 AM3/25/10
to
On Mar. 24, 2010, 20:10 +0200, Trond Myklebust <trond.m...@fys.uio.no> wrote:
> On Wed, 2010-03-24 at 14:02 -0400, Trond Myklebust wrote:
>> I'd say you have a corruption issue either on the server side or on your
>> client.
>
> By the way. Does this issue happen with NFSv4 too, or is it only
> NFSv4.1?

I reproduced this both over v4 and v4.1 with ext3 exported on the server.

Benny

>
> Trond
>
> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Benny Halevy

unread,
Mar 25, 2010, 6:30:02 AM3/25/10
to
On Mar. 25, 2010, 12:12 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>
>> It makes no difference, fails just the same. Would an "strace" help?
>
> It might, especially if you ran it for identical repositories on local
> fs and on NFS; at least that way it would be possible to see where do
> they diverge...
> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

ext3 fs exported on the server
client mount -t nfs4 localhost:/ /mnt/localhost
$ strace git status
...
open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)

Benny Halevy

unread,
Mar 25, 2010, 6:40:01 AM3/25/10
to
On Mar. 25, 2010, 12:22 +0200, Benny Halevy <bha...@panasas.com> wrote:
> On Mar. 25, 2010, 12:12 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
>> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>>
>>> It makes no difference, fails just the same. Would an "strace" help?
>> It might, especially if you ran it for identical repositories on local
>> fs and on NFS; at least that way it would be possible to see where do
>> they diverge...
>> _______________________________________________
>> pNFS mailing list
>> pN...@linux-nfs.org
>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
> ext3 fs exported on the server
> client mount -t nfs4 localhost:/ /mnt/localhost
> $ strace git status
> ...
> open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)

FWIW, This could be timing related as retrying with nfs debugging on
failed to reproduce the problem.

Benny

> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Al Viro

unread,
Mar 25, 2010, 6:50:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 12:22:31PM +0200, Benny Halevy wrote:
> On Mar. 25, 2010, 12:12 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> > On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
> >
> >> It makes no difference, fails just the same. Would an "strace" help?
> >
> > It might, especially if you ran it for identical repositories on local
> > fs and on NFS; at least that way it would be possible to see where do
> > they diverge...
> > _______________________________________________
> > pNFS mailing list
> > pN...@linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
> ext3 fs exported on the server
> client mount -t nfs4 localhost:/ /mnt/localhost
> $ strace git status
> ...
> open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)

Interesting. It is a directory, indeed, but why the hell does that call
fail with -EISDIR?

Does that happen with nfsv3 or is that v4-only? I'm going to set up v4
server and client and see what happens, but that information could be
useful...

Benny Halevy

unread,
Mar 25, 2010, 7:00:03 AM3/25/10
to
On Mar. 25, 2010, 12:49 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 12:22:31PM +0200, Benny Halevy wrote:
>> On Mar. 25, 2010, 12:12 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
>>> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>>>
>>>> It makes no difference, fails just the same. Would an "strace" help?
>>> It might, especially if you ran it for identical repositories on local
>>> fs and on NFS; at least that way it would be possible to see where do
>>> they diverge...
>>> _______________________________________________
>>> pNFS mailing list
>>> pN...@linux-nfs.org
>>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>> ext3 fs exported on the server
>> client mount -t nfs4 localhost:/ /mnt/localhost
>> $ strace git status
>> ...
>> open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)
>
> Interesting. It is a directory, indeed, but why the hell does that call
> fail with -EISDIR?
>
> Does that happen with nfsv3 or is that v4-only? I'm going to set up v4
> server and client and see what happens, but that information could be
> useful...

We've seen this with v4 only so far.

BTW I added this WARN_ON:
@@ -1656,8 +1659,10 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
if (path->dentry->d_inode->i_op->follow_link)
return NULL;
error = -ENOTDIR;
- if (*want_dir && !path->dentry->d_inode->i_op->lookup)
+ if (*want_dir && !path->dentry->d_inode->i_op->lookup) {
+ WARN_ON(1);
goto exit_dput;
+ }
path_to_nameidata(path, nd);
audit_inode(pathname, nd->path.dentry);
goto ok;

but it is NOT tripping for this scenario.

(for some reason I saw it tripping when building the kernel over nfs
but it's benign)

Benny

Al Viro

unread,
Mar 25, 2010, 7:00:03 AM3/25/10
to
On Thu, Mar 25, 2010 at 10:12:31AM +0000, Al Viro wrote:
> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>
> > It makes no difference, fails just the same. Would an "strace" help?
>
> It might, especially if you ran it for identical repositories on local
> fs and on NFS; at least that way it would be possible to see where do
> they diverge...

I wonder... What happens if you add
if (error == -EISDIR)
printk("blah: %s", pathname);
right after do_lookup() call in do_last()? That would separate -EISDIR
coming from NFS from the same thing coming from fs/namei.c...

Al Viro

unread,
Mar 25, 2010, 7:10:01 AM3/25/10
to
On Thu, Mar 25, 2010 at 12:56:40PM +0200, Benny Halevy wrote:
> - if (*want_dir && !path->dentry->d_inode->i_op->lookup)
> + if (*want_dir && !path->dentry->d_inode->i_op->lookup) {
> + WARN_ON(1);
> goto exit_dput;
> + }
> path_to_nameidata(path, nd);
> audit_inode(pathname, nd->path.dentry);
> goto ok;
>
> but it is NOT tripping for this scenario.
>
> (for some reason I saw it tripping when building the kernel over nfs
> but it's benign)

It's ENOTDIR, not EISDIR, anyway. Happens if you ask to open foo/ or
foo with O_DIRECTORY when foo is not a directory.

Benny Halevy

unread,
Mar 25, 2010, 7:20:01 AM3/25/10
to
On Mar. 25, 2010, 13:12 +0200, Benny Halevy <bha...@panasas.com> wrote:

> On Mar. 25, 2010, 13:00 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
>> On Thu, Mar 25, 2010 at 12:56:40PM +0200, Benny Halevy wrote:
>>> - if (*want_dir && !path->dentry->d_inode->i_op->lookup)
>>> + if (*want_dir && !path->dentry->d_inode->i_op->lookup) {
>>> + WARN_ON(1);
>>> goto exit_dput;
>>> + }
>>> path_to_nameidata(path, nd);
>>> audit_inode(pathname, nd->path.dentry);
>>> goto ok;
>>>
>>> but it is NOT tripping for this scenario.
>>>
>>> (for some reason I saw it tripping when building the kernel over nfs
>>> but it's benign)
>> It's ENOTDIR, not EISDIR, anyway. Happens if you ask to open foo/ or
>> foo with O_DIRECTORY when foo is not a directory.
>
> Hmm, not according to my strace (or am I blind today? :)

Duh, you were referring to the hunk I added a WARN_ON to.
Sure. you're right!

>
>>> open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)
>

> The odd thing is the app incorrectly gets EISDIR despite
> the fact that it is opening a directory with O_DIRECTORY.


> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Benny Halevy

unread,
Mar 25, 2010, 7:20:01 AM3/25/10
to
On Mar. 25, 2010, 13:00 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 12:56:40PM +0200, Benny Halevy wrote:
>> - if (*want_dir && !path->dentry->d_inode->i_op->lookup)
>> + if (*want_dir && !path->dentry->d_inode->i_op->lookup) {
>> + WARN_ON(1);
>> goto exit_dput;
>> + }
>> path_to_nameidata(path, nd);
>> audit_inode(pathname, nd->path.dentry);
>> goto ok;
>>
>> but it is NOT tripping for this scenario.
>>
>> (for some reason I saw it tripping when building the kernel over nfs
>> but it's benign)
>
> It's ENOTDIR, not EISDIR, anyway. Happens if you ask to open foo/ or
> foo with O_DIRECTORY when foo is not a directory.

Hmm, not according to my strace (or am I blind today? :)

> > open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = -1 EISDIR (Is a directory)

The odd thing is the app incorrectly gets EISDIR despite


the fact that it is opening a directory with O_DIRECTORY.

Benny Halevy

unread,
Mar 25, 2010, 7:20:02 AM3/25/10
to
On Mar. 25, 2010, 12:54 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 10:12:31AM +0000, Al Viro wrote:
>> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>>
>>> It makes no difference, fails just the same. Would an "strace" help?
>> It might, especially if you ran it for identical repositories on local
>> fs and on NFS; at least that way it would be possible to see where do
>> they diverge...
>
> I wonder... What happens if you add
> if (error == -EISDIR)
> printk("blah: %s", pathname);

I'll add that check too.

Benny

> right after do_lookup() call in do_last()? That would separate -EISDIR
> coming from NFS from the same thing coming from fs/namei.c...

> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Al Viro

unread,
Mar 25, 2010, 8:00:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 10:49:24AM +0000, Al Viro wrote:

> Does that happen with nfsv3 or is that v4-only? I'm going to set up v4
> server and client and see what happens, but that information could be
> useful...

*grumble*
I'd managed to forget that rpc.imapd in its infinite wisdom relies on
the FPOS I normally disable. Oh, well... rebuilding testbox kernel
with dnotify enabled ;-/

Benny Halevy

unread,
Mar 25, 2010, 8:10:01 AM3/25/10
to
On Mar. 25, 2010, 12:54 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 10:12:31AM +0000, Al Viro wrote:
>> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>>
>>> It makes no difference, fails just the same. Would an "strace" help?
>> It might, especially if you ran it for identical repositories on local
>> fs and on NFS; at least that way it would be possible to see where do
>> they diverge...
>
> I wonder... What happens if you add
> if (error == -EISDIR)
> printk("blah: %s", pathname);
> right after do_lookup() call in do_last()? That would separate -EISDIR
> coming from NFS from the same thing coming from fs/namei.c...

Bingo.
It is returned from do_lookup.

@@ -1648,6 +1654,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
/* just plain open? */
if (!(open_flag & O_CREAT)) {


error = do_lookup(nd, &nd->last, path);

+ if (error == -EISDIR)
+ printk("%s: do_lookup returned -EISDIR %s\n", __func__, pathname);
if (error)
goto exit;
error = -ENOENT;


> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Benny Halevy

unread,
Mar 25, 2010, 8:20:02 AM3/25/10
to
On Mar. 25, 2010, 14:07 +0200, Benny Halevy <bha...@panasas.com> wrote:
> On Mar. 25, 2010, 12:54 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
>> On Thu, Mar 25, 2010 at 10:12:31AM +0000, Al Viro wrote:
>>> On Thu, Mar 25, 2010 at 11:39:38AM +0200, Boaz Harrosh wrote:
>>>
>>>> It makes no difference, fails just the same. Would an "strace" help?
>>> It might, especially if you ran it for identical repositories on local
>>> fs and on NFS; at least that way it would be possible to see where do
>>> they diverge...
>> I wonder... What happens if you add
>> if (error == -EISDIR)
>> printk("blah: %s", pathname);
>> right after do_lookup() call in do_last()? That would separate -EISDIR
>> coming from NFS from the same thing coming from fs/namei.c...
>
> Bingo.
> It is returned from do_lookup.

Indeed this error is coming from the server:

nfsd_dispatch: vers 4 proc 1
nfsv4 compound op #1/7: 22 (OP_PUTFH)
nfsd: fh_verify(16: 01010001 00000000 000e6592 345b9f25 00000000 00000000)
nfsv4 compound op ffff880076734078 opcnt 7 #1: 22: status 0
nfsv4 compound op #2/7: 32 (OP_SAVEFH)
nfsv4 compound op ffff880076734078 opcnt 7 #2: 32: status 0
nfsv4 compound op #3/7: 18 (OP_OPEN)
NFSD: nfsd4_open filename pack op_stateowner (null)
renewing client (clientid 4bab503e/00000002)
nfsd: nfsd_lookup(fh 16: 01010001 00000000 000e6592 345b9f25 00000000 00000000, pack)
nfsd: fh_verify(16: 01010001 00000000 000e6592 345b9f25 00000000 00000000)
nfsd: fh_compose(exp 08:05/106497 objects/pack, ino=943508)
nfsd: fh_verify(16: 01010001 00000000 000e6594 345b9f26 00000000 00000000)
nfsv4 compound op ffff880076734078 opcnt 7 #3: 18: status 21
nfsv4 compound returned 21

Al Viro

unread,
Mar 25, 2010, 9:10:03 AM3/25/10
to
On Thu, Mar 25, 2010 at 02:18:56PM +0200, Benny Halevy wrote:

> Indeed this error is coming from the server:
>
> nfsd_dispatch: vers 4 proc 1
> nfsv4 compound op #1/7: 22 (OP_PUTFH)
> nfsd: fh_verify(16: 01010001 00000000 000e6592 345b9f25 00000000 00000000)
> nfsv4 compound op ffff880076734078 opcnt 7 #1: 22: status 0
> nfsv4 compound op #2/7: 32 (OP_SAVEFH)
> nfsv4 compound op ffff880076734078 opcnt 7 #2: 32: status 0
> nfsv4 compound op #3/7: 18 (OP_OPEN)
> NFSD: nfsd4_open filename pack op_stateowner (null)
> renewing client (clientid 4bab503e/00000002)
> nfsd: nfsd_lookup(fh 16: 01010001 00000000 000e6592 345b9f25 00000000 00000000, pack)
> nfsd: fh_verify(16: 01010001 00000000 000e6592 345b9f25 00000000 00000000)
> nfsd: fh_compose(exp 08:05/106497 objects/pack, ino=943508)
> nfsd: fh_verify(16: 01010001 00000000 000e6594 345b9f26 00000000 00000000)
> nfsv4 compound op ffff880076734078 opcnt 7 #3: 18: status 21
> nfsv4 compound returned 21

Ho-hum... So it hits the "let's try to open it atomically" path and
gets told to FOAD by server (as it should, of course).

And if we see different behaviour after ls -l, presumably that's a
difference between ->lookup() and ->d_revalidate() paths on client...

OK, I think I see what's going on in this case. However, it doesn't
explain everything; my current theory is that we used to get LOOKUP_DIRECTORY
on the last components in O_DIRECTORY opens and we don't do that now.
That used to derail the is_atomic_open(), now it's hit and there we go.

It's not hard to verify (and it might take care of this testcase), but
I still have questions about the way this code used to work *without*
O_DIRECTORY.

Let's try this: before do_lookup() call there add
if (*want_dir)
nd->flags |= LOOKUP_DIRECTORY;
and see how does it behave.

However, even if it does help, it doesn't explain everything. Normal
open() on a directory without O_DIRECTORY if flags shouldn't fail with
-EISDIR. How did that manage to avoid it all along?

Boaz Harrosh

unread,
Mar 25, 2010, 9:10:03 AM3/25/10
to
On 03/25/2010 01:55 PM, Al Viro wrote:
> On Thu, Mar 25, 2010 at 10:49:24AM +0000, Al Viro wrote:
>
>> Does that happen with nfsv3 or is that v4-only? I'm going to set up v4
>> server and client and see what happens, but that information could be
>> useful...
>
> *grumble*
> I'd managed to forget that rpc.imapd in its infinite wisdom relies on
> the FPOS I normally disable. Oh, well... rebuilding testbox kernel
> with dnotify enabled ;-/

I just wanted to make sure, on top of 2.6.34-rc2 i did:
git revert 781b16775ba0bb55fac0e1757bf0bd87c8879632
Revert "Fix a dumb typo - use of & instead of &&"
and:
git revert 1f36f774b22a0ceb7dd33eca626746c81a97b6a5
Revert "Switch !O_CREAT case to use of do_last()"

The revert is now clean and the problem does not show.
Are you able to reproduce this? Anything I can do to help?

Thanks
Boaz

Boaz Harrosh

unread,
Mar 25, 2010, 9:20:02 AM3/25/10
to
On 03/25/2010 03:00 PM, Boaz Harrosh wrote:
> On 03/25/2010 01:55 PM, Al Viro wrote:
>> On Thu, Mar 25, 2010 at 10:49:24AM +0000, Al Viro wrote:
>>
>>> Does that happen with nfsv3 or is that v4-only? I'm going to set up v4
>>> server and client and see what happens, but that information could be
>>> useful...
>>
>> *grumble*
>> I'd managed to forget that rpc.imapd in its infinite wisdom relies on
>> the FPOS I normally disable. Oh, well... rebuilding testbox kernel
>> with dnotify enabled ;-/
>
> I just wanted to make sure, on top of 2.6.34-rc2 i did:
> git revert 781b16775ba0bb55fac0e1757bf0bd87c8879632
> Revert "Fix a dumb typo - use of & instead of &&"
> and:
> git revert 1f36f774b22a0ceb7dd33eca626746c81a97b6a5
> Revert "Switch !O_CREAT case to use of do_last()"
>
> The revert is now clean and the problem does not show.
> Are you able to reproduce this? Anything I can do to help?
>

I'm not sure I mentioned this. If you are able to reproduce the
error, then after the error in "git status" the tree becomes unusable.
Not even from the local_mount. I have not found a method to fix the tree.
So what I do is:
- Before I start playing I save "cp some_tree/.git/index index.back"
- then after the error and tree damage On local mount I:
"rm some_tree/.git/index"
"cp index.back cp some_tree/.git/index"
"cd some_tree; git status"

this will fix it so I can try a new iteration

Boaz Harrosh

unread,
Mar 25, 2010, 9:40:03 AM3/25/10
to

Yes this fixes it!!
2.6.34-rc2 plus above, now works, horay. (diff attached)

> and see how does it behave.
>
> However, even if it does help, it doesn't explain everything. Normal
> open() on a directory without O_DIRECTORY if flags shouldn't fail with
> -EISDIR. How did that manage to avoid it all along?

---
diff --git a/fs/namei.c b/fs/namei.c
index 1c0fca6..434ad2a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1647,6 +1647,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,



/* just plain open? */
if (!(open_flag & O_CREAT)) {

+ if (*want_dir)
+ nd->flags |= LOOKUP_DIRECTORY;


error = do_lookup(nd, &nd->last, path);

if (error)
goto exit;

Al Viro

unread,
Mar 25, 2010, 9:40:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 03:30:22PM +0200, Boaz Harrosh wrote:
> > Let's try this: before do_lookup() call there add
> > if (*want_dir)
> > nd->flags |= LOOKUP_DIRECTORY;
>
> Yes this fixes it!!
> 2.6.34-rc2 plus above, now works, horay. (diff attached)
>
> > and see how does it behave.
> >
> > However, even if it does help, it doesn't explain everything. Normal
> > open() on a directory without O_DIRECTORY if flags shouldn't fail with
> > -EISDIR. How did that manage to avoid it all along?

Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
It certainly does for local filesystems and it does for NFSv3; does it work
for NFSv4?

Boaz Harrosh

unread,
Mar 25, 2010, 9:50:03 AM3/25/10
to
On 03/25/2010 03:37 PM, Al Viro wrote:
> On Thu, Mar 25, 2010 at 03:30:22PM +0200, Boaz Harrosh wrote:
>>> Let's try this: before do_lookup() call there add
>>> if (*want_dir)
>>> nd->flags |= LOOKUP_DIRECTORY;
>>
>> Yes this fixes it!!
>> 2.6.34-rc2 plus above, now works, horay. (diff attached)
>>
>>> and see how does it behave.
>>>
>>> However, even if it does help, it doesn't explain everything. Normal
>>> open() on a directory without O_DIRECTORY if flags shouldn't fail with
>>> -EISDIR. How did that manage to avoid it all along?
>
> Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
> It certainly does for local filesystems and it does for NFSv3; does it work
> for NFSv4?

In my tests. Every thing is the same safe the client with the above change.

So I guess NFSv4 does something different when asked for directory lookup
as opposed to files lookup. I guess there is something added/removed to
the compound depending on that flag. But I wouldn't know, I am not familiar
with this code. NFSv4 someone?

Boaz

Benny Halevy

unread,
Mar 25, 2010, 10:00:02 AM3/25/10
to
On Mar. 25, 2010, 15:37 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 03:30:22PM +0200, Boaz Harrosh wrote:
>>> Let's try this: before do_lookup() call there add
>>> if (*want_dir)
>>> nd->flags |= LOOKUP_DIRECTORY;
>> Yes this fixes it!!
>> 2.6.34-rc2 plus above, now works, horay. (diff attached)
>>
>>> and see how does it behave.
>>>
>>> However, even if it does help, it doesn't explain everything. Normal
>>> open() on a directory without O_DIRECTORY if flags shouldn't fail with
>>> -EISDIR. How did that manage to avoid it all along?
>
> Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
> It certainly does for local filesystems and it does for NFSv3; does it work
> for NFSv4?

No, it doesn't.

# mount localhost:/usr0/nfs4export /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
open("/mnt/localhost/server", O_RDONLY) = 3

# mount -t nfs4 localhost:/ /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
open("/mnt/localhost/server", O_RDONLY) = -1 EISDIR (Is a directory)

Al Viro

unread,
Mar 25, 2010, 10:10:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 03:45:53PM +0200, Boaz Harrosh wrote:
> > Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
> > It certainly does for local filesystems and it does for NFSv3; does it work
> > for NFSv4?
>
> In my tests. Every thing is the same safe the client with the above change.
>
> So I guess NFSv4 does something different when asked for directory lookup
> as opposed to files lookup. I guess there is something added/removed to
> the compound depending on that flag. But I wouldn't know, I am not familiar
> with this code. NFSv4 someone?

OK, what happens if you do the following:

mount the same fs from two clients
on one client:
mkdir /mnt/weird_name_69
on another:
echo 'main() {open("/mnt/weird_name_69", 0);}' >/tmp/a.c
gcc /tmp/a.c
strace ./a.out
ls -l /mnt/weird_name_69
strace ./a.out

Will the first strace show EISDIR and the second succeed?

From my reading of that code (2.6.33, before all that stuff got merged),
we have different behaviour depending on which codepath do we hit.
If we go through ->d_revalidate(), it sees that it's not S_ISREG() and
doesn't try to play with atomic open. If we go through ->lookup(), we
tell the server to open it, and when it tells us to bugger off (it's a
directory, NFSv4 doesn't support atomic open for those), -EISDIR is
passed to caller. Which leads to open() failing.

It definitely looks like a bug. Masked by O_DIRECTORY in 2.6.33. Bug
in fs/namei.c patch has exposed that crap both for O_DIRECTORY and !O_DIRECTORY
cases.

So immediate fix will need to be along the lines of "add LOOKUP_DIRECTORY
even on the last step if we have *want_dir" (and I'd probably get rid of
want_dir then and just abuse nd->flags), but there's a real NFS bug as
well.

Al Viro

unread,
Mar 25, 2010, 10:10:03 AM3/25/10
to
On Thu, Mar 25, 2010 at 03:52:25PM +0200, Benny Halevy wrote:
> On Mar. 25, 2010, 15:37 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> > On Thu, Mar 25, 2010 at 03:30:22PM +0200, Boaz Harrosh wrote:
> >>> Let's try this: before do_lookup() call there add
> >>> if (*want_dir)
> >>> nd->flags |= LOOKUP_DIRECTORY;
> >> Yes this fixes it!!
> >> 2.6.34-rc2 plus above, now works, horay. (diff attached)
> >>
> >>> and see how does it behave.
> >>>
> >>> However, even if it does help, it doesn't explain everything. Normal
> >>> open() on a directory without O_DIRECTORY if flags shouldn't fail with
> >>> -EISDIR. How did that manage to avoid it all along?
> >
> > Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
> > It certainly does for local filesystems and it does for NFSv3; does it work
> > for NFSv4?
>
> No, it doesn't.
>
> # mount localhost:/usr0/nfs4export /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
> open("/mnt/localhost/server", O_RDONLY) = 3
>
> # mount -t nfs4 localhost:/ /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
> open("/mnt/localhost/server", O_RDONLY) = -1 EISDIR (Is a directory)

Gets better - if you do ls -l /mnt/localhost/server and then repeat that
open(), it'll succeed.

Benny Halevy

unread,
Mar 25, 2010, 10:10:04 AM3/25/10
to
On Mar. 25, 2010, 16:06 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
> On Thu, Mar 25, 2010 at 03:52:25PM +0200, Benny Halevy wrote:
>> On Mar. 25, 2010, 15:37 +0200, Al Viro <vi...@ZenIV.linux.org.uk> wrote:
>>> On Thu, Mar 25, 2010 at 03:30:22PM +0200, Boaz Harrosh wrote:
>>>>> Let's try this: before do_lookup() call there add
>>>>> if (*want_dir)
>>>>> nd->flags |= LOOKUP_DIRECTORY;
>>>> Yes this fixes it!!
>>>> 2.6.34-rc2 plus above, now works, horay. (diff attached)
>>>>
>>>>> and see how does it behave.
>>>>>
>>>>> However, even if it does help, it doesn't explain everything. Normal
>>>>> open() on a directory without O_DIRECTORY if flags shouldn't fail with
>>>>> -EISDIR. How did that manage to avoid it all along?
>>> Does open() of directory _without_ O_DIRECTORY work in e.g. vanilla 2.6.33?
>>> It certainly does for local filesystems and it does for NFSv3; does it work
>>> for NFSv4?
>> No, it doesn't.
>>
>> # mount localhost:/usr0/nfs4export /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
>> open("/mnt/localhost/server", O_RDONLY) = 3
>>
>> # mount -t nfs4 localhost:/ /mnt/localhost; strace cat /mnt/localhost/server 2>&1 | grep 'open.*server'; umount /mnt/localhost
>> open("/mnt/localhost/server", O_RDONLY) = -1 EISDIR (Is a directory)
>
> Gets better - if you do ls -l /mnt/localhost/server and then repeat that
> open(), it'll succeed.

Correct :)

Boaz Harrosh

unread,
Mar 25, 2010, 10:30:02 AM3/25/10
to

NFS bug, you mean by not being consistent when opening a directory from
cache or when having to fetch it for the first time? I guess...

But I still think it is a problem that the application, git in this case,
did an open(,O_DIRECTORY...) and because of code reuse some low-level did
not see that flag, relaying on the fact that in most systems they don't care.

The nd->flags things looks like the right thing to me.

And about the NFS thing, let misbehaving applications byte there own bullets.
But a good app like git, does put an O_DIRECTORY on the directory open, and
therefore will receive consistent results, right?

Please push a fix, meanwhile I'm running with this one, but I will need it
in linux-next later.

Thanks for everything
Boaz

Benny Halevy

unread,
Mar 25, 2010, 10:40:01 AM3/25/10
to

So, previously (pre 1f36f77), do_last called do_path_lookup with
lookup_flags(open_flag)|LOOKUP_OPEN and lookup_flags did
- if (f & O_DIRECTORY)
- retval |= LOOKUP_DIRECTORY;

and this was dropped in 1f36f77, right?

Benny

> _______________________________________________
> pNFS mailing list
> pN...@linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

Al Viro

unread,
Mar 25, 2010, 11:30:02 AM3/25/10
to
On Thu, Mar 25, 2010 at 04:27:13PM +0200, Boaz Harrosh wrote:

> And about the NFS thing, let misbehaving applications byte there own bullets.
> But a good app like git, does put an O_DIRECTORY on the directory open, and
> therefore will receive consistent results, right?

What the hell do you mean, "misbehaving"? O_DIRECTORY means one thing:
"refuse to open if it turns out to be a non-directory". If open() succeed
with it and fails without, it's a definite bug. Application has every
right to call the damn thing without that flag.

It *can* be used as a hint by fs code, and it's a useful hint, so yes,
we need to restore LOOKUP_DIRECTORY (until we get rid of the entire
d_revalidate/lookup+open mess and pass struct file + open flags directly
to fs method for the last step of pathname resolution; basically, most
of the do_last() guts would become that method, killing the intents
crap for good; new method would simply check for O_DIRECTORY presence
instead of looking at LOOKUP_DIRECTORY).

However, NFS behaviour in cases when O_DIRECTORY is absent is a separate
(and quite real) bug. If nfs_atomic_lookup() sees -EISDIR from attempt
to open on server, it should either issue plain lookup or try to use the
information it got in response to open attempt (if there's enough of it
there). Passing -EISDIR to ->lookup() caller is simply wrong.

I'm going to send a fix for O_DIRECTORY case (restoring the behaviour
we had in 2.6.33) today, but NFS side of things also needs to be dealt
with.

Trond Myklebust

unread,
Mar 25, 2010, 11:50:02 AM3/25/10
to
On Thu, 2010-03-25 at 11:13 +0200, Boaz Harrosh wrote:
> On 03/24/2010 08:10 PM, Trond Myklebust wrote:
> > On Wed, 2010-03-24 at 14:02 -0400, Trond Myklebust wrote:
> >> I'd say you have a corruption issue either on the server side or on your
> >> client.
> >
>
> It is surely a corruption on the client. I've tested with an old server
> and it fails just the same.

Yes, but we should not be seeing crap like that in the XDR decoding of a
revalidate request. I strongly suspect the server should be returning an
error here, probably NFS4ERR_BAD_HANDLE... After we fix the client side,
I'd like to take a look at the server and see why you are getting bogus
info for your GETATTR.

Trond

Boaz Harrosh

unread,
Mar 25, 2010, 1:30:02 PM3/25/10
to
On 03/25/2010 05:25 PM, Al Viro wrote:
> On Thu, Mar 25, 2010 at 04:27:13PM +0200, Boaz Harrosh wrote:
>
>> And about the NFS thing, let misbehaving applications byte there own bullets.
>> But a good app like git, does put an O_DIRECTORY on the directory open, and
>> therefore will receive consistent results, right?
>
> What the hell do you mean, "misbehaving"? O_DIRECTORY means one thing:
> "refuse to open if it turns out to be a non-directory". If open() succeed
> with it and fails without, it's a definite bug. Application has every
> right to call the damn thing without that flag.
>

Ha, OK, sorry I thought exactly the opposite. I assumed it meant
"don't allow a directory to be opened, unless set". As an application
hint of saying I'm going to do file things to this handle, don't want
any dirs.

So yes definitely it looks like an NFS bug masked by the fact that everyone
is using opendir() form libc for this.

> It *can* be used as a hint by fs code, and it's a useful hint, so yes,
> we need to restore LOOKUP_DIRECTORY (until we get rid of the entire
> d_revalidate/lookup+open mess and pass struct file + open flags directly
> to fs method for the last step of pathname resolution; basically, most
> of the do_last() guts would become that method, killing the intents
> crap for good; new method would simply check for O_DIRECTORY presence
> instead of looking at LOOKUP_DIRECTORY).
>
> However, NFS behaviour in cases when O_DIRECTORY is absent is a separate
> (and quite real) bug. If nfs_atomic_lookup() sees -EISDIR from attempt
> to open on server, it should either issue plain lookup or try to use the
> information it got in response to open attempt (if there's enough of it
> there). Passing -EISDIR to ->lookup() caller is simply wrong.
>
> I'm going to send a fix for O_DIRECTORY case (restoring the behaviour
> we had in 2.6.33) today, but NFS side of things also needs to be dealt
> with.

Thank you Al for fixing this, I hope some capable NFS person will take
that issue to heart.

Boaz

Trond Myklebust

unread,
Mar 25, 2010, 2:00:02 PM3/25/10
to
On Thu, 2010-03-25 at 19:28 +0200, Boaz Harrosh wrote:
> On 03/25/2010 05:25 PM, Al Viro wrote:

> > I'm going to send a fix for O_DIRECTORY case (restoring the behaviour
> > we had in 2.6.33) today, but NFS side of things also needs to be dealt
> > with.
>
> Thank you Al for fixing this, I hope some capable NFS person will take
> that issue to heart.

The NFSv4 fix is a trivial 1 liner. Pushed to bugfixes.

Now to understand that screwed up xdr decode...

Trond
------------------------------------------------------------------------------------------
NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR
From: Trond Myklebust <Trond.M...@netapp.com>

Signed-off-by: Trond Myklebust <Trond.M...@netapp.com>
---

fs/nfs/dir.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index c6f2750..be46f26 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1025,12 +1025,12 @@ static struct dentry *nfs_atomic_lookup(struct inode *dir, struct dentry *dentry
res = NULL;
goto out;
/* This turned out not to be a regular file */
+ case -EISDIR:
case -ENOTDIR:
goto no_open;
case -ELOOP:
if (!(nd->intent.open.flags & O_NOFOLLOW))
goto no_open;
- /* case -EISDIR: */
/* case -EINVAL: */
default:
goto out;

Boaz Harrosh

unread,
Mar 25, 2010, 2:10:02 PM3/25/10
to
On 03/25/2010 07:59 PM, Trond Myklebust wrote:
> On Thu, 2010-03-25 at 19:28 +0200, Boaz Harrosh wrote:
>> On 03/25/2010 05:25 PM, Al Viro wrote:
>
>>> I'm going to send a fix for O_DIRECTORY case (restoring the behaviour
>>> we had in 2.6.33) today, but NFS side of things also needs to be dealt
>>> with.
>>
>> Thank you Al for fixing this, I hope some capable NFS person will take
>> that issue to heart.
>
> The NFSv4 fix is a trivial 1 liner. Pushed to bugfixes.
>

Thanks

> Now to understand that screwed up xdr decode...
>

Tell me if you need that I run any tests. I still have that
failing setup if you need it.

> Trond

Boaz

Trond Myklebust

unread,
Mar 25, 2010, 2:20:02 PM3/25/10
to
On Thu, 2010-03-25 at 20:06 +0200, Boaz Harrosh wrote:
> On 03/25/2010 07:59 PM, Trond Myklebust wrote:
> > Now to understand that screwed up xdr decode...
> >
>
> Tell me if you need that I run any tests. I still have that
> failing setup if you need it.

Do you have a wireshark (binary) dump that includes that particular
getattr call and reply? That would help.

Trond

Boaz Harrosh

unread,
Mar 25, 2010, 2:40:02 PM3/25/10
to
On 03/25/2010 08:18 PM, Trond Myklebust wrote:
> On Thu, 2010-03-25 at 20:06 +0200, Boaz Harrosh wrote:
>> On 03/25/2010 07:59 PM, Trond Myklebust wrote:
>>> Now to understand that screwed up xdr decode...
>>>
>>
>> Tell me if you need that I run any tests. I still have that
>> failing setup if you need it.
>
> Do you have a wireshark (binary) dump that includes that particular
> getattr call and reply? That would help.
>

Wahoo sorry, it's already 20:31 out here It'll have to wait til Sunday
(You know that Hebrew weekend).

Tell me if you still need it sure, NP, I will make one first thing Sunday
morning.

> Trond
>

Boaz

0 new messages