Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

kernel nfsd

0 views
Skip to first unread message

Stephan von Krawczynski

unread,
Mar 18, 2003, 9:58:47 AM3/18/03
to linux-kernel, Trond Myklebust
Hello Trond, hello all,

can you explain what this means:

kernel: nfsd-fh: found a name that I didn't expect: <filename>

Should something be done against it, or is it simply informative?

Comes up on 2.4.20 kernel based nfs-server quite often. Exported FS is reiserfs
sized about 500 GB.

--
Regards,
Stephan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Trond Myklebust

unread,
Mar 18, 2003, 10:32:38 AM3/18/03
to Stephan von Krawczynski, linux-kernel, Neil Brown
>>>>> " " == Stephan von Krawczynski <sk...@ithnet.com> writes:

> Hello Trond, hello all, can you explain what this means:

> kernel: nfsd-fh: found a name that I didn't expect: <filename>

> Should something be done against it, or is it simply
> informative?

The comment in the code just above the printk() reads

/* Now that IS odd. I wonder what it means... */

Looks like you and Neil (and possibly the ReiserFS team) might want to
have a chat...

Cheers,
Trond

Stephan von Krawczynski

unread,
Mar 18, 2003, 10:43:47 AM3/18/03
to trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
On Tue, 18 Mar 2003 16:31:43 +0100
Trond Myklebust <trond.m...@fys.uio.no> wrote:

> >>>>> " " == Stephan von Krawczynski <sk...@ithnet.com> writes:
>
> > Hello Trond, hello all, can you explain what this means:
>
> > kernel: nfsd-fh: found a name that I didn't expect: <filename>
>
> > Should something be done against it, or is it simply
> > informative?
>
> The comment in the code just above the printk() reads
>
> /* Now that IS odd. I wonder what it means... */
>
> Looks like you and Neil (and possibly the ReiserFS team) might want to
> have a chat...

I'm all for it. Who has a glue? I have in fact tons of these messages, it's a
pretty large nfs server.

--
Regards,
Stephan

Oleg Drokin

unread,
Mar 18, 2003, 11:09:04 AM3/18/03
to Stephan von Krawczynski, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
Hello!

On Tue, Mar 18, 2003 at 04:42:04PM +0100, Stephan von Krawczynski wrote:

> > The comment in the code just above the printk() reads
> > /* Now that IS odd. I wonder what it means... */
> > Looks like you and Neil (and possibly the ReiserFS team) might want to
> > have a chat...
> I'm all for it. Who has a glue? I have in fact tons of these messages, it's a
> pretty large nfs server.

What is the typical usage pattern for files whose names are printed?
Are they created/deleted often by multiple clients/processes by any chance?

Bye,
Oleg

Stephan von Krawczynski

unread,
Mar 18, 2003, 11:29:49 AM3/18/03
to Oleg Drokin, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
On Tue, 18 Mar 2003 19:07:33 +0300
Oleg Drokin <gr...@namesys.com> wrote:

> Hello!
>
> On Tue, Mar 18, 2003 at 04:42:04PM +0100, Stephan von Krawczynski wrote:
>
> > > The comment in the code just above the printk() reads
> > > /* Now that IS odd. I wonder what it means... */
> > > Looks like you and Neil (and possibly the ReiserFS team) might want to
> > > have a chat...
> > I'm all for it. Who has a glue? I have in fact tons of these messages, it's
> > a pretty large nfs server.
>
> What is the typical usage pattern for files whose names are printed?
> Are they created/deleted often by multiple clients/processes by any chance?

This is a nfs-server who serves web-servers (apache). I find a lot of these
messages, but they (upto now) only point to 3 different filenames. And these
are in fact all directories. The box never crashed and has currently 20 days
uptime. It is dual P-III and has 6 GB of RAM.
The questionable directories were created long before they first showed this
message and have never changed (regarding name-change). Their contents were
possible changed but surely not often meaning no more than once a day or once a
week.
It may well occur that multiple nfs-client systems _read_ them, as well as
multiple processes on one client.
The nfs-clients are 2.4.19 boxes and one 2.2.21.

--
Regards,
Stephan

Stephan von Krawczynski

unread,
Mar 18, 2003, 11:42:45 AM3/18/03
to Stephan von Krawczynski, gr...@namesys.com, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au

And one addition:
They are all second level, meaning look like:

kernel: nfsd-fh: found a name that I didn't expect: libyen2000/pics

(where pics is a directory, too)

Stephan von Krawczynski

unread,
Mar 18, 2003, 11:47:19 AM3/18/03
to Stephan von Krawczynski, gr...@namesys.com, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
On Tue, 18 Mar 2003 17:41:06 +0100

Please ignore this rather silly comment. One should read code before commenting ;-)

Bernd Schubert

unread,
Mar 18, 2003, 12:30:12 PM3/18/03
to Oleg Drokin, Stephan von Krawczynski, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
On Tuesday 18 March 2003 17:07, Oleg Drokin wrote:
> Hello!
>
> On Tue, Mar 18, 2003 at 04:42:04PM +0100, Stephan von Krawczynski wrote:
> > > The comment in the code just above the printk() reads
> > > /* Now that IS odd. I wonder what it means... */
> > > Looks like you and Neil (and possibly the ReiserFS team) might want to
> > > have a chat...
> >
> > I'm all for it. Who has a glue? I have in fact tons of these messages,
> > it's a pretty large nfs server.
>
> What is the typical usage pattern for files whose names are printed?
> Are they created/deleted often by multiple clients/processes by any chance?
>


Hi,

we also sometimes see those messages. In our case it seems to appears rather
often for the local/share/perl directory of our /usr/local directory:

nfsd-fh: found a name that I didn't expect: share/perl

This directory is certainly never deleted when this message appears, actually
data are very, very seldem written to it.

Once this message also appeared for a file:
servicetypes/kdeveloplanguagesupport.desktop

I can't tell you how often kde deletes this file.

Please ask if you need more information.

Bernd

Neil Brown

unread,
Mar 18, 2003, 5:11:17 PM3/18/03
to Stephan von Krawczynski, trond.m...@fys.uio.no, linux-...@vger.kernel.org
On Tuesday March 18, sk...@ithnet.com wrote:
> On Tue, 18 Mar 2003 16:31:43 +0100
> Trond Myklebust <trond.m...@fys.uio.no> wrote:
>
> > >>>>> " " == Stephan von Krawczynski <sk...@ithnet.com> writes:
> >
> > > Hello Trond, hello all, can you explain what this means:
> >
> > > kernel: nfsd-fh: found a name that I didn't expect: <filename>
> >
> > > Should something be done against it, or is it simply
> > > informative?
> >
> > The comment in the code just above the printk() reads
> >
> > /* Now that IS odd. I wonder what it means... */
> >
> > Looks like you and Neil (and possibly the ReiserFS team) might want to
> > have a chat...
>
> I'm all for it. Who has a glue? I have in fact tons of these messages, it's a
> pretty large nfs server.

When knfsd gets a request for a filehandle which refers to an object
that isn't in the dcache, it needs to get it into the dcache. This
involves finding it's name and splicing it in.

It gets hold of an inode for the parent directory (don't worry how)
and reads through that directory looking for a name with the right
inode number. When it finds the name, it checks to see that the name
isn't already in the dcache under that directory. As the object with
that name isn't in the dcache you would expect the name not to be
their either. This message indicates that the name was there.

I think there is enough locking in place so that a race between one
process adding the name and another process looking up the name for an
object should not stumble over each other - both hold i_sem for the
directory. So I don't think that would be the cause.

Maybe this is reiserfs specific. Has anyone seen it on a non-reiserfs
filesystem? Possibly reiserfs does something funny with inode numbers
that is confusing the name lookup.

If it doesn't seem to correlate with other symptoms, I probably
wouldn't worry about it.

2.5 does all this quite differently so shouldn't have the same problem
(it certainly doesn't contain the same error message).

NeilBrown

Oleg Drokin

unread,
Mar 19, 2003, 1:46:51 AM3/19/03
to Bernd Schubert, Stephan von Krawczynski, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
Hello!

On Tue, Mar 18, 2003 at 06:28:59PM +0100, Bernd Schubert wrote:

> we also sometimes see those messages. In our case it seems to appears rather
> often for the local/share/perl directory of our /usr/local directory:
> nfsd-fh: found a name that I didn't expect: share/perl

Do you also use reiserfs for your /usr/local filesystem?

Bye,
Oleg

Stephan von Krawczynski

unread,
Mar 19, 2003, 6:02:15 AM3/19/03
to Neil Brown, trond.m...@fys.uio.no, linux-...@vger.kernel.org, gr...@namesys.com
On Wed, 19 Mar 2003 09:09:49 +1100
Neil Brown <ne...@cse.unsw.edu.au> wrote:

> Maybe this is reiserfs specific. Has anyone seen it on a non-reiserfs
> filesystem? Possibly reiserfs does something funny with inode numbers
> that is confusing the name lookup.
>
> If it doesn't seem to correlate with other symptoms, I probably
> wouldn't worry about it.

I re-checked the logfile and it looks like the read request (or open request)
is in fact failing, so something should be done. The apache-log looks like:

[Mon Mar 17 22:55:56 2003] [crit] [client w.x.y.z] (17)File exists: /a/b/c/d/e
pcfg_openfile: unable to check htaccess file, ensure it is readable

The corresponding nfs message is:

Mar 17 22:55:55 me kernel: nfsd-fh: found a name that I didn't expect: c/d

--
Regards,
Stephan

Bernd Schubert

unread,
Mar 19, 2003, 6:53:09 AM3/19/03
to Oleg Drokin, Stephan von Krawczynski, trond.m...@fys.uio.no, linux-...@vger.kernel.org, ne...@cse.unsw.edu.au
On Wednesday 19 March 2003 07:43, Oleg Drokin wrote:
> Hello!
>
> On Tue, Mar 18, 2003 at 06:28:59PM +0100, Bernd Schubert wrote:
> > we also sometimes see those messages. In our case it seems to appears
> > rather often for the local/share/perl directory of our /usr/local
> > directory: nfsd-fh: found a name that I didn't expect: share/perl
>
> Do you also use reiserfs for your /usr/local filesystem?
>

Yes of course, otherwise I wouldn't have replied to this thread.

Best regards,
Bernd

Vladimir Serov

unread,
Mar 20, 2003, 11:23:56 AM3/20/03
to linux-kernel, Trond Myklebust

Hello Trond, hello all,

I'm suffering from the long present bug in the nfs client.
This bug cause programs reading from NFS volume to stuck in D state forever.
This bug revealed only when client talks to NFS server with 3COM 3C905
NIC's ( well I'v triggered it with Intel eepro card too, but you have to
wait) and never with cheap slower cards like RTLxxxx, NE2000 clones. It
happens infrequently but inevitably. It happens more frequentlly on
2.4.17 kernel then on 2.4.21-pre5, when compiled by gcc 3.2.1 then gcc
2.95.3. Trond's NFS patches doesn't help on both kernels. It's not due
to packets loss (ok, it happens some times but rarely), it happens on
both 10 and 100 Mbps. This happens only on my StrongARM board (similar
to Brutus) with SMC's LAN91C111 ethernet chip. I 've not able to
reproduce this on PC, but i've head about very similar case:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0206.0/0066.html
It triggered simply by 'ls -lR /home>/dev/null&' and takes ~ half a
minute to happend.
If i insert a few printk's in the interrupt handler for NIC, it's gone !!!
IMHO this is due to the race in the nfs client.

Look at some logs from my system:

sh-2.03# mount
rootfs on / type rootfs (rw)
/dev/mtdblock4 on / type jffs2 (rw)
none on /proc type proc (rw)
none on /tmp type tmpfs (rw)
none on /dev/pts type devpts (rw)
infracvs:/group on /group type nfs
(rw,v2,rsize=4096,wsize=4096,soft,intr,udp,lock,addr=infracvs)
serov:/home on /home type nfs
(rw,v3,rsize=4096,wsize=4096,soft,intr,udp,lock,addr=serov)

sh-2.03# ps
PID Uid Stat Command
1 root S init
2 root S [keventd]
3 root S [ksoftirqd_CPU0]
4 root S [kswapd]
5 root S [bdflush]
6 root S [kupdated]
7 root S [mtdblockd]
8 root S [jffs2_gcd_mtd4]
102 root S dhcpcd
111 bin S portmap
113 root S [rpciod]
114 root S [lockd]
124 root S klogd
138 root S /usr/sbin/inetd
143 root S /www/sbin/sshd -f /www/etc/sshd_config
152 root S init
153 root S sh -login -i
158 root D ls -lR /home
159 root D ls -lR /home
179 root D ls -lR /home
183 root R ps

Part of output from Magic SysRq t with decoded symbols:

ls D C001EB8C 3216 165 153 (NOTLB)
Function entered at [<c001e990>] from [<c0109b14>]
schedule __rpc_execute

I've used /proc/sys/sunrpc/rpc_debug and /proc/sys/sunrpc/nfs_debug to
get some info, it was nothing interesting in it exept the fact that rpc
request wich was constantly reused after 'ls' stuck is appeared inthe
following message in the --rqstp- column.
sh-2.03# echo 1 > /proc/sys/sunrpc/rpc_debug
sh-2.03# dmesg -c -s 66666
-pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait
-action- --exit--
20429 0001 0000 000000 c0eda960 100003 c8f89218 00000000 <NULL>
c0105d5c 0
10052 0001 0000 000000 c0eda960 100003 c8f8918c 00000000 <NULL>
c0105d5c 0
06851 0001 0000 000000 c0eda960 100003 c8f89100 00000000 <NULL>
c0105d5c 0
00673 0004 0000 000000 c0eda960 100003 c8f89074 00000000 <NULL>
c0105d5c 0
00368 0000 0081 -00110 c0eda960 100003 0 00003000 nfs_flushd
c006e290 c006e3c8
00002 0000 0081 -00110 c0e310a0 100003 0 00003000 nfs_flushd
c006e290 c006e3c8

c006e290 t nfs_flushd
c006e3c8 t nfs_flushd_exit
c0105d5c t call_status

Trond Myklebust

unread,
Mar 20, 2003, 11:30:36 AM3/20/03
to Vladimir Serov, linux-kernel
>>>>> " " == Vladimir Serov <vse...@infratel.com> writes:

> interrupt handler for NIC, it's gone !!! IMHO this is due to
> the race in the nfs client.

Why would an NFS race show up only on PPC? Do you have a tcpdump?

Cheers,
Trond

Vladimir Serov

unread,
Mar 21, 2003, 4:32:54 AM3/21/03
to trond.m...@fys.uio.no, linux-kernel
Trond Myklebust wrote:

>>>>>>" " == Vladimir Serov <vse...@infratel.com> writes:
>>>>>>
>>>>>>
>
> > interrupt handler for NIC, it's gone !!! IMHO this is due to
> > the race in the nfs client.
>
>Why would an NFS race show up only on PPC? Do you have a tcpdump?
>
>

Hi , Trond
As I wrote , another persone has similar problems on PC's, as to me it
was a big suprise to see such a problem in nfs, cause i'm using it for
over 10 yers in a different setups's OS's , etc. Yes I have tcpdump ,
and as i wrote, nothing wrong is going on with packet receiption, where
is now corrupted packets , no error messages, NOTHING !!!! Just RPC
request gets lost, I mean not correctly connected to the some queue or
caller. It last for over a year , and is a big pain in the ass of
company i'm working for now.

With best regards, Vladimir.

Trond Myklebust

unread,
Mar 21, 2003, 6:17:30 AM3/21/03
to Vladimir Serov, linux-kernel
>>>>> " " == Vladimir Serov <vse...@infratel.com> writes:

> Trond Myklebust wrote:
>>>>>>> " " == Vladimir Serov <vse...@infratel.com> writes:
>>>>>>>

>>
>> > interrupt handler for NIC, it's gone !!! IMHO this is due to
>> > the race in the nfs client.
>>
>> Why would an NFS race show up only on PPC? Do you have a
>> tcpdump?
>>

> Hi , Trond As I wrote , another persone has similar problems on
> PC's, as to me it was a big suprise to see such a problem in

No that wasn't the same problem. IIRC that other person had faulty
hardware. To my knowledge, there are no outstanding problems with
hangs under 2.4.x.

> nfs, cause i'm using it for over 10 yers in a different
> setups's OS's , etc. Yes I have tcpdump , and as i wrote,
> nothing wrong is going on with packet receiption, where is now
> corrupted packets , no error messages, NOTHING !!!! Just RPC

Can I see that tcpdump in order to judge that for myself?

Cheers,
Trond

0 new messages