[NFS] ESTALE and "getfh failed" problems with wildcard exports

1 view
Skip to first unread message

Anne Milicia

Apr 6, 2001, 3:33:40 PM4/6/01
I am seeing "rpc.mountd: getfh failed: Operation not permitted" and
ESTALE returns on clients with the newer utilities, 2.1 and above I
think, when the utilities manage to get multiple client (svc_client)
structures in the kernel for a sigle client machine - one with FQDN as
the cl_ident client name, and the other with the ip address string as
the cl_ident client name. This happens with wildcard and netgroup
exports and old stuff left around in /var/lib/nfs. (An export to all
and an export *.local.domain should cause it.)

For example:
[milicia@zaphod milicia]$ grep /proc/fs/nfs/exports
/nfs,root_squash,async,wdelay) #
/nfs/projects,root_squash,async,wdelay) #
/home1 dictionary.lowell.mclinux.com(rw,no_root_squash,async,wdelay) #
/home2 dictionary.lowell.mclinux.com(rw,no_root_squash,async,wdelay) #

When an nfsd finds the svc_client in the kernel for a file handle in
routine exp_getclient(), the lookup is by ip address rather than the
cl_ident client name. Both svc_client structures have the same ip
address, so the svc_client structure found is the first one on the
kernel global clnt_hash queue for the ip address. If NFS mounts are
working fine for a client machine with a single svc_client entry and
then rpc.mountd adds another svc_client entry for the same ip address
with a different export, suddenly the old NFS mounts start to get ESTALE
returns when their file handles start matching the new svc_client
structure without the old exports in the original svc_client.

It looks like a change to utils/mountd/auth.c
auth_authenticate_internal() with the comment /* First try it w/o doing
a hostname lookup... */ may be contributing to this problem, although
saving the lookup seems like a good idea.

The question is how best to solve this problem? The kernel might be
patched to fix the problem, the utilities might be forced to always use
the ip address string as the cl_ident to guarantee a consistent name, or
the utilities could be changed to always use the same utility nfs_client
structure to keep a consistent name. Since both exportfs and rpc.mountd
add exports to the kernel, they must be kept consistent with each
other. For the exportfs command that would mean reading the
/proc/fs/nfs/exports entries prior to processing any new command line
exports to find any existing client names.

Anybody have a fix for this problem or an opinion about the proper place
to fix this?

The attached utils patch seemed to keep the names consistent when
rebooting and then mounting, but it hasn't changed the exportfs order,
and is pretty kludgy so I'm looking for other suggestions. Or, better
yet, a fix.

Thanks for the help!

Reply all
Reply to author
0 new messages