[Lustre-discuss] rc -43: Identifier removed

361 views
Skip to first unread message

Per Lundqvist

unread,
Feb 11, 2008, 11:04:59 AM2/11/08
to Lustre Discuss
I got this error today when testing a newly set up 1.6 filesystem:

n50 1% cd /mnt/test
n50 2% ls
ls: reading directory .: Identifier removed

n50 3% ls -alrt
total 8
?--------- ? ? ? ? ? dir1
?--------- ? ? ? ? ? dir2
drwxr-xr-x 4 root root 4096 Feb 8 15:46 ../
drwxr-xr-x 4 root root 4096 Feb 11 15:11 ./

n50 4% stat .
File: `.'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: b438c888h/-1271347064d Inode: 27616681 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 1120/ faxen) Gid: ( 500/ nsc)
Access: 2008-02-11 16:11:48.336621154 +0100
Modify: 2008-02-11 15:11:27.000000000 +0100
Change: 2008-02-11 15:11:31.352841294 +0100

this seems to be happen almost all the time when I am running as a
specific user on this system. Note that the stat call always works... I
haven't yet been able to reproduce this problem when running as my own
user.

dmesg from client:

LustreError: 9000:0:(dir.c:406:ll_readdir()) error reading dir 27583921/2381382571 page 0: rc -43
LustreError: 9019:0:(dir.c:406:ll_readdir()) error reading dir 27583921/2381382571 page 0: rc -43
LustreError: 9020:0:(dir.c:406:ll_readdir()) error reading dir 27583921/2381382571 page 0: rc -43
LustreError: 9021:0:(dir.c:406:ll_readdir()) error reading dir 27583921/2381382571 page 0: rc -43
LustreError: 9022:0:(dir.c:406:ll_readdir()) error reading dir 4848481/4054352687 page 0: rc -43
LustreError: 9127:0:(file.c:2413:ll_inode_revalidate_fini()) failure -43 inode 27616681
LustreError: 9128:0:(file.c:2413:ll_inode_revalidate_fini()) failure -43 inode 27616681
LustreError: 9129:0:(file.c:2413:ll_inode_revalidate_fini()) failure -43 inode 27616681
...

where error 43 means: Identifier removed.

No error messages from the MDS or OSS:s.

setup:
Client: 2.6.9-55.0.9.EL_lustre.1.6.3smp (rhel4)
1 x MDS: 2.6.18-8.1.14.el5_lustre.1.6.4.2smp (rhel5)
4 x OSS with 2 OST:s each: 2.6.18-8.1.14.el5_lustre.1.6.4.2smp (rhel5)

thanks,
Per Lundqvist

--
Per Lundqvist

National Supercomputer Centre
Linköping University, Sweden

http://www.nsc.liu.se

Andreas Dilger

unread,
Feb 11, 2008, 4:11:45 PM2/11/08
to Per Lundqvist, Lustre Discuss
On Feb 11, 2008 17:04 +0100, Per Lundqvist wrote:
> I got this error today when testing a newly set up 1.6 filesystem:
>
> n50 1% cd /mnt/test
> n50 2% ls
> ls: reading directory .: Identifier removed
>
> n50 3% ls -alrt
> total 8
> ?--------- ? ? ? ? ? dir1
> ?--------- ? ? ? ? ? dir2
> drwxr-xr-x 4 root root 4096 Feb 8 15:46 ../
> drwxr-xr-x 4 root root 4096 Feb 11 15:11 ./
>
> n50 4% stat .
> File: `.'
> Size: 4096 Blocks: 8 IO Block: 4096 directory
> Device: b438c888h/-1271347064d Inode: 27616681 Links: 2
> Access: (0755/drwxr-xr-x) Uid: ( 1120/ faxen) Gid: ( 500/ nsc)
> Access: 2008-02-11 16:11:48.336621154 +0100
> Modify: 2008-02-11 15:11:27.000000000 +0100
> Change: 2008-02-11 15:11:31.352841294 +0100
>
> this seems to be happen almost all the time when I am running as a
> specific user on this system. Note that the stat call always works... I
> haven't yet been able to reproduce this problem when running as my own
> user.

EIDRM (Identifier removed) means that your MDS has a user database
(/etc/passwd and /etc/group) that is missing the particular user ID.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Steden Klaus

unread,
Feb 11, 2008, 7:18:31 PM2/11/08
to adi...@sun.com, pe...@nsc.liu.se, lustre-...@lists.lustre.org

Is this an error one would see on orphaned files with stat, ls -l, etc?

Klaus

Aaron Knister

unread,
Feb 11, 2008, 9:05:48 PM2/11/08
to Steden Klaus, adi...@sun.com, lustre-...@lists.lustre.org
I had the same issue with my lustre setup. I think this should fix it --

tunefs.lustre --param mdt.group_upcall=NONE /dev/mdt/device

Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aa...@iges.org

Per Lundqvist

unread,
Feb 12, 2008, 4:06:25 AM2/12/08
to Aaron Knister, adi...@sun.com, Steden Klaus, lustre-...@lists.lustre.org
On Mon, 11 Feb 2008, Aaron Knister wrote:

> I had the same issue with my lustre setup. I think this should fix it --
>
> tunefs.lustre --param mdt.group_upcall=NONE /dev/mdt/device

Thanks Andreas and Aaron, but then I wonder why the MDS needs to have all
the users in its own passwd/group file? And what are the implications of
setting the above mdt.group_upcall=NONE on the MDT?

/Per

Kit Westneat

unread,
Feb 12, 2008, 11:27:27 AM2/12/08
to Per Lundqvist, adi...@sun.com, Steden Klaus, lustre-...@lists.lustre.org
What the group upcall does is get all the secondary groups for the
client user. There isn't enough room in the LNET message to send them
all, so the MDS has to look it up in the /etc/groups. If you don't care
about secondary groups at all, there is no harm in clearing the
group_upcall param.

In theory, there also shouldn't be any harm in having different passwd
and group files on the MDS and OSSes than on the clients. It's highly
important, however, that all the clients have the same passwd and groups
files. Otherwise the clients could interpret the same UID as different
users, and people could go mucking around in each others files.

- Kit

Per Lundqvist wrote:
> On Mon, 11 Feb 2008, Aaron Knister wrote:
>
>
>> I had the same issue with my lustre setup. I think this should fix it --
>>
>> tunefs.lustre --param mdt.group_upcall=NONE /dev/mdt/device
>>
>
> Thanks Andreas and Aaron, but then I wonder why the MDS needs to have all
> the users in its own passwd/group file? And what are the implications of
> setting the above mdt.group_upcall=NONE on the MDT?
>
> /Per
>
>

> ------------------------------------------------------------------------

Per Lundqvist

unread,
Feb 14, 2008, 9:12:28 AM2/14/08
to Kit Westneat, adi...@sun.com, Steden Klaus, lustre-...@lists.lustre.org
On Tue, 12 Feb 2008, Kit Westneat wrote:

> What the group upcall does is get all the secondary groups for the client
> user. There isn't enough room in the LNET message to send them all, so the MDS
> has to look it up in the /etc/groups. If you don't care about secondary groups
> at all, there is no harm in clearing the group_upcall param.
>
> In theory, there also shouldn't be any harm in having different passwd and
> group files on the MDS and OSSes than on the clients. It's highly important,
> however, that all the clients have the same passwd and groups files. Otherwise
> the clients could interpret the same UID as different users, and people could
> go mucking around in each others files.

ok, thanks for clarifying this

Reply all
Reply to author
Forward
0 new messages