[Lustre-discuss] filesystem UID' GID's

Brock Palen

unread,

Apr 11, 2008, 9:01:39 AM4/11/08

to lustre-...@lists.lustre.org

Does a /etc/passwd with all the filesystem users UID's required only
on the MDS ? Or does the OST's need them also?

Testing for me shows only the MDS, but I could be wrong.
We don't use LDAP or anything like that at the moment for UID GID
mapping.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985

_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Jakob Goldbach

unread,

Apr 11, 2008, 9:07:23 AM4/11/08

to Brock Palen, Lustre discuss

On Fri, 2008-04-11 at 09:01 -0400, Brock Palen wrote:
> Does a /etc/passwd with all the filesystem users UID's required only
> on the MDS ? Or does the OST's need them also?

MDS only. But this is not needed either if you

mds# echo NONE > proc/fs/lustre/mds/<fsname>-MDT0000/group_upcall

Then you only need uid/gid to be available on all clients.

/Jakob

D. Marc Stearman

unread,

Apr 11, 2008, 10:53:18 AM4/11/08

to Lustre discuss

On Apr 11, 2008, at 6:07 AM, Jakob Goldbach wrote:
>
> On Fri, 2008-04-11 at 09:01 -0400, Brock Palen wrote:
>> Does a /etc/passwd with all the filesystem users UID's required only
>> on the MDS ? Or does the OST's need them also?
>
> MDS only. But this is not needed either if you
>
> mds# echo NONE > proc/fs/lustre/mds/<fsname>-MDT0000/group_upcall
>
> Then you only need uid/gid to be available on all clients.
>
> /Jakob

Jakob is correct. The group_upcall is needed if you want to support
large numbers of secondary groups. We have users at LLNL that belong
to >16 groups, and the group_upcall is needed to support permissions
access with all those groups. I think by default lustre will only
check the first two groups you belong to.

If your users don't use additional groups, then you can do as Jakob
suggested.

-Marc

----
D. Marc Stearman
LC Lustre Administration Lead
ma...@llnl.gov
925.423.9670
Pager: 1.888.203.0641

Peter Avakian

unread,

Apr 11, 2008, 12:29:41 PM4/11/08

to lustre-...@lists.lustre.org

Hi,

Do you see any problem in having each compute node, within a grid,
acting as an OSS server via the separate IB channel on the fabric? My
compute nodes have built-in raid controllers.
Any feedback and comments are really appreciated.
Regards,
-Peter

Brian J. Murrell

unread,

Apr 11, 2008, 12:34:41 PM4/11/08

to lustre-...@lists.lustre.org

On Fri, 2008-04-11 at 20:29 +0400, Peter Avakian wrote:
>
> Do you see any problem in having each compute node, within a grid,
> acting as an OSS server via the separate IB channel on the fabric? My
> compute nodes have built-in raid controllers.

If by compute nodes you mean Lustre clients, then yes, this is a problem
and an unsupported configuration. The reason is because memory
pressures on a client/OSS machine can cause a deadlock.

The client tries to flush pages to an OST to relieve memory pressure.
An OST needs to allocate memory in order to process page flushes from a
client. If a client trying to relieve memory pressure tries to flush
pages to an OST on the same node, the OST will get failures trying to
allocate memory (which is already under pressure) to fulfill the request
from the client. Deadlock.

b.

signature.asc

rishi pathak

unread,

Apr 13, 2008, 2:35:55 PM4/13/08

to Peter Avakian, lustre-...@lists.lustre.org

By compute node if you mean a node which is a part of a compute cluster(parallel computing) then it would be a very bad idea.
I tried it on my 16 node cluster using both IB and ethernet.It always showed that nodes which were running OSS were over utilized and they were since user's reported this fact.
But in my case there was no raid controller.I was using a partition of the OS disk.
You may very well try it in experimental environment and run some benchmarking test's for both lustre(using compute node OSS and then other OSS) and lmbench , do it mix and match and compare the results.If you are successful then please repost the results.

--
Regards--
Rishi Pathak

Chris Worley

unread,

Apr 13, 2008, 5:06:26 PM4/13/08

to lustre-discuss

On Fri, Apr 11, 2008 at 10:34 AM, Brian J. Murrell
<Brian....@sun.com> wrote:
> On Fri, 2008-04-11 at 20:29 +0400, Peter Avakian wrote:
> >
> > Do you see any problem in having each compute node, within a grid,
> > acting as an OSS server via the separate IB channel on the fabric? My
> > compute nodes have built-in raid controllers.
>
> If by compute nodes you mean Lustre clients, then yes, this is a problem
> and an unsupported configuration. The reason is because memory
> pressures on a client/OSS machine can cause a deadlock.

What if either the compute node or (most likely) the OSS was in a VM
(and make sure to not overcommit processors)?

Chris

>
> The client tries to flush pages to an OST to relieve memory pressure.
> An OST needs to allocate memory in order to process page flushes from a
> client. If a client trying to relieve memory pressure tries to flush
> pages to an OST on the same node, the OST will get failures trying to
> allocate memory (which is already under pressure) to fulfill the request
> from the client. Deadlock.
>
> b.
>
>

Reply all

Reply to author

Forward