[slurm-dev] slurm on NFS for a cluster?

Jeff Layton

unread,

Mar 24, 2015, 3:44:00 PM3/24/15

to slurm-dev

Good afternoon,

I apologies for the newb question but I'm setting up slurm
for the first time in a very long time. I've got a small cluster
of a master node and 4 compute nodes. I'd like to install
slurm on an NFS file system that is exported from the master
node and mounted on the compute nodes. I've been reading
a bit about this but does anyone have recommendations on
what to watch out for?

Thanks!

Jeff

Paul Edmon

unread,

Mar 24, 2015, 3:50:02 PM3/24/15

to slurm-dev

Yup, that's exactly what we do. We make sure to export it read only and
make sure that it is synced and hard mounted. Not much else to it.

-Paul Edmon-

Uwe Sauter

unread,

Mar 24, 2015, 4:06:03 PM3/24/15

to slurm-dev

And if you are planning on using cgroups, don't use NFSv4. There are problems that cause the NFS client process to freeze (and
with that freeze the node) when the cgroup removal script is called.

Regards,

Uwe Sauter

Eduardo Almeida Costa

unread,

Mar 24, 2015, 4:09:57 PM3/24/15

to slurm-dev

You can tuning your NFS too.
We noted slow jobs in our clusters.
Our queue had a lot of jobs and due the RPC connections, the traffic was so
high and some jobs simple was taking too long time to complete.
An example for you:

http://www.emc.com/collateral/white-paper/h12712-wp-nfs-tuning-and-best-practices-for-ngs.pdf

Paul Edmon

unread,

Mar 24, 2015, 4:12:02 PM3/24/15

to slurm-dev

Interesting. Yeah we use v3 here. Hadn't tried out v4, and good thing
we didn't then.

-Paul Edmon-

Uwe Sauter

unread,

Mar 24, 2015, 4:18:18 PM3/24/15

to slurm-dev

Took me about 4 weeks to get behind that as the runtime or cgroup usage seemed to affect if the node actually froze. Short jobs
were ok but longer ones reliably caused the kernel to hang with those annoying "task didn't react for more than 120 sec"
messages. The effect was that Slurm wasn't able to communicate and to drain the node.

Uwe

Eduardo Almeida Costa

unread,

Mar 24, 2015, 4:18:28 PM3/24/15

to slurm-dev

Humm... I always feel lost about NFS4 or NFS3. I shall test this.

Jason Bacon

unread,

Mar 24, 2015, 4:22:57 PM3/24/15

to slurm-dev

I ran one of our CentOS clusters this way for about a year and found it
to be more trouble than it was worth.

I recently reconfigured it to run all system services from local disks
so that nodes are as independent of each other as possible. Assuming you
have ssh keys on all the nodes, syncing slurm.conf and other files is a
snap using a simple shell script. We only use NFS for data files and
user applications at this point.

Of course, if your compute nodes don't have local disks, that's another
story.

Jason

Christopher Samuel

unread,

Mar 24, 2015, 6:53:55 PM3/24/15

to slurm-dev

On 25/03/15 06:49, Paul Edmon wrote:

> Yup, that's exactly what we do. We make sure to export it read only and
> make sure that it is synced and hard mounted. Not much else to it.

Hmm, doesn't the sync NFS mount option only make a difference for writes
to an NFS mount, which isn't going to happen if it's also ro?

cheers!
Chris
--
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
http://www.vlsci.org.au/ http://twitter.com/vlsci

Christopher Samuel

unread,

Mar 24, 2015, 7:00:17 PM3/24/15

to slurm-dev

I've done this with Torque and Slurm for more than a decade without
many issues (see below for the biggest issue we've hit yet).

One thing we do to make things easier is to have Slurm installs
go into a version specific directory with the configuration in a
common directory, thus:

./configure --prefix=/usr/local/slurm/${slurm_ver} --sysconfdir=/usr/local/slurm/etc

We then have a symlink that points from /usr/local/slurm/latest to
our currently blessed version of Slurm, currently:

[root@merri-m SLURM]# ls -l /usr/local/slurm/latest
lrwxrwxrwx 1 root root 8 Dec 19 14:08 /usr/local/slurm/latest -> 14.03.11

The biggest issue we've hit was with "scontrol reconfigure"
where some nodes could report that their configuration file
does not exist (which is rubbish), so I suspect and NFS issue
there.

It used to work, so I suspect a kernel bug, but it's pretty
impossible for us to try to replicate without losing production
jobs. :-(

This is on RHEL6 FWIW - we did similar with Torque on everything
from RH7.3 (yes, pre RHEL), SLES9, SLES10, RHEL 3, 4 & 5 (we moved
to Slurm when we went to RHEL6).

All the best,

Trey Dockendorf

unread,

Mar 24, 2015, 7:21:18 PM3/24/15

to slurm-dev

I'd recommend avoiding NFSv4 as previously mentioned. We tried v4 and it caused lots of issues. I think the primary problem was our cgroup release agents were on NFSv4 and when jobs finished the delay from NFSv4 caused kernel locks which made nodes go unresponsive. Switched to v3 and issue was gone.

- Trey

Paul Edmon

unread,

Mar 24, 2015, 9:46:56 PM3/24/15

to slurm-dev

Yeah, we've been running CentOS 6 and slurm in this fashion for about a
year and a half on about a thousand machines and haven't really had a
problem with this. Though I don't know if this method scales
indefinitely. We just have a symlink back to our conf from
/etc/slurm/slurm.conf. We then control the version via RPM installs.

-Paul Edmon-

Paddy Doyle

unread,

Mar 26, 2015, 5:24:06 AM3/26/15

to slurm-dev

+1 for local installs.

We build the RPMs and put them in a local repo (Scientific Linux 6), and so
installing/upgrading via Salt/Puppet/Ansible etc is quite scalable.

It works for us, but of course YMMV.

Paddy

--
Paddy Doyle
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Phone: +353-1-896-3725
http://www.tchpc.tcd.ie/

Jason Bacon

unread,

Mar 26, 2015, 11:47:07 AM3/26/15

to slurm-dev

+1 for using package managers in general.

On our CentOS clusters, I do the munge and slurm installs using pkgsrc
(+ pkgsrc-wip).

http://acadix.biz/pkgsrc.php

I use Yum for most system services and for libraries required by
commercial software, and pkgsrc for all the latest open source software.

I rsync a small pkgsrc tree to the local drive of each node for munge,
slurm, and a few basic tools, and keep separate, more extensive tree on
the NFS share for scientific software.

The current slurm package is pretty outdated, but we'll bring it
up-to-date soon.

Regards,

Jason

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jason W. Bacon
jwb...@tds.net

If a problem can be solved,
there's no need to worry.

If it cannot be solved, then
worrying will do no good.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Paul Edmon

unread,

Mar 26, 2015, 11:51:11 AM3/26/15

to slurm-dev

Yeah, we use puppet and yum to manage our stack. Works pretty well and
scales nicely.

-Paul Edmon-

Reply all

Reply to author

Forward