With PBS Pro, it was customary and necessary to modify /etc/init.d/pbs_mom to add ulimit commands shown in (a) below, but I haven’t seen a need for that with SLURM yet. Recently there are some users at my organization who are focusing on process-limit issues as potential sources of application failures.
I was wondering if anyone has found need to be concerned with change (a) below, if change (b) has already been taken care of? In the user reports, I think they are using OpenMPI 1.6.5 and did not include the –without-slurm configuration, so I’m not certain of the launching mechanism for the children processes, but I think it may leverage SLURM as opposed to using ssh.
(a) adding the following lines to /etc/init.d/slurm
ulimit -l unlimited
ulimit -s unlimited
(b) adding these lines to /etc/security/limits.conf
* soft memlock unlimited
* hard memlock unlimited
* soft stack unlimited
* hard stack unlimited
* soft nofile 1000000
* hard nofile 1000000
Thanks for your help,
Ed