I'm working on getting slurm running on a small academic cluster. I looked through the archives and didn't really see an answer to the following questions. In advance, I apologize for the naivety of these questions - I'd be grateful for any suggestions.
(1) There needs to be a user, "slurm" on each of the nodes that slurm will allocate jobs to (and the controller node). Can this be done via NIS/yp, or should I manually create a user/group on each machine with commands like,
groupadd --gid 777 slurm
useradd -g 777 -u 777 slurm
I just grabbed this example command from the list archives. Is there something special about the 777 group (I don't think so, but...).
Along these lines, should the slurm "user" have login privileges or a shared home directory (this doesn't sound much like the munge "user")? Right now the corresponding lines in /etc/passwd looks like,
munge:x:102:157:Runs Uid 'N' Gid Emporium:/var/run/munge:/sbin/nologin
slurm:x:777:777::/home/slurm:/bin/bash
(2) Again, I'm an inexperienced sysadmin. With most services/deamons, I normally use service httpd restart and chkconfig httpd on (with apache for example) to start and "start on boot" a service. So far as I can tell, although the slurmd/slurmctld "make install" just fine on my RHEL58 systems, they don't seem to be setup with the "service" (/etc/init.d?) system.
How do people normally start/stop/start on boot the slurm system? Is there a configuration stem I'm missing?
Thanks for reading - it seems like a wonderful system that I'm looking forward to having available.
regards,
Nathan Moore
Winona, MN