CentOS7 run hiccups

52 views
Skip to first unread message

Tomasz Skowron

unread,
Feb 15, 2021, 6:13:04 PM2/15/21
to elasticluster
Hi Ricardo,

I am making some progress into the installation and setup. For compatibility reasons, I have a need to run the slurm on CentOS7. Few comments on my side:


minor adjustment is needed for the my.cnf file in the
elasticluster/share/playbooks/roles/slurm-master/tasks/db.yml

- name: Ensure InnoDB parameters are large enough for SLURM DBD (Ubuntu)
  tags:
    - slurm
    - slurmdbd
  blockinfile:
    # FIXME: Path is correct on Debian/Ubuntu; is it also for CentOS/RHEL?
    path: '/etc/mysql/my.cnf'
    state: present
    backup: yes
    insertafter: EOF
    content: |

      # See https://wiki.fysik.dtu.dk/niflheim/Slurm_database#id5
      [mysqld]
      innodb_buffer_pool_size=1024M
      innodb_log_file_size=64M
      innodb_lock_wait_timeout=900
  when: 'is_debian_or_ubuntu and not (is_debian_8_or_later or is_ubuntu_14_04_or_later)'


- name: Ensure InnoDB parameters are large enough for SLURM DBD (RHEL)
  tags:
    - slurm
    - slurmdbd
  blockinfile:
    # FIXME: Path is correct on Debian/Ubuntu; is it also for CentOS/RHEL?
    path: '/etc/my.cnf'
    state: present
    backup: yes
    insertafter: EOF
    content: |

      # See https://wiki.fysik.dtu.dk/niflheim/Slurm_database#id5
      [mysqld]
      innodb_buffer_pool_size=1024M
      innodb_log_file_size=64M
      innodb_lock_wait_timeout=900
  when: 'is_rhel7_compatible'

(alternatively, the /etc/my.cnf can be softlinked to /etc/mysql/my.cnf)

Additionally, default playbook does not set the permissions of the /etc/slurm/slurmdbd.conf to 600. Upon brief examination of the logs :

slurmdbd[15118]: fatal: slurmdbd.conf file /etc/slurm/slurmdbd.conf should be 600 is 444 accessible for group or others

(did not have the chance to find which playbook would be responsible for that, after manual fix, re-run of the elasticluster.sh was successfull.

There are minor errors related to FastSchedule and Openmpi:

Feb 15 22:53:17 slurm5-frontend001 slurmctld[15051]: error: Ignoring obsolete FastSchedule=1 option. Please remove from your configuration.
Feb 15 22:53:17 slurm5-frontend001 slurmctld[15051]: error: Translating obsolete 'MpiDefault=openmpi' option to 'MpiDefault=none'. Please update your configuration.

Apart from that, cluster is up and ready for testing.

I am going to be exploring slurm on openstack over the next couple of weeks, if need to test anything, happy to help.



Kind Regards,

Tomasz

Riccardo Murri

unread,
Feb 16, 2021, 11:03:02 AM2/16/21
to Tomasz Skowron, elasticluster
Hello Tomasz,

many thanks for the suggestions -- I've released a new Docker image for ElastiCluster which I think fixes them all.  Please let me know if you have time to test it.

Since you're using SLURM on CentOS 7, you might be interested in this PR which installs SLURM 20.02 instead of 18.02: https://github.com/elasticluster/elasticluster/pull/691

Thanks,
Riccardo

Tomasz Skowron

unread,
Feb 19, 2021, 12:37:30 PM2/19/21
to elasticluster
Hi,

I could test it and perhaps add/remove some of the ansible steps. on centos the environment-modules and the /etc/profile.d modification it makes, interferes with system default behaviour, affecting how "out of the box" cluster is.

I am able to successfully rebuild the docker container with the script in tools directory. I was wondering, if there was an easier way of getting modularity into the setup. I have not customised dockerfiles much, thus please correct me if I am wrong. I was thinking, apart from copy stuff from:

VOLUME /home/.ssh
VOLUME /home/.elasticluster

might be possible to add  mounted volume for the:

elasticluster/share/playbooks?

In any case, thank you for the help and thumbs up for the development efforts! Compared to manual installation of slurm, elasticluster is a breeze.


Tomasz

Tomasz Skowron

unread,
Feb 19, 2021, 12:42:22 PM2/19/21
to elasticluster
and yes, the mysql and permission fixes work!

Tomasz Skowron

unread,
Feb 23, 2021, 11:51:14 AM2/23/21
to elasticluster
Hi,

i had a look at the scripts and available options for introduction of modularity of the playbook sources.A minor adjustment to the elasticluster.sh file with:

if [ -d "$HOME/elasticluster/elasticluster/share" ]; then
    volumes="${volumes} -v $HOME/elasticluster/elasticluster/share:/home/elasticluster/share"
fi

gives exactly what I needed to start slimming down the playbooks to deploy only bare minimum of required roles easily without adding any other dependencies.

Kind Regards,

Tomasz
Reply all
Reply to author
Forward
0 new messages