cd /root/installations/tar --bzip -x -f slurm-20.11.5.tar.bz2cd slurm-20.11.5/./configure --enable-debug --prefix=/usr/local --sysconfdir=/usr/local/etcmakemake install
mkdir /var/spool/slurmctld /var/log/slurmchown johnsy /var/spool/slurmctldchown johnsy /var/log/slurmchmod 755 /var/spool/slurmctld /var/log/slurmcp /var/run/slurmctld.pid /var/run/slurmd.pidtouch /var/log/slurm/slurmctld.logchown johnsy /var/log/slurm/slurmctld.logtouch /var/log/slurm/slurm_jobacct.log /var/log/slurm/slurm_jobcomp.logchown johnsy /var/log/slurm/slurm_jobacct.log /var/log/slurm/slurm_jobcomp.logldconfig -n /usr/lib64
srun /proc/cpuinfo
srun: error: Unable to allocate resources: Unable to contact slurm controller (connect failure)
############################################################################################################################################################################################################# slurm.conf file generated by configurator.html.# Put this file on all nodes of your cluster.# See the slurm.conf man page for more information.#SlurmctldHost=homepc#SlurmctldHost=##DisableRootJobs=NO#EnforcePartLimits=NO#Epilog=#EpilogSlurmctld=#FirstJobId=1#MaxJobId=999999#GresTypes=#GroupUpdateForce=0#GroupUpdateTime=600#JobFileAppend=0#JobRequeue=1#JobSubmitPlugins=1#KillOnBadExit=0#LaunchType=launch/slurm#Licenses=foo*4,bar#MailProg=/bin/mail#MaxJobCount=5000#MaxStepCount=40000#MaxTasksPerNode=128MpiDefault=none#MpiParams=ports=#-##PluginDir=#PlugStackConfig=#PrivateData=jobsProctrackType=proctrack/cgroup#Prolog=#PrologFlags=#PrologSlurmctld=#PropagatePrioProcess=0#PropagateResourceLimits=#PropagateResourceLimitsExcept=#RebootProgram=ReturnToService=1SlurmctldPidFile=/var/run/slurmctld.pidSlurmctldPort=6817SlurmdPidFile=/var/run/slurmd.pidSlurmdPort=6818SlurmdSpoolDir=/var/spool/slurmdSlurmUser=johnsy#SlurmdUser=root#SrunEpilog=#SrunProlog=StateSaveLocation=/var/spoolSwitchType=switch/none#TaskEpilog=TaskPlugin=task/affinity#TaskProlog=#TopologyPlugin=topology/tree#TmpFS=/tmp#TrackWCKey=no#TreeWidth=#UnkillableStepProgram=#UsePAM=0### TIMERS#BatchStartTimeout=10#CompleteWait=0#EpilogMsgTime=2000#GetEnvTimeout=2#HealthCheckInterval=0#HealthCheckProgram=InactiveLimit=0KillWait=30#MessageTimeout=10#ResvOverRun=0MinJobAge=300#OverTimeLimit=0SlurmctldTimeout=120SlurmdTimeout=300#UnkillableStepTimeout=60#VSizeFactor=0Waittime=0### SCHEDULING#DefMemPerCPU=0#MaxMemPerCPU=0#SchedulerTimeSlice=30SchedulerType=sched/backfillSelectType=select/cons_tresSelectTypeParameters=CR_Core### JOB PRIORITY#PriorityFlags=#PriorityType=priority/basic#PriorityDecayHalfLife=#PriorityCalcPeriod=#PriorityFavorSmall=#PriorityMaxAge=#PriorityUsageResetPeriod=#PriorityWeightAge=#PriorityWeightFairshare=#PriorityWeightJobSize=#PriorityWeightPartition=#PriorityWeightQOS=### LOGGING AND ACCOUNTING#AccountingStorageEnforce=0#AccountingStorageHost=#AccountingStoragePass=#AccountingStoragePort=AccountingStorageType=accounting_storage/none#AccountingStorageUser=AccountingStoreJobComment=YESClusterName=cluster#DebugFlags=#JobCompHost=#JobCompLoc=#JobCompPass=#JobCompPort=JobCompType=jobcomp/none#JobCompUser=#JobContainerType=job_container/noneJobAcctGatherFrequency=30JobAcctGatherType=jobacct_gather/noneSlurmctldDebug=info#SlurmctldLogFile=SlurmdDebug=info#SlurmdLogFile=#SlurmSchedLogFile=#SlurmSchedLogLevel=### POWER SAVE SUPPORT FOR IDLE NODES (optional)#SuspendProgram=#ResumeProgram=#SuspendTimeout=#ResumeTimeout=#ResumeRate=#SuspendExcNodes=#SuspendExcParts=#SuspendRate=#SuspendTime=### COMPUTE NODESNodeName=localhost CPUs=12 Sockets=1 CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWNPartitionName=debug Nodes=localhost Default=YES MaxTime=INFINITE State=UP##################################################################################################################################################################################################################################################################################################################
I don't have an active support contact. I just started learning slurm by installing it on my fedora machine. This is the first time I am installing and experimenting with slurm kind of software.
I did: systemctl start slurmctld and got this message: Failed to start slurmctld.service: Unit slurmctld.service not found.
systemctl start slurmd
Failed to start slurmd.service: Unit slurmd.service not found.
Similar to slurmctrld
Yes. Here is the status:[johnsy@homepc ~]$ systemctl status mungemunge.service - MUNGE authentication serviceLoaded: loaded (/usr/lib/systemd/system/munge.service; enabled; vendor preset: disabled)Active: active (running) since Mon 2021-04-19 07:49:13 EDT; 13min ago <-- it is always enabled after restart. This log is just after a restart.Docs: man:munged(8)Process: 1070 ExecStart=/usr/sbin/munged (code=exited, status=0/SUCCESS)Main PID: 1072 (munged)Tasks: 4 (limit: 76969)Memory: 1.4MCPU: 8msCGroup: /system.slice/munge.service└─1072 /usr/sbin/munged
I tried to follow some instructions mentioned in: https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#copy-slurm-conf-to-all-nodesI thought, as I am installing the slurm as root, the user "johnsy" has to have ownership permissions.
I tried : srun hostname and got the following error message:
srun: error: Unable to allocate resources: Unable to contact slurm controller (connect failure)
Also tried:systemctl status slurmctldUnit slurmctld.service could not be found.
You'll definitely need to get slurmd and slurmctld working before proceeding further. slurmctld is the Slurm controller mentioned when you do the srun.
Though there's probably some other steps you can take to make the slurmd and slurmctld system services available, it might be simpler to do the rpmbuild and rpm commands listed on https://slurm.schedmd.com/quickstart_admin.html , right below the instructions you were following. Those two commands will both run steps 3-8 of your original procedure, and will almost definitely put the systemd service files in the correct location.
From:
slurm-users <slurm-use...@lists.schedmd.com> on behalf of Johnsy K. John <johns...@gmail.com>
Date: Monday, April 19, 2021 at 7:18 AM
To: Slurm User Community List <slurm...@lists.schedmd.com>, fzil...@lenovo.com <fzil...@lenovo.com>, johnsy john <johns...@gmail.com>
Subject: Re: [slurm-users] [External] Slurm Configuration assistance: Unable to use srun after installation (slurm on fedora 33)
External Email Warning
This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.