[slurm-dev] Jobs stuck in CF state

0 views
Skip to first unread message

John Hearns

unread,
Nov 27, 2015, 4:53:00 AM11/27/15
to slurm-dev

Yesterday I thought to some investigations of the suspend and resume scripts on my in-house test cluster.

 

As my Mum would have said ‘ “See what thought done…”

 

I have backed out of the changes to slurm.conf   (or have I … ??)

I have restarted slurm on the head node and all compute nodes.  Whacked up debug to 7  (not going to 11 just yet… )

When I start a job it just sits in the CF state, even a simple ‘srun hostname’

The slurmctld log says the job is allocated to a node, then nothing more.

 

In the words of Penelope Pitstop,  “Haaayulp”

 



Scanned by
MailMarshal - M86 Security's comprehensive email content security solution. 



Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP

Werner Saar

unread,
Nov 27, 2015, 10:27:09 AM11/27/15
to slurm-dev
Hi,

you can download http://sourceforge.net/projects/slurm-roll/files/release-6.2-15.08.4/slurm-roll.pdf
and read chapter 5. This is for a rocks cluster, but it may be, that the documentation helps.

Best regards
Werner

John Hearns

unread,
Nov 29, 2015, 9:26:05 AM11/29/15
to slurm-dev
Thankyou Werner.

The compute nodes were all in  idle~   state - I now tknow this means power down,
but the nodes were up and running.
I restarted slurm completely, and thisngs are OK now.
Reply all
Reply to author
Forward
0 new messages