Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Slurm, cgroups, and emspring

45 views
Skip to first unread message

David Hoover

unread,
Apr 28, 2017, 9:56:18 AM4/28/17
to emspring
I've installed emspring and have it running.  However, we are using Slurm as our batch system, and Slurm uses cgroups (control groups) to pin processes to a limited set of cpus.  If I allocate a single cpu, then spring will launch and things are ok.  If I allocate multiple cpus, however, spring stalls indefinitely.  If I cheat, by allocating all the cpus of a node as an interactive session, then going around and ssh-ing directly to the node, I can launch spring and everything is good.

What is spring doing at the start that would prevent it from launching?  Please note that we can't turn off cgroups for a single job.

David Hoover

unread,
May 2, 2017, 9:43:14 AM5/2/17
to emspring
Just to let you know I am making progress, I can now get spring to start up on a Slurm interactive node with multiple processors.  I got a micexam job to complete with 4 cpus in an MPI run.  However, I got an error at the end:

Traceback (most recent call last):
  File "/usr/local/apps/SPRING/spring_v0-84-1470/lib/python2.7/site-packages/emspring-0.84.1470-py2.7.egg/spring/csinfrastr/csgui.py", line 800, in jobDone
    os.rename(self.prgparfile, self.directory + os.sep + 'parameters.par')
OSError: [Errno 2] No such file or directory

Is this a serious error or just noise?

David Hoover

unread,
May 2, 2017, 9:45:55 AM5/2/17
to emspring
Oh, and here is another oddity:

$ spring --version && springenv e2version.py
Spring environment loaded.
GUI from package Emspring-0.84.1470
Spring environment loaded.
EMAN 2.1 alpha2 (CVS 2013/08/07 17:01:09)
Your EMAN2 is running on: 
Traceback (most recent call last):
  File "/usr/local/apps/SPRING/spring_v0-84-1470/parts/EMAN2/bin/e2version.py", line 90, in <module>
    main()
  File "/usr/local/apps/SPRING/spring_v0-84-1470/parts/EMAN2/bin/e2version.py", line 53, in main
    print 'Your EMAN2 is running on: ', result.split('"')[1], os.uname()[2], os.uname()[-1]    
IndexError: list index out of range


On Friday, April 28, 2017 at 9:56:18 AM UTC-4, David Hoover wrote:

Carsten Sachse

unread,
May 2, 2017, 3:41:07 PM5/2/17
to emspring
Thank you for pointing out the issue. It seems that later OS are not compatible with the e2version.py anymore. 

I will update the info to be reported:
% spring --version && springenv python -c 'import platform; print platform.platform()'

Best wishes,


Carsten

Carsten Sachse

unread,
May 2, 2017, 3:44:02 PM5/2/17
to emspring
Dear David,

Indeed, this is just noise as it is printed from the GUI when the running directory has been renamed while the job was running. Nothing to worry about if your actual job correctly finished.

Best wishes,


Carsten
Reply all
Reply to author
Forward
0 new messages