Submit additional jobs from within a container

2,388 views
Skip to first unread message

Brian Puchala

unread,
Jul 16, 2017, 2:44:00 PM7/16/17
to singularity
Hi,

I'm trying to familiarize myself with how Singularity might work for our application.  We have components that submit additional jobs through the host job manager (TORQUE or SLURM). Is it possible to run these within their own container?  Is there an example that shows how?

Thanks!

Oleksandr Moskalenko

unread,
Jul 16, 2017, 2:52:27 PM7/16/17
to singu...@lbl.gov
Hi Brian,

Someone will likely tell your about any native scheduler support. I just wanted to note that it is possible to submit Torque or SLURM jobs (we did this with SLURM) from within Singularity containers without any additional special or native support. You can bind-mount the SLURM directory tree inside the container, which would automatically provide the access to the up-to-date config, and place SLURM bin directory in the $SINGULARITYENV_PATH. Once that's done applications that create their own workflows can now submit jobs. 

Regards,

Alex

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

vanessa s

unread,
Jul 16, 2017, 3:13:54 PM7/16/17
to singu...@lbl.gov
+1! And if you have some other method for already doing this, you can use the same thing (and think of singularity just as the executable being called in the submission script).

To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.



--
Vanessa Villamia Sochat
Stanford University '16

Paolo Di Tommaso

unread,
Jul 16, 2017, 6:47:20 PM7/16/17
to singu...@lbl.gov
A better approach is to separate the application logic from the scheduling logic, by doing that you will be able to isole your job executions with singularity and submit them to SLURM or any other cluster. 

Hope it helps. 


vanessa s

unread,
Jul 16, 2017, 7:03:28 PM7/16/17
to singu...@lbl.gov
yes huuuge +1! If you think about the singularity container like any other executable, this would do the trick :)

To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Brian Puchala

unread,
Jul 17, 2017, 2:06:32 AM7/17/17
to singu...@lbl.gov
Thank you for the response, but I'm not sure I understand it. I can bind-mount directories from the host, and I can read the files inside the directories, via 'cat' or 'vi' or what have you, but I can't execute them. So if I have a shell script or Python subprocess call to 'qsub', for instance, it always says file not found. Should I be able to do that? Or are you describing a different approach?

Also, I've installed vagrant and singularityware/singularity-2.3.1 as instructed here: http://singularity.lbl.gov/install-mac, but using $SINGULARITYENV_PATH doesn't affect my PATH inside the container. I can set other variables via $SINGULARITYENV_WHATEVER, but not PATH.

Thanks again for the help,
Brian


To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to a topic in the Google Groups "singularity" group.
To unsubscribe from this topic, visit https://groups.google.com/a/lbl.gov/d/topic/singularity/syLcsIWWzdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to singularity+unsubscribe@lbl.gov.



Brian Puchala

unread,
Jul 17, 2017, 2:15:09 AM7/17/17
to singu...@lbl.gov
Thank you for the response.  The purpose of a significant part of our software package is to decide what jobs are necessary and submit them. I imagine this is not such an unusual potential use case.

Cheers,
Brian


On Sun, Jul 16, 2017 at 6:46 PM, Paolo Di Tommaso <paolo.d...@gmail.com> wrote:

--
You received this message because you are subscribed to a topic in the Google Groups "singularity" group.
To unsubscribe from this topic, visit https://groups.google.com/a/lbl.gov/d/topic/singularity/syLcsIWWzdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to singularity+unsubscribe@lbl.gov.



--
>>
>> whoami
Brian Puchala
Assistant Research Scientist
Materials Science and Engineering
University of Michigan
>>

Kim Wong

unread,
Feb 7, 2018, 4:33:07 PM2/7/18
to singularity
Hi Brian,

Did you ever find a solution to this question?  This is a functionality we would like to use as well.  Thanks.


On Monday, July 17, 2017 at 2:15:09 AM UTC-4, Brian Puchala wrote:
Thank you for the response.  The purpose of a significant part of our software package is to decide what jobs are necessary and submit them. I imagine this is not such an unusual potential use case.

Cheers,
Brian
On Sun, Jul 16, 2017 at 6:46 PM, Paolo Di Tommaso <paolo.d...@gmail.com> wrote:
A better approach is to separate the application logic from the scheduling logic, by doing that you will be able to isole your job executions with singularity and submit them to SLURM or any other cluster. 

Hope it helps. 


On Sun, Jul 16, 2017 at 8:44 PM, Brian Puchala <bpuc...@umich.edu> wrote:
Hi,

I'm trying to familiarize myself with how Singularity might work for our application.  We have components that submit additional jobs through the host job manager (TORQUE or SLURM). Is it possible to run these within their own container?  Is there an example that shows how?

Thanks!

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to a topic in the Google Groups "singularity" group.
To unsubscribe from this topic, visit https://groups.google.com/a/lbl.gov/d/topic/singularity/syLcsIWWzdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to singularity...@lbl.gov.

Jonathan Greenberg

unread,
Feb 16, 2018, 12:36:08 PM2/16/18
to singularity
I had a similar question about a month ago that we didn't quite get figured out:


 -- a job running within a singularity container that passes an sbatch command (similar to qsub) to the "global" scheduler.  It has to do with the container interacting with its "global" environment -- we can mount files within the container, but haven't figure out how to submit them.

I think one of the issues that came up is how it might be bad practice for portable code but, in our case, we don't necessarily care about sharing the container with someone else -- our HPC REQUIRES we use singularity, but we need to have a singularity container create jobs and then submit them.  In my case, what I end up doing is having the container build the jobs, but then I have to manually (in the "outside" environment) submit them.

I think the basic request is allowing a container to execute something (anything) in the "containing" environment as if the user was typing it on the command line in that environment.

Bennet Fauber

unread,
Feb 16, 2018, 12:43:33 PM2/16/18
to singu...@lbl.gov
Wouldn't you need to configure the container a slurm/torque what have
you client? If, internal to the container, it can run the slurm
client commands, and it knows the correct scheduler node name,
wouldn't that work? That seems like it would be required to stick
with the purpose of containing the application to the environment
inside the container. What am I missing?

Brian Puchala

unread,
Feb 16, 2018, 12:47:40 PM2/16/18
to singularity
I never figured this out and gave up.  If someone gives an easy to use example of how to call the "sbatch" or "qsub" executable of the host from within the container I would try singularity again.

Jonathan Greenberg

unread,
Feb 16, 2018, 12:59:53 PM2/16/18
to singularity
Basically, I wasn't able to figure out how to do that (sounds like neither did Brian) -- how do I call an "external" sbatch with all the needed environment variables?  

A straightforward example would be perfect, if someone has solved this!

Jason Stover

unread,
Feb 16, 2018, 1:33:38 PM2/16/18
to singu...@lbl.gov
I was able to with slurm ... I did an install of the slurm package
into the container and pointed the configuration at the scheduler.

If you have, for example, Torque installed somewhere like:
/software/torque/[version] ... you could bind mount that to the
container. I think the only variable that needs set is PBS_SERVER or
PBS_MASTER ...

Once that's set, you should be able to talk to it ... though if you're
bind mounting the install location, and the spool directory if needed,
then you shouldn't need to set the variable as it'll be read from the
config in the spool.

-J

Jason Stover

unread,
Feb 16, 2018, 1:48:49 PM2/16/18
to singu...@lbl.gov
As an addition ...

Having the scheduler in the container, installed or bind mount, should
"just work" if the nodes are set up to be able to submit jobs. Since
we don't have Network Namespaces, we're using the host network, so the
scheduler will see the traffic as coming from the host and _should_ be
allowed.

You'd also want to be sure the workdir is consistent between the path
in the container and outside. So if you're submitting from a directory
`/scratch/job_setup/` in the container ... that should exist outside
the container, or pass an option when submitting to the correct
location (i.e. `-d` in Torque).

-J

Bennet Fauber

unread,
Feb 16, 2018, 1:54:49 PM2/16/18
to singu...@lbl.gov
That's pretty much exactly what I was thinking should work, Jason.
Thanks for saying it more elegantly and concisely than I could. :-)

Kim Wong

unread,
Feb 16, 2018, 2:05:02 PM2/16/18
to singularity
After some trial and error last week, I got it working under SLURM.  The Singularity file is attached.  It was adapted from the centos example.  Originally, I attempted to install SLURM and munge within the container.  Package installation was a success but getting munge working was a no go.  Trying to get systemd to initialize the process was prohibited.  In the end, the comment about bind paths lead down the useful path.

To build the container, issue

sudo $(which singularity) build centos.sim singularity-centos

To test the container, issue

$(which singularity) shell --bind /etc/munge --bind /var/log/munge --bind /var/run/munge --bind /usr/bin --bind /etc/slurm --bind /usr/lib64 --bind /ihome/crc/wrappers --bind /ihome/crc/install centos.sim --login

You will need to adapt the above bind paths to your specific host environment.  The --login at the end is important for initializing our lmod environment as the inits are done within profile.d.  In the singularity file, the slurm and munge user creation parts should use the appropriate gid/uid corresponding to what you have on the host.  The yum install for lua-* was needed for lmod.  I tested everything by submitting an Amber MPI job from within the container.

Good luck.  You might be able to adapt this to Ubuntu but I have not tried.  Our host OS is RHEL7, so I might have benefited from "free" compatibility by selecting CentOS for the image.



singularity-centos

Brian Puchala

unread,
Feb 24, 2018, 10:15:56 AM2/24/18
to singularity
With Bennet's help I was able to submit jobs from inside a test container. So in case it's helpful here's how I did it. The main difference is that it's not modifying the image.

On our cluster singularity is configured to mount our home directory by default (in singularity.conf: mount home = yes), so starting from there:

# get a CentOS 7 image (because that's what our cluster is running it minimizes the number of libs that differ):

singularity pull docker://centos:7


# create a place for the host programs I need and copy them over

# because our home directories are always bound no additional

# bind commands are needed when starting the container

mkdir -p ~/.local/host/bin

cp /usr/local/bin/qstat ~/.local/host/bin

# etc.


# ends up with:

$ tree /home/bpuchala/.local/host/bin
/home/bpuchala/.local/host/bin
├── qalter
├── qdel
├── qhold
├── qrls
├── qselect
├── qstat
└── qsub


# create a place for the host libs I need:

mkdir -p ~/.local/host/lib


# use ldd on the programs inside and outside the container to 

# find which libraries need to be copied. 


# inside container

singularity shell centos-7.img

> ldd /home/bpuchala/.local/host/bin/qstat

> ldd /home/bpuchala/.local/host/bin/qstat
linux-vdso.so.1 =>  (0x00007ffc5cf42000)
libtorque.so.2 => not found
libtcl8.5.so => not found
... 

# outside container:
$ ldd /usr/local/bin/qstat
linux-vdso.so.1 =>  (0x00007fff90391000)
libtorque.so.2 => /usr/local/lib/libtorque.so.2 (0x00002b2567edb000)
libtcl8.5.so => /lib64/libtcl8.5.so (0x00002b2568808000)
...
cp   /usr/local/lib/libtorque.so.2 ~/.local/host/lib
# etc.


# Ends up with:

$ tree /home/bpuchala/.local/host/lib
/home/bpuchala/.local/host/lib
├── libhwloc.so.5
├── libltdl.so.7
├── libnuma.so.1
├── libtcl8.5.so
└── libtorque.so.2

# test job submission and management from inside container:

singularity shell centos-7.img


# set PATH inside the container to find host programs we copied

export PATH=$HOME/.local/host/bin:$PATH


# set LD_LIBRARY_PATH inside the container find host libs we copied

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.local/host/lib


# test

qstat

qsub ~/submit_scripts/hello_world.sh

Paolo Di Tommaso

unread,
Feb 24, 2018, 10:35:46 AM2/24/18
to singu...@lbl.gov
What's the advantage of this approach? would not be much easier just run `qsub singuarity exec ..etc` instead of submitting the from within the container? 



--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Bennet Fauber

unread,
Feb 24, 2018, 10:49:43 AM2/24/18
to singu...@lbl.gov
Paolo,

The container is persistent and the jobs are transient. The end goal
is to have additional software inside the container that evaluates
many jobs run and determines which jobs should be run next, creates
the submission script(s), and submits them. That is iterative.

The container is, essentially, acting as an interactive 'shell' for
managing a workflow.

I think. Brian will correct me if I have misinterpreted.
>> email to singularity...@lbl.gov.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to singularity...@lbl.gov.

v

unread,
Feb 24, 2018, 11:14:00 AM2/24/18
to singu...@lbl.gov
To eat his own... there's no wrong way to eat a reeses! :)


>
>
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an


--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Paolo Di Tommaso

unread,
Feb 24, 2018, 11:18:05 AM2/24/18
to singu...@lbl.gov
Surely it's my fault but still I don't see why this cannot be done having `qsub` to launch singularity and not the other way around. 

p


>
>
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an


--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Bennet Fauber

unread,
Feb 24, 2018, 11:25:37 AM2/24/18
to singu...@lbl.gov
Maybe it can be, but it still needs to be able to qsub jobs of its
own, monitor their progress, delete jobs that are going astray, etc.

It's a job manager; it needs to manage jobs running on the host
cluster outside itself.




On Sat, Feb 24, 2018 at 11:17 AM, Paolo Di Tommaso
>> >> email to singularity...@lbl.gov.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "singularity" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to singularity...@lbl.gov.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "singularity" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to singularity...@lbl.gov.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to singularity...@lbl.gov.

Brian Puchala

unread,
Feb 24, 2018, 11:26:06 AM2/24/18
to singularity
A primary purpose of our software is to decide which jobs need to be run, set up input files, and collect results so we don't have to do that ourselves.

Kim Wong

unread,
Feb 24, 2018, 12:01:09 PM2/24/18
to singularity
The ability to submit jobs within a container is important for certain domain applications where future job submissions depend on the progress of the currently running job.  This capability is useful for adaptive parameter space sampling, for example, where you want to automatically launch addition jobs from within a running job so as to get coverage over a region that has insufficient sampling.  Typically, these additional jobs are launched not at the conclusion of the running job and not at uniform intervals but at arbitrary intervals determined adaptively based on some built-in logic.  Furthermore, the additional jobs that were launched within a container can themselves launch jobs, etc.  This cascade of job submissions automates a lot of the grunt work, so that a research can start the project, walk away for coffee or to another project, and come back later to do the analysis workup when the data becomes available.

Also, if you can tailor each job to be short (< 4 hours, for example), you can take advantage of an HPC center's backfill capability for cycles that would otherwise go wasted in the drain state while preparing for jobs with long wallclock requirements.



Jeff Kriske

unread,
Feb 25, 2018, 9:07:33 AM2/25/18
to singularity
I'll chime in at how I've done this...

Our environment uses environment modules and univa grid engine (uge). I simply bind the installation directory and the directory containing the module file. From within the container, as long as I've sourced /etc/profile I can module load our cluster setup module and start using qsub inside the container from the host without any issue.

The main point isn't necessarily to run qsub manually from within the container, many workflows have built libraries assuming qsub is available or DRMAA libraries are present with the correct configs. Running qsub outside the container would break these kinds of workflows and wouldn't make sense.

Jonathan Greenberg

unread,
May 21, 2018, 12:39:35 PM5/21/18
to singularity
We're still trying to get a definition file working "right" (see my latest post) but we are seeing a promising lead by simply mapping the slurm commands to an ssh remote execution command, e.g.:

  alias squeue="ssh $USER@$HOSTNAME squeue"

As long as you setup ssh keypairs, and your host system allows this, this will work.  

I'm trying to get this formalized in a definition file.  Any suggestions would be welcome.

--j

Jonathan Greenberg

unread,
May 22, 2018, 11:35:49 AM5/22/18
to singularity
So we solved this as follows:
1) Create a file and make it executable named after the slurm command with e.g.:

echo '#!/bin/bash
ssh $USER@$HOSTNAME squeue $1' >> /usr/local/bin/squeue

2) chmod 755 /usr/local/bin/squeue

This solved it!  Note that you need passwordless ssh and it assumes you are running this on the same system with slurm ($USER and $HOSTNAME), but could easily be modified.

Here's a full definition file for this solution:


--j

Matthew Strasiotto

unread,
Apr 24, 2020, 6:55:13 AM4/24/20
to singularity
The link you gave is dead now - Here's a permalink https://github.com/gearslaboratory/gears-singularity/blob/d7823b98e4c823a8747b048f6f53e2b3d4f061d5/singularity-definitions/development/Singularity.gears-rfsrc-openmpi#L90-L153

I think this is one of my favourite solutions to this problem, because it circumvents a LOT of the complexity & caveats implicit in the binding solutions, and it seems to "just work".

I'd probably suggest a little bit of `bash` metaprogramming to make this a bit more DRY- 

Something like 

```bash
function invoke_host {
  host_exe=$1
  shift
  echo \
    '#!/bin/bash
     ssh $USER@$HOSTNAME ' "$host_exe" ' $@' >> /usr/local/bin/${host_exe}

  chmod +x /usr/local/bin/${host_exe}
}

# For example:

invoke_host qsub

# etc

```

(Untested - I'll update it when I see how it performs)

Matthew Strasiotto

unread,
Apr 24, 2020, 8:32:22 AM4/24/20
to singularity
Here's how to make that a little less tedious to implement :) 
invoke_host.bash

Alan Sill

unread,
Apr 24, 2020, 9:08:14 AM4/24/20
to singu...@lbl.gov, Alan Sill
If you do something like this, you probably want to set up a restricted set of keys to use for this purpose. This is easy to do by prepending the command that is allowed for that specific key to the line for it in the authorized_keys file with “command=…” option, forcing the key to be limited to executing the given command. You can also use “from=…” to ensure this is coming from your host and not just anywhere. The full list of things you can force to be in effect when that key is used to connect via ssh is available in the man pages for sshd (man 8 ssh and search for the proper section by typing /AUTHORIZED_KEYS (return)).

Note you can make the forced command to be a wrapper script that sets up the logic of what you are trying to do, including locations fo input and output files, etc. Just be sure not to invoke a command that contains any ability to escape to a shell, which would eliminate he advantage of using the forced command. 

There’s more on this topic in chapter 8 of the”Reilly book "SSH, The Secure Shell: The Definitive Guide” by  Daniel Barrett and Richard Silverman, or chapter 12 of "“SSH Mastery: OpenSSH, PuTTY, Certificates, Tunnels, and Keys” by Michael W. Lucas. Note I am not advocating for ssh automation in this case, but if you’re going to use it, it would probably be good to use some of he features of ssh to set up a more restricted environment rather than leave keys around that can get into your account if they get loose in the world.

Alan


--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Matthew Strasiotto

unread,
Apr 28, 2020, 1:42:37 AM4/28/20
to singularity, alan...@nsfcac.org
That's broadly speaking good advice, although in this instance, the keypair likely both live in the same host's ~/.ssh , so you just need to add your own ~/.ssh/id_rsa.pub to your ~/.ssh/authorized_keys .

I do not view this as an appreciable vulnerability, at least in my use-case, as the singularity container will be running on the HPC host, and the host's $HOME directory is mounted on the container's $HOME directory, which implies that if a bad actor to access this private key, they would already have access to your user account.

On Friday, 24 April 2020 23:08:14 UTC+10, Alan Sill wrote:
If you do something like this, you probably want to set up a restricted set of keys to use for this purpose. This is easy to do by prepending the command that is allowed for that specific key to the line for it in the authorized_keys file with “command=…” option, forcing the key to be limited to executing the given command. You can also use “from=…” to ensure this is coming from your host and not just anywhere. The full list of things you can force to be in effect when that key is used to connect via ssh is available in the man pages for sshd (man 8 ssh and search for the proper section by typing /AUTHORIZED_KEYS (return)).

Note you can make the forced command to be a wrapper script that sets up the logic of what you are trying to do, including locations fo input and output files, etc. Just be sure not to invoke a command that contains any ability to escape to a shell, which would eliminate he advantage of using the forced command. 

There’s more on this topic in chapter 8 of the”Reilly book "SSH, The Secure Shell: The Definitive Guide” by  Daniel Barrett and Richard Silverman, or chapter 12 of "“SSH Mastery: OpenSSH, PuTTY, Certificates, Tunnels, and Keys” by Michael W. Lucas. Note I am not advocating for ssh automation in this case, but if you’re going to use it, it would probably be good to use some of he features of ssh to set up a more restricted environment rather than leave keys around that can get into your account if they get loose in the world.

Alan


To unsubscribe from this group and stop receiving emails from it, send an email to singu...@lbl.gov.

Erik Sjölund

unread,
Oct 1, 2020, 1:27:17 PM10/1/20
to singu...@lbl.gov, alan...@nsfcac.org
I wonder if it's possible to do a variation of this but relying on
Unix Domain sockets and file permissions to block others from
executing commands on your behalf. (The solution would not necessarily
need to use sshd)
Something like this:

Before invoking Singularity, start a server process from your normal
user account and let the server process listen on a Unix Domain
socket.
The Unix Domain socket would be placed in a directory where only your
user account has access to.
This directory would be bind-mounted when you start Singularity to
make the Unix Domain socket accessible from within the
container.
If the server process would have functionality to get slurm commands
via this socket and execute them we would have something
similar as the SSH server solution.

I don't know what possibilities there are for implementing such a server. Maybe

* Implementing a web app (running on a web server that listens on a
Unix Domain socket).
* Running sshd as your normal user (non-root) and have it listen on a
Unix Domain socket. (I haven't checked if that is even supported by
sshd)

Any other ideas?

Best regards,
Erik Sjölund
> To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
> To view this discussion on the web visit https://groups.google.com/a/lbl.gov/d/msgid/singularity/fc899847-4f0a-4480-aba3-7e5ada1421e0%40lbl.gov.
Reply all
Reply to author
Forward
0 new messages