--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
There are several issues with a Job dependencies approach. The first is that the job to download and cache the image would have to be run on all nodes. We don't explicitly assign jobs to a select group of hosts. It's a batch processing environment. Every CPU compute node can run every CPU compute job, and the same is true for our GPU nodes. There is no guarantee that the node that runs the download job will take the tasks.
From the admin side, we are trying to slide in containerization between the existing system image and the applications without requiring a massive change to the user workflows. A lot of our users are researchers who just run a tool provided to them from the tool builders. They are not cluster aware application developers. We are working with our toolbuilders on this project, but their focus is on new work, not trying to rework old workflows. So we have to make this as seamless as possible to the existing job submission workflows. Introducing job dependencies will likely break these workflows.
Our prolog wrapper is how we are going to force the workflows into containerization. We have two "default" container images (one for CPU jobs, one for GPU jobs) that are almost 100% identical to our current images. That 9 GB container image I mentioned in our collection is one of those default. Our toolbuilders have pushed back against the slow download times already. The efforts to enable caching, object storage backends, etc. were attempts to reduce the download times.
I had hoped that by enabling shub caching in Singularity that it would help, and it does so long as we don't run into this corner case. But knowing our workflows, users, and operational procedures, we are going to hit it regularly. Thats why I am trying to figure out any alternatives short of putting our containers into a shared directory. The shared directory model would work for today. But we have to move to the cloud and having a running shared directory that is not object based is costly.
HPC and cloud are very different use cases, and that seems to be the edge we are hitting here!
From the admin side, we are trying to slide in containerization between the existing system image and the applications without requiring a massive change to the user workflows. A lot of our users are researchers who just run a tool provided to them from the tool builders. They are not cluster aware application developers. We are working with our toolbuilders on this project, but their focus is on new work, not trying to rework old workflows. So we have to make this as seamless as possible to the existing job submission workflows. Introducing job dependencies will likely break these workflows.Singularity was (first) optimized for this use case (HPC), so a shared cache would be a good solution. The only other alternative would be through educating your users to pull to a file first, and then submit the file to the jobs. You don't need to use dependencies exactly, but the simple example that @gmk gave pulling first You are saying this isn't an option?
Our prolog wrapper is how we are going to force the workflows into containerization. We have two "default" container images (one for CPU jobs, one for GPU jobs) that are almost 100% identical to our current images. That 9 GB container image I mentioned in our collection is one of those default. Our toolbuilders have pushed back against the slow download times already. The efforts to enable caching, object storage backends, etc. were attempts to reduce the download times.A shared filesystem cache, so you could download once, would be a typical HPC use case (sharing binaries between jobs) and help this, no? I see that you opened a PR to do that here: https://github.com/sylabs/singularity/pull/2776
I had hoped that by enabling shub caching in Singularity that it would help, and it does so long as we don't run into this corner case. But knowing our workflows, users, and operational procedures, we are going to hit it regularly. Thats why I am trying to figure out any alternatives short of putting our containers into a shared directory. The shared directory model would work for today. But we have to move to the cloud and having a running shared directory that is not object based is costly.If you need to run a file, and the file is in some external server, it seems logical that you either need to download a gazillion copies of the same thing (not ideal) or share a few copies via shared systems (more ideal). The cloud is a different use case because it won't be too costly on the instance to download, but it would be a burden on the registry. This is good rationale for having a service that can handle that kind of concurrency - whether it means deploying multiple front ends to your own storage, or using something like Google Storage that can handle it. What could be other solutions? You could set up something complicated with Globus and then have a shared folder "somewhere else" but that just makes it more complicated. You could provide a wrapper for users that enforces a pull first, but it sounds like you don't want to do that.
Yes, that PR enables singularity to check the local cache for shub downloads. This feature was missing. But even with that fix, if two tasks are released to the same node nearly simultaneously and the required container is not in the cache, the first task starts the download to the cache, but the second task just sees the file name in the cache and tries to run it. If the download was sufficiently fast, this would be less of an issue. If it is cached, it is not an issue.
I don't have a problem with the wrapper doing the pull. It is the corner case where one download is currently running while another job starts trying to use the same image on the same node. Some of this may be our own fault because we moved the singularity cache out of ${HOME} and into a shared local directory. We did this because 1) the GPFS home directory on the compute nodes is very limited - Only to be used to create your compute environment and is read-only on the compute/gpu nodes. 2) By using a shared cache, we reduce the amount of local storage used for image caching. I would just have to figure out a synchronization method to hold jobs if the image is being actively downloaded. The wrapper could do that.
So, I guess my next question would be, Does Singularity itself support pulling from and object store directly using an S3 or Swift client? I know it handles docker/OCI, Singularity Library, Singularity Hub, and local file system paths. That would probably the be better fit overall instead of moving to a share file system. The transition to a public cloud would be easier with the container store being in object storage.
curl https://singularityhub.github.io/registry-org/singularityhub/busybox/manifests/latest/