any way to use local docker image on AWS Batch instead of from repo

414 views
Skip to first unread message

Sam Tischfield

unread,
Jul 6, 2018, 2:12:01 PM7/6/18
to Nextflow
I am running a process 10K times using NF on AWS batch.  Each instance pulls the docker image from the docker repo - this takes a bit of time to download.  I created a custom AMI with the docker image but NF always tries to pull from the docker repo - is there a way to specify a local docker image?

Paolo Di Tommaso

unread,
Jul 6, 2018, 2:43:09 PM7/6/18
to nextflow
NF is not aware if the container is remote or local, that's only depends on the docker engine. Also AFAIK once the image is download docker does not tried to download again, provided yo don't use a different image/tag name of use the `pull` command. 


p

On Fri, Jul 6, 2018 at 8:12 PM, Sam Tischfield <sam.tis...@gmail.com> wrote:
I am running a process 10K times using NF on AWS batch.  Each instance pulls the docker image from the docker repo - this takes a bit of time to download.  I created a custom AMI with the docker image but NF always tries to pull from the docker repo - is there a way to specify a local docker image?

--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/nextflow.
For more options, visit https://groups.google.com/d/optout.

Francesco Strozzi

unread,
Jul 7, 2018, 6:59:07 AM7/7/18
to next...@googlegroups.com
Hi Sam,
That is Batch managing the containers deployment and not Nextflow. Normally once a container is staged on a running  ECS instance it shall not be re-downloaded for other jobs. What are your jobs resources? Which type of instances is Batch running for your workloads? One possibility, if your 10k jobs are not so demanding, is to create a computing environment for Batch only with some large instance types. This way you will have less instances running and way less Docker containers to be pulled.

Hope it helps 

Cheers 
Francesco 


Le ven. 6 juil. 2018 à 20:12, Sam Tischfield <sam.tis...@gmail.com> a écrit :
I am running a process 10K times using NF on AWS batch.  Each instance pulls the docker image from the docker repo - this takes a bit of time to download.  I created a custom AMI with the docker image but NF always tries to pull from the docker repo - is there a way to specify a local docker image?

--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.

Sam Tischfield

unread,
Jul 7, 2018, 1:29:01 PM7/7/18
to Nextflow
Thanks for input.  I could get by with small ec2 nodes, however batch doesn't seem to have the smaller types available so the jobs are running on m4.large instances. The docker is relatively small (<400 mb) and it appears to load on an instance in about a minute - over all the jobs, if each instance downloaded the docker container that would be roughly <$20, so not huge increase but seems like could be better solution.  Also, its quite possible if the container is already loaded it won't be downloaded but I dont know how to verify this.

Francesco Strozzi

unread,
Jul 8, 2018, 4:16:37 AM7/8/18
to next...@googlegroups.com
m4.large are very small instances with 2 CPUs if I recall correctly. Try to specify only some larger types such as m4.4xlarge or m4.8xlarge when you create a computing environment, so that Batch will be forced to use only those. Also larger instances have faster network as well, so the container download will take less time. 
To have an idea on how much time is spent to download the Docker container you can have a look in the Batch dashboard at how much time the jobs stay in Starting mode. When you have already running ECS instances, and that already staged the container, the jobs should almost instantly pass from Starting to Running state.

Cheers 
Francesco 
Reply all
Reply to author
Forward
0 new messages