Dear all
We have a workflow where we want to process all the 1000 Genomes crams, so the disk requirements are relatively substantial. AWS Batch is ideal for us. We've run into a problem with the disk space available on our custom AMI.
The problem is that when multiple containers are scheduled on the same instance, the instance runs out of space. If I run maxForks=1 or 2 then there's no problem but with more, the instance runs out of disk space.
For this particular problem, we can solve the problem by tuning our memory and cpu requirements to ensure that at most we have two containers running per instance (obviously I could make the disk bigger but that would just change the n of the problem from 2 to 4 or whatever -- we need to do some calculations to work out reasonable configurations as this is likely to be an expensive run). This would be a satisfactory solution since we hope this will be a one-off run and we are not building this workflow for other people to use.
My question is not so much to solve our specific problem but to understand what the best way for handling this is for future. Could we mount an EFS volume to our containers that could be used for the local staging directory. (Of course as soon as the job has complete we want to delete the local cram copy because don't want to pay for 10s of TBs of disk storage). Is there another way of doing things?
As an aside, the documentation on AWS Batch may need some updating. The limitation of docker image sizes and base size no longer seems to apply to the current version of Docker on the default AWS ECS instance (and the instructions don't apply since e.g. docker info does not show base device size)
Many thanks
Scot