Running out of disk during STAR genomeGenerate / specify disk size for AWS batch

506 views
Skip to first unread message

Olga Botvinnik

unread,
May 27, 2019, 10:48:30 PM5/27/19
to Nextflow
Hello,
I'm running out of space when running STAR's genomeGenerate command, from a pipeline built off of the nf-core/rnaseq. Below is an excerpt of the log.

ERROR ~ Error executing process > 'makeSTARindex (ERCC__GRCm38.fa)'

Caused by:
  Process `makeSTARindex (ERCC__GRCm38.fa)` terminated with an error exit status (110)

Command executed:

  mkdir star
  STAR \
      --runMode genomeGenerate \
      --runThreadN 10 \
      --sjdbGTFfile ERCC__GRCm38.gtf \
      --genomeDir star/ \
      --genomeFastaFiles ERCC__GRCm38.fa \
      --limitGenomeGenerateRAM 85799345920

Command exit status:
  110

Command output:
  May 24 20:42:09 ..... started STAR run
  May 24 20:42:09 ... starting to generate Genome files
  May 24 20:43:20 ... starting to sort Suffix Array. This may take a long time...
  May 24 20:43:37 ... sorting Suffix Array chunks and saving them to disk...

Command wrapper:
  nxf-scratch-dir ip-172-31-16-85:/tmp/nxf.G2dSVzU2DF
  May 24 20:42:09 ..... started STAR run
  May 24 20:42:09 ... starting to generate Genome files
  May 24 20:43:20 ... starting to sort Suffix Array. This may take a long time...
  May 24 20:43:37 ... sorting Suffix Array chunks and saving them to disk...

  Genome_genomeGenerate.cpp:339:genomeGenerate: exiting because of *OUTPUT FILE* error: could not write the output file star//SA_8
  fail()=1 ; bad()=1
  Error while trying to write chunk # 2; 4294967294 bytes
  File size full = 4479743408 bytes
  File size on disk = 2437038080 bytes
  Solution: check that you have enough space on the disk
  Empty space on disk = 0 bytes

  May 24 20:57:17 ...... FATAL ERROR, exiting
  tee: .command.err: No space left on device

Work dir:
  s3://czb-nextflow/rnaseq/5d/2db41175c4a38cb4871b8f71b91f33

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`



I tried setting the disk process directive with `disk "100.GB"`, even though it said it's not propagated to AWS Batch (should I submit a feature request for this?)

Is this really a docker image space issue, as in here? I'm not quite sure how this applies here because I only create docker images, no AMIs. I was confused how to adjust the size of the image without making all images explode in size on my mac so instead I added the containerOptions directive to specify the base image size:

    process makeSTARindex {
        tag "$fasta"
        publishDir path: { params.saveReference ? "${params.outdir}/reference_genome" : params.outdir },
                   saveAs: { params.saveReference ? it : null }, mode: 'copy'
        containerOptions "--storage-opt dm.basesize=100G"


Not sure if it's working yet, but wanted to get my question in now just in case.

Thank you!
Warmest,
Olga

Paolo Di Tommaso

unread,
May 29, 2019, 1:54:56 AM5/29/19
to nextflow
The `containerOptions` directive is not applied to Batch jobs, therefore it's useless.  

I guess you need to configure properly the base size as explained in the docs or mount an EBS volume using the latest edge.  


Hope it helps 

p

--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
Visit this group at https://groups.google.com/group/nextflow.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/cffc29a2-c374-4d5b-b57d-dbaa49a0ae28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Olga Botvinnik

unread,
May 29, 2019, 8:54:00 PM5/29/19
to Nextflow
Thank you for your response! I was finally able to re-generate an AMI that allows for a larger docker image base size. Is there a way to have a process-specific AMI allocation, e.g. only for STAR genomeGenerate do we need to add 1000GB ?


On Tuesday, May 28, 2019 at 10:54:56 PM UTC-7, Paolo Di Tommaso wrote:
The `containerOptions` directive is not applied to Batch jobs, therefore it's useless.  

I guess you need to configure properly the base size as explained in the docs or mount an EBS volume using the latest edge.  


Hope it helps 

p

To unsubscribe from this group and stop receiving emails from it, send an email to next...@googlegroups.com.

Sean Davis

unread,
May 30, 2019, 7:12:58 AM5/30/19
to next...@googlegroups.com
Hi, Olga.

You might want to look at using ebs-autoscaling. https://docs.opendata.aws/genomics-workflows/core-env/create-custom-compute-resources/

Sean


To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Sean Davis, MD, PhD
Center for Cancer Research
National Cancer Institute
National Institutes of Health
Bethesda, MD 20892

Olga Botvinnik

unread,
May 30, 2019, 2:28:37 PM5/30/19
to Nextflow
Hi Sean,
Wow, this is amazing! Thank you so much, this is exactly what I needed.
Warmest,
Olga


On Thursday, May 30, 2019 at 4:12:58 AM UTC-7, Sean Davis wrote:
Hi, Olga.

You might want to look at using ebs-autoscaling. https://docs.opendata.aws/genomics-workflows/core-env/create-custom-compute-resources/

Sean


Olga Botvinnik

unread,
May 30, 2019, 2:31:24 PM5/30/19
to Nextflow
Using this EBS autoscaling, would the mount point appropriate for Nextflow be `/tmp`, as in the docs?

Paolo Di Tommaso

unread,
May 31, 2019, 9:14:26 AM5/31/19
to nextflow
Yes, task work dir should be allocated in the `/tmp` container dir. 


p

To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.

Anand Venkatraman

unread,
Mar 13, 2020, 11:21:24 AM3/13/20
to Nextflow
Hi Olga,

I am also running into problems with the very same workflow with the STAR step. Could you share how you fixed this - how much size you allocated for the docker base size and any other recommendations/best practices

Thanks in advance.

Felix Schlesinger

unread,
Jun 10, 2021, 5:49:06 PM6/10/21
to Nextflow
Hi Paolo,

I don't understand why the instance (scratch) storage is needed in /tmp. Should it not be local docker storage (/var/log/docker)? That's also what the tower forge AWS EC2 launch template seems to use (plus some extra dance with the docker daemon). 

Should it work to just a simple local path (e.g. /tmp) instead? How/Where is that path set?

Felix

Paolo Di Tommaso

unread,
Jun 11, 2021, 10:33:39 AM6/11/21
to nextflow
Not sure to understand your point, var/log/docker is the storage path used by the driver in the host afaik. The job is getting executed in the container file system. /tmp is a conventional choice. 

p

Felix Schlesinger

unread,
Jun 11, 2021, 5:58:35 PM6/11/21
to Nextflow
I was wondering specifically where to mount extra space on the EC2 instance running in AWS Batch. I guess I had misunderstood that Nextflow was by default mounting /tmp from the host into the container to use as scratch space. But it's just writing inside the container, which by default goes to /var/log/docker on the host, right?

So the options are 
1) Mount the extra EBS volume so that /var/log/docker is on there.
2) Use the 'aws.batch.volumes' option to mount some other host path into the container and use that for intermediate files,
Right?

It looks like Tower Forge and the AWS Genomics template both do 1), but there seem to be some extra complications around re-starting the docker deaemon to use the extra space.
For 2) it is unclear to me where (in the container) Nextflow stores input files staged from S3. In the work-dir (-w)? But it seems possible to  set that to an S3 location (nextflow run -w s3:///...), so there must be another local path?

Thanks
  Felix  
Reply all
Reply to author
Forward
0 new messages