BWA mem + nextflow best practice

504 views
Skip to first unread message

Elinor L

unread,
Feb 15, 2021, 8:55:51 AM2/15/21
to Nextflow
Hello all!

BWA mem requires a lot of memory, and default in nextflow is to run the processes in parallel. What is the best practice for that? Do you run it one by one?
Out end goal is to deploy the nextflow pipeline in the cloud (aws or azure), and don't want it too be super expensive :) 

Best regards, Elinor

aliaks...@gmail.com

unread,
Feb 15, 2021, 10:16:31 AM2/15/21
to Nextflow
Hi Elinor,

I'd be interested to see what others do. But from my perspective, since BWA allows multi-threading itself, we usually allocate all available cores to BWA process and pass that value to BWA's number of threads parameters (minus 1-2 threads to allow for overhead processing, e.g unzipping fastq files). Effectively, we are explicitly asking NextFlow to run one sample at a time during BWA process. 

Hope this helps,

Best wishes,

Aliaksei.

Elinor L

unread,
Feb 17, 2021, 3:35:36 AM2/17/21
to next...@googlegroups.com
Hi Aliaksei,

It seems like a good solution for BWA. Thank you for your answer and for pointing me in the right direction, definitely helps :)

Best regards, Elinor

--
You received this message because you are subscribed to a topic in the Google Groups "Nextflow" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/nextflow/medtFlMi39k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/f94ff3c6-fd56-4693-8ae9-244a3eb83314n%40googlegroups.com.

Alan Hoyle

unread,
Feb 17, 2021, 9:28:21 AM2/17/21
to next...@googlegroups.com
We almost always pipe the BWA output directly into a sort/index process (samtools sort or bamsormadup) to avoid having an uncompressed SAM file saved in our working space. 

I think our experience is that processing is “bursty” between the two processes and we end up with BWA using most of the cores for a second and then samtools/bamsormadup using most of the cores the next second, so we “undersubscribe” the CPU allocation vs the number of threads in total for BWA+downstream. 

— 

From: next...@googlegroups.com <next...@googlegroups.com> on behalf of Elinor L <elinor....@gmail.com>
Sent: Wednesday, February 17, 2021 3:35:22 AM
To: next...@googlegroups.com <next...@googlegroups.com>
Subject: Re: BWA mem + nextflow best practice
 
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/CAEasauHysQskkbRvCfj6X52gu%2BHM-w%3DSmK8zvVNrVh6j2%3DxYyA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages