QIIME 1.9.1 Amazon Web Services instance specifications for 25 GiB input

76 views
Skip to first unread message

iosif....@gmail.com

unread,
Dec 31, 2015, 7:50:56 PM12/31/15
to Qiime 1 Forum
Any rough estimates for what Amazon instance configuration would be required for running a pipeline to completion for a qual file around 25 GiB and a fasta file around 9 GiB?

On a c3.8xlarge [0] instance (60 gig of RAM) running Amazon Web Services virtual machine identifier "QIIME 1.9.1 AMI: ami-1918ff72" we were getting the mysterious "Killed" error during split_libraries.py similar to described here [1] and here [2].

Thanks for any advice, and the work on the toolset.

[0] https://aws.amazon.com/ec2/details/

[1] https://groups.google.com/forum/#!topic/qiime-forum/DZvwHcz-Cts

[2] https://groups.google.com/forum/#!topic/qiime-forum/9oblyuXLoeU

Colin Brislawn

unread,
Jan 1, 2016, 11:50:19 PM1/1/16
to Qiime 1 Forum
Hello there,

c3.8xlarge should be more than enough for the large steps like OTU picking and taxonomy assignment. It should be plenty for the initial steps like splitting libraries. 

May I ask what kind of sequencing generated this 25GB .qual file and 9 GB .fasta file? Knowing if you are using 454 or IonTorrent, will help me come up with a game plan. I have not encountered the "Killed" error and not sure how to approach that, but I mostly use Illumina data these days. 

Keep in touch!
Colin


iosif....@gmail.com

unread,
Jan 2, 2016, 8:29:19 PM1/2/16
to Qiime 1 Forum
We ended up using previous generation [0] hs1.8xlarge: "Storage optimized", 64-bit, 16 CPUs, with 117 GB memory.  This was the only instance that would run with QIIME 1.9.1 AMI that had enough memory where split_libraries.py wouldn't die after running out of memory with "Killed".

During the split_libraries.py run, memory usage got up to at least 73 GB, maybe more.  During a parallel run of beta_diversity_through_plots.py with 16 jobs, memory usage hit at least 91 GB (78%).






[0] https://aws.amazon.com/ec2/previous-generation/

Colin Brislawn

unread,
Jan 2, 2016, 9:16:03 PM1/2/16
to Qiime 1 Forum
Hey I'm glad that worked for you.

May I ask how large your .biom file is?

Colin

iosif....@gmail.com

unread,
Jan 2, 2016, 10:00:09 PM1/2/16
to Qiime 1 Forum
The .biom file ended up at 113 megabytes.

Colin Brislawn

unread,
Jan 2, 2016, 10:14:31 PM1/2/16
to Qiime 1 Forum
Wow. 113 mb is huge. May I ask how many samples and how many OTUs?

Sorry to be so nosy. Part of the reason I ask is that you mentioned you are using fasta + qual files. These are common formats for 454 data, which is error prone, which could lead to more 'noise' or 'error' OTUs, which could lead to a large .biom table. 

Thanks for telling me more about your project. Let me know how I can help,
Colin


Reply all
Reply to author
Forward
0 new messages