HiSeq run times

Andrew Gagne

unread,

Aug 13, 2010, 5:07:40 PM8/13/10

to sol...@googlegroups.com

Does anyone have some real life runtimes from their hiseq?

We're trying to estimate throughput.

thanks,

andrew

Abhishek Pratap

unread,

Aug 13, 2010, 5:19:31 PM8/13/10

to sol...@googlegroups.com

Hi Andrew

We are still battling to process our first HiSeq run through the
Illumina pipeline. I think due to 4x yield / lane the pipelines are
not able to digest all the data that well.

I am not sure what you meant by "estimate throughput". ( yield / time ?? )

Thanks!
-Abhi

-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer II
Genomics Resource Center
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/

> --
> You received this message because you are subscribed to the Google Groups
> "solexa" group.
> To post to this group, send email to sol...@googlegroups.com.
> To unsubscribe from this group, send email to
> solexa+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/solexa?hl=en.
>

Andrew Gagne

unread,

Aug 13, 2010, 5:24:13 PM8/13/10

to sol...@googlegroups.com

We're mostly concerned about instrument run times.. if Illumina says it takes X but in reality a run takes longer or not..

What is your pipeline set up, our hiseq is supposed to arrive in the next moth or so.

Abhishek Pratap

unread,

Aug 13, 2010, 5:26:15 PM8/13/10

to sol...@googlegroups.com

I think we finished the test 101 PE run in 8 days. It is roughly about
1 hour/cycle.

We are using the CASAVA v1.7 pipeline but for now it is not digesting
HiSeq data easily. May be its just us but we are experience
significant increase in data processing time.

-Abhi

Leath Tonkin

unread,

Aug 13, 2010, 5:45:46 PM8/13/10

to sol...@googlegroups.com

Are you using ELAND_SET_SIZE in your Gerald config files? From the documentation:

The maximum number of tiles analyzed by each ELAND process, needed to ensure that the memory usage stays below 2 GB. The optimal value is such that there are approximately 10 to 13 million lines (reads) in one set. Only available for ANALYSIS eland_extended, ANALYSIS eland_pair, and ANALYSIS eland_rna. Specifying a value is mandatory.

For the HiSeq it should be set at:

4 for 425K clusters/mm2 (40 for the GA)

3 for 550K clusters/mm2 (30 for the GA)

2 for 750K clusters/mm2 (20 for the GA)

Of course, this means it will take longer since you are only doing a subset of sequences from each lane, but at least it won't crash on you if that is one of your problems with CASAVA. Gerald would crash on our GAIIx data sets with 1.6 and 1.7 until we learned about this new parameter. Wish our FAS would have said something since I missed this in the documentation when it was released.

Leath

Leath Tonkin, PhD

Manager, Vincent J. Coates Genomics Sequencing Laboratory (GSL)

QB3/University of California, Berkeley

B206 Stanley Hall

MC 3220

Berkeley, CA 94720-3220

(510) 666-3372

lto...@berkeley.edu

http://qb3.berkeley.edu/gsl/ <--Calendar, Submission Requirements, Current Forms, Blog Updates & Pricing.

Reply all

Reply to author

Forward