HiSeq data size

438 views
Skip to first unread message

Lingsheng Dong

unread,
Dec 14, 2010, 2:55:29 PM12/14/10
to sol...@googlegroups.com
Hello all,
We are going to have a HiSeq machine next month and want to know how much data will generate from the machine. Illumina told me, for 100bp pair-end run, the basecall (including bcl, position and other related files) files will be 250G. I want to know, how much data there will be in qseq format from these 250G files after we run BCL converter? Please suggest,
Best,
Lingsheng

hemant kelkar

unread,
Dec 14, 2010, 3:46:46 PM12/14/10
to sol...@googlegroups.com
If you are looking for the sequence.txt file sizes, they would likely be up to 25 GB (depending on your cluster density) per read per lane.

--
You received this message because you are subscribed to the Google Groups "solexa" group.
To post to this group, send email to sol...@googlegroups.com.
To unsubscribe from this group, send email to solexa+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solexa?hl=en.

Abhishek Pratap

unread,
Dec 14, 2010, 6:10:09 PM12/14/10
to sol...@googlegroups.com
Typically a good HiSeq fully processed run where you have the FASTQ files generated will range from 3.1-3.3 Tb / flowcell. So for a full HiSeq run you can double this number ( 2 flowcells / HiSeq) . From this data we usually delete the CIF files , qseqs, and some files like ( *_anamoly.txt, per lane) from the GERALD folder. This is after we are satisfied with the raw reads and dont forsee any reporecessing from qseq. Just in case we need to redo we still have bcl files stored and if need be can re do the standard pipeline steps.

I think the final per flowcell space foot print for us is ~600-700 Gb / flowcell.

hth
-Abhi
Reply all
Reply to author
Forward
0 new messages