HiSeq data processing with CASAVA1.7

yongmei zhao

unread,

Oct 14, 2010, 9:18:30 PM10/14/10

to sol...@googlegroups.com

Hi,

For those of you have HiSeq, would you please share some information about the computing requirement for data processing with CASAVA1.7? How long does it usually take for processing a 100 cycles PE run, 1 flowcell with GERALD pipeline in your computing environment? Number of cores(CPUs), RAM, disk space, tmp space allocated for running the job?

We experience much longer data processing time for HiSeq data than GAIIx data, and would like to know how other groups handle the increased volume of data from HiSeq.

Thank you very much,

Yongmei

Andrew Gagne

unread,

Oct 18, 2010, 11:21:55 AM10/18/10

to sol...@googlegroups.com

Hi Yongmei,

Our validation run is still going.. would you provide your environment and run times?

andrew

--
You received this message because you are subscribed to the Google Groups "solexa" group.
To post to this group, send email to sol...@googlegroups.com.
To unsubscribe from this group, send email to solexa+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solexa?hl=en.

Sivakumar Gowrisankar

unread,

Oct 18, 2010, 3:44:30 PM10/18/10

to sol...@googlegroups.com

A 100PE takes between 1-2 days of processing. We use ELANDv2 by running them as separate jobs for each lane. Each gerald process is run on an 8 core node with 16GB mem (total), about 50 GB for /tmp.

Hope this helps.

yongmei zhao

unread,

Oct 18, 2010, 8:23:36 PM10/18/10

to sol...@googlegroups.com

Thank you for the responses.

In our case, A 100PE flowcell takes about 6 days of GERALD run time(using eland_pair) on a single node. The process ran on an 8 core node with 16GB mem. 100GB of /tmp. If we have 8 nodes for running the jobs, the processing time will be very close to what Sivakumar provided here - about 1-2 days of processing time.