Bis-SNP running time

433 views
Skip to first unread message

yaping liu

unread,
Aug 15, 2012, 3:38:52 PM8/15/12
to bissn...@googlegroups.com

There are multiple steps to call genotype, which described in our user manual. 

Here is the running time for each step in our test data set(about 1250M single end reads) in our cluster (12 Intel CPU cores, 16G memory):


1. Indel realign step:

about 12 hours


2. Base quality recalibration step:

about 24 hours


3. Bisulfite genotyping step:

By default output mode (cpg.vcf and snp.vcf):

35 hours


4. VCF postprocess to filter fake SNP, convert to bed format:

about 2 hours



Notes: Time cost is not strictly linear correlated with the reads number. Indel realign step is exponential increase with reads number. While Bisulfite genotyping step would be more correlated with interval size to go through and output mode selection, 200M reads in the whole genome may still cost 24 hours, but 100M reads in chr1 may only take less than 8 hours.

ermelin

unread,
Apr 2, 2013, 11:49:15 AM4/2/13
to bissn...@googlegroups.com
Dear Yaping,

I am looking for paramters to speed the analysis a little bit up since our files are quite big (>150 GB). Could you recommed parameter settings? Currently I am using the easy usage perl script.

Many thanks!

Yaping Liu

unread,
Apr 2, 2013, 9:15:32 PM4/2/13
to bissn...@googlegroups.com
Hi Ermelin,
It is better for you to use higher memory, the default is 10G. We have some internal bam files, which is 106G after mark duplicated reads, and 157G after the recalibration and realign step, you should be able to run it in 16G with 12 CPUs (that is our internal pipeline setting) --mem 16 --nt 12 
It is always better to do indel and base quality recalibration before genotyping. 
Also, you could set the cutoff more stringent, like --qual 30 --mbq 10 (--qual is the main genotyping quality score, --mbq is the minimum base quality required for the base to use)  for genotyping and --minCT 3 for more accurate methylation calling

Thanks for the interests! Feel free to contact with me for any problems. 


---
Yaping Liu

PhD candidate 
in
USC Epigenome Center

University of Southern California






--
You received this message because you are subscribed to the Google Groups "bissnp-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bissnp-help...@googlegroups.com.
To post to this group, send email to bissn...@googlegroups.com.
Visit this group at http://groups.google.com/group/bissnp-help?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages