Extremely Slow BisulfiteGenotyper

97 views
Skip to first unread message

bruce moran

unread,
May 5, 2015, 10:46:24 AM5/5/15
to bissn...@googlegroups.com
Hi,

I have been testing bisulfite conversion efficiency using MiSeq and have multiple small samles, ~1m PE 75bp reads. I use a pipeline:


  1. Trimmomatic, Bismark
  2. Sort, Reorder, Remove Duplicates
  3. BisulfiteCountCovariates
  4. BisulfiteTableRecalibration
  5. BisulfiteGenotyper

Up until 5, everything is fine, my conversions work well and we are happy to go on to capture and run on HiSeq. However, I was interested to test BisSNP for calling SNPs, and included it in the pipeline therefore. I have used GATK on exomes, and I have never seen it run so slowly. Example: I left a run 90 hours, 1.3% complete for 200,000 reads... No errors thrown, I am using a very large dbSNP VCF but even when this is not included I see this behaviour. I am using BisSNP-0.82.2.jar.

Any ideas on what is going on would be appreciated.


Also, in your manual you show plot of recalibration, but there is no script to generate the plots, only to make the data. Do you have something publically available to make plots that you can share? Just to save me a little time=)


Thanks for all your work on BisSNP,


Bruce.

ping

unread,
May 6, 2015, 5:40:46 PM5/6/15
to bissn...@googlegroups.com
Hi Bruce,
Large dbSNP usually is not a problem. But large sample size could be a problem. Have you test it with just one sample bam file?


In user manual, there is a section (4.4.4 Generate recalibration plot) about how to generate plot from the output file.

let me know if you have further question

yaping

--
You received this message because you are subscribed to the Google Groups "bissnp-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bissnp-help...@googlegroups.com.
To post to this group, send email to bissn...@googlegroups.com.
Visit this group at http://groups.google.com/group/bissnp-help.
For more options, visit https://groups.google.com/d/optout.



--
Yaping Liu, Ph.D.

Postdoctoral Associate
Manolis Kellis Lab
Computer Science and Artificial Intelligence Lab (CSAIL)
Massachusetts Institute of Technology
Broad Institute of MIT and Harvard


bruce moran

unread,
May 6, 2015, 6:29:46 PM5/6/15
to bissn...@googlegroups.com
Hi Yaping,

yes, I have tested on single sample, it is still running very, very slowly, so slow as to be totally unusable. Any updates recently that may cause this? Do you have a test sample I can try to see if it is an issue with my installs?

Will take a look at the plot section,

Thanks,

Bruce.
Reply all
Reply to author
Forward
0 new messages