GotCloud for non-Human variant calling

17 views
Skip to first unread message

David Kainer

unread,
Feb 1, 2016, 8:12:53 PM2/1/16
to GotCloud
Hi

I would like to test Gotcloud for calling variants in a non-model plant population. I can't seem to get it configured to prevent snpcall from requiring various input VCFs, which I don't have of course.

Is there a way to configure it to call SNPs and INDELs for this sort of problem?

cheers
DK

Mary Kate Wing

unread,
Feb 5, 2016, 10:20:03 PM2/5/16
to David Kainer, Adrian Tan, GotCloud
Sorry for the delayed response.

Unfortunately, some tools/scripts used by GotCloud assume diploid and chromosomes 1-23 and also use human genome assumptions for some of the filtering applied during snpcall.  

GotCloud alignment can be made to work, but I don't believe there is an easy way to use the snpcall/indel pipelines for non-human.

If you already have a set of tools that you know work for performing snpcall, you can use GotCloud to manage the pipeline, split up the work by chromosome/region, and parallelize the jobs.  
If this is something you are interested, I could help you write your custom pipeline (it is done using the configuration file to define your tools/steps).  Unfortunately our tools that we use for snpcall are too focused on human genomes at this point.

I have added Adrian to the email since he wrote the indel caller in case he knows if the indel caller can be configured to support non-human.

Mary Kate

--
You received this message because you are subscribed to the Google Groups "GotCloud" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gotcloud+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Kainer

unread,
Feb 5, 2016, 10:45:23 PM2/5/16
to Mary Kate Wing, Adrian Tan, GotCloud

Hi Mary, Adrian

 

Thanks for your reply!

 

Here is what I am hoping to do with my low coverage (diploid) plant genomes:

 

1.       Align them to a reference genome

2.       Call SNPs (and INdels hopefully)

3.       Refine variants using LD

 

So GotCloud seems like a really good pipeline for this.

 

At this point I have done the QC and alignment. I can call SNPs using FreeBayes and produce an output VCF file. I can then filter the VCF using VCFLib, which can tag each variant with a PASS or FAIL. Would I then be able to use GotCloud’s SPLIT and LDREFINE functionality? If so, what is the best way to configure those steps?

 

Best regards

David

David Kainer

unread,
Feb 5, 2016, 10:57:33 PM2/5/16
to Adrian Tan, Mary Kate Wing, GotCloud

So if I call Indels can they be refined with LDrefine as well?

 

 

From: Adrian Tan [mailto:at...@umich.edu]
Sent: Saturday, 6 February 2016 2:55 PM
To: David Kainer; Mary Kate Wing
Cc: GotCloud
Subject: Re: GotCloud for non-Human variant calling

 

For the Indel part, no other data sets are required except for reference sequence is needed.  An issue with indels here is that the pipeline calls only biallelic indels.  That might be an issue for plants that tend to have many repeats.

Mary Kate Wing

unread,
Feb 6, 2016, 12:02:34 AM2/6/16
to Adrian Tan, Hyun Min Kang, David Kainer, GotCloud
For ldrefine - we run Beagle, then Thunder.
I've added Hyun to the email as he is more familiar with those tools and might know if either/both of them support non-human diploid organisms.  

A quick look and I believe the split step would work for you, as well as the scripts used in ldrefine.  It is just a matter of whether or not the 2 programs (beagle/thunder) themselves support it.

The fact that you have diploid improves your chances of things working.
But I believe part of SnpCall still expects chromosomes to be 1-23/X&Y and makes assumptions for human, so I believe the other tools you mentioned would work better for that.  You could use the GotCloud framework to create your own custom pipeline to run those tools.  If you are interested in that, let me know and I can help guide you. 


On Fri, Feb 5, 2016 at 11:08 PM, Adrian Tan <at...@umich.edu> wrote:
I'm not too sure about that part of the pipeline.  
Reply all
Reply to author
Forward
0 new messages