VCF version specifications of CAVA

60 views
Skip to first unread message

Sally Guthrie

unread,
Mar 30, 2015, 5:14:47 PM3/30/15
to cava-us...@googlegroups.com
Trying to run CAVA on a v4.1 VCF can result in a KeyError when the file contains a monomorphic reference. Which version of VCF was CAVA built to interpret? Any suggestions for fixing this?

An example line that I believe would cause this kind of error is shown below:
#CHROM  POS       ID   REF  ALT  QUAL  FILTER  INFO                       FORMAT          NA00001         NA00002         NA00003
20            1230237  .     T      .        47       PASS    NS=3;DP=13;AA=T  GT:GQ:DP:HQ  0|0:54:7:56,60  0|0:48:4:51,51  0/0:61:2

Thanks!
Sally

Elise Ruark

unread,
Mar 31, 2015, 6:37:34 AM3/31/15
to cava-us...@googlegroups.com

Hi Sally,

CAVA is designed to annotate variation from reference genome sequence thus does not currently support monomorphic reference calls as no variation exists at these positions. We are working on an update which will recognize these sites rather than generating an error.
In the meantime, these calls could be separated prior to running CAVA with the following bash commands (if using UNIX):
cat file.vcf | awk '($5==".") {print}' > monomorphic.vcf
cat file.vcf | awk '($5!=".") {print}' > forCava.vcf

Thank you for bringing this to our attention,
Elise

Sally Guthrie

unread,
Mar 31, 2015, 1:18:01 PM3/31/15
to cava-us...@googlegroups.com
Excellent! I was able to write a work-around by adding
if '.' in alt:
    logging
.info("Variant ignored because it is monomorphic reference: "+self.chrom+':'+str(self.pos)+' '+self.ref+'>'+alt)
   
continue
in basics.Record.__init__ right after the check for no-calls, but modifying the original file worked identically.

I would also like to point out that CAVA will also throw an error when breakends are in the VCF file (example below), even if they are not in the areas of interest defined by the filters.

CHROM    POS      ID         REF   ALT               QUAL  FILTER  INFO

2              321681   bnd
_W  G      G]17:198982]  6         PASS    SVTYPE=BND

This is not presenting an immediate problem, since my population does not have breakends in my areas of interest, but I thought it would be helpful to know for future work!

Thanks for your help!
Sally

Reply all
Reply to author
Forward
0 new messages