I have implemented CAVA and been testing it using several VCFs but CAVA fails for each VCF. The human reference FASTA being used works fine for all other variant callers. Log file is included below. Thank you for your assistance.
Input file contains 1555409 records to annotate.
Annotating variants ... 23.0%Traceback (most recent call last):
File "./cava.py", line 184, in <module>
record.annotate(ensembl,dbsnp,reference,impactdir)
File "/home/cava-v1.0.0/basics.py", line 247, in annotate
variant.annotate(ensembl,dbsnp,reference,impactdir)
File "/home/cava-v1.0.0/basics.py", line 88, in annotate
if not ensembl is None: self=ensembl.annotate(self,reference,impactdir)
File "/home/cava-v1.0.0/data.py", line 199, in annotate
variant_plus=variant.alignOnPlusStrand(reference)
File "/home/cava-v1.0.0/basics.py", line 100, in alignOnPlusStrand
seq1=reference.getReference(self.chrom,self.pos,self.pos+len(self.ref)-1+100)
File "/home/cava-v1.0.0/data.py", line 500, in getReference
seq = self.fastafile.fetch(goodchrom,start-1,end)
File "pysam/cfaidx.pyx", line 182, in pysam.cfaidx.FastaFile.fetch (pysam/cfaidx.c:3371)
ValueError: invalid region: start (198043100) > end (198022430)
here is some another run (log is a bit of different):
Input file contains 1555409 records to annotate.
Annotating variants ... 74.6%Traceback (most recent call last):
File "./cava.py", line 184, in <module>
record.annotate(ensembl,dbsnp,reference,impactdir)
File "/home/cava-v1.0.0/basics.py", line 247, in annotate
variant.annotate(ensembl,dbsnp,reference,impactdir)
File "/home/cava-v1.0.0/basics.py", line 88, in annotate
if not ensembl is None: self=ensembl.annotate(self,reference,impactdir)
File "/home/cava-v1.0.0/data.py", line 284, in annotate
csn_plus=csn.getAnnotation(variant_plus,transcript,reference)
File "/home/cava-v1.0.0/csn.py", line 68, in getAnnotation
protein = makeProteinString(variant,transcript,reference)
File "/home/cava-v1.0.0/csn.py", line 172, in makeProteinString
prot=transcript.getProteinSequence(reference,None)
File "/home/cava-v1.0.0/basics.py", line 443, in getProteinSequence
ret=Sequence(self.getCodingSequence(reference,variant)).translate(1)
File "/home/cava-v1.0.0/basics.py", line 414, in getCodingSequence
ret+=reference.getReference(self.chrom,exon.start+1,exon.end)
File "/home/cava-v1.0.0/data.py", line 500, in getReference
seq = self.fastafile.fetch(goodchrom,start-1,end)
File "pysam/cfaidx.pyx", line 182, in pysam.cfaidx.FastaFile.fetch (pysam/cfaidx.c:3371)
ValueError: invalid region: start (114426046) > end (114364328)