Nicki
unread,Apr 11, 2012, 12:05:36 PM4/11/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to igv-help
Hello
I am trying to index a vcf file using igvtools from within IGV and am
getting an error:
"Error: The provided VCF file is malformed at line number 40:
Unparsable vcf record with allele M "
Here is the top of the VCF file, including line 40:
##fileformat=VCFv4.0
##fileDate=2012-01-30
##source=Platypus_Version_0.1.5
##INFO=<ID=func,Number=1,Type=String,Description="Functional category:
exnoic, intergenic etc">
##INFO=<ID=gene,Number=1,Type=String,Description="RefSeq gene name">
##INFO=<ID=exon_func,Number=1,Type=String,Description="Exonic function
of the variant">
##INFO=<ID=AAchange,Number=1,Type=String,Description="Amino Acid
change">
##INFO=<ID=cons46,Number=1,Type=String,Description="UCSC 46 species
conservation score">
##INFO=<ID=segdup,Number=1,Type=Float,Description="UCSC segment
duplication score">
##INFO=<ID=1000g,Number=1,Type=Float,Description="1000 genomes allelic
frequency">
##INFO=<ID=dbsnp,Number=1,Type=String,Description="dbSNP ID">
##INFO=<ID=sift,Number=1,Type=Float,Description="SIFT score">
##INFO=<ID=pp2,Number=1,Type=Float,Description="PolyPhen2 score">
##INFO=<ID=phylop,Number=1,Type=Float,Description="Phylop score">
##INFO=<ID=mutT,Number=1,Type=Float,Description="Mutation Taster
score">
##INFO=<ID=LRT,Number=1,Type=Float,Description="LRT score">
##INFO=<ID=FR,Number=0,Type=Float,Description="Estimated population
frequency">
##INFO=<ID=RPV,Number=0,Type=Float,Description="Median minimum base
quality for bases around variant">
##INFO=<ID=RPV,Number=0,Type=Float,Description="Reverse strand p-
value">
##INFO=<ID=TCR,Number=0,Type=Integer,Description="Total reverse strand
coverage at this locus">
##INFO=<ID=HP,Number=1,Type=Integer,Description="Homopolmer run
length">
##INFO=<ID=ABPV,Number=0,Type=Float,Description="Allele-bias p-value.
Testing for low variant coverage">
##INFO=<ID=TR,Number=0,Type=Integer,Description="Total number of reads
containing this variant">
##INFO=<ID=PP,Number=0,Type=Float,Description="Posterior probability
(phred scaled) that this variant segregates">
##INFO=<ID=NF,Number=0,Type=Integer,Description="Total number of
forward reads containing this variant">
##INFO=<ID=SC,Number=1,Type=String,Description="Genomic sequence 10
bases either side of variant position">
##INFO=<ID=FPV,Number=0,Type=Float,Description="Forward strand p-
value">
##INFO=<ID=TCF,Number=0,Type=Integer,Description="Total forward strand
coverage at this locus">
##INFO=<ID=NR,Number=0,Type=Integer,Description="Total number of
reverse reads containing this variant">
##INFO=<ID=RMP,Number=0,Type=Float,Description="RMS Position in reads
of Variant">
##INFO=<ID=TC,Number=0,Type=Integer,Description="Total coverage at
this locus">
##FILTER=<ID=sb,Description="Variant fails strand-bias filter">
##FILTER=<ID=ab,Description="Variant fails allele-bias filter">
##FILTER=<ID=badReads,Description="Variant supported only by reads
with low quality bases close to variant position, and not present on
both strands.">
##FILTER=<ID=hp10,Description="Flanking sequence contains homopolymer
of length 10 or greater">
##FORMAT=<ID=GL,Number=.,Type=Float,Description="Genotype log-
likelihoods (log10) for AA,AB and BB genotypes, where A = ref and B =
variant. Only applicable for bi-allelic sites">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Unphased genotypes">
##FORMAT=<ID=NR,Number=1,Type=Integer,Description="Number of reads
covering variant in this sample">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype quality,
as Phred score">
#CHROM POS ID REF ALT QUAL FILTER INFO
FORMAT AW_SC_4654.bam
1 10146 . AC A 118 PASS
ABPV=4.72e-01;FPV=3.72e-09;FR=0.5003;HP=4;NF=6;NR=2;PP=118;RMP=65.56;RPV=3.43e-01;SC=CCTAACCCTAACCCCTAACCC;TC=31;TCF=25;TCR=6;TR=8;func=intergenic;gene=NONE(dist=NONE)
WASH7P(dist=4215);segdup=0.99 GT:GL:GQ:NR
0/1:-182.49,-146.41,-145.22:32:32
1 12783 . G A 110 PASS ABPV=1.00e
+00;FPV=1.00e
+00;FR=1.0000;HP=1;MMLQ=33;NF=6;NR=0;PP=110;RMP=35.03;RPV=1.00e
+00;SC=CGGGGCCGGCGTCTCCTGTCT;TC=6;TCF=6;TCR=0;TR=6;func=intergenic;gene=NONE(dist=NONE)
WASH7P(dist=1579);segdup=0.99;1000g=0.58;dbsnp=rs62635284
GT:GL:GQ:NR 1/1:-32.34,-4.13,0.0:50:6
1 14464 . A T 154 PASS ABPV=1.00e
+00;FPV=1.00e
+00;FR=1.0000;HP=1;MMLQ=35;NF=10;NR=0;PP=154;RMP=51.55;RPV=1.00e
+00;SC=TTAAGAACACAGTGGCGCAGG;TC=10;TCF=10;TCR=0;TR=10;func=ncRNA_exonic;gene=WASH7P;segdup=0.99;1000g=0.17
GT:GL:GQ:NR 1/1:-52.06,-15.75,-9.98:72:10
1 14930 . A G 64 PASS ABPV=1.00e
+00;FPV=8.59e-01;FR=0.5000;HP=1;MMLQ=33;NF=18;NR=8;PP=64;RMP=63.67;RPV=1.47e-03;SC=ACAGAATTACAAGGTGCTGGC;TC=65;TCF=31;TCR=34;TR=26;func=ncRNA_intronic;gene=WASH7P;segdup=0.99;1000g=0.50;dbsnp=rs6682385
GT:GL:GQ:NR 1/0:-220.74,-198.5,-398.09:99:65
1 15118 . A G 28 PASS ABPV=1.00e
+00;FPV=1.00e
+00;FR=0.5000;HP=2;MMLQ=30;NF=0;NR=7;PP=28;RMP=45.42;RPV=3.15e-01;SC=CCCCCATGACACTCCCCAGCC;TC=17;TCF=0;TCR=17;TR=7;func=ncRNA_intronic;gene=WASH7P;segdup=0.99;1000g=0.35;dbsnp=rs11580262
GT:GL:GQ:NR 0/1:-39.1,-25.04,-63.71:64:17
1 15211 . T G 200 PASS ABPV=1.00e
+00;FPV=1.00e
+00;FR=0.5000;HP=1;MMLQ=32;NF=13;NR=18;PP=200;RMP=56.34;RPV=1.00e
+00;SC=AGACAGCGGCTGTTTGAGGAG;TC=35;TCF=13;TCR=22;TR=31;func=ncRNA_intronic;gene=WASH7P;segdup=0.99;1000g=0.63;dbsnp=rs11586607
GT:GL:GQ:NR 1/0:-209.81,-120.44,-129.85:43:35
How do I fix this?
Thanks, Nicki
MRC Molecular Haematology Unit