Hallo,
I have some issues with output of elprep. I'm running elprep as follows:
elprep filter input/NA12877_GCA_mapped.bam NA12877_GCA_mapped.elprep.output.bam \
--mark-duplicates --mark-optical-duplicates NA12877_GCA.output.metrics \
--sorting-order coordinate \
--bqsr NA12877_GCA.output.recal \
--known-sites resources/Homo_sapiens_assembly38.dbsnp138.elsites,resources/Mills_and_1000G_gold_standard.indels.hg38.elsites \
--reference resources/hg38.elfasta \
--haplotypecaller NA12877_GCA.vcf.gz
The problem is that inside the vcf file all of the 'ALT' column is filled with <NON_REF> and the QUAL is absent. Here is a sample line from vcf:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1
chr1 10353 . A <NON_REF> . . END=10358 GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,35
...
In log files there is nothing catching my eye. The same happens in sfm mode. Do you have an idea, what am I doing wrong?
And another, unrelated, question. As far as I understand, GATK recommends performing SortAndFixTags after marking duplicates and before BQSR What is the status of this step? Is it really recommended? I see that sample elprep command omits this step. Or is it possible to incorporate it in elprep as well?
With the best wishes,
Yegor