understanding ##INFO=<ID=PE and ##INFO=<ID=SR

100 views
Skip to first unread message

Siamat De Teelveed

unread,
Dec 8, 2023, 7:12:12 AM12/8/23
to delly-users
Hi there,

I'm looking into consensus calling of SVs with several tools and am constructing a uniform header for the merged vcf. Therefore, I would like to understand how the PE and SR INFO fields relate to the DR, DV, RR, and RV FORMAT fields. From the snippet below you can appreciate that the support for a variant is not that easy to understand simply by looking at the high-quality pairs or reads.

bcftools query -f '[%PE %DR %DV %SR %RR %RV\n]' sample.delly.vcf.gz | head -5
9 4 8 . 0 0
2 24 2 . 0 0
2 33 2 . 0 0
0 0 0 3 1 6
2 3 2 . 0 0

Best regards
Mattias

tr

unread,
Dec 12, 2023, 4:47:16 AM12/12/23
to delly-users
Hi Mattias,

Delly has 2 phases, (i) SV site discovery and (ii) SV genotyping. For the SV site discovery, the INFO fields (PE, SR) indicate how many initial paired-ends and split-reads were found. This is not exhaustive because once delly has enough evidence for an SV it stops searching to save runtime. The genotyping is exhaustive (up to parameter -a 250) and should be used to calculate the AF of an SV. For precise variants, RR and RV, and for imprecise SVs, DR and DV.

Best, Tobias
Reply all
Reply to author
Forward
0 new messages