bndeval on liftedover vcf

15 views
Skip to first unread message

Asher Bryant

unread,
Jun 14, 2023, 5:12:45 PM6/14/23
to RTG Users
Hello again & thank you for your previous response.

I am trying to compare the breakends of a lifted-over vcf from a publication, truthset_somaticSVs_COLO829_hg38lifted.vcf, with this vcf, but bndeval isn't finding any common breakpoints (though I know must of them are common with a threshold of 1000bp). 

To test bndeval, I compared each of these vcfs to themselves to see if bndeval would work as expected - it did (i.e. precision, sensitivity, F-measure all 1).
I also tried prefixing the the numbers in the #CHROM column with "chr" in the lifted over vcf, but that also did not help.

Any help would be appreciated!

Best,
Asher

Sean Irvine

unread,
Jun 14, 2023, 6:11:31 PM6/14/23
to Asher Bryant, RTG Users
Hi Asher,

The two problems are that your calls contain a "chr" prefix while the baseline does not, and the calls are not properly sorted.  Fixing both of those problems, I find:

sed 's/chr//g' <colo829_tumor_xy.haplotagged.sv_breakpoints.vcf >colo829_tumor_xy.haplotagged.sv_breakpoints_nochr.vcf
java -jar picard.jar SortVcf I=colo829_tumor_xy.haplotagged.sv_breakpoints_nochr.vcf O=colo829_tumor_xy.haplotagged.sv_breakpoints_nochr_srt.vcf
rtg bgzip colo829_tumor_xy.haplotagged.sv_breakpoints_nochr_srt.vcf
rtg index colo829_tumor_xy.haplotagged.sv_breakpoints_nochr_srt.vcf.gz

rtg bgzip truthset_somaticSVs_COLO829_hg38lifted.vcf
rtg index truthset_somaticSVs_COLO829_hg38lifted.vcf.gz

rtg bndeval -b truthset_somaticSVs_COLO829_hg38lifted.vcf.gz -c colo829_tumor_xy.haplotagged.sv_breakpoints_nochr_srt.vcf.gz -o output
Read baseline variant set containing 132 BND variants on 19 chromosomes
Read calls variant set containing 301586 BND variants on 24 chromosomes
There were 301586 variants not thresholded in ROC data files due to missing or invalid DP (INFO) values.
Could not select maximized F-measure threshold from ROC data, only un-thresholded statistics will be shown. Consider selecting a different scoring attribute with --vcf-score-field
Threshold  True-pos-baseline  True-pos-call  False-pos  False-neg  Precision  Sensitivity  F-measure
----------------------------------------------------------------------------------------------------
     None                110            110     301476         22     0.0004       0.8333     0.0007

Regards,
Sean.


--
You received this message because you are subscribed to the Google Groups "RTG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rtg-users+...@realtimegenomics.com.
To view this discussion on the web visit https://groups.google.com/a/realtimegenomics.com/d/msgid/rtg-users/eeb3ca9d-6930-41fb-b2ce-889ccfa65380n%40realtimegenomics.com.

Asher Bryant

unread,
Jun 14, 2023, 10:42:54 PM6/14/23
to RTG Users, Sean Irvine, RTG Users, Asher Bryant
That's fantastic! Thank you very much!
Reply all
Reply to author
Forward
0 new messages