Why does delly generate results where both the tumor and normal samples are annotated to be homozygous reference (0/0)

1 view
Skip to first unread message

Peter Waltman

unread,
Feb 19, 2026, 6:14:39 PM (2 days ago) Feb 19
to delly-users
I'm trying to understand the delly annotation and filtering step for somatic SVs. Specifically, why does delly generate results where both the tumor and normal samples are annotated to be homozygous reference (0/0) - even in the inital call step?

Below, I describe my testing approach, but the fundamental question is why delly returns fusions/breakpoints where both the tumor and normal are annotated to be homozygous reference.



Testing process:

My group has identified 9 clinically-validated samples that each contain a fusion. Initially, these were identified using RNA-seq tools, and then validated with DNA-seq and visualized in IGV.

For testing purposes, we have extracted the DNA-seq reads associated with those 9 fusions, and have added them sythetically to 4 other, unrelated samples. To do this, we created fastq files for the "fusion reads" and added those to the set of fastq's that are aligned when those samples are aligned. In other words, if a sample was pair-end sequenced across 2 lanes, we added a "3rd" lane that is/are the fastq's that contain the "fusion reads." An example sample sheet is at the end.

Finally, each of these synthetic samples has an associated normal/control sample that is unique to each sample, and is provided to delly when it is generating the initial set of somatic SVs. 

This setup allows us to have a ground truth of fusion events/breakpoints that delly should find.

The good news is that delly does find these fusions - at least with the initial vcf that it generates (woohoo!! yay delly!). The bad news is multifaceted:
  1. the genotypes that delly assigns during the initial call and filter steps are usually homozygous reference (0/0) for the tumor sample
  2. the delly filter step usually removes half of these ground truth fusions 
  3. the delly filter step produces a vcf that includes numerous fusions that are marked as being "LowQual"
Because of issues 2 & 3, I'm reluctant to use rely upon the filter step to generate the geno.bcf file that is produced  by the genotype step, especially as our pipeline doesn't allow for multiple normals to be use when genotyping a sample.

My biggest concern is the homozygous genotypes that are given to the ground-truth fusion/breakpoints. Is there some sort of explanation that you can suggest that I can give to my supervisor for why we shouldn't be concerned about those?

Honestly, I don't understand why delly would return any result that it genotypes as being homozygous reference in both the tumor and normal.

sample, fq_r1, fq_r2
sample01, lane01_R1.fq.gz, lane01_R12.fq.gz
sample01, lane02_R1.fq.gz, lane02_R12.fq.gz
sample01, fusion_R1.fq.gz, fusion_R12.fq.gz



Reply all
Reply to author
Forward
0 new messages