The primary purpose of SV props is to create a feature matrix for machine learning approaches to filter germline SVs. For very large cohorts such as 1000 Genomes or some of the large clinical resequencing cohorts this is a viable alternative to delly's default germline SV filter if and only if a good training set is available.
Thus, svprops generates mostly some site statistics across all samples and we usually first recalibrate the GQs for such large cohorts using:
I am still updating some of the output fields of svprops but here is a brief explanation:
vac variant allele count (across all samples)
vaf variant allele frequency (across all samples)
singleton N/A if present in multiple samples or sample name if only present in one
missingrate how many samples have a missing genotype (useful after GQ calibration)
ct Delly's INFO:CT
precise 1 if INFO:PRECISE
ci Delly's INFO:CIPOS
inslen insertion length
homlen homology length
ce Delly's INFO:CE
refgq median reference genotype quality for all SV non-carriers (GT==0/0)
altgq median alternative genotype quality for het. SV carriers
gqsum total GQ sum (useful to flag repetitive sites that are poorly genotyped)
rdratio read-depth ratio of SV carriers to non-carriers (useful for filtering CNVs)
medianrc median coverage
refratio median REF support for non-carriers
altratio median ALT support for carriers
maxaltratio max. ALT support for carriers
PEsupport total paired end support across all samples
SRsupport total split-read support across all samples
supportsum total depth across all samples