A guideline to interpret output files

183 views
Skip to first unread message

Daekwan Seo

unread,
Nov 29, 2017, 5:14:28 PM11/29/17
to verifyBamID
  • At first, I would like to many thanks for the nice tool for contamination checking in sequencing data.

  • As mentioned in interpret output files,

  • "When genotype data is not available but allele-frequency-based estimates of [FREEMIX] >= 0.03 and [FREELK1]-[FREELK0] is large, then it is possible that the sample is contaminated with other sample. We recommend to use per-sample data rather than per-lane data for checking this for low coverage data, because the inference will be more confident when there are large number of bases with depth 2 or higher."
Can you let us know a kind of threshold for FREELK1-FREELK0? I mean, how large value can lead me to decide my sample is contaminated sample?
Best,

Daekwan

Hyun Min Kang

unread,
Nov 29, 2017, 5:32:21 PM11/29/17
to verif...@googlegroups.com
I would use something like 12, which correspond to pvalue of 1e-6 when chisq is used as null distribution. This might be still too lenient, and something heuristic, like 100 might be better in terms of specificity

--
You received this message because you are subscribed to the Google Groups "verifyBamID" group.
To unsubscribe from this group and stop receiving emails from it, send an email to verifybamid...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daekwan Seo

unread,
Nov 30, 2017, 10:41:05 AM11/30/17
to verifyBamID
Thanks

raqs...@gmail.com

unread,
Dec 12, 2018, 4:38:49 AM12/12/18
to verifyBamID
Hi,

I have been getting negative values (all kinds of range) for FREELK1 - FREELK0 independently of the FREEMIX value.
Command line used is 

verifyBamID --vcf $1 --bam $2 --out $3 --verbose --ignoreRG --site --minQ 20

Two example outputs from different BAMs and VCFs:

 #SEQ_ID RG CHIP_ID #SNPS #READS AVG_DP FREEMIX FREELK1 FREELK0 FREE_RH FREE_RA CHIPMIX CHIPLK1 CHIPLK0 CHIP_RH CHIP_RA DPREF RDPHET RDPALT
SI_2 ALL NA 11341430 9579343 0.84 0.49992 3276580.38 3362001.50 NA NA NA NA NA NA NA NA NA NA

or

#SEQ_ID RG CHIP_ID #SNPS #READS AVG_DP FREEMIX FREELK1 FREELK0 FREE_RH FREE_RA CHIPMIX CHIPLK1 CHIPLK0 CHIP_RH CHIP_RA DPREF RDPHET RDPALT
79_C4_CD4naive ALL NA 2615527 438963 0.17 0.00468 171452.74 171803.39 NA NA NA NA NA NA NA NA NA NA

Best regards,
Raquel
Reply all
Reply to author
Forward
0 new messages