Duplicated reads (estimated) vs Duplication rate in bamqc report

67 views

Skip to first unread message

Jason

unread,

Aug 29, 2016, 7:47:12 PM8/29/16

to QualiMap

Hi all,

I'm new to use Qualimap for RNA-seq data QC and I run it and I'm confused by the difference between Duplicated reads (estimated) and Duplication rate. How are each term defined and calculated?

In particular, how is the duplicated reads estimated? And why is the duplication rate different from the number of Duplicated reads (estimated) divided by number of reads?

Thank you.

Jason

Konstantin Okonechnikov

unread,

Aug 31, 2016, 3:53:49 PM8/31/16

to qual...@googlegroups.com

Hi Jason,

the duplicated reads are detected by counting how many positions in the genome have exactly 1,2,3, etc reads starting from it. Alignment is considered as duplicate if there was already another alignment starting from the same position detected. The "Duplication Rate Histogram" plot demonstrates the general overview.

The duplication rate value is estimated as

Dup Rate = 1 - USP,

where USP is a proportion of genomic positions having exactly one read starting from it to all the genomic positions that have "at least" one read starting from it.

Konstantin

--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages