Hi,
We are using RSeqC to get a sense of our total nucleic acid quality for a targeted RNA-Seq clinical assay. We are curious how we could better interpret the read distribution output to say whether a sample is mostly DNA, mostly RNA, or an equal mixture of RNA+DNA. It seems like the read distribution analysis preferentially tags reads that map to intron+cds_exon as CDS_Exons. Is that right? While reads that map only to introns are tagged as Introns? What about a paired end read where one of the pair maps to an intron and the other maps to an exon? Also, does RSeqC respect the mark duplicates flag for read alignments?
Any suggestions on what RSeqC output would be useful to say that a sample is rich in RNA vs. DNA would be very helpful to us. Clinically, we would fail a case with mostly DNA but pass something with lots of RNA or a good mix of RNA+DNA.
Thanks,
Long