bamRemoveDuplicates

43 views
Skip to first unread message

Olivia Fong

unread,
Oct 4, 2017, 11:25:44 PM10/4/17
to rna-star
Hi Alex,
   How does  --bamRemoveDuplicatesType work?  Does it have to happen on both read1 and read2 (for paired end) to count as a duplicate?

Thanks,
Olivia

Alexander Dobin

unread,
Oct 5, 2017, 11:34:04 AM10/5/17
to rna-star
Hi Olivia,

duplicates are called if the coordinates of both mates and their CIGARs have to coincide. The soft-clipped ends are extended for comparison.

Cheers
Alex

Olivia Fong

unread,
Oct 12, 2017, 5:48:30 PM10/12/17
to rna-star
Thanks, Alex,
  How do you choose  which read  to keep?   Is it by quality or is it random?

Olivia

Alexander Dobin

unread,
Oct 13, 2017, 11:00:28 AM10/13/17
to rna-star
Hi Olivia,

STAR keeps the read with the best alignment score (AS tag in SAM), i.e. the reads with the min number of mismatches is recorded.
Quality Scores are not taken into account, unlike Picard and samtools. If you are calling variants, you may want to choose by QS.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages