read IDs in the Unmapped.out.mate1/2 files

86 views
Skip to first unread message

Vasily A.

unread,
Mar 3, 2020, 9:23:29 PM3/3/20
to rna-...@googlegroups.com
I run STAR with --outReadsUnmapped Fastx parameter and can't find in the manual how the read header lines are constructed in the output file (I do know how the original Illumina's read IDs are made, my question is only about the part added by STAR).
I see something like the following:
@A00261:110:HH2HHDMXX:1:1108:30138:17425 1:N:0:AACGTGAT+AAACATCG - in the input mate1 file
@A00261:110:HH2HHDMXX:1:1108:30138:17425 2:N:0:AACGTGAT+AAACATCG - in the input mate2 file
@A00261:110:HH2HHDMXX:1:1108:30138:17425 0:N:  00 - Unmapped.out.mate1
@A00261:110:HH2HHDMXX:1:1108:30138:17425 1:N:  00 - Unmapped.out.mate2
i.e. the original read ID is always followed by 0:N: for mate1 or 1:N: for mate2, then 2 spaces, then usually 00 but sometimes 01 or 10 instead. What do these numbers mean?
Also, as a feature request for future versions - would it be possible to add the reason-of-not-mapping here, like uT in the SAM output?

Alexander Dobin

unread,
Mar 9, 2020, 5:33:42 PM3/9/20
to rna-star
Hi Vasily,

00: means that the both mates were not mapped,
10: 1st mate mapped, 2nd unmapped
01: 1st unmapped, 2nd mapped

Adding uT is a good suggestion, will add to my list.

Cheers
Alex

Vasily A.

unread,
Mar 9, 2020, 5:50:35 PM3/9/20
to rna-star
Thank you for the explanation!
(maybe it would also worth to add this info to the manual)
Reply all
Reply to author
Forward
0 new messages