multi-mappers (non-ribosomal)

91 views
Skip to first unread message

Joseph Mudd

unread,
Apr 11, 2019, 6:46:25 PM4/11/19
to rna-star
Hi Alex,

In one of my runs I'm getting a near ~50% of multi-mappers (attached log.final.out).

I thought these may be ribosomal.  However I've aligned one of the fastq files to human 18S rRNA sequences with BBduk and I get surprisingly few contaminants:

Input is being processed as unpaired
Started output streams: 0.078 seconds.
Processing time:    6.266 seconds.

Input:                  6336220 reads 921191628 bases.
Contaminants:            164762 reads (2.60%) 22565483 bases (2.45%)
Total Removed:          164762 reads (2.60%) 22565483 bases (2.45%)
Result:                  6171458 reads (97.40%) 898626145 bases (97.55%)


If multi-mappers are not ribosomal contamination, could you guess what they may be?  This is assuming everything is right with my command.  I have attached the Log.out.

thanks

jc



Log.out.txt
log.final.out.txt

Vasily A.

unread,
Apr 20, 2019, 8:04:03 AM4/20/19
to rna-star
(while waiting for Alex reply), you can just check in your output sam/bam file: filter the reads by `NH` tag. Multimapped reads will have the value >1 (in SAM file it will look like `NH:i:2` etc.)

Alexander Dobin

unread,
Apr 22, 2019, 8:50:25 AM4/22/19
to rna-star
Hi Joseph,

you are using --outFilterScoreMinOverLread    0  --outFilterMatchNminOverLread   0 
which removes any limitation of the minimum alignment length. Because of that, you are getting a lot of short reads mapped as multimappers. With default paramters they would not be mapped at all.  I would recommend mapping with default parameters and 
1. Map just one pair of read1/read2 files first, to check that there is no problem with the list.
2. If the mapped % is still small, map read1 and read2 separately.
3. If the mapped % is still small, BLAST the unmapped reads from 1 to check for contamination.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages