Salmon: Large amount of Reads discarded but High Mapping Rate

204 views
Skip to first unread message

Jag Lally

unread,
Aug 10, 2021, 12:30:11 PM8/10/21
to Sailfish Users Group
Hi, 

We are seeing an issue of having a very large "number of mappings discarded because of alignment score", nearly 98% of our reads in some cases. Yet, we have fairly high Mapping rates in range of 80-90%. Has anyone encountered this issue in the past and is it something that we should be concerned about? 

See example below: 

Observed 41084625 total fragments (41084625 in most recent round)


[2021-08-04 11:58:50.908] [jointLog] [info] Computed 575,448 rich equivalence classes for further processing

[2021-08-04 11:58:50.908] [jointLog] [info] Counted 36,907,348 total reads in the equivalence classes 

[2021-08-04 11:58:50.918] [jointLog] [warning] 0.00482906% of fragments were shorter than the k used to build the index.

If this fraction is too large, consider re-building the index with a smaller k.

The minimum read size found was 8.



[2021-08-04 11:58:50.918] [jointLog] [info] Number of mappings discarded because of alignment score : 48,653,382

[2021-08-04 11:58:50.918] [jointLog] [info] Number of fragments entirely discarded because of alignment score : 2,493,069

[2021-08-04 11:58:50.918] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 2,026,863

[2021-08-04 11:58:50.918] [jointLog] [info] Number of fragments discarded because they have only dovetail (discordant) mappings to valid targets : 1,155,701

[2021-08-04 11:58:50.918] [jointLog] [info] Mapping rate = 89.8325%


Thank you!


Rob

unread,
Aug 10, 2021, 4:58:58 PM8/10/21
to Sailfish Users Group
Hi,

This shouldn't be a matter of concern.  The `Number of mappings discarded because of alignment score` is in terms of the number of *alignments*.  This is simply a record of the number of potential mapping sites where there was a hit (enough exact matches to signal alignment), but the alignment didn't achieve the minimum required score.

The subsequent log `Number of fragments entirely discarded because of alignment score : 2,493,069` shows the number of reads where *none* of the alignments were good enough to achieve the minimum score.  So, your overall high mapping rate suggests that your reads map to your reference well (most reads can be well explained by the reference).  The high number of discarded mappings is likely caused by a lot of shared sequence (alternative splicing) where many transcripts have some short exact matches, but only a few transcripts produce good full quality alignments for the read.

Best,
Rob
Reply all
Reply to author
Forward
0 new messages