Mapping rate for salmon using different k and coverage

862 views
Skip to first unread message

Nick Bernstein

unread,
Apr 12, 2016, 4:27:59 PM4/12/16
to Sailfish Users Group
Hi all,

I seem to be getting out the same mapping rate regardless of the different k and coverage inputs I use. I thought I should see substantial difference in mapping rate depending on the inuts to the minLength of the MEM and the coverage required. Any ideas why I'm not seeing any difference in my mapping rates?


fctxFSPD/logs/salmon_quant.log:

[2016-04-12 15:33:13.344] [jointLog] [info] Mapping rate = 42.2917%


fctx.k10.c4/logs/salmon_quant.log:

[2016-04-12 16:18:19.449] [jointLog] [info] Mapping rate = 42.2917%


fctx.k25.c4/logs/salmon_quant.log:

[2016-04-12 14:29:44.643] [jointLog] [info] Mapping rate = 42.2917%


fctx.k25.c5/logs/salmon_quant.log:

[2016-04-12 14:12:16.526] [jointLog] [info] Mapping rate = 42.2917%


fctx.k30.c1/logs/salmon_quant.log:

[2016-04-12 16:13:41.557] [jointLog] [info] Mapping rate = 42.2917%


fctx.k30.c4/logs/salmon_quant.log:

[2016-04-12 14:46:53.282] [jointLog] [info] Mapping rate = 42.2917%


k30.c5/logs/salmon_quant.log:

[2016-04-12 14:53:22.427] [jointLog] [info] Mapping rate = 42.2917%


fctx/logs/salmon_quant.log:

[2016-04-12 13:17:00.045] [jointLog] [info] Mapping rate = 42.2917%


I confirmed that the argument inputs were different:

==> /data/bernsteinnj/projects/salmon/KEN1095fctx.salmon.k30.c5.log <==

Version Info: Could not resolve upgrade information in the alotted time.

Check for upgrades manually at https://combine-lab.github.io/salmon

# salmon (mapping-based) v0.6.0

# [ program ] => salmon 

# [ command ] => quant 

# [ index ] => { /data/bernsteinnj/projects/tools/SalmonBeta-0.6.5-pre_CentOS5/references/salmon.quasi }

# [ libType ] => { ISR }

# [ mates1 ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx_R1.fastq }

# [ mates2 ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx_R2.fastq }

# [ output ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx.k30.c5 }

# [ useVBOpt ] => { }

# [ numBootstraps ] => { 30 }

# [ minLen ] => { 30 }

# [ coverage ] => { .5 }


==> /data/bernsteinnj/projects/salmon/KEN1095fctx.salmon.k30.c4.log <==

Version Info: Could not resolve upgrade information in the alotted time.

Check for upgrades manually at https://combine-lab.github.io/salmon

# salmon (mapping-based) v0.6.0

# [ program ] => salmon 

# [ command ] => quant 

# [ index ] => { /data/bernsteinnj/projects/tools/SalmonBeta-0.6.5-pre_CentOS5/references/salmon.quasi }

# [ libType ] => { ISR }

# [ mates1 ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx_R1.fastq }

# [ mates2 ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx_R2.fastq }

# [ output ] => { /data/bernsteinnj/projects/salmon/KEN1095fctx.k30.c4 }

# [ useVBOpt ] => { }

# [ numBootstraps ] => { 30 }

# [ minLen ] => { 25 }

# [ coverage ] => { .4 }


Not gonna post all of those here. 



Best,

Nick

Rob

unread,
Apr 12, 2016, 5:57:16 PM4/12/16
to Sailfish Users Group
Hi Nick,

  What you're seeing is a result of the fact that you're using the quasi index (which is perfectly fine!).  Basically, the problem is that the software moves faster than the pre-prints and publications.  The parameters that you are changing (minLen and coverage) only matter when running Salmon with the FMD-index.  When running Salmon with the quasi index, you'd actually have to change the size of `k` during the index building step.  So, if you want to see how these things affect your mapping rate, there are two different approaches you can take (maybe testing both is a good idea):

1) Build the quasi-index with a few different values of k.  In general, a smaller k may allow more reads to map, but the potential for spurious alignments increases as well (out of curiosity, how long are your reads?  Is there anything (e.g. related to the protocol by which the data was generated) that would lead you to expect such a low mapping rate?).

2) Build the fmd-index, and then test the effect of the parameters you're using above.  Here, the parameters should absolutely make a difference to the mapping rate.  However, the same caveat as above applies (a lower coverage requirement and lower minLen requirement will allow more reads to map, but may lead to more spurious mappings).

I'll add warnings for using these flags with the `quasi` index to my todo list for the upcoming Salmon release.

Best,
Rob

Nick Bernstein

unread,
Apr 13, 2016, 10:08:46 AM4/13/16
to Sailfish Users Group
Ah. I see. 

My reads are 100bp (from Illumina Truseq Stranded Total RNA). I'm built the index from human transcriptome, UCSC hg19. There's nothing I'm aware of from QC or the protocol to expect such a low mapping rate.

I'll see what happens with the different mapping and coverage arguments, along with different quasi indexes.

Best,
Nick

Rob Patro

unread,
Apr 13, 2016, 10:18:15 AM4/13/16
to Nick Bernstein, Sailfish Users Group
Sure!  Things to be aware of that might affect the mapping rate are the protocol (e.g. rRNA depletion tends to give yield low mapping rates sometimes), and perhaps the presence of unspliced transcripts (which could be included in the index of you have reason to believe they may be in the sample at nontrivial abundance).

Best,
Rob

Sent from Rob's iPhone.  Please excuse the brevity and any typos.
--
Sailfish is available at https://github.com/kingsfordgroup/sailfish
Citation:
Patro, Rob, Stephen M. Mount, and Carl Kingsford. "Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms." Nature biotechnology 32.5 (2014): 462-464.
---
You received this message because you are subscribed to the Google Groups "Sailfish Users Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sailfish-user...@googlegroups.com.
To post to this group, send email to sailfis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sailfish-users/2c74881e-cbfc-409a-9be0-ca5e4d403a59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages