Hi,
I have a question concerning the use of Salmon in alignment-based mode and the quantification of transcripts that share the same exon regions. I have used STAR (with the option --quantMode TranscriptomeSAM) to map genes to the genome and then “project” them to transcriptome coordinates. In the case of a genomic alignment being mapped to several transcriptomic coordinates, it is projected to all of them, resulting in one genomic alignment being converted to as many transcriptomic alignments as needed. Now I was wondering about the following questions:
1) How does the Salmon alignment-based mode treat these alignments for transcript quantification? Is there a reason why all alignments for the same read should appear consecutively in the input alignment file?
2) How does this affect the summarization of transcript TPM counts to gene-level counts when using tximport with the txOut=FALSE option?
In short, does the combined use of STAR and Salmon (in alignment-based mode) lead to genes having a higher expression due to many transcripts sharing the same exon(s)/genome coordinates (and thus read alignments being projected to several transcripts) or is there a way to control for this?
Hi,
I have a question concerning the use of Salmon in alignment-based mode and the quantification of transcripts that share the same exon regions. I have used STAR (with the option --quantMode TranscriptomeSAM) to map genes to the genome and then “project” them to transcriptome coordinates. In the case of a genomic alignment being mapped to several transcriptomic coordinates, it is projected to all of them, resulting in one genomic alignment being converted to as many transcriptomic alignments as needed. Now I was wondering about the following questions:
1) How does the Salmon alignment-based mode treat these alignments for transcript quantification? Is there a reason why all alignments for the same read should appear consecutively in the input alignment file?
2) How does this affect the summarization of transcript TPM counts to gene-level counts when using tximport with the txOut=FALSE option?
In short, does the combined use of STAR and Salmon (in alignment-based mode) lead to genes having a higher expression due to many transcripts sharing the same exon(s)/genome coordinates (and thus read alignments being projected to several transcripts) or is there a way to control for this?