Paralogs, trinity components and abundance estimation

40 views
Skip to first unread message

mafalda_sferreira

unread,
May 20, 2014, 7:39:44 PM5/20/14
to rsem-...@googlegroups.com
Hello, 

I've done a de novo assembly using Trinity with RNA_seq data and now I'm following the Trinity raper and using RSEM do estimate abundances. I understand that RSEM deals with reads mapping to multiple genes and isoforms but I have a question.

How does RSEM deal with reads that map to multiple regions in my transcriptome. If two genes (trinity components) are very similar because they are paralogs or duplicates, how will RSEM deal with them? Shouldn't they be excluded? Are they treated in the same way as reads mapping to different isoforms of the same gene? 

Thanks,
Mafalda

Colin Dewey

unread,
May 20, 2014, 9:35:44 PM5/20/14
to rsem-...@googlegroups.com
Hi Mafalda,

It is the latter: reads that map to multiple genes (components) are treated the same as reads that map to multiple isoforms of the same gene.  Under the hood, RSEM actually doesn't use the concept of a gene at all, it explicitly estimates abundances of all transcripts, using all reads.  Gene-level abundances are only computed as a post-processing step by simply summing the abundances of each gene's isoforms.

Best,
Colin

--
RSEM website: http://deweylab.biostat.wisc.edu/rsem/
---
You received this message because you are subscribed to the Google Groups "RSEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rsem-users+...@googlegroups.com.
To post to this group, send email to rsem-...@googlegroups.com.
Visit this group at http://groups.google.com/group/rsem-users.

Reply all
Reply to author
Forward
0 new messages