--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
We also once accidently specified our non-strand-specific library as being strand-specific (RF). Remarkably, this assembly produced less truncated contigs (so longer ‘genuine' contigs, confirmed by RACE-PCR) than when a default trinity was run. What is more surprising is that this assembly constructed with the -RF flag scored better metrics in Transrate (higher percentage of good read mappings, and overall higher transrate scores) and showed higher BUSCO scores (more complete BUSOs, however also more duplicated BUSCOs than in our default trinity run, which is a consequence of the redundancy you are talking about).
Does anyone have an idea why the assembler performs better in several ways (but not all, see higher BUSCO duplications) when strand-specificity is specified (RF), while the library is actually not strand-specific (we are sure about this, the company that did the sequencing also confirmed that they do not use a strand-specific library preparation kit)?
We tried the RF flag with some additional libraries, and we consequently got a lot less truncated contigs. So by specifying strand-specificity trinity will construct both sense as well as antisense transcripts and not merge them. If you would merge the sense and antisense transcripts by using a tool like dedupe (BBtools, since I read that CD hit does not merge reverse complements), will the assembly harbor additional artifacts because of specifying this strand-specificity, other than reducing sequencing depth (in theory) by 50%?
We are planning to make some real strand-specific libraries in the future, so we can actually benefit from the RF flag. But in the mean time, I am wondering whether anyone can get his head around these peculiar results.
Kind regards,
Margo