Using PacBio data with Trinity to improve de novo transcriptome assembly

410 views
Skip to first unread message

Atul Kakrana

unread,
Oct 1, 2015, 11:38:42 AM10/1/15
to trinityrnaseq-users
Hi List,

I just finished a de novo transcriptome assembly of Illumina strand-specific PE and SE reads using Trinity. First of all, Trinity (v.2.0.6) works great now, lots of improvements since last year. Very nice tool indeed.

Now back to topic, here is summary of my assembly which shows low  N50 values:

Total trinity 'genes': 748144
Total trinity transcripts: 916206
Percent GC: 43.83

Contig N10: 1765
Contig N20: 1064
Contig N30: 677
Contig N40: 477
Contig N50: 369

Median contig length: 253
Average contig: 364.39
Total assembled bases: 333854406

In addition to Illumina SE and PE data, I also have PacBio full-length transcripts data. So, I was wondering if I can use it, perhaps after proof-reading/correcting, to improve my Trinity assembly. Is some sort of hybrid assembly possible with Trinity? Or is there any way Trinity can benefit from full-length (PacBio) transcripts data?

Thanks for helping me with Trinity and de novo assembly.

Atul

Brian Haas

unread,
Oct 1, 2015, 11:43:18 AM10/1/15
to Atul Kakrana, trinityrnaseq-users
Hi Atul,

You can incorporate corrected pacbio reads into Trinity using the --long_reads parameter.   

You won't likely see much difference in the bulk statistics, but it should help for certain complex genes that benefit from the long range connectivity provided by the pacbio reads.

best,

~brian


--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Atul Kakrana

unread,
Oct 1, 2015, 3:20:43 PM10/1/15
to trinityrnaseq-users, atulk...@gmail.com, trinityrn...@googlegroups.com
Hi Brian,

Thanks for your reply. It there any documentation on how long (pacbio) reads will be used by trinity? Trinity help is very brief:

--long_reads <string>           :fasta file containing error-corrected or circular consensus (CCS) pac bio reads


Is Trinity 1) going to assemble Illumina reads first and then try to correct the isoforms/gene models using longer pacbio reads or 2)  is it going to assemble reads using pacbio as reference first and assemble those separately that can't be mapped to pacbio reads i.e. sort of hybrid assembly?

Atul
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Brian Haas

unread,
Oct 1, 2015, 3:55:19 PM10/1/15
to Atul Kakrana, trinityrnaseq-users
It's actually still mostly an R&D project, to be completely honest.   I've demonstrated that it 'works' based on the Trinity v2 redesign, but there's still a lot of work to do to demonstrate just how much it helps.  We should aim to make that clear in the documentation.

In short, it uses the long reads just like the short reads in the new Trinity v2 framework, and documentation is forthcoming, ideally with a corresponding publication with the finer details.

best,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages