translate trinity.fasta or use transdecoder.fasta.pep as protein database?

509 views
Skip to first unread message

Cheryl Ames

unread,
Sep 17, 2015, 8:41:07 PM9/17/15
to trinityrnaseq-users
Hello Trinityrnaseq-users:

I work on a non-model organism, and want to use my transcriptome (trinity.fasta) file to make a protein database for a downstream proteomics project on the same species.

I am exploring these two options:
Option 1: query my protein sequences against the trinity.transdecoder.fasta.pep
Option 2: translate the entire trinity.fasta file into amino acids using all 6 reading frames (3 forward; 3 reverse) in Geneious.

Logically which option would work best as to generate a "protein database" for my non-model organism?

With option 2, I would need to reformat the forward and reverse strands in order to distinguish them in a concatenated database.
With option 1, it is concerning that the  transdecoder.pep file is much smaller than the translated and concatenated trinity.fasta database will be.

Does anyone have a better option?
Suggestions are appreciated.

Best,
Sheriru

Brian Haas

unread,
Sep 18, 2015, 8:59:56 AM9/18/15
to Cheryl Ames, trinityrnaseq-users
Hi Sheriru,

For proteomics studies, folks usually do the 6-frame translations and search that.  This way, you'll not miss anything that was otherwise excluded by the trandecoder search

best,

~brian

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Cheryl Ames

unread,
Sep 18, 2015, 9:12:08 AM9/18/15
to trinityrnaseq-users
Dear Brian,
Thank you.
In order to search the six-frame translation effectively, can you recommend at good way of:
1. Changing the format of the reverse translation so that entries there can be distinguished from those from forward translation?
--> Just adding an F or R at the end of the contig?
2. Lining up the format so that the search engine can use one parsing rule for all entries?
--> Just concatenating all the files into one?
Any suggestions would be great.
Best,
Sheriru

Brian Haas

unread,
Sep 20, 2015, 9:20:30 AM9/20/15
to Cheryl Ames, trinityrnaseq-users
Hi Cheryl,

Attached is a script that will do the 6-frame translation of a fasta file.  You can drop it into TRINITY/util/misc  and it should work.

It generates output like so, providing F${frame}_trans as a suffix to the accession:

>TR83|c0_g1_i1.F1_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
RRHNQGTRLRGTDASRSGRTAAERKRRRRRCQRQPRAAAAPRGKGR
>TR83|c0_g1_i1.F2_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
ADIIRAPG*EEPTRHAAAERLRRGSGGGGGASASPELPRRREARGG
>TR83|c0_g1_i1.F3_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
PT*SGHQVKRNRRVTQRPNGCGEEAAAAAVPAPAQSCRGAERQGAE
>TR83|c0_g1_i1.F4_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
LRPLPLGAAAALGWRWHRRRRRFLSAAVRPLRDASVPLNLVP*LCR
>TR83|c0_g1_i1.F5_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
SAPCLSAPRQLWAGAGTAAAAASSPQPFGRCVTRRFLLTWCPDYVG
>TR83|c0_g1_i1.F6_trans TR83|c0_g1_i1 len=140 path=[235:0-139] [-1, 235, -2]
PPLASRRRGSSGLALAPPPPPLPLRSRSAAA*RVGSS*PGALIMSA


I hope this helps,


~brian




--
sixFrameTranslation.pl
Reply all
Reply to author
Forward
0 new messages