Inaccurate length recorded for best selected open reading frames

52 views
Skip to first unread message

Molly Rivers

unread,
Oct 20, 2023, 5:45:18 AM10/20/23
to TransDecoder-users

Hi Brian,

I am using TransDecoder on my de novo assembled transcriptome, which has been assembled using illumina short-reads and iso-seq long-reads in rnaSPAdes. 

Script: TransDecoder.LongOrfs -t assembled.Transcriptome.fasta -m 50, TransDecoder.Predict -t  assembled.Transcriptome.fasta   --single_best_only

When using the gff3 output file from TransDecoder I am finding an issue with the annotated length of the selected open reading frame. There is a discrepancy between the annotated sequence length (in the gff3 file) and the actual sequence length. The sequence is labelled as being longer than the actual contig (which I have manually checked using the longest.orfs.cds file). What I am finding, is that the selected open reading frame is being annotated with the length of the longest open reading frame, even when this is not the one that was selected. 

I am not sure if this has to do with the iso-seq long-read sequences in the transcriptome, but it is causing issues with another programme I need to run with the TransDecoder gff3 output file. Any input you can provide on what might be causing this issue and how to fix this would be greatly appreciated.


Many thanks,

Molly Rivers

Reply all
Reply to author
Forward
0 new messages