Combine similar transcripts after TransAbySS-merge

42 views
Skip to first unread message

Giuseppe Puglia

unread,
Apr 18, 2016, 1:04:48 PM4/18/16
to Trans-ABySS
Hi,

I'm carrying out an RNAseq project on a plant species using 8 different conditions with 4 replicas each (36 samples in total). I have already assembled the transcriptome using TopHat with the annotation file, but since the reference genome and the annotation file are still not very reliable (version 1.0 not even available on Ensemble) the mapping generated about 30% of unmapped reads. My purpose is to use this 30% of reads just for gene annotation with blastx.
My procedure was to use Velvet-Oases with different k from 21 to 31 to make a de novo alignment. Then I merged all the transcripts.fa with the option "--mink 21 --maxk 31 --SS --out path/to/the/final_merged.fa" using Trans-Abyss (version 1.5.1). 
My problem is that even if the TransAbyss program combined a lot of transcripts my "final_merged.fa" file has too many sequences (file size 900Mega) where still there are al lot of very similar transcripts with different length and confidence. My aim is to combine all the very similar transcripts without using the annotation file and to obtain an easier to handle "final_merged.fa" file, but as far as I know TransAbyss-analize needs an annotation file. 

Therefore is there a way to simplify the "final_merged.fa" file combining the transcripts selecting them for their confidence value or length?

Thank you in advance. 
Reply all
Reply to author
Forward
0 new messages