Hi,
There are several reasons why you might have multiple trinity 'genes'
having the same top blast matches. The first is biological - they
represent paralogs. Other reasons are more technical, such as the
trinity 'genes' are partial and represent different non-overlapping
parts of the same gene, but ended up being in a fragmented assembly
due to insufficient read coverage or algorithmic complications.
Looking at the regions of sequence homology along the target best
match could give some clues here.
If it turns out that they're paralogs, you might want to keep them
separate instead of collapsing. If they're 'parts' of the same gene,
then collapsing could be better justified.
hope this helps,
~b
> --
> You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
trinityrnaseq-u...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/trinityrnaseq-users/1827fa73-4ad8-4a57-a27a-1d32fb8442aan%40googlegroups.com.
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas