Dear Trinity/Trinotate creators,
Thank you very much for such great tools!
I similarly had genes annotated with the same blastx gene identity which is complicating interpretation of downstream DESeq2 analysis since either the same gene name shows up multiple times or on opposite sides of a comparison.
I had two main questions representing two ways of addressing this.
1) If I were to continue with this analysis as is, I need to explain why those genes/transcripts were not clustered together. Was there just not enough evidence during assembly (I used de novo Trinity assembly on pooled samples since I have a non-model organism) to combine them? I am leaning towards this since I trust the tools, but unsure how to best explain to others (i.e. reviewers) honesty.
2) If I decide to move forward with combining genes by annotation, is there a script in trinity/trinotate that could help with that or another tool? In a previous question Dr. Haas put forth PASA as a potential solution to a similar issue (
https://groups.google.com/g/trinityrnaseq-users/c/IXFYk9qc5aw/m/WThVjncNAAAJ) but I am unsure if that is the best route to explore. Per the previous questions, I understand that paralogs complicate matters.
Please let me know if you need any other information.
Thank you very much for your time and help.
Sincerely,
David