In making phylogeny of protein superfamily I have 15000+ refseq sequences, many of them very similar to each other. Suggestions for reducing number?
17 views
Skip to first unread message
ramiro barrantes
unread,
Oct 7, 2015, 9:45:07 PM10/7/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to POY - Phylogenetic Analysis Software
I am revisiting a phylogeny I did years ago but now refseq in genbank has about 15000+ relevant protein sequences!! I wanted to filter those out to a more manageable set and I am using t-coffee (like I used to) but it's taking a long time and I am wondering what people do these days. Are there other things people use to automatically remove sequences that are very similar to each other?? Since I am interested in the deep branches I don't need to have all sequences, just the few hundred most divergent ones (a lot of these are different strains of e.coli of the same subfamily for example, where I could just use one). Any suggestions??
Thank you very much for any help,
Ramiro
alizoh...@gmail.com
unread,
Jan 22, 2016, 11:46:28 AM1/22/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to POY - Phylogenetic Analysis Software
HI
Dear
You can upload to online servers for analysis. Just download your sequences in Bioedit and uplaod to