Ziheng
unread,Aug 3, 2014, 4:41:49 PM8/3/14Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to pamlso...@googlegroups.com
135 species or sequences are not too many. The error has to do with the fact that I wrote the program to accommodate at most 256 (I think) distinct codons, including TTT, TTC, ..., GGG, TYY, NGG, etc... If you used lots of ambiguity codes, you may have reached the limit.
One thing you can do is perhaps to sample a sequence from each specie at random, rather than using ambiguity codes to represent polymorphisms. You can then perhaps somehow try to analyze your polymorphism data separately and somehow integrate the results.
The program is not designed to deal with polymorphism data, so even if you code the polymorphic codons TTT and TTC as TTY, the program will be misinterpreting the data. It will treat TTY as meaning that the species has a codon TT?, where ? is T or C but definitely not both and we don't know whether it is T or C. But in fact you know both T and C are observed.
Ziheng