Hi,
I have a dataset of protein coding sequences. There are three introns; two of them are placed between codons (phase 0), but one is between codon positions 1 and 2 (phase 1). I am also partitioning the data (codon positions 1 and 2 together and then 3 as a separate partition).
My question: Do I have to take into account the “phase 1” intron when assigning partitions? …as I have done here…The “phase 1” intron is at 538-593. …it looks logical, but in the end I became unsure how the program reads/executes the script.
Partition file:
DNA,
EF1Codon1andCodon2 = 103-144\3, 104-144\3, 399-537\3, 400-537\3, 596-1202\3,
594-1202\3,
DNA, EF1Codon3 = 105-144\3, 401-537\3, 595-1202\3
DNA, EF1intron = 1-102, 145-398, 538-593
Timo
But well there may be exceptions from the rule. How did you define the starts/ends of the exons? Have you used an aminoacid sequence of the protein as template to identify the 1st, 2nd and 3rd codon positions and the position of the introns?
If not, you may just have misplaced the codon by one-two nucleotides. I notice that your third exon (594-1202), the one that according to your question should start with a 2nd codon position, has a length of 609 nucleotides, so it either has two nucleotide too many, or is short one.
Analysis-wise it doesn't matter, as long as all first/second and third codon positions are correctly defined. Which is the case in your original definition:
after the last intron, the exon starts with the 2nd codon pos. (594), then comes the 3rd (595), and then the first (596). So all in order, analysis-technically.
Still, I would check if you recognised the introns correctly. In case you don't have a reference proteine sequence, you can just check the number of variable sites for you 1st/2nd and 3rd codon partitions. The latter should always be more than the first, also the optimised model for the 3rd codon position usually approaches more decisively a HKY-like or similar situation (transversions with much lower probability than transitions) than the one for the 1st and 2nd codon position. RAxML includes the according partition-wise information in the RAxML info file.