Janani Hariharan

Jan 10, 2024, 4:29:10 PMJan 10
to PAML discussion group
Hi all, 

I am using pal2nal to create codon-aware alignments for several gene loci before calculating selection pressure using codeml. I'm hoping that others here have used pal2nal in the same way and can help me figure out next steps. 

I use with the -nogap option, which removes all characters from the output file. When I remove this flag, I see alignments with gaps in the output file but I can't use them for the codeml analysis. I assume this is because every column in my alignment has atleast gap? This only occurs for 2 genes out of 10 in the same operon.

Has anyone run into this issue before? How have you dealt with it? My goal is to calculate dN/dS ratios using codeml and infer what selective pressures different genes might be under. I've attached one of the nucleotide and protein files here. 



Sishuo Wang

Jan 11, 2024, 11:42:35 PMJan 11
to PAML discussion group
Dear Janani,

thx for the question! I don't think your AA seq is aligned but you're that even after alignment if -nogap is specified nothing is in the output. The reason lies in a seemingly problematic sequence (see screenshot), so I'd suggest double-checking its seq or deleting it.many_gap_seq.png.

I also attach an AA alignment using mafft v7.520 with default params. Note also that cleandata=1 will auto remove any sites w/ >= 1 gap, while cleandata=0 will treat gap as unknownsy (i.e., removing that "taxon" when calculating the likelihood of that site; for codon i think "site" here means one codon thus three nucleotides). See also Seo Tae-Kun's thread at You might also wish to have a look at my tentative solution to Problem 2.1 of Yang 2006 at which uses and codeml.


