What is the meaning of the number of patterns in an alignment?

636 views
Skip to first unread message

Andy Garcia

unread,
Jan 14, 2021, 2:56:30 PM1/14/21
to raxml
Hello,

I am trying to use RaxML HPC2 on CIPRES, and the website is asking me to input the number of patterns in my alignment. This is what CIPRES says: "Knowing the number of characters in your dataset helps us determine the most efficient way to run raxml. We need to know the number of characters per row in the input data matrix,". 

I saw on another thread here that the number of patterns = the number of unique/identical columns in multiple sequence alignment. Is this the same thing that CIPRES is describing above? If so, how do I determine the number of identical columns in my alignment? I know how to find the number of columns in my alignment, I'm just not sure how to determine how many of them are unique / how many patterns there are. Thank you for your time.

Best Regards,
Andy

Mark Miller

unread,
Jan 14, 2021, 4:10:22 PM1/14/21
to raxml
Hi Andy,

First, let me apologize for the text in the cipres interface, which is incorrect. The correct definition  of patterns is the number of unique columns in the input multiple sequence alignment.
If you dont have that information already, you can make a quick test run, using either 1000 as the number of patterns, or use the number of characters in your matrix as a starting point.
Start the run, and take a look at the intermediate results when RAxML8 or RAxML-NG are run. It will print the number of patterns. If the number is very different from what you entered, you can kill the job, then clone and adjust the number of patterns.
It is ok to enter 1000, but if the value is much larger than that, the run may be very slow compared to what one might hope/what is possible.

Let me know if you have further questions. I will correct the text in the interface.

Best,
Mark
Reply all
Reply to author
Forward
0 new messages