What are "indel groups" in "alignment-info"?

2 views
Skip to first unread message

Benjamin Redelings

unread,
Feb 7, 2022, 5:57:44 PMFeb 7
to bali-ph...@googlegroups.com

Hi,

I was asked what "indel groups" means in the "alignment-info" program.  Here is my answer:

1. For the purposes of this program, an indel is a sequence of "-" characters in one row with non-gaps on both ends.  It is determined by the location of the first "-" and the last "-" in the indel, and also the row.

2. The total number of indels in an alignment is computed by determining the number of indels in each row, and then adding the numbers for the rows.  This is the number of "separate" indels.

3. Two gaps in different rows are considered to be in the same "indel group" if the two gaps start and end at the same place.  An indel group is determined by just the location of the first and last "-".

4. An indel group is "unique" if it occurs in only one sequence.  "informative" means that it occurs in two or more sequences.  The terminology of "informative" refers to maximum parsimony.

5. However, the definition of gaps from alignment-info has a problem.  It depends on the order of columns.  But that is not always defined.  For example, consider:

A-A-
-A-A

and

AA--
--AA
These two alignments have the same homology information, but they have different indel groups, according to the definition above.  This is a problem if you want to use this program for research.

-BenRI

Reply all
Reply to author
Forward
0 new messages