Question: Rate matrix Q with GTR+F+G4

278 views
Skip to first unread message

Eve

unread,
Jul 30, 2019, 11:11:47 AM7/30/19
to IQ-TREE
A trying to reconstruct a tree within a bacteria CC group (expected to be very closely related). We removed expected recombinant sites, and worked with a core alignment.
We choosen a standard Model of substitution: GTR+F+G4 (we have not done model testing yet, we are just learning ...) .
We are not so surprized to observe that the G4 might be actually overfitting, because it seems we have a very low variation between heterogeneity rate classes if I understand well, wich we woud expect if we were working within a lineage/CC group. 

But we were wondering why the Rate matrix Q was not symetrical. 
Is it because of rounding/model approximation or scaling?  I see all diagonal values are negative. Could you explain? 

Please extract of output corresponding to this question bellow
Best regards

Eve

```
SEQUENCE ALIGNMENT
------------------

Input data: 25 sequences with 1441 nucleotide sites
Number of constant sites: 0 (= 0% of all sites)
Number of invariant (constant or ambiguous constant) sites: 0 (= 0% of all sites)
Number of parsimony informative sites: 481
Number of distinct site patterns: 226

SUBSTITUTION PROCESS
--------------------

Model of substitution: GTR+F+G4

Rate parameter R:

  A-C: 0.7237
  A-G: 5.0504
  A-T: 1.3350
  C-G: 0.2022
  C-T: 5.0167
  G-T: 1.0000

State frequencies: (empirical counts from alignment)

  pi(A) = 0.2086
  pi(C) = 0.2908
  pi(G) = 0.2811
  pi(T) = 0.2195

Rate matrix Q:

  A    -1.203    0.1316    0.8877    0.1832
  C    0.0944   -0.8185   0.03554    0.6886
  G    0.6588   0.03676   -0.8328    0.1373
  T    0.1741    0.9121    0.1758    -1.262

Model of rate heterogeneity: Gamma with 4 categories
Gamma shape alpha: 998.4-

 Category  Relative_rate  Proportion
  1         0.9601         0.25
  2         0.9894         0.25
  3         1.01           0.25
  4         1.041          0.25
Relative rates are computed as MEAN of the portion of the Gamma distribution falling in the category.
```

Minh Bui

unread,
Aug 11, 2019, 4:05:44 AM8/11/19
to IQ-TREE, Eve
Hi Eve,

Sorry for the slow response.

Q is not symmetric because q_ij = r_ij * pi_j, 

where r_ij is the symetric matrix R (the “Rate parameter R” in iqtree output), and pi_j is the state frequency vector.

Q is symmetric only if the model has equal state frequencies (e.g. the SYM model). See http://www.iqtree.org/doc/Substitution-Models#dna-models 

Note that this model still reversible, even though Q is not symmetric.

Hope that helps,
Minh

--
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/iqtree/ed0343c7-f670-42af-b9e7-89ec09a8e381%40googlegroups.com.

Eve

unread,
Aug 12, 2019, 2:58:30 AM8/12/19
to IQ-TREE
Thanks a lot for you answer and for your time
All the best
Eve

Minh Bui

unread,
Aug 16, 2019, 3:22:50 AM8/16/19
to IQ-TREE, Eve
Hi Eve again,

Regarding your other questions: the diagonal elements are negative because sum of rows of Q matrix should be 0. This follows a standard Markov process. Therefore, I suggest you have a look at text book, such as the Phylogenetic Handbook: It has a nice chapter on the derivation of the Q matrix.

Cheers,
Minh

--
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages