Model specified gets ignored

17 views
Skip to first unread message

Émilie Tremblay

unread,
Jan 16, 2025, 10:48:47 PMJan 16
to raxml
Hello, I know this has been reported before, but I tried what has been suggested before without any luck.

My computer has 56 cpus, 128GB

After using modeltest-ng to determine which model to use in raxml-ng, it indicated to pick the GTR+G4 model.

I have version 1.2.2 of raxml-ng and this is the command I run:

raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --msa-format FASTA --data-type DNA --seed 1234 -model GTR+G4 --threads 48

this outputs the following warning:
WARNING: The model you specified on the command line (GTR+G4) will be ignored
         since the binary MSA file already contains a model definition.
         If you want to change the model, please re-run RAxML-NG
         with the original PHYLIP/FASTA alignment and --redo option.

Can you please help me sort this out?
My alignment is a bunch of fasta files that are very similar (ITS1 fungal sequences) of about 230bp in average.

By the way, what the modeltest-ng run indicated for me to run seemed pretty limited in options. Is this concerning ? I would have like to know more on bootrapping for instance.

modeltest-ng -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -d nt -p 48 -T raxml -r 1234 -o /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi

output:

Input data:
  MSA:        /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
  Tree:       Maximum parsimony
    file:           -
  #taxa:            3152
  #sites:           2283
  #patterns:        1917
  Max. thread mem:  2995 MB

Output:
  Log:           /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.log
  Starting tree: /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.tree
  Results:       /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.out

Selection options:
  # dna schemes:      3
  # dna models:       12
  include model parameters:
    Uniform:         true
    p-inv (+I):      true
    gamma (+G):      true
    both (+I+G):     true
    free rates (+R): false
    fixed freqs:     false
    estimated freqs: true
    #categories:     4
  gamma rates mode:   mean
  asc bias:           none
  epsilon (opt):      0.01
  epsilon (par):      0.05
  keep branches:      false

Additional options:
  verbosity:        very low
  threads:          48/28
  RNG seed:         1234
  subtree repeats:  enabled
--------------------------------------------------------------------------------

BIC       model              K            lnL          score          delta    weight
--------------------------------------------------------------------------------
       1  GTR+G4             9   -360262.0260    769320.8321         0.0000    1.0000
       2  GTR+I+G4          10   -360286.8997    769378.3126        57.4805    0.0000
       3  GTR                8   -385567.0516    819923.1500     50602.3179    0.0000
       4  GTR+I              9   -385575.1867    819947.1535     50626.3214    0.0000
       5  HKY                4   -386346.6207    821451.3552     52130.5231    0.0000
       6  HKY+I              5   -386351.7237    821469.2944     52148.4623    0.0000
       7  F81                3   -387996.7704    824743.9214     55423.0892    0.0000
       8  F81+I              4   -388005.0147    824768.1432     55447.3111    0.0000
       9  HKY+G4             5   -429866.2668    908498.3807    139177.5486    0.0000
      10  HKY+I+G4           6   -429976.1962    908725.9726    139405.1405    0.0000

Best model according to BIC
---------------------------
Model:              GTR+G4
lnL:                -360262.0260
Frequencies:        0.2585 0.2369 0.2170 0.2877
Subst. Rates:       1.2536 1.7666 1.6104 0.9823 2.4136 1.0000
Inv. sites prop:    -
Gamma shape:        0.6464
Score:              769320.8321
Weight:             1.0000
---------------------------
Parameter importances
---------------------------
P.Inv:              -
Gamma:              1.0000
Gamma-Inv:          0.0000
Frequencies:        1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv:              -
Alpha:              0.6464
Alpha-P.Inv:        0.6472
P.Inv-Alpha:        0.0221
Frequencies:        0.2585 0.2369 0.2170 0.2877

Commands:
  > phyml  -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 012345 -f m -v 0 -a e -c 4 -o tlr
  > raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTRGAMMAX -n EXEC_NAME -p PA>
  > raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model GTR+G4
  > paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
  > iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTR+G4

AIC       model              K            lnL          score          delta    weight
--------------------------------------------------------------------------------
       1  GTR+G4             9   -360262.0260    733144.0521         0.0000    1.0000
       2  GTR+I+G4          10   -360286.8997    733195.7993        51.7473    0.0000
       3  GTR                8   -385567.0516    783752.1032     50608.0511    0.0000
       4  GTR+I              9   -385575.1867    783770.3735     50626.3214    0.0000
       5  HKY                4   -386346.6207    785303.2414     52159.1893    0.0000
       6  HKY+I              5   -386351.7237    785315.4474     52171.3953    0.0000
       7  F81                3   -387996.7704    788601.5408     55457.4887    0.0000
       8  F81+I              4   -388005.0147    788620.0294     55475.9773    0.0000
       9  HKY+G4             5   -429866.2668    872344.5337    139200.4816    0.0000
      10  HKY+I+G4           6   -429976.1962    872566.3923    139422.3402    0.0000
--------------------------------------------------------------------------------
Best model according to AIC
Best model according to AIC
---------------------------
Model:              GTR+G4
lnL:                -360262.0260
Frequencies:        0.2585 0.2369 0.2170 0.2877
Subst. Rates:       1.2536 1.7666 1.6104 0.9823 2.4136 1.0000
Inv. sites prop:    -
Gamma shape:        0.6464
Score:              733144.0521
Weight:             1.0000
---------------------------
Parameter importances
---------------------------
P.Inv:              -
Gamma:              1.0000
Gamma-Inv:          0.0000
Frequencies:        1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv:              -
Alpha:              0.6464
Alpha-P.Inv:        0.6472
P.Inv-Alpha:        0.0221
Frequencies:        0.2585 0.2369 0.2170 0.2877

Commands:
  > phyml  -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 012345 -f m -v 0 -a e -c 4 -o tlr
  > raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTRGAMMAX -n EXEC_NAME -p PA>
  > raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model GTR+G4
  > paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
  > iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTR+G4

AICc      model              K            lnL          score          delta    weight
--------------------------------------------------------------------------------
       1  F81                3   -387996.7704  80282041.5408         0.0000    1.0000
       2  HKY                4   -386346.6207  80303963.2414     21921.7006    0.0000
       3  F81+I              4   -388005.0147  80307280.0294     25238.4886    0.0000
       4  HKY+I              5   -386351.7237  80329199.4474     47157.9066    0.0000
       5  GTR+G4             9   -360262.0260  80377964.0521     95922.5113    0.0000
       6  F81+G4             4   -431215.1159  80393700.2318    111658.6910    0.0000
       7  GTR+I+G4          10   -360286.8997  80403259.7993    121218.2585    0.0000
       8  GTR                8   -385567.0516  80403332.1032    121290.5624    0.0000
       9  HKY+G4             5   -429866.2668  80416228.5337    134186.9929    0.0000
      10  F81+I+G4           5   -431325.7187  80419147.4374    137105.8966    0.0000

Best model according to AICc
---------------------------
Model:              F81
lnL:                -387996.7704
Frequencies:        0.2702 0.2471 0.1822 0.3005
Subst. Rates:       1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
Inv. sites prop:    -
Gamma shape:        -
Score:              80282041.5408
Weight:             1.0000
---------------------------
Parameter importances
---------------------------
P.Inv:              -
Gamma:              -
Gamma-Inv:          -
Frequencies:        1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv:              -
Alpha:              -
Alpha-P.Inv:        -
P.Inv-Alpha:        -
Frequencies:        0.2702 0.2471 0.1822 0.3005

Commands:
  > phyml  -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 000000 -f m -v 0 -a 0 -c 1 -o tlr
  > raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -c 1 -m GTRCATX --JC69 -n EXEC_>
  > raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model F81
  > paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
  > iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m F81

Done


Thanks


Grimm

unread,
Jan 17, 2025, 7:54:04 AMJan 17
to raxml
Hi,

the error message is quite clear, the ASV_seqs_MSA.fasta either contains already a model-definition or could have the wrong data format (e.g. binary instead of nucleotide sequence alignment).
Ideally, it should look like this:
>ASV1 
GCATTCCCTTTTRG....
>ASV2
GCATCCCCTATARG....

or
ASV1 GCATTCCCTTTTRG....
ASV2 GCATCCCCTATARG....

and nothing else. Just labels and sequences.

Re. Second question. Nah, that's fine. For non-coding but functional spacers that underly few structural constraints as in the case of ITS1 (in contrast to ITS2, the ITS1 is not folded during rRNA maturation) but are also not 100% free to evolve neutrally in samples that comprise a high diversity (1900 DAP for 2200 AS; I guess it's an environmental bulk sample), modeltesting usually converges to a single optimal option, wether max-parametrised GTR (typical result for most organism's ITS1) or classic F81 (next-most, together with HKY) depends on the sequence structure (mutation patterns in globally aligned sites) and stuff like overall GC-content.

Cheers, G.

Oleksiy Kozlov

unread,
Jan 17, 2025, 7:55:05 AMJan 17
to ra...@googlegroups.com
Hi Émilie,



> raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/
> ASV_seqs_MSA.fasta --msa-format FASTA --data-type DNA --seed 1234 -model GTR+G4 --threads 48
>
> this outputs the following warning:
> WARNING: The model you specified on the command line (GTR+G4) will be ignored
>          since the binary MSA file already contains a model definition.
>          If you want to change the model, please re-run RAxML-NG
>          with the original PHYLIP/FASTA alignment and --redo option.
>
> Can you please help me sort this out?

Please ignore this warning since model is the same (GTR+G4) in both cases.


> By the way, what the modeltest-ng run indicated for me to run seemed pretty limited in options. Is
> this concerning ? I would have like to know more on bootrapping for instance.

Not quite sure what do you mean by "limited" here?
Bootstrapping is typically used during tree search with raxml-ng, and not for model selection.


Best,
Oleksiy
Reply all
Reply to author
Forward
0 new messages