Hello, I know this has been reported before, but I tried what has been suggested before without any luck.
My computer has 56 cpus, 128GB
After using modeltest-ng to determine which model to use in raxml-ng, it indicated to pick the GTR+G4 model.
I have version 1.2.2 of raxml-ng and this is the command I run:
raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --msa-format FASTA --data-type DNA --seed 1234 -model GTR+G4 --threads 48
this outputs the following warning:
WARNING: The model you specified on the command line (GTR+G4) will be ignored
since the binary MSA file already contains a model definition.
If you want to change the model, please re-run RAxML-NG
with the original PHYLIP/FASTA alignment and --redo option.
Can you please help me sort this out?
My alignment is a bunch of fasta files that are very similar (ITS1 fungal sequences) of about 230bp in average.
By the way, what the modeltest-ng run indicated for me to run seemed pretty limited in options. Is this concerning ? I would have like to know more on bootrapping for instance.
modeltest-ng -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -d nt -p 48 -T raxml -r 1234 -o /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi
output:
Input data:
MSA: /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
Tree: Maximum parsimony
file: -
#taxa: 3152
#sites: 2283
#patterns: 1917
Max. thread mem: 2995 MB
Output:
Log: /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.log
Starting tree: /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.tree
Results: /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/model-test-output_fungi.out
Selection options:
# dna schemes: 3
# dna models: 12
include model parameters:
Uniform: true
p-inv (+I): true
gamma (+G): true
both (+I+G): true
free rates (+R): false
fixed freqs: false
estimated freqs: true
#categories: 4
gamma rates mode: mean
asc bias: none
epsilon (opt): 0.01
epsilon (par): 0.05
keep branches: false
Additional options:
verbosity: very low
threads: 48/28
RNG seed: 1234
subtree repeats: enabled
--------------------------------------------------------------------------------
BIC model K lnL score delta weight
--------------------------------------------------------------------------------
1 GTR+G4 9 -360262.0260 769320.8321 0.0000 1.0000
2 GTR+I+G4 10 -360286.8997 769378.3126 57.4805 0.0000
3 GTR 8 -385567.0516 819923.1500 50602.3179 0.0000
4 GTR+I 9 -385575.1867 819947.1535 50626.3214 0.0000
5 HKY 4 -386346.6207 821451.3552 52130.5231 0.0000
6 HKY+I 5 -386351.7237 821469.2944 52148.4623 0.0000
7 F81 3 -387996.7704 824743.9214 55423.0892 0.0000
8 F81+I 4 -388005.0147 824768.1432 55447.3111 0.0000
9 HKY+G4 5 -429866.2668 908498.3807 139177.5486 0.0000
10 HKY+I+G4 6 -429976.1962 908725.9726 139405.1405 0.0000
Best model according to BIC
---------------------------
Model: GTR+G4
lnL: -360262.0260
Frequencies: 0.2585 0.2369 0.2170 0.2877
Subst. Rates: 1.2536 1.7666 1.6104 0.9823 2.4136 1.0000
Inv. sites prop: -
Gamma shape: 0.6464
Score: 769320.8321
Weight: 1.0000
---------------------------
Parameter importances
---------------------------
P.Inv: -
Gamma: 1.0000
Gamma-Inv: 0.0000
Frequencies: 1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv: -
Alpha: 0.6464
Alpha-P.Inv: 0.6472
P.Inv-Alpha: 0.0221
Frequencies: 0.2585 0.2369 0.2170 0.2877
Commands:
> phyml -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 012345 -f m -v 0 -a e -c 4 -o tlr
> raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTRGAMMAX -n EXEC_NAME -p PA>
> raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model GTR+G4
> paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
> iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTR+G4
AIC model K lnL score delta weight
--------------------------------------------------------------------------------
1 GTR+G4 9 -360262.0260 733144.0521 0.0000 1.0000
2 GTR+I+G4 10 -360286.8997 733195.7993 51.7473 0.0000
3 GTR 8 -385567.0516 783752.1032 50608.0511 0.0000
4 GTR+I 9 -385575.1867 783770.3735 50626.3214 0.0000
5 HKY 4 -386346.6207 785303.2414 52159.1893 0.0000
6 HKY+I 5 -386351.7237 785315.4474 52171.3953 0.0000
7 F81 3 -387996.7704 788601.5408 55457.4887 0.0000
8 F81+I 4 -388005.0147 788620.0294 55475.9773 0.0000
9 HKY+G4 5 -429866.2668 872344.5337 139200.4816 0.0000
10 HKY+I+G4 6 -429976.1962 872566.3923 139422.3402 0.0000
--------------------------------------------------------------------------------
Best model according to AIC
Best model according to AIC
---------------------------
Model: GTR+G4
lnL: -360262.0260
Frequencies: 0.2585 0.2369 0.2170 0.2877
Subst. Rates: 1.2536 1.7666 1.6104 0.9823 2.4136 1.0000
Inv. sites prop: -
Gamma shape: 0.6464
Score: 733144.0521
Weight: 1.0000
---------------------------
Parameter importances
---------------------------
P.Inv: -
Gamma: 1.0000
Gamma-Inv: 0.0000
Frequencies: 1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv: -
Alpha: 0.6464
Alpha-P.Inv: 0.6472
P.Inv-Alpha: 0.0221
Frequencies: 0.2585 0.2369 0.2170 0.2877
Commands:
> phyml -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 012345 -f m -v 0 -a e -c 4 -o tlr
> raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTRGAMMAX -n EXEC_NAME -p PA>
> raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model GTR+G4
> paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
> iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m GTR+G4
AICc model K lnL score delta weight
--------------------------------------------------------------------------------
1 F81 3 -387996.7704 80282041.5408 0.0000 1.0000
2 HKY 4 -386346.6207 80303963.2414 21921.7006 0.0000
3 F81+I 4 -388005.0147 80307280.0294 25238.4886 0.0000
4 HKY+I 5 -386351.7237 80329199.4474 47157.9066 0.0000
5 GTR+G4 9 -360262.0260 80377964.0521 95922.5113 0.0000
6 F81+G4 4 -431215.1159 80393700.2318 111658.6910 0.0000
7 GTR+I+G4 10 -360286.8997 80403259.7993 121218.2585 0.0000
8 GTR 8 -385567.0516 80403332.1032 121290.5624 0.0000
9 HKY+G4 5 -429866.2668 80416228.5337 134186.9929 0.0000
10 F81+I+G4 5 -431325.7187 80419147.4374 137105.8966 0.0000
Best model according to AICc
---------------------------
Model: F81
lnL: -387996.7704
Frequencies: 0.2702 0.2471 0.1822 0.3005
Subst. Rates: 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
Inv. sites prop: -
Gamma shape: -
Score: 80282041.5408
Weight: 1.0000
---------------------------
Parameter importances
---------------------------
P.Inv: -
Gamma: -
Gamma-Inv: -
Frequencies: 1.0000
---------------------------
Model averaged estimates
---------------------------
P.Inv: -
Alpha: -
Alpha-P.Inv: -
P.Inv-Alpha: -
Frequencies: 0.2702 0.2471 0.1822 0.3005
Commands:
> phyml -i /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m 000000 -f m -v 0 -a 0 -c 1 -o tlr
> raxmlHPC-SSE3 -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -c 1 -m GTRCATX --JC69 -n EXEC_>
> raxml-ng --msa /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta --model F81
> paup -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta
> iqtree -s /home/tremblayemi/nfcore/rawData/results/fungi-and-controls_v2/dada2/ASV_seqs_MSA.fasta -m F81
Done
Thanks