SNAPP XML Error 110: Could Not Identify Newick Tree Labels – Need Help

100 views
Skip to first unread message

Emanuele Berrilli

unread,
Aug 17, 2025, 3:29:34 PMAug 17
to beast-users

Hi everyone,

I am generating an XML file for SNAPP using the snapp_prep.rb script and following the tutorial: https://github.com/ForBioPhylogenomics/tutorials/tree/main/divergence_time_estimation_with_snp_data. The only difference from the tutorial is that I am using the -s option to specify a starting tree that I already have in .nwk format (generated with RAxML). Everything seems to work fine, and the XML file is generated without issues.

However, when I run it in beast, I keep getting the following error:

Error 110 parsing the xml input file validate and initialize error: Label 'CM-VM-01' in Newick beast.tree could not be identified. Perhaps taxa or taxonset is not specified? Error detected about here: <beast> <run id='mcmc' spec='MCMC'> <state id='state'> <stateNode id='tree' spec='beast.base.evolution.tree.TreeParser'>

I have double-checked all the IDs, and they are all correct, so I believe the problem is with reading the tree itself. Unfortunately, I am not sure how to fix this, so I am asking for your help in trying to resolve the issue.

Thank you very much,
Emanuele

Omar Mejía

unread,
Aug 17, 2025, 10:43:43 PMAug 17
to beast...@googlegroups.com
Maybe the trouble is related with the use of  - in the name , try  using a _ instead

regards

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/a81f39f0-020c-43e2-8751-3cc7fc00682en%40googlegroups.com.


--
Omar Mejía G
Laboratorio de Variación Biológica y Evolución
Departamento de Zoología
Escuela Nacional de Ciencias Biológicas-IPN

michaelm

unread,
Aug 18, 2025, 2:25:26 PMAug 18
to beast-users
Hi Emanuele,

did you compare the names in the XML itself – whether they are identical between the Newick string and the sequence IDs? Is there a sequence with the taxon ID "CM-VM-01"?

Angelica Pulido

unread,
Sep 3, 2025, 3:32:03 PMSep 3
to beast-users
Hello,
I got the same error when trying to run SNAPP. After generating the *.xml file using snapp_prep.rb script, from the tutorial, beast gives me the following error:

Error 110 parsing the xml input file

validate and intialize error: Label 'T.lon.lon.AU_F289_UNL_174002_P003_WA11_i5-512_i7-107_S2550' in Newick beast.tree could not be identified. Perhaps taxa or taxonset is not specified?

Error detected about here:

  <beast>

      <run id='mcmc' spec='MCMC'>

          <state id='state'>

              <stateNode id='tree' spec='beast.base.evolution.tree.TreeParser'>

I have checked whether the issue is just different names for the taxon in tree and taxon id or whether I have extra " or ' in all specified taxon names, but I can't find any name problem in the snapp.xml file. 

I noticed that this taxon id 'T.lon.lon.AU_F289_UNL_174002_P003_WA11_i5-512_i7-107_S2550' is the first taxon specified in the tree, and that if I just swap it with another taxon, without changing the structure of the tree, the error message now will point me to the newly fist taxon on the tree. So I don't think this is an issue with any labelling or the newick tree.

I get the same error using snapper.

I'll highly appreciate any help.

best regards,

Angelica

Emanuele Berrilli

unread,
Sep 3, 2025, 3:32:03 PMSep 3
to beast-users

Dear Omar and Michael,

thank you very much for your help. I tried removing the dashes from all the IDs in the files, but unfortunately the error still persists. It really looks as if BEAST gets stuck exactly when it tries to read the nwk file.

I also double-checked the IDs and they are consistent throughout the XML file, so the error must be caused by something else. Unfortunately I haven’t been able to figure out what yet.

Best regards,
Emanuele

Emanuele Berrilli

unread,
Sep 9, 2025, 6:19:47 PMSep 9
to beast-users

Dear all,

I contacted Michael Matschiner, the editor of the tutorial: https://github.com/ForBioPhylogenomics/tutorials/tree/main/divergence_time_estimation_with_snp_data.
It turns out that the problem is related to the tree tips, which need to be labeled with the species ID rather than the voucher/sample ID. I will therefore re-run the analysis accordingly and hopefully this will resolve the issue.


beastgig

unread,
Sep 18, 2025, 2:08:33 PMSep 18
to beast-users
Hi. 

Were you able to run successfully after you changed the tree tips labels to species name? I had the same Error 110 as you. Then I saw this post, but when i change to species name it is giving me the another error because all taxon have the same species name (see below). How to format the tree tips? Thanks.

Error 110 parsing the xml input file

validate and intialize error: Duplicate taxon found

Emanuele Berrilli

unread,
Sep 18, 2025, 2:37:29 PMSep 18
to beast...@googlegroups.com

Hi,

you need a tree in which each species is represented only once. So, you can include multiple individuals assigned to the same species in your XML file, but the newick tree you use must contain only a single sequence per species.

Hope that helps

Ema




--
Emanuele Berrilli, Phd 
PostDoc at Department of Life, Health and Environmental Sciences
University of L'Aquila
Via Vetoio, 40
67100 Coppito AQ

beastgig

unread,
Sep 19, 2025, 11:00:27 PM (14 days ago) Sep 19
to beast-users
Hi Emanuele,

Thanks for the response. I am actually working with SNP data, so there are no sequences as such... Do you have a suggestion for how to format for SNPs? Or do you have an example newick tree you can share with me? I am trying to run beast for a long time but keep running into errors :(
Reply all
Reply to author
Forward
0 new messages