Hi everyone,
I am generating an XML file for SNAPP using the snapp_prep.rb script and following the tutorial: https://github.com/ForBioPhylogenomics/tutorials/tree/main/divergence_time_estimation_with_snp_data. The only difference from the tutorial is that I am using the -s option to specify a starting tree that I already have in .nwk format (generated with RAxML). Everything seems to work fine, and the XML file is generated without issues.
However, when I run it in beast, I keep getting the following error:
I have double-checked all the IDs, and they are all correct, so I believe the problem is with reading the tree itself. Unfortunately, I am not sure how to fix this, so I am asking for your help in trying to resolve the issue.
Thank you very much,
Emanuele
--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/a81f39f0-020c-43e2-8751-3cc7fc00682en%40googlegroups.com.
Error 110 parsing the xml input file
validate and intialize error: Label 'T.lon.lon.AU_F289_UNL_174002_P003_WA11_i5-512_i7-107_S2550' in Newick beast.tree could not be identified. Perhaps taxa or taxonset is not specified?
Error detected about here:
<beast>
<run id='mcmc' spec='MCMC'>
<state id='state'>
<stateNode id='tree' spec='beast.base.evolution.tree.TreeParser'>
I have checked whether the issue is just different names for the taxon in tree and taxon id or whether I have extra " or ' in all specified taxon names, but I can't find any name problem in the snapp.xml file.
I noticed that this taxon id 'T.lon.lon.AU_F289_UNL_174002_P003_WA11_i5-512_i7-107_S2550' is the first taxon specified in the tree, and that if I just swap it with another taxon, without changing the structure of the tree, the error message now will point me to the newly fist taxon on the tree. So I don't think this is an issue with any labelling or the newick tree.
I get the same error using snapper.
I'll highly appreciate any help.
best regards,
Angelica
Dear Omar and Michael,
thank you very much for your help. I tried removing the dashes from all the IDs in the files, but unfortunately the error still persists. It really looks as if BEAST gets stuck exactly when it tries to read the nwk file.
I also double-checked the IDs and they are consistent throughout the XML file, so the error must be caused by something else. Unfortunately I haven’t been able to figure out what yet.
Best regards,
Emanuele
Dear all,
I contacted Michael Matschiner, the editor of the tutorial: https://github.com/ForBioPhylogenomics/tutorials/tree/main/divergence_time_estimation_with_snp_data.
It turns out that the problem is related to the tree tips, which need to be labeled with the species ID rather than the voucher/sample ID. I will therefore re-run the analysis accordingly and hopefully this will resolve the issue.
Hi,
you need a tree in which each species is represented only once. So, you can include multiple individuals assigned to the same species in your XML file, but the newick tree you use must contain only a single sequence per species.
Hope that helps
Ema
To view this discussion visit https://groups.google.com/d/msgid/beast-users/e276a664-6360-49f4-95b3-97a45f01c1ban%40googlegroups.com.