Help with TreeAnnotator

82 views
Skip to first unread message

Gabriela Cruz Luna

unread,
Nov 13, 2022, 1:34:07 PM11/13/22
to beast-users
Hi everyone
I'm running some divergence time analysis with BEAST and got some great results with my statistics.
Now I'm trying to get my trees using TreeAnnotator but there are two thing I really don't know. As you see on the image I don't know what 'Burning percentage' and 'Posterior probability limit' means, and off course, I don't know what are the values that I have to put in here.
I would be very thankful if anyone can help me and explain me what does this means.
Thanks in advance
If anyone is wondering I'm using BEAST version 2.6.6. And if you need more information about my analysis feel free to ask for it.
Captura de pantalla 2022-11-02 a la(s) 8.07.22 a. m..png

Para ver el aviso de privacidad clic aquí y para el uso de datos personales clic aquí

Alexei Drummond

unread,
Nov 13, 2022, 11:36:33 PM11/13/22
to beast...@googlegroups.com
Burnin percentage is the percentage of the trees from the start of the chain that you want to discard as unrepresentative of the posterior. 10% is the default within Tracer. You should choose this number based on visual inspection of the trace in tracer.

You can leave posterior probability limit at 0.0. If I remember correctly, setting this to a positive number would mean that nodes with less than that probability would not be summarised.

Cheers
Alexei

On 3/11/2022, at 2:22 AM, Gabriela Cruz Luna <est.gabr...@unimilitar.edu.co> wrote:

Hi everyone
I'm running some divergence time analysis with BEAST and got some great results with my statistics.
Now I'm trying to get my trees using TreeAnnotator but there are two thing I really don't know. As you see on the image I don't know what 'Burning percentage' and 'Posterior probability limit' means, and off course, I don't know what are the values that I have to put in here.
I would be very thankful if anyone can help me and explain me what does this means.
Thanks in advance
If anyone is wondering I'm using BEAST version 2.6.6. And if you need more information about my analysis feel free to ask for it.
<Captura de pantalla 2022-11-02 a la(s) 8.07.22 a. m..png>

Para ver el aviso de privacidad clic aquí y para el uso de datos personales clic aquí

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/e5c5a655-f947-4286-bf24-b13e67aea58cn%40googlegroups.com.

Omar Idris

unread,
Nov 14, 2022, 1:47:32 PM11/14/22
to beast...@googlegroups.com
If you do not mind I run BEAST to 500 million mcmc. I know I tampered with BEAUTi that run it to be longer. I think on uniform prior I put 10 or 13, and mutation rate some estimate from the literature for 16s RNA to be some e-9 or the average vertebrate mitochondrial mutation. I am not sure what ucld mean does and I do not recall changing it. Anyway the support posterior is 88 and most of it is 99 for major African frog family of Hyperoliidae. When I check the branch length it is not a fraction or some small number it is in millions for tip branches or down second level or something like that. I am not specifically interested in the numbers now as topology support is great but wondered if branch calculations may have changed the relationships or topology. 2056789 is one possible value for branch length. when I divide it with rate , the rate is small reasonable value in e-7. I know branch length could be both time and rate combined if relaxed clock used which is in my case. As I was using my personal computer it took me 10 days to complete the run for 1300 taxa and 270 nucleotides. I run similarly mr Bayes 30 million mcmc but the support is low for deeper branches just fyi.

All ESS are extremely high. 
Do I need to rerun it since I am not interested in branch length or age estimation right now? where did I miss?

--
ONI

Alexei Drummond

unread,
Nov 15, 2022, 3:13:24 PM11/15/22
to beast...@googlegroups.com
Hi,

If you don’t care about the absolute divergence times then: 

(1) you should fix the mutation rate to 1.0 so that branch lengths are in units of substitutions per site and
(2) you should not add any tmrca priors on divergence times.

What did you put the uniform prior on? A uniform prior is often a bad idea for a prior since it says you have absolute knowledge, which is almost never the case.

Short answer: yes you might distort the topology estimation if you have bad/incompatible priors on rates and dates.

Cheers
Alexei

Omar Idris

unread,
Nov 15, 2022, 6:41:21 PM11/15/22
to beast...@googlegroups.com
Hi all,
Thank you.
I would like to see absolute time just for my own satisfaction as my sequence length is 278 with 274 patterns most of informative.
I looked at my prior for Yule tree which I set upper limit to 13 following discussion suggestions here. Then on clock rate on ucld mean I set it to 1.7e-7 from clock rate of 16s RNA published on frog literature. I have several genera but not many 10 in group 8 out-groups represented by many species close to 1363 with several frog out groups. I am worried about my branch lengths affecting topology although I agreed with most taxa. My mcmc is too long 500000000 with 50000 sampling frequency using my personal 8 thread/core HP envy 2012.
Repeating this would take me another 10 days, anyway, the point is if I want to do it again should I set it to what number? I mean how do I input or incorporate this fixed clock rate? I suppose I learned this value now stands for year or generation? (the one rate I indicated above). Also important is since I discovered that even out-groups are now nested inside the in-group and 
root is not on out groups and after checking alignment it seems the sequences imputed after trimming using BMGE on NGfrphylogeny website, the gene portion or sequences may be highly conserved hence even though the genera are diverse this short segment possibly be assumed under strict clock and relaxed may not be appropriate? 
So if I decide to rerun it,
then:
1. set clock to strict
2. uniform from 0 to 13?
3. clock rate ucld to ? based on above fixed value?
I appreciate your help!

Sent from my iPhone

On Nov 15, 2022, at 2:13 PM, Alexei Drummond <alexei....@gmail.com> wrote:

Hi,
-- 


--
ONI
image_6487327.JPG

Alexei Drummond

unread,
Nov 15, 2022, 9:11:30 PM11/15/22
to beast...@googlegroups.com
If you set an informative prior on the root age then you will already get a substitution rate estimate and absolute divergence times, since rate is just genetic distance / time. The genetic distance comes from your data and the time comes from your prior on the root age. So you should either set a prior on the rate or set an informative prior on the root age. If you do both, then there is a risk that one piece of calibration information is incompatible with the other. My general advice is to start simple and add model complexity one step at a time. If you want to see whether the two sources of calibration information (root age prior and rate) are consistent with each other then you can easily run one analysis with only the rate set and another analysis with a root age prior but the rate estimated. Then you can see if they give the same results or not.

Cheers
Alexei

Omar Idris

unread,
Nov 15, 2022, 10:45:31 PM11/15/22
to beast...@googlegroups.com
Thank you for your email Alexei.
Omar

--
ONI
Reply all
Reply to author
Forward
0 new messages