Question about beast :)

300 views
Skip to first unread message

Cécile Emeraud

unread,
Aug 23, 2022, 2:06:56 PM8/23/22
to beast-users
Hello every one, 

I am a new user of Beast and I have some problems using it...

I would like to study the evolution rate of a population of carbapenemase producing Klebsiella pneumoniae ST147 (n=400). I saw on several publications that they all used the GTR model with these parameters : lognormal relaxed clock, constant size polulation, 10^9 lengh, sampling every 5x10^3.

My questions are :

- which king of alignement do I have to use? Core genome alignement (very heavy) or SNPs alignment?

- how can i know which model is the best?

- using the SNP alignment and the previously cited model I have the following results: Meanrate = 0.162. does that mean that I have 0.162 substitutions per year and per site? it's huge given that a genome is 6 MpB...

Many thanks +++++


Cécile

Pavel Rinkman

unread,
Aug 23, 2022, 4:13:47 PM8/23/22
to beast-users
Hello Cécile,

You can compare different models via Bayesian factor calculation (there are a lot of methods: path sampling, stepping-stone sampling, nested sampling). You can check taming the beast for tutorials.
Before the rate calculation you should verify the presence of temporal signal in your data with TempEst & marginal likelihood comparison. Although even its presence does not guarantee a precise estimation of the rate.
This is hard enough to say anything about your results. They look improbable but I am blind to advise anything without at least glimpsing at logs, model, & those articles.

Best,
Pavel
вторник, 23 августа 2022 г. в 21:06:56 UTC+3, cecile.e...@gmail.com:

Farman, Mark L.

unread,
Aug 23, 2022, 5:16:52 PM8/23/22
to beast...@googlegroups.com
Cecile,

My guess is that you are only using variant sites for your input data, as this potentially could explain the unexpectedly high substitution rate. If this is the case, you should add a constantSitesWeights parameter to your .xml file:


Mark



On Aug 23, 2022, at 4:13 PM, Pavel Rinkman <pavel....@gmail.com> wrote:

CAUTION: External Sender

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/133d804f-3e7e-42e8-bf51-c858f56e7798n%40googlegroups.com.



Mark L. Farman 
Professor, Department of Plant Pathology
225 Plant Science Building
1405 Veteran's Dr.
University of Kentucky
Lexington, KY 40546 USA 
tel:  (859) 218-0728 
fax:  (859) 323-1961

Cécile Emeraud

unread,
Aug 24, 2022, 3:33:33 PM8/24/22
to beast-users

Thanks a lot for your answers!

@Pavel Rinkman:  I will use tempest to evaluate the presence of a temporal signal and calculate Bayes factors to choose my model, is there a cut-off of the R2 to say that it is ok?

@Mark: I think you're right, it's this only variant sites that makes my substitution rate so high. I tried to modify my xml file but it is doesn't work.. I don't know if my fasta file have a problem. I have this error message (picture), could you please tell where the problem comes from?pastedImage.png

Thanks+++

Cécile

Farman, Mark L.

unread,
Aug 24, 2022, 10:48:45 PM8/24/22
to beast...@googlegroups.com
Let me rewrite the constantSitesWeight instructions on the beast site for you. It’s not exactly clear from that documents all the changes that need to be made. Will respond again soon. 

Mark L. Farman 
Professor, Department of Plant Pathology
Sent from my iPhone

On Aug 24, 2022, at 3:35 PM, Cécile Emeraud <cecile.e...@gmail.com> wrote:


CAUTION: External Sender

Pavel Rinkman

unread,
Aug 25, 2022, 3:31:45 AM8/25/22
to beast-users
There is a cut-off only for Bayes factor. TempEst is used mostly for subjective view.

среда, 24 августа 2022 г. в 22:33:33 UTC+3, cecile.e...@gmail.com:

Cécile Emeraud

unread,
Aug 27, 2022, 2:12:55 PM8/27/22
to beast-users

Thank you very much, I remain attentive to your messages :)

Farman, Mark L.

unread,
Aug 28, 2022, 9:13:31 PM8/28/22
to beast...@googlegroups.com
Cecile,

Follow the instructions at this link (https://groups.google.com/g/beast-users/c/QfBHMOqImFE) and I have included an annotated snippet of a file that shows where to make the necessary changes.

Good luck,

Mark



On Aug 26, 2022, at 8:18 AM, Cécile Emeraud <cecile.e...@gmail.com> wrote:

CAUTION: External Sender


Thank you very much, I remain attentive to your messages :)

Le jeudi 25 août 2022 à 09:31:45 UTC+2, Pavel Rinkman a écrit :
There is a cut-off only for Bayes factor. TempEst is used mostly for subjective view.

среда, 24 августа 2022 г. в 22:33:33 UTC+3, cecile.e...@gmail.com:

Thanks a lot for your answers!

@Pavel Rinkman:  I will use tempest to evaluate the presence of a temporal signal and calculate Bayes factors to choose my model, is there a cut-off of the R2 to say that it is ok?

@Mark: I think you're right, it's this only variant sites that makes my substitution rate so high. I tried to modify my xml file but it is doesn't work.. I don't know if my fasta file have a problem. I have this error message (picture), could you please tell where the problem comes from?Thanks+++

Cécile Emeraud

unread,
Aug 29, 2022, 2:26:36 PM8/29/22
to beast-users

Dear Mark!

thanks again! 
 I modified my xml file as you told me, can you verify that this is correct? Once this file is modified, I can directly put it in beast or I still have to do the manipulation with "b2constsite"? 

Cécile
Capture d’écran 2022-08-29 à 15.45.16.png

Farman, Mark L.

unread,
Aug 29, 2022, 2:33:37 PM8/29/22
to beast...@googlegroups.com
Cecile,

This looks good, Beast should now read this and make adjustments for the constant sites. 

Mark L. Farman 
Professor, Department of Plant Pathology
Sent from my iPhone

CAUTION: External Sender

Cécile Emeraud

unread,
Aug 30, 2022, 8:39:06 PM8/30/22
to beast-users

Hello, it works! 

I get substitution rates close to what is found in the literature for klebsiella with exactly the same model used (around 2.10^-6) but with a very wide 95% HPD '1.10^-7-1;10^- 9) whereas in the literature it is very low (1.1.10-6 - 2.10^-6) do you have an explanation for these results?
 I guess I must be close to the results found in several papers on Klebsiella 
 Thanks! 
 Cecile

Farman, Mark L.

unread,
Aug 30, 2022, 9:26:30 PM8/30/22
to beast...@googlegroups.com
Cecile,

Glad to hear that you are getting reasonable results.  I’m not sure the HPD range you stated is correct. However, if it is wider than in other studies, I’m guessing this is your first run and you explored a lot of parameter space using a fairly small MCMC chain. In this case, your chain may not have converged early enough and this will also show up with marginal ESS values. After arriving at an estimated rate of 2.1E-6, for your current run, you can now set this as a prior and limit the space explored by setting upper and lower rate bounds. Of course you will want to check the mixing in the chain when you do this, to make sure that you are not constraining the bounds by too much.

Good luck,

Mark




CAUTION: External Sender

Cécile Emeraud

unread,
Sep 1, 2022, 2:14:51 PM9/1/22
to beast-users
Thanks, 

I increased the length of the MCM chain and I have these results attached (photo). 
the interval of 95% HPD values is much smaller and corresponds more to what I saw, however I can't find the age of the common ancestor and the 95% HPD on Tarcer... and my file .tree is too heavy for TreeAnntator, maybe I missed a step..

Cécile
Capture d’écran 2022-09-01 à 11.26.26.png

Farman, Mark L.

unread,
Sep 1, 2022, 4:04:51 PM9/1/22
to beast...@googlegroups.com
Cecile,

Run this command on your tree file, then run treeAnnotator on the output file:

awk 'NR % 10 == 0' tree_file.trees > every10thTree.trees

This code will sample every tenth tree from your tree file.

Mark

p.s. how often did you sample from the chain?


On Sep 1, 2022, at 5:33 AM, Cécile Emeraud <cecile.e...@gmail.com> wrote:

CAUTION: External Sender

Thanks, 

I increased the length of the MCM chain and I have these results attached (photo). 
the interval of 95% HPD values is much smaller and corresponds more to what I saw, however I can't find the age of the common ancestor and the 95% HPD on Tarcer... and my file .tree is too heavy for TreeAnntator, maybe I missed a step..

Cécile

Pavel Rinkman

unread,
Sep 2, 2022, 3:17:57 AM9/2/22
to beast-users

By the way, you can use logcombiner for changing the trees frequency logcombiner -trees -resample required_frequency original.trees sparse.trees. The age may be calculated as the height from the most recent sample age.

четверг, 1 сентября 2022 г. в 23:04:51 UTC+3, mark....@uky.edu:

Cécile Emeraud

unread,
Sep 7, 2022, 12:20:23 PM9/7/22
to beast-users
Thanks you!

Sorry but I don't understand this question .. : p.s. how often did you sample from the chain?

Farman, Mark L.

unread,
Sep 7, 2022, 1:19:22 PM9/7/22
to beast...@googlegroups.com
Cecile,

How often did you log data, every 1,000th iteration, every 10,000…? If you increase your MCM chain length by 10, you may want to sample 10-times less often. 

Mark L. Farman 
Professor, Department of Plant Pathology

CAUTION: External Sender

Cécile Emeraud

unread,
Sep 13, 2022, 3:03:35 PM9/13/22
to beast-users
Mark,

Markov chain Monte Carlo length of 1×109, sampling every 5×10^3 steps. 

Furthermore, my range of HPD 95 is still quite large compared to the publications I have seen, do you know how to reduce it? 

Also, I don't know how to add my constant sites for Tempest because it must cause the same problems as for Beast, any idea?
Have a nice day

Cécile

Cécile Emeraud

unread,
Sep 19, 2022, 2:04:08 PM9/19/22
to beast-users
Hi everyone, 


I have a question regarding the analysis of results on Tracer. 
I compared the mutation rates of two klebsiella populations with the same conditions and I get the attached results (see photo).
 Is it possible to look if these two results are statistically different on Tracers? 

 Thanks a lot

CécileCapture d’écran 2022-09-19 à 11.27.51.png

Reply all
Reply to author
Forward
0 new messages