ESS value is low

916 views
Skip to first unread message

Badr

unread,
Oct 17, 2018, 2:54:11 PM10/17/18
to beast-users
Dear all,

I'm trying to build a phylogenetic tree. My dataset is 226 full genomes. Size of all genomes around 30000 bps. 1 partition.


 I used the following parameters:

Site Section: GTR+Gamma+Invariants sites
Substitution Model : GTR +Gamma+Invariants sites
Clock Type:Longnormal uncorrelated  relaxed clock 
Tree Prior:GMRF Bayesian SkyGrid
Length of chain:12000000
Echo State to Every Screen:35000
Log parameters:35000


I got ESS less than 200 specifically in posterior, prior and likelihood.   Of course. I tried different chains starting from 50000000 but without improvement

Any suggestion would be appreciated.

Thanks;
Badr
Message has been deleted
Message has been deleted
Message has been deleted

Artem B

unread,
Oct 18, 2018, 1:30:29 AM10/18/18
to beast-users
Dear Badr
Logging every 35000 gives you only 3,500 trees, that is a really small draw, especially for your HUGE dataset and difficult model (30000 X 226). 

10000 trees in a draw is the minimal amount for getting independent and reliable results. And that's why you get so low ESS.

I advice you to decrease an amount of full genomes if it's possible and set next MCMC parameters:

Length: 300000000
Log every: 6000 iteration (50000 total trees)

If ESS is still low (e.g. 100) then increase your chain length twofold.

What's version of BEAST do you use? Beast 1.*.* can not be able to do so huge and difficult dataset with any chain length.

Artem B

unread,
Oct 18, 2018, 1:32:07 AM10/18/18
to beast-users
If ESS is still low (e.g. 100) then increase your chain length twofold

And leave a total amount of trees 50000 (log every 12000 iteration)

Badr

unread,
Oct 18, 2018, 3:58:59 AM10/18/18
to beast-users
Dear Artem,

Many thanks for your reply. I may reduce number of dataset but I can't reduce amount of full genome. 

I use BEAST1.8.4 for this analysis.

Cheers,
Badr

Artem B

unread,
Oct 19, 2018, 2:03:57 AM10/19/18
to beast-users
I use BEAST1.8.4 for this analysis.

Dear Badr, 
 
I also used that version for the similar set size (150 full genomes,10242 nt each one) with the similar model (GTR + UCLD + BSP)
And BEAST 1.8.4 couldn't get a convergence of a prior/posterior trace cause of a rootHeight parameter. The trace was infinitely falling to improbable values.
But the BEAST 2.5 could get it. And 2.5 is much more faster than 1.*.*

Good luck with your research!

Badr

unread,
Oct 21, 2018, 3:23:25 PM10/21/18
to beast-users
Your comments are so appreciated. I will upgrade BEAST.

Andrew Rambaut

unread,
Oct 22, 2018, 2:24:01 AM10/22/18
to beast...@googlegroups.com
Dear Artem,

I  am surprised by this. Both programs use exactly the same underlying machinery. I regularly run much larger data sets than this with no issue. I would be happy to investigate this further. Perhaps you would be able to email me the problematic data set.

I would also strongly recommend using Skyride or Skygrid models over BSP to give you better mixing over all.

On 19 Oct 2018, at 07:03, Artem B <ui.ar...@gmail.com> wrote:

I also used that version for the similar set size (150 full genomes,10242 nt each one) with the similar model (GTR + UCLD + BSP)
And BEAST 1.8.4 couldn't get a convergence of a prior/posterior trace cause of a rootHeight parameter. The trace was infinitely falling to improbable values.
But the BEAST 2.5 could get it.



I am also curious why you think this is the case:

And 2.5 is much more faster than 1.*.*

Do you have some benchmarks? There is no reason why 1.X should be slower (again, it is essentially the same algorithm) so I would like to see if there is a bug causing this.

Best,
Andrew

Artem B

unread,
Oct 22, 2018, 3:19:06 AM10/22/18
to beast-users
Dear Andrew

I don't see your e-mail, this is it?  a.rambaut@ed.ac.uk 

About benchmark - I used ubuntu 16.04 and ran another small data (45 sequences, 1308 nt) both in BEAST 1.8.4 and in 2.5.0 (+BEAGLE 3.0)
In 1.8.4 I got something about 30 min per million state, in 2.5.0 I got 11 min per million. Unfortunately it was two weeks ago and I may be inaccurate. I'll try to run benchmark test again next week and send you result.

And BSP
I choose BSP over other models based on PS/SS methods and ΔML is >10 for all my data. However if the Skyride gives the same results I'll use it that way.

I have another bugs related tree prior and using GTR model with BSP, small data is still the same (45 sequences, 1308 nt)
I hope I found your e-mail right and then will write more details on that address.



Andrew Rambaut

unread,
Oct 22, 2018, 4:30:30 AM10/22/18
to beast...@googlegroups.com
Thanks,

Yes. that is my address. There really shouldn't be a 3-fold difference in speed. Are you sure 1.8.4 was using BEAGLE? I suggest you use the 'time' command on Linux and look at the total CPU time which avoid including slow downs due to other processes running etc. - also if you want to benchmark, it would be good to include 1.10.2.

Andrew


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Gilles Vergnaud

unread,
Oct 22, 2018, 9:04:30 AM10/22/18
to beast...@googlegroups.com
Hi,

About the speed difference/benchmarking Beast/Beast2: in our modest experience with Beast 1.10.2 and Beast2 2.5, we also see a 5 to 8 fold speed difference in favor of Beast2.

We started using Beast 1.10 with Beagle under windows 10 or windows server 2016 a few weeks ago with a dataset of 250 taxa, 40,000 concatenated SNPs (GTR Gamma, relaxed clock lognormal, Bayesian skyline).
Then we tested Beast2 on the same machines and same dataset.

Gilles

Andrew Rambaut

unread,
Oct 27, 2018, 7:59:27 PM10/27/18
to beast...@googlegroups.com
There was an issue with BEAST 1.10.2 running with BEAGLE 3 which was slowing it down (when not using GPUs). This issue was not present in BEAST2 or when using BEAGLE 2. 

We have just released v1.10.3 which fixes this specific issue. 

One other thing is that BEAST 1 and 2 use different sets of transition kernels (operators) and priors by default so caution should be used comparing 2 runs simply by the runtime for a set number of states (because those steps could be radically different). Some operators have a large computational cost, others a small one. What matters is the amount of statistical information that is sampled. 

A much better metric is to consider ESS/hour (for a bunch of parameters). This then allows you assess the sampling efficiency. 

Andrew

李兴广

unread,
Nov 10, 2018, 1:52:59 AM11/10/18
to beast-users


在 2018年10月18日星期四 UTC+8上午2:54:11,Badr写道:

Andrew Rambaut

unread,
Nov 11, 2018, 12:43:42 PM11/11/18
to beast...@googlegroups.com
Dear Artem,

Did you follow up on this benchmark? Basically if BEAST and BEAST2 are exhibiting widely different reported speeds, then it is because they are doing radically different things. They both use exactly the same computational engine (BEAGLE) so should exhibit the same performance. 

When you see very different time/step values on the same hardware, it most-likely means that the moves being made are different. Some moves take more time to compute than others. I.e., changing the evolutionary rate is expensive, changing the coalescent parameters are cheap. If you increase the weight of the coalescent parameters’ operators then you will see a large reduction in the time/step but you will be doing a much worse job at sampling the other parameters and trees.

Basically, don’t use time/step as a measure of performance unless you are comparing hardware or parallelization options. You could be getting a much faster run but a much worse statistical result.

Andrew

On 22 Oct 2018, at 08:19, Artem B <ui.ar...@gmail.com> wrote:

Artem B

unread,
Nov 11, 2018, 8:30:35 PM11/11/18
to beast-users
Thank you, Andrew!

I got your email but I have no opportunity to run benchmark test now. It is very useful information (ESS per hour), thanks a lot.
But I have another bugs related with BEAST 1. I'll email you soon.
Reply all
Reply to author
Forward
0 new messages