Optimizing BEAST Runs for Large Datasets (1000-1500 Sequences, >15kb)

111 views
Skip to first unread message

Sikandar Azam

unread,
Mar 30, 2025, 2:40:09 PM3/30/25
to beast...@googlegroups.com
Hi BEAST users,

I am working with large datasets (1000-1500 sequences, each >15,000-20,000 bp) and running Bayesian phylogeographic analyses using BEAST. However, my analyses are running very slowly even with the beagle. 400 million MCMC takes a 3-4weeks (sometimes more than a month)) on our current Linux system.

My current System Specifications are here:

Processor = intel Xeon (R) Gold 6242 CPU @2.80GHz x 32

Graphics= NVIDiA corporation TU102GL [quadro RTx6000/8000] (Quadro RTx 6000)

I am considering two options to optimize performance:

1. Upgrading our system (e.g., adding more CPUs and RAM).

2. Purchasing another system with similar specs and splitting MCMC chains across multiple machines, then combining them using LogCombiner.

For those who have worked with similar large datasets, what setup do you recommend for faster convergence? Are there specific computational strategies that significantly improve speed?

Thank you in advance for your insights


Artem B

unread,
Mar 31, 2025, 9:50:06 PM3/31/25
to beast-users
Hi Sikandar,

Our decision is pretty naive and not perfect.

We came across a similar problem when analyzed first nearly identical genomes of SARS-CoV-2:  10.1016/j.virusres.2021.198551.  We provide performance in details (see the MM section). 
It was quickly discovered that the optimal number of threads for our biggest data set was 4 threads. Our system was equipped with three clusters with two 18-core 36-thread Intel Xeon E5-2695 v4 "Broadwell" processors. So we ran multiple parallel MCMC for one data set and combined them later as you said. 

Also, to skip a redundant burn-in phase you can use a tree from a chain optimum (i.e., after a chain passed burn-in) as a start tree for each parallel run. It usually reduces burn-in significantly.

The last option is tuning operators using the report after the end of run, but I haven't performed this successfully :(

Best,
Artem 


понедельник, 31 марта 2025 г. в 02:40:09 UTC+8, Sikandar Azam:

Sikandar Azam

unread,
Apr 1, 2025, 1:35:12 AM4/1/25
to beast...@googlegroups.com
Hi artem,
Thank you for your response and suggestions.
I have also tested different threads configuration (4,8,16,24 and 32) and found that using 4 Threads provides the best performance. 
Since then, I have consistently used this setting.

However, I have not tried running multiple parallel MCMC for one data and tuning the operators, but I will try with this approach in the future and share the result with the group.

Thank you



--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/9f75f9c9-f539-42a5-90a7-67d8ccd10ea6n%40googlegroups.com.

Pfeiffer, Wayne

unread,
Apr 1, 2025, 1:53:05 PM4/1/25
to rawalg...@gmail.com, beast...@googlegroups.com
Hi Sikandar,

You should try using your GPU with BEAGLE.

Best regards,
Wayne

Reply all
Reply to author
Forward
0 new messages