Beagle Library Beast

0 views

Skip to first unread message

Kenneth Melniczek

unread,

Aug 5, 2024, 5:05:19 AM8/5/24

to katrasurli

withBeast 1.5.2 I get: Failed to load BEAGLE library: no hmsbeagle-

jni in java.library.pathI can't find a file called hmsbeagle-jni on my computer.with Beast 1.5.3: Exception in thread "main"

java.lang.NoClassDefFoundError: beagle/BeagleFlag

at dr.app.beast.BeastMain.main(Unknown Source), etcI can't find a file called BeagleFlag on my computer. Making sure

CLASSPATHs were defined for the Beast and Beagle Libraries hasn't

helped.If anyone has any solutions I'm all ears (or eyes).Sincerely,

Will

I was able to compile and run an earlier build of BEAGLE on Windows

Vista using a combination of VisualStudio and the build_installer.bat

batch file in /trunk/project/beagle-vs of the SVN download... seem to

remember that I ran into some of the same problems that you mentioned,

since the Java syntax for Windows is slightly different from what

other folks have recommended for UNIX-based operating systems.If you're confident that you've compiled successfully without any

errors (warnings should be ok), try the following command line after

changing into the appropriate BEAST 1.5.3 directory, which should

include sub-directories for "bin", "lib", etc...java -xms10000m -xmx10000m -classpath .\lib\beast.jar;.\lib\beast-

beagle.jar dr.app.beast.BeastMain -beagle -beagle_GPU -beagle_order

1,1,2,2,0,0 name.of.your.input.file.xmlYou will probably need to tweak the amount of memory, depending on

your available RAM. The beagle_order flag allocates each partition to

the GPU or CPU. In the example above, I've split six partitions

between two GPUs and the CPU.There could also be issues with setting paths for environment

variables, but I think that was all handled during the compile

process. The developers noted in an earlier post that you should run

BEAGLE with concurrent builds of BEAST, not sure if that is part of

your problem: -users/browse_thread/thread/9db2020fcddcb3a1#One last thought... I spent a lot of time working on getting BEAGLE

compiled and running on Vista, only to find that there was no

performance increase for my data. In fact, BEAGLE ran about 10-15%

slower than the standard CPU version of BEAST. It seems that the

computational overhead of calling the GPU for multiple partitions can

be relatively expensive, especially when those partitions have less

than (say) 500-1000 sites and/or comprise nucleotide data (rather than

amino acid models).Best,

Chris

A quick follow-up after looking at some of my old files...The build_installer.bat file uses VisualStudio to compile the code and

create a Windows setup executable in /trunk/project/beagle-vs/

beagleinstaller/Release. You need this to install BEAGLE before using

the command line described above. I'm not sure how compiling with

Apache Ant works in comparison.You can download a trial version of VisualStudio 9.0 from the

Microsoft web site, then modify the build_installer.bat file to set

the correct paths for devenv.exe and vcvarsall.bat (part of

VisualStudio). For example:SET devenv="C:\Program Files (x86)\Microsoft Visual Studio

9.0\Common7\IDE\devenv.exe"

SET vcvars="C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC

\vcvarsall.bat"Hope this helps... but I would carefully consider the potential

benefits of computational speed in BEAGLE versus the inevitable time

and frustration spent trying to compile source for Windows. Depending

on the size and structure of your data set, CPU/GPU/RAM, etc... it

might be more efficient to run simultaneous analyses on different CPU

cores in BEAST, rather than trying to obtain marginal improvement for

a single run with BEAGLE, since you'll need to replicate your results

to verify MCMC convergence anyway.Best,

Chris

Aaron - Just tried your latest installer for Windows, and it returns a

null pointer exception when trying to execute with BEAST 1.5.3. I

remember encountering this type of error for my homebrew compile using

BEAGLE builds after Czech2010, so maybe the problem is cross-

compatibility between various revisions of the source code for each

application?Will - If you've been able to compile the latest build of BEAST (i.e.,

matching Aaron's installer) using Apache Ant, this might not be a

problem for you. Give it a shot and see what happens?Best,

Chris

Aaron is right that a path needs to be set for the hmsbeagle library,

either in the Java command line or as a system environment variable in

the Windows control panel. I usually do the latter, but verified that

the -Djava command works after deleting the system setting and running

this instead:java -xms1000m -xmx1000m -Djava.library.path="C:\Program Files

(x86)\Common Files\libhmsbeagle-1.0" -classpath .\lib\beast.jar;.\lib

Obviously, you need to change the name of the path so it corresponds

to the actual directory on your hard drive that contains the BEAGLE

driver (.dll) and library (.lib).Can you run BEAST (w/out BEAGLE) from the command line? That would be

a good way to check whether everything is ok on that end:java -xms1000m -xmx1000m .\lib\beast.jar dr.app.beast.BeastMain

name.of.your.input.file.xmlNot sure what the jni problem is... The only way I've been able to

replicate this type of issue is by attempting to run BEAGLE on a much

earlier build of BEAST (e.g., release 1.5.2), which throws the

following error when checking resources with the -beagle_info flag:

When I run Aaron's version of BEAGLE w/ BEAST 1.5.3, the null pointer

exception is almost identical to what Bena reported in the thread I

linked to earlier:java.lang.NullPointerException

at beagle.BeagleJNIWrapper.createInstance(Native Method)

at beagle.BeagleJNIImpl.(Unknown Source)Note that this error also refers to the jni, and the solution was to

download and compile the latest builds for both applications. So maybe

your versions just aren't playing nice with each other.Here are the compatible builds that I was able to compile and run on

Vista 64-bit: BEAST (r2671) and BEAGLE (r657). There have been a ton

of revisions and bug fixes since then, but that might be another

option. I'm pretty sure the r2671 version of BEAST is just the regular

1.5.3 release from last fall, so you wouldn't need to recompile that.Good luck!Best,

Chris

If you're still hanging in there, I was able to get the most recent

revisions of BEAGLE and BEAST working together on Windows. Basically,

it just involved compiling BEAST (r3076) with Ant and running Aaron's

installer executable. Before compiling, make sure to edit the

build.xml file so that the default on line 2 is "build_jar_all_BEAST",

otherwise it won't create the necessary jar files.Once everything is compiled and installed, try using the following

command after changing into the main directory for the new compiled

version of BEAST (remember to modify paths and settings as

appropriate):java -Djava.library.path="C:\Program Files\(x86)\Common Files

\libhmsbeagle-1.0" -xms1000m -xmx1000m -classpath .\build\dist

\beast.jar;.\build\dist\beast.beagle.jar dr.app dr.app.beast.BeastMain

The key issue seems to be synching the revisions for both

applications. Since this is all pre-release code, I think it would be

a reasonable courtesy to check with the developers prior to submitting

results for publication, and to follow the updates on Google code to

look for bug fixes that might affect your results.Best,

Chris

BEAST[1] is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability.

Loading the BEAST module with: module load beast, will automatically load it's dependencies, namely the beagle-lib and java modules, and set the environment variable EBROOTBEAST to point to the directory where BEAST's program files are located.

BEAST has been installed without any packages (add-ons). You can use the packagemanager command (for BEAST v2.5.1 and newer; in older versions of BEAST, the command is addonmanager) to install the desired packages within your home directory.

I installed both BEAST 2 & BEAGLE onto them. This gives me a boost by 3.6 times. It was said too that using OpenCL with BEAGLE gives me more speed-up. Nonetheless, my calculations takes from twelve hours to 3 days so I got really interested in this. Installing it on Windows were very simple. When I ran beast -beagle_info, I saw:

Any heavy duty MCMC is very amenable to traditional CPU parallelisation - if you think at it - it would be and thats why GPU - CUDA will work. Personally, looking at your resources I would parallelise under you cluster using a MINIMUM of 8-cores per calculation outside CUDA/OpenCL. If that has not already been done, that is the solution in context in my opinion. Unless you're a Google developer few will have access to a NVIDA cluster in any case sufficient CPU parallelisation should be able to achieve the runtime close to that of a good GPU (its gonna be a lot of CPUs though).

ConclusionHonestly, my personal advise is that phylogenetics sump clusters no matter what the calculation is being used and that has always been the case. The best solution is to max out on available cores. When eukaryote genome guys complain about 1000 hour runtimes (obviously thats the total CPU parallelised), thats a routine calculation in phylogenetics. Buying in NVIDA doesn't necessarily solve this, because whilst MCMC is massively accelerated for maximum likelihood isn't accelerated to that extent by comparison.

Extra information In case the OP is not aware, the enormous advantage of Beast2 is the ability to resume an MCMC at the point it ended (or some checkpoint therein), you don't need to run the whole thing again. You should seek to exploit that, so a calculation started on a laptop can be finished on a cluster. I can't state sufficiently how powerful that is, it stands to reason because the random number stream is declared and every sample thereafter. Keeping all calculations is standard practice in any case, but especially with a Beast2 run and don't every discard it (hard disk is cheap right?). You might find you have to return to it and its a whole lot better than needing to do the whole thing again.