SEVERE: State was not correctly restored after reject step in BEAST with BEAGLE on CUDA..

835 views
Skip to first unread message

James Cotton

unread,
Jun 24, 2011, 12:15:05 PM6/24/11
to beast-users
Hi - I have this error in a chain I'm running using BEAST with the
BEAGLE library. Its occurred twice in independent runs of the same xml
file on this machine, but the same xml file run without beagle is
fine. Is this a known problem?

I'm happy to share the xml etc. if it would help diagnosis.
Its a big dataset..

James



______________________________________________________

SEVERE: State was not correctly restored after reject step.
Likelihood before: -929784.6945448748 Likelihood after:
-929784.6798964373
Operator: scaleOperator(pInv [0.75, 1.3333333333333333] scale(pInv)

Details
Before:
CompoundLikelihood(compoundModel)=(
DistributionLikelihood=-3.0709,
DistributionLikelihood=-3.1687,
DistributionLikelihood=-2.7344,
DistributionLikelihood=-3.184,
DistributionLikelihood=-2.8383,
SpeciationLikelihood(speciationLikelihood)=673.6484
),
CompoundLikelihood(compoundModel)=(
BeagleTreeLikelihood(treeLikelihood)=-930443.3467
)
After:
CompoundLikelihood(compoundModel)=(
DistributionLikelihood=-3.0709,
DistributionLikelihood=-3.1687,
DistributionLikelihood=-2.7344,
DistributionLikelihood=-3.184,
DistributionLikelihood=-2.8383,
SpeciationLikelihood(speciationLikelihood)=673.6484
),
CompoundLikelihood(compoundModel)=(
BeagleTreeLikelihood(treeLikelihood)=-930443.332
)
24-Jun-2011 17:06:36 dr.inference.markovchain.MarkovChain runChain
SEVERE: State was not correctly restored after reject step.
Likelihood before: -929784.6945448748 Likelihood after:
-929784.6798964373
Operator: dr.evomodel.operators.WilsonBalding@6f9ec4ed
wilsonBalding(treeModel)

Details
Before:
CompoundLikelihood(compoundModel)=(
DistributionLikelihood=-3.0709,
DistributionLikelihood=-3.1687,
DistributionLikelihood=-2.7344,
DistributionLikelihood=-3.184,
DistributionLikelihood=-2.8383,
SpeciationLikelihood(speciationLikelihood)=673.6484
),
CompoundLikelihood(compoundModel)=(
BeagleTreeLikelihood(treeLikelihood)=-930443.332
)
After:
CompoundLikelihood(compoundModel)=(
DistributionLikelihood=-3.0709,
DistributionLikelihood=-3.1687,
DistributionLikelihood=-2.7344,
DistributionLikelihood=-3.184,
DistributionLikelihood=-2.8383,
SpeciationLikelihood(speciationLikelihood)=673.6484
),
CompoundLikelihood(compoundModel)=(
BeagleTreeLikelihood(treeLikelihood)=-930443.332
)
Exception in thread "Thread-1" java.lang.RuntimeException: One or more
evaluation errors occured during the test phase of this
run. These errors imply critical errors which may produce incorrect
results.
at dr.inference.markovchain.MarkovChain.runChain(Unknown
Source)
at dr.inference.mcmc.MCMC.chain(Unknown Source)
at dr.inference.mcmc.MCMC.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)


1.2529333333333335 minutes

alexei

unread,
Jun 24, 2011, 5:39:38 PM6/24/11
to beast-users
Hi James,

As far as I am aware BEAGLE can struggle to calculate the likelihood
accurately with large trees, especially in 32 bit mode. The error you
are reporting seems to be caused because the discrepancy between two
successive likelihood calculations on the same state was too great.
Not sure if there is a known fix besides falling back to the slower
but more Java likelihood calculator. Marc or Andrew may be able to
suggest ways to tweak the BEAGLE likelihood.

Alexei

Andrew Rambaut

unread,
Jun 24, 2011, 6:21:37 PM6/24/11
to beast...@googlegroups.com
Further to Alexei's point, this error is likely due to loss of precision in the likelihood calculations. If you are using single precision mode, then try switching to double. The other thing to try is using the '-beagle_scaling always' option which will attempt to rescale the calculations to maintain precision. Normally BEAGLE only rescales when the calculations completely fail but it is possible to loose sufficient precision that the results become inconsistent between calculations that should return the same value (which is what this test is comparing).

The '-beagle_scaling always' option is the equivalent to the scaling used in the built in BEAST calculator but should be considerably faster.

Andrew

> --
> You received this message because you are subscribed to the Google Groups "beast-users" group.
> To post to this group, send email to beast...@googlegroups.com.
> To unsubscribe from this group, send email to beast-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/beast-users?hl=en.
>
>

___________________________________________________________________
Andrew Rambaut
Institute of Evolutionary Biology University of Edinburgh
Ashworth Laboratories Edinburgh EH9 3JT
EMAIL - a.ra...@ed.ac.uk TEL - +44 131 6508624

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

James Cotton

unread,
Jun 27, 2011, 5:24:14 AM6/27/11
to beast-users
Thanks both -

the 'beagle_scaling always' seems to do the trick, and the chain seems
to be running happily now -
and about 40x more quickly on a GeForce GTX 480 than on just CPU.

Yours
James
> > For more options, visit this group athttp://groups.google.com/group/beast-users?hl=en.
>
> ___________________________________________________________________
>   Andrew Rambaut                
>   Institute of Evolutionary Biology       University of Edinburgh
>   Ashworth Laboratories                         Edinburgh EH9 3JT
>   EMAIL - a.ramb...@ed.ac.uk                TEL - +44 131 6508624      

strobi

unread,
Sep 10, 2012, 7:12:46 AM9/10/12
to beast...@googlegroups.com
I have the same issue, except that in my case it was not solved by changing the scaling options in Beagle. I am running out of ideas. Any help?

Alejandra

Anthony Weaver

unread,
Jun 12, 2013, 9:25:16 AM6/12/13
to beast...@googlegroups.com
Hello all,

  I am also experiencing this problem with a likely loss of precision.  I used double precision and beagle scaling and still have the problem.  Even when I don't use Beagle I still have the problem.  When I did some googling, it seems like this issue has cropped up on and off over the last couple of years.  Is there any thing that can be done?  I have attached the file I am trying to run.  It uses RandomLocalClock.  I am trying to help another researcher with this so I can't answer any questions about the data itself.

Thanks

Tony

On Thursday, May 9, 2013 9:33:39 PM UTC-4, Yan wrote:
Hi Alejandra,

My situation is the same as yours. The "beagle_scaling always" did not solve the problem. Did you later find a way to get through? I will appreciate it a lot if you would share it. Thanks.

Best,
Yan
RandomClockGamma_50mil.xml

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 11:32:53 AM6/12/13
to beast...@googlegroups.com
Update.  I was trying to run this on a Linux Box and got the error (with or without Beagle).  I switched over to windows 7 and the same file runs fine on Beast without Beagle.  My problem on Windows 7 is that I can't get Beast/Java to recognize Beagle.  I ran the Windows installer and checked that the files and PATH environment variables all match up but it still tells me that there is no hms beagle in java.library.path.  Yes, I tried to google this and saw others with the same problem but no apparent solution.


--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/9zhYfoT8VTQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.

To post to this group, send email to beast...@googlegroups.com.



--
Visiting Instructor
Dept. Of Computer Science

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 1:13:28 PM6/12/13
to beast...@googlegroups.com
Update #2:  There seem to be two issues here.  
   Issue #1: There appears to be a difference between running on Linux and Windows.  On Linux I get errors which appear to be related to a loss of precision in the calculations.  This problem arises whether or not I use Beagle and using double precision and always scale does not help.  When I run the same file on the same version(1.7.5) of Beast in Windows I do not get the errors.

   Issue #2: Beagle does not seem to work on Windows (Windows 7).  I downloaded/re-downloaded installed-removed-reinstalled the libraries and still get the error.  The installer (which is from 2011) does not create a file hmsbeagle-jni or libhmsbeagle-jni.  It does create the following files:

hmsbeagle-cuda64.dll
hmsbeagle64.dll
hmsbeagle64.lib
hmsbeagle-cpu-openmp64.dll
hmsbeagle-cpu64.dll
hmsbeagle-cpu-sse64.dll
hmsbeagle-cude32.dll
hmsbeagle32.dll
hmsbeagle32.lib
hmsbeagle-cpu-openmp32.dll
hmsbeagle-cpu32.dll
hmsbeagle-cpu-sse32.dll

I cannot use the source to try to create the libraries because I do not have a full version of Visual Studio

Guy Baele

unread,
Jun 12, 2013, 3:11:06 PM6/12/13
to beast...@googlegroups.com
Beagle does work on Windows 7 in 64-bit, it runs just fine on my PC. It's been a while since I installed it (using the installer on the website) so I don't remember if I had any problems with getting it running (and I'm currently not near that PC to check specific settings). I would not recommend trying to build Beagle using Visual Studio at this point. Additional features have been added since the release and I have not been successful compiling a new version (due to some SSE check) on Windows 7.

I have heard of (and experienced) problems with installing Beagle on a 32-bit system, so maybe you could let us know if you have the 32-bit or 64-bit version installed?

Best regards,
Guy


Op woensdag 12 juni 2013 10:13:28 UTC-7 schreef Anthony Weaver het volgende:

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 3:19:54 PM6/12/13
to beast...@googlegroups.com
My windows 7 is definitely 64 bit.  The default Java that ships with it is not.  I am currently trying to get a 64 bit Java installed now.  I also found a posting for a problem with Beagle, see:

I believe that my arch does not match and so it returns hmsbeagle-jni.  I went into the directory where beagle installs.  First I tried to make copies of hmsbeagle64.dll and lib respectively and rename them hmsbeagle-jni.dll and -jni.lib.  I got some error about trying to load AMD64 on IA32 so I did the same thing with the hmsbeagle32.dll and .lib files and then  Beagle does work, but I can't use it because of the problems I detailed above which seem to be related to precision errors.  I am hoping that getting java 64 bit will help but I don't know yet.

Andrew Rambaut

unread,
Jun 12, 2013, 3:22:35 PM6/12/13
to beast...@googlegroups.com
Hi Tony,


On 12 Jun 2013, at 18:13, Anthony J. Weaver Jr. <anthony...@fandm.edu> wrote:

   Issue #1: There appears to be a difference between running on Linux and Windows.  On Linux I get errors which appear to be related to a loss of precision in the calculations.  This problem arises whether or not I use Beagle and using double precision and always scale does not help.  When I run the same file on the same version(1.7.5) of Beast in Windows I do not get the errors.

I suspect this is not an issue of Linux vs Windows but rather running BEAGLE vs not using BEAGLE. The built in BEAST likelihood calculator does differ from BEAGLE in some aspects. In particular the scaling issue - BEAST should be most like BEAGLE with '-beagle_scaling always -beagle_double' but may still be slightly different.

On linux, if you don't use BEAGLE do you get the same result as Windows?

A.

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 3:48:49 PM6/12/13
to beast...@googlegroups.com
Andrew,

  On Linux I was getting the same problem whether I used Beagle or not which is not what happened on Windows.  On Windows it was running on the CPU without errors.  However, now it is not, but I have changed a bunch of stuff including removing JRE7 and putting JRE6 64 bit in.  At this point,  I get the SEVERE: operator errors on Linux and Windows with or without Beagle.

Tony


--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/9zhYfoT8VTQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Andrew Rambaut

unread,
Jun 12, 2013, 3:51:21 PM6/12/13
to beast...@googlegroups.com
When not running BEAGLE was the Linux and or Windows version running the 'NativeLikelihoodCore' or 'JavaLikelihoodCore'? It should say somewhere in the text that comes up.

Andrew

You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 3:56:01 PM6/12/13
to beast...@googlegroups.com
It looks like it is using the java one

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 3:56:34 PM6/12/13
to beast...@googlegroups.com
Sorry, on windows it is using the Java one.  Not sure about linux but probably the same

Andrew Rambaut

unread,
Jun 12, 2013, 4:00:09 PM6/12/13
to beast...@googlegroups.com
If they are both using the Java version, then they should give exactly the same results (the code is
the same). What versions of Java are running on the two machines? Is it OpenJDK on Linux? 

Andrew

Anthony J. Weaver Jr.

unread,
Jun 12, 2013, 4:05:45 PM6/12/13
to beast...@googlegroups.com
Now both Windows and Linux run the same, that is to say they both error out with the likelihood calculations.  Windows is JDK 6 64 bit.  If Beast can't find java does it use the native?

Tony Weaver

unread,
Jun 13, 2013, 9:07:56 AM6/13/13
to beast...@googlegroups.com
After wrestling with this Severe: State not correctly set error here is what I know. 

    1. The benchmark files provided with Beast work fine CPU/GPU
    2. The file the researcher is using does not work.  It gets these errors.  Using the double precision with scaling always as mentioned somewhere earlier in this thread does not work.
    3.  My computer has hyperthreading.  I disabled hyperthreading and still got the problem.
    4.  Finally, I ran Beast with threads set to 0 or 1 (not automatic, the default) it runs fine. The second I change it to 2 or more threads it errors out.

From what I can tell, the error relates to a loss of precision in the calculations.  Running this in a threaded environment exacerbates this problem.  This really hurts my chances of helping the researcher here since this run is expected to take 45 days.  If she can only use single threading I really can't speed up her calculations too much as I was hoping to use my GPU or at least my hyperthreaded CPU.  Hopes this helps someone.

Tony

Andrew Rambaut

unread,
Jun 13, 2013, 9:15:36 AM6/13/13
to beast...@googlegroups.com
This is the interesting fact... Perhaps it is a thread synchronisation error? Setting threads to 1 or 0 will run all the likelihood computations in serial.

Would you be able to send the XML to me (off the list)? I will see if I can replicate the problem.

Andrew

griffinia

unread,
Jun 13, 2013, 10:19:23 AM6/13/13
to beast...@googlegroups.com
I was just having the exact same problem with BEAST 1.75 on an OpenSUSE Linux Cluster using BEAGLE with 8 CPU instances and the "scaling always" option.  The problem disappeared when I dropped the "scaling always" from the command line.  My analysis is running and using 600-700% CPU.  This is weird because in the past with other input files, it has usually been the opposite (i.e., adding the "scaling always" option stopped the issue).

Alan Meerow


Tony Weaver

unread,
Jun 13, 2013, 10:25:14 AM6/13/13
to beast...@googlegroups.com
I just tested this with scaling set to never and it still causes the error

Tony

Andrew Rambaut

unread,
Jun 13, 2013, 10:30:21 AM6/13/13
to beast...@googlegroups.com
The error being reported, "State was not correctly restored", is a very general test of correctness that can be caused by different things. If turning on scalingAlways
fixes it then it will be due to loss of precision (particularly when a GPU is being used that may be using single precision computation). It would also seem that there
are occasional multithreading issues causing it (seems to be when there are lots of partitions and possibly involves the Random Local Clock). For this, setting -threads 0
should fix it (with a loss of multi-core parallelization). The latter problem is a bug and will be fixed once we pin down what exactly is causing it.

A.
> You received this message because you are subscribed to the Google Groups "beast-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.

griffinia

unread,
Jun 13, 2013, 10:31:08 AM6/13/13
to beast...@googlegroups.com
I did not explicitly have "scaling never" in the command line for the successful (running) analysis.  I just deleted any reference to scaling.

Alan

Andrew Rambaut

unread,
Jun 13, 2013, 10:34:03 AM6/13/13
to beast...@googlegroups.com
The default is 'scaling delayed' which has no scaling until the first underflow when it switches to 'always'. So this should be the same as using
scaling always if needed. However, it is conceivable that there is loss of precision without an actual underflow which then causes the 'not correctly
restored' error.

A.
Reply all
Reply to author
Forward
0 new messages