Problem Running SNAPP v1.1.6 in BEAST v2.1.3

518 views
Skip to first unread message

daniel.m...@yale.edu

unread,
Sep 15, 2014, 5:06:04 PM9/15/14
to beast...@googlegroups.com
Hi all,

I'm attempting to use SNAPP on a very large AFLP datset of 105 individuals with 4803 loci. I've been performing these calculations on a computing cluster using the following command:

~/BEAST/bin/beast -working myData.xml

However, even after allowing the analysis to run for approximately 2 full days, the MCMC chain does not progress beyond generation 0 and I encounter the following error:

java.lang.ArrayIndexOutOfBoundsException: 27
        at snap.likelihood.SiteProbabilityCalculator.doInternalLikelihood(Unknown Source)
        at snap.likelihood.FCache.getBottomOfBrancheF(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.doCachedInternalLikelihood(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeCachedSiteLikelihood2(Unknown Source)
        at snap.likelihood.SiteProbabilityCalculator.computeSiteLikelihood(Unknown Source)
        at snap.likelihood.SnAPLikelihoodCore.computeLogLikelihood(Unknown Source)
        at snap.likelihood.SnAPTreeLikelihood.calculateLogP(Unknown Source)
        at beast.core.util.CompoundDistribution.calculateLogP(Unknown Source)
        at beast.core.util.CompoundDistribution.calculateLogP(Unknown Source)
        at beast.core.State.robustlyCalcPosterior(Unknown Source)
        at beast.core.MCMC.run(Unknown Source)
        at beast.app.BeastMCMC.run(Unknown Source)

I also encounter similar java.lang.ArrayIndexOutOfBoundsException errors when attempting to run the example datset (test1.xml) provided with the SNAPP package. However, unlike with my dataset, the MCMC chain progresses for the test dataset.

I have attached the standard out and standard error files along with my AFLP dataset. Please let me know if you have any idea what might be the cause of these issues.

Thanks in advance for the help,
 
myData.xml
myData.system.out
myData.error.out

Pip

unread,
Sep 15, 2014, 9:14:23 PM9/15/14
to beast...@googlegroups.com
Hi Daniel,

This looks similar to the java errors I've been getting (see here https://groups.google.com/forum/#!topic/beast-users/2MvKMlsfYP0)

Did you see the likelihood stuck at 0 for your run of the test dataset?

I'm also looking forward to any help anybody can provide...

Pip

daniel.m...@yale.edu

unread,
Sep 16, 2014, 3:02:52 PM9/16/14
to beast...@googlegroups.com
Hi Pip,

Thanks for commenting. I actually saw your post earlier, but I figured if I posted separately it might help draw more attention to this issue.

When I run the test1.xml dataset it doesn't get stuck at 0 generations. In fact it runs to completion, but with tons of errors. To give you and idea, the standard error file for the test run was actually too large to attach with my original post.

-Dan 

Pip Griffin

unread,
Sep 16, 2014, 6:18:04 PM9/16/14
to beast-users
Hi Dan,

just to clarify - the test1.xml dataset ran for me too, and also raises lots of errors, but the likelihood value stays at 0 for the whole run.

Pip

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

daniel.m...@yale.edu

unread,
Sep 17, 2014, 2:44:12 PM9/17/14
to beast...@googlegroups.com
Hi Pip,

Ah sorry, I misunderstood. You're right, I actually do encounter the same problem as you when running the test1.xml file. Although the MCMC progresses, the likelihood value (the "posterior" column in the output file) always remains at 0.0. So it would seem we're both having the same issue.

How are you running BEAST? From a GUI or from the command line?

-Dan

David Bryant

unread,
Sep 19, 2014, 5:29:21 PM9/19/14
to beast...@googlegroups.com
Hi guys,

Remco is away on leave, and I'm drowning in teaching at the moment, so I haven't had a chance to check this out properly (and wasn't able to get it running on my installation anyway).

My hunch is that there is a bug in the missing sites code in the way the caching works.

I expect to have some time next Tuesday and I'll fire up the debugger then!


-David.

Pip Griffin

unread,
Sep 21, 2014, 6:44:08 PM9/21/14
to beast-users
Thanks a lot David, will check back again on Tuesday

Pip

daniel.m...@yale.edu

unread,
Sep 22, 2014, 7:56:41 PM9/22/14
to beast...@googlegroups.com
Hi all,

I've been in email contact with Remco about this issue. It seems there's some issue with the current version of SNAPP. Here's what he had to say.

"Hi Daniel,

Your file [referring to my data file, not the test file] runs fine (though very slowly) with the latest development
code, but won' start with the SNAPP v1.1.6. Not sure why this happens,
but it looks like a new release is required to get your file running.

Will let you know as soon as this will be done."

Remco Bouckaert

unread,
Sep 23, 2014, 5:11:46 PM9/23/14
to beast...@googlegroups.com
Hi Daniel,

Just to let you know about a quick fix: the file you sent runs with BEAST pre-release 2.2.0 -- available from https://github.com/CompEvol/beast2/releases/tag/v2.2.0-pre for Linux/command line only. Note that BEAST v2.2.0-pre requires Java 8.

I prepared the SNAPP package for this pre-release, so you can install SNAPP v1.1.7 (this package is not compatible with v2.1.x). I will keep investigating what is going wrong with v2.1.x and let you know once I made progress.

Cheers,

Remco

Pip Griffin

unread,
Oct 16, 2014, 9:17:42 PM10/16/14
to beast-users
Hi Remco, Daniel and others,

I have been trying the 'quick fix' Remco suggested, but am still running into problems.

We have a SNP alignment (0,1,2 format) with a lot of missing data (indicated by '?').

With the 'non-polymorphic' box unchecked (indicating the data excludes non-polymorphic sites), the program appears to run okay, but still shows a static likelihood of 0.0

With the 'non-polymorphic' box checked (which is inappropriate for our data), the program runs and the likelihood trace looks okay, but the program appears to be getting stuck in parameter space somehow: each run produces a mcc tree with one highly-supported node (pp = 1) and the rest of the nodes with very low pp ( < 0.01). The node that is supported varies among runs, and doesn't reflect the major structure we do see in our data.

Otherwise I have kept all the default settings as per the example AFLP data file - which runs fine, by the way.

Can you shed any light on this?

thanks for your help
Pip

Remco Bouckaert

unread,
Oct 19, 2014, 5:12:45 PM10/19/14
to beast...@googlegroups.com
Hi Pip,

The default settings in the example file are probably not appropriate for your data -- a Bayesian analysis requires you to think about these priors unfortunately. It is very well possible that the default priors lead to trees you do not expect, as well as leading to the convergence issues you refer to.

In the Rough Guide (http://beast2.cs.auckland.ac.nz/SNAPPv1.2.pdf) there are some hints on how to set up these priors and calculate default values for the u and v parameters.

It worries me a bit that you get static likelihoods of 0. If you can send me that file I'd be happy to have a look at it.

Hope this helps,

Remco

Pip Griffin

unread,
Oct 19, 2014, 7:54:36 PM10/19/14
to beast-users
Hi Remco,

yes I expect to have to adjust the priors somewhat before a final analysis - currently I'm just trying to check that the program runs as it should. I'm attaching the .xml and .log files from the run with likelihood 0 - would be fantastic if you can have a look.

Thanks very much
Pip
ast_test.xml
ast_test.log.zip

Pip Griffin

unread,
Nov 9, 2014, 8:22:58 PM11/9/14
to beast-users, higg...@gmail.com, daniel.m...@yale.edu
Hi Daniel and Remco,

I was just wondering whether you've had any luck figuring out the SNAPP issue Daniel and I both encountered and posted to the mailing list. I am still stuck without a working solution.

I'm still trying the combination of pre-release BEAST v2.2.0, SNAPP 1.1.7 and Java 1.8 that you recommended, Remco. I've noticed a few patterns:

- Runs produce a static likelihood of 0 if the 'invariant sites remain in data' flag is set to 'false'.
- If the 'invariant sites remain in data' flag is set to 'true' (which is inappropriate for my data!) the likelihood varies in a way that looks normal. However the run produces a mcc tree that is totally unexpected, has full support for the basal node and near-zero support for all other nodes, and differs completely between runs.

I would love to be able to use this software. Can you let me know if you're any closer to finding a solution?

thanks again
Pip
Reply all
Reply to author
Forward
0 new messages