model selection

546 views
Skip to first unread message

EvoClive

unread,
May 11, 2012, 11:56:33 AM5/11/12
to beast-users
Sorry about this, I know it's obviously straightforward but I can't
work this out.

In the Model Selection tutorial, the bit of code for running a "series
of 100 power posteriors along the path between prior en posterior".
What do I do with that bit of code? When I run it I get an error:
"Object with idref=mcmc has not been previously declared".

Is it supposed to be attached at the bottom of another XML file?
Obviously the the mcmc id has to be declared earlier in the XML.

Clive

Guy Baele

unread,
May 12, 2012, 5:40:08 AM5/12/12
to beast...@googlegroups.com
Yes, it is supposed to be attached at the bottom of your standard MCMC analysis file, where the mcmc block needs to have id=mcmc.
This doesn't mean that you have to run the whole analysis first, you could set the chain length in the mcmc block to zero (although I wouldn't recommend it).
But you do have to run the standard MCMC analysis past its burnin, to achieve high accuracy for your marginal likelihood.

Best regards,
Guy


Op vrijdag 11 mei 2012 17:56:33 UTC+2 schreef EvoClive het volgende:

EvoClive

unread,
May 15, 2012, 6:58:59 AM5/15/12
to beast-users
OK, so if I attach it at the bottom of my MCMC I still get the "Object
with idref=mcmc has not been previously declared" error message.

But as far as I can see id="mcmc" is defined at the top of the mcmc
section. XML follows, id="mcmc" is in first line, additional code is
last part of block:

<!-- Define
MCMC -->
<mcmc id="mcmc" [SURELY MCMC IS DEFINED HERE?]
chainLength="120000000" autoOptimize="true"
operatorAnalysis="C1.ops.txt">
<posterior id="posterior">
<prior id="prior">
<logNormalPrior mean="1.0" stdev="1.25" offset="0.0"
meanInRealSpace="false">
<parameter idref="kappa"/>
</logNormalPrior>
<uniformPrior lower="0.0" upper="1.0">
<parameter idref="frequencies"/>
</uniformPrior>
<uniformPrior lower="0.0" upper="1.0E100">
<parameter idref="skyline.popSize"/>
</uniformPrior>
<generalizedSkyLineLikelihood idref="skyline"/>
<exponentialMarkovLikelihood idref="eml1"/>
</prior>
<likelihood id="likelihood">
<treeLikelihood idref="treeLikelihood"/>
</likelihood>
</posterior>
<operators idref="operators"/>

<!-- write log to
screen -->
<log id="screenLog" logEvery="1000">
<column label="Posterior" dp="4" width="12">
<posterior idref="posterior"/>
</column>
<column label="Prior" dp="4" width="12">
<prior idref="prior"/>
</column>
<column label="Likelihood" dp="4" width="12">
<likelihood idref="likelihood"/>
</column>
<column label="rootHeight" sf="6" width="12">
<parameter idref="treeModel.rootHeight"/>
</column>
<column label="clock.rate" sf="6" width="12">
<parameter idref="clock.rate"/>
</column>
</log>

<!-- write log to
file -->
<log id="fileLog" logEvery="6000" fileName="C1.log.txt"
overwrite="false">
<posterior idref="posterior"/>
<prior idref="prior"/>
<likelihood idref="likelihood"/>
<parameter idref="treeModel.rootHeight"/>
<parameter idref="skyline.popSize"/>
<parameter idref="skyline.groupSize"/>
<parameter idref="kappa"/>
<parameter idref="frequencies"/>
<parameter idref="clock.rate"/>
<treeLikelihood idref="treeLikelihood"/>
<generalizedSkyLineLikelihood idref="skyline"/>
</log>

<!-- write tree log to
file -->
<logTree id="treeFileLog" logEvery="6000" nexusFormat="true"
fileName="C1.(time).trees.txt" sortTranslationTable="true">
<treeModel idref="treeModel"/>
<strictClockBranchRates idref="branchRates"/>
<posterior idref="posterior"/>
</logTree>
<logTree id="substTreeFileLog" logEvery="6000" nexusFormat="true"
fileName="C1.(subst).trees" branchLengths="substitutions">
<treeModel idref="treeModel"/>
<strictClockBranchRates idref="branchRates"/>
</logTree>

[ADDITIONAL CODE]
<marginalLikelihoodEstimator chainLength="100000" pathSteps="100"
pathScheme="betaquantile" alpha="0.30">
<samplers>
<mcmc idref="mcmc"/>
</samplers>
<pathLikelihood id="pathLikelihood">
<source>
<posterior idref="posterior"/>
</source>
<destination>
<prior idref="prior"/>
</destination>
</pathLikelihood>
<log id="MLE.bsp" logEvery="1000" fileName="MLE.bsp.log">
<pathLikelihood idref="pathLikelihood"/>
</log>
</marginalLikelihoodEstimator>
</mcmc>
<report>
<property name="timer">
<mcmc idref="mcmc"/>
</property>
</report>
</beast>

Guy Baele

unread,
May 15, 2012, 10:40:22 AM5/15/12
to beast...@googlegroups.com
You've put the marginalLikelihoodEstimator block within the mcmc block, which is incorrect.
First close the mcmc block and then paste the marginalLikelihoodEstimator XML-code there, like this:


<beast>

       <mcmc>
         ...
        </mcmc> 

<marginalLikelihoodEstimator chainLength="100000" pathSteps="100" pathScheme="betaquantile" alpha="0.30"> 
            <samplers> 
            <mcmc idref="mcmc"/> 
        </samplers> 
        <pathLikelihood id="pathLikelihood"> 
            <source> 
                <posterior idref="posterior"/> 
            </source> 
            <destination> 
                <prior idref="prior"/> 
            </destination> 
        </pathLikelihood> 
        <log id="MLE.bsp" logEvery="1000" fileName="MLE.bsp.log"> 
            <pathLikelihood idref="pathLikelihood"/> 
        </log> 
    </marginalLikelihoodEstimator> 
        <report> 
                <property name="timer"> 
                        <mcmc idref="mcmc"/> 
                </property> 
        </report> 

</beast> 



Cheers,
Guy



Op vrijdag 11 mei 2012 16:56:33 UTC+1 schreef EvoClive het volgende:

Op vrijdag 11 mei 2012 16:56:33 UTC+1 schreef EvoClive het volgende:

EvoClive

unread,
May 15, 2012, 3:05:08 PM5/15/12
to beast...@googlegroups.com
Thanks again

Clive

EvoClive

unread,
May 24, 2012, 2:24:25 PM5/24/12
to beast...@googlegroups.com
Hi

So, I ran the analysis with the above code included. There is no file called "MLE.bsp.log" that has been outputted.

Any ideas

Clive

C.-j. Mei

unread,
May 24, 2012, 2:57:38 PM5/24/12
to beast...@googlegroups.com
For Model selection, there are 80+ different models, there is a tool called jModelTest which can be used to check a set of alligned sequences and see what's the best model to use with the corresponding set of parameters. We, scoentists in Arizona State University (ASU), developed a workflow which can align the sequence set you have, run the jModelTest to your set of sequences and determine the best model to use and generate the XML input for you.
 
C.-J.
 


 

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/beast-users/-/qmGDLtVQQbMJ.
To post to this group, send email to beast...@googlegroups.com.
To unsubscribe from this group, send email to beast-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/beast-users?hl=en.

EvoClive

unread,
May 24, 2012, 4:17:17 PM5/24/12
to beast...@googlegroups.com
Hi guys

I ran it on my on PC and I get the file, seems it's some servers offering BEAST that don't generate the MLE.bsp.log file

To unsubscribe from this group, send email to beast-users+unsubscribe@googlegroups.com.
To unsubscribe from this group, send email to beast-users+unsubscribe@googlegroups.com.
To unsubscribe from this group, send email to beast-users+unsubscribe@googlegroups.com.
To unsubscribe from this group, send email to beast-users+unsubscribe@googlegroups.com.
To unsubscribe from this group, send email to beast-users+unsubscribe@googlegroups.com.

C.-j. Mei

unread,
May 24, 2012, 4:29:37 PM5/24/12
to beast...@googlegroups.com
It's all depending on the output specification yourself specified. Look at the end of your input XML.
 
C.-J. Mei
 
To view this discussion on the web visit https://groups.google.com/d/msg/beast-users/-/slyYNGaI_ycJ.

To post to this group, send email to beast...@googlegroups.com.
To unsubscribe from this group, send email to beast-users...@googlegroups.com.

EvoClive

unread,
May 25, 2012, 5:50:14 AM5/25/12
to beast...@googlegroups.com
Hi

But exactly the same .XML generated a log file on my own PC but not on the Oslo Bioportal??

C

C.-j. Mei

unread,
May 25, 2012, 9:20:49 AM5/25/12
to beast...@googlegroups.com
Oh, a portal. That's why. It'll generate only links/connection for certain output files, not all of them. Email the support and see if you can access the additional file(s).
 
C.-J.
 
To view this discussion on the web visit https://groups.google.com/d/msg/beast-users/-/XD3O2rjZ0_AJ.

To post to this group, send email to beast...@googlegroups.com.
To unsubscribe from this group, send email to beast-users...@googlegroups.com.

Giap Nguyen

unread,
May 25, 2012, 11:54:02 PM5/25/12
to beast...@googlegroups.com

As I understant, model selection just applied from BEAST v1.7.1 or later, but not in Bioportal server (v.1.6.1)

Jacob Berv

unread,
Apr 26, 2013, 1:26:33 PM4/26/13
to beast...@googlegroups.com
What if you've already run your MCMC analysis - how would you analyze a log file from a normal BEAST run to estimate the marginal likelihood using SS/PS? 

mountainmanjared

unread,
Apr 27, 2013, 7:10:33 PM4/27/13
to beast...@googlegroups.com
You can't generate PS and SS MLEs from your normal .log file. You need to re-run the analysis with the code provided in the Baele et al. (2012) supplement that generates a separate .log file that is used in the PS/SS marginal likelihood calculations. You can only run the HME, sHME, and AICM with your "normal" MCMC .log output that you already have (by commenting out the MCMC section of your .xml, or setting your chainlength to 0, I can't remember which way it is).

Jared

Jacob Berv

unread,
Apr 27, 2013, 8:46:41 PM4/27/13
to beast...@googlegroups.com
I've been reading through these forums and it seems to be the consensus that these methods are much better for estimating the likelihood - are there circumstances where it would be acceptable to sue the HME, sHME  or AICM? For example, if you are comparing partitioning schemes (not demographic or clock models) and the difference in likelihood between two schemes as estimated by the HME, sHME, or AICM (100s or 1000s of units), is it possible that the PS/SS methods might give me a different result? In some cases the BF comparison as estimated by Tracer is >1000. 

Jacob Berv

unread,
Apr 27, 2013, 8:49:54 PM4/27/13
to beast...@googlegroups.com
Sorry - that didn't make sense - I meant to say - 

if you are comparing partitioning schemes (not demographic or clock models) and the difference in likelihood between two schemes as estimated by the HME, sHME, or AICM is very large, (100s or 1000s of units), is it possible that the PS/SS methods might give me a different result?

mountainmanjared

unread,
Apr 29, 2013, 12:24:46 PM4/29/13
to beast...@googlegroups.com
It is possible, just probably unlikely. The sHME and HME have been shown to be inferior ML estimators compared to PS and SS estimators, so there's no reason to be doing Bayes factors testing with the harmonic mean estimators when we have the means to estimate MLs with better estimators, even if it means re-doing analyses...

Jared

Jacob Berv

unread,
Apr 29, 2013, 1:52:03 PM4/29/13
to beast...@googlegroups.com
Unfortunately the analyses I'm doing takes about a month (large species tree) to run on our cluster, so re-doing the analyses is not a practical option. 


--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/KuDpxg-liPc/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.

To post to this group, send email to beast...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages