Analysis of variable sites only in BEAST (Bis)

655 views
Skip to first unread message

adrien...@gmail.com

unread,
Feb 27, 2014, 1:06:36 PM2/27/14
to beast...@googlegroups.com
Hi,

I'm trying to run BEAST using variant sites only.
As previously described by Andrew: https://groups.google.com/forum/#!searchin/beast-users/SNPs/beast-users/V5vRghILMfw/jMtC_DwS5EYJ

I have started by replacing the following part of the XML:

       <patterns id="patterns" from="1" every="1" >
               <alignment idref="alignment"/>
       </patterns>


by:
       <mergePatterns id="patterns">
               <patterns from="1" every="1">
                       <alignment idref="alignment"/>
               </patterns>

               <constantPatterns>
                       <alignment idref="alignment">
                       <counts>
                               <parameter value="698995 1320628 1316480 697848"/>
                       </counts>
               </constantPatterns>
       </mergePatterns>


With the numbers being my counts of the constant ACGT sites

But i keep on getting the fatal error:

    Line number: 213
    Column number: 19
    Error message: The element type "alignment" must be terminated by the matching end-tag "</alignment>".


BEAST has terminated with an error. Please select QUIT from the menu.


I've tried to manually add the required end-tag at different places but without any success.

Can someone, give me a hand ? I've attached my xml.

Also, I don't think that this is related but in my xml file, instead having
  <patterns from="1" every="1">
I rather find :
<patterns id="patterns" from="1" strip="false">

BEAST run when I don't modify the xml at all but I guess that the divergence times/rates estimation will be biased if I don't stipulate how many invariant site I have. Isn't it an aposteriori way of converting the estimations taking into account for the invariant sites ?

Thanks
Adrien





M_only.fastab.xml

Andrew Rambaut

unread,
Feb 27, 2014, 3:45:36 PM2/27/14
to beast...@googlegroups.com
Dear Adrien,

Change this line:
                       <alignment idref="alignment"> 

to:
                       <alignment idref="alignment" /> 

Andrew

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/groups/opt_out.
<M_only.fastab.xml>

Andrew Rambaut 
Institute for Evolutionary Biology | Centre for Infection, Immunity & Evolution 
Ashworth Laboratories, University of Edinburgh, Edinburgh, EH3 9JT, UK


Message has been deleted

adrien...@gmail.com

unread,
Feb 28, 2014, 4:45:34 AM2/28/14
to beast...@googlegroups.com
Thanks Andrew, it's working...
# Could you quickly explain me how and when this information of number of invariant sites will be taken into account by BEAST ?
Will it modify the value rates and divergence times estimated ?
# Also, I'll post a message in the original post to correct for the typo...
Best,
Adrien

Andrew Rambaut

unread,
Feb 28, 2014, 4:54:49 AM2/28/14
to beast...@googlegroups.com
Normally when BEAST loads an alignment it compresses it into a set of unique site patterns (columns in the alignment with the same pattern of nucleotides). Because the same pattern will have the same likelihood under any model it just calculates the likelihood once and then multiplies it by the number of times that pattern occurs. In most alignments the 4 constant patterns (A, C, G & T) are the most frequent so for low diversity alignments this compression is substantial. 

If you have an alignment with all the constant sites removed then this will just put them back in with the appropriate weights. You end up with exactly the same result as loading the very large alignment and then allowing BEAST to compress it. 

Andrew


signature.asc
Reply all
Reply to author
Forward
0 new messages