optimize TNT tree

56 views
Skip to first unread message

alicia alvarez

unread,
Aug 15, 2014, 8:50:58 AM8/15/14
to garli...@googlegroups.com
Dear all
I want to optimize a tree obtained with TNT.
I am failing to understand how to set the config file. Which command should I put?
I wrote this but I dont know wheter it is ready to run or not.
Any help will be very welcome
[general]
datafname = none
constraintfile = none
streefname = TreeNex
attachmentspertaxon = 50
ofprefix = TreeNex
randseed = -1
availablememory = 512
logevery = 10
saveevery = 100
refinestart = 1
outputeachbettertopology = 0
outputcurrentbesttopology = 0
enforcetermconditions = 1
genthreshfortopoterm = 20000
scorethreshforterm = 0.05
significanttopochange = 0.01
outputphyliptree = 0
outputmostlyuselessfiles = 0
writecheckpoints = 0
restart = 0
outgroup = 1
resampleproportion = 1.0
inferinternalstateprobs = 0
outputsitelikelihoods = 0
optimizeinputonly = 1
collapsebranches = 1

searchreps = 2
bootstrapreps = 0

Thanks in advance,
Alicia Álvarez

Derrick Zwickl

unread,
Aug 18, 2014, 5:10:24 PM8/18/14
to garli...@googlegroups.com

Hi Alicia,

Sorry for my slow reply, I was on vacation.

The setting that triggers the optimization of a fixed tree is optimizeinputonly, which you have set below.  The other thing that must be done is specifying the Nexus formatted file that contains the tree(s) to be optimized.  That comes from the streefname option, which it looks like you've also set.

So, it looks like you have things set properly.  Let me know if you have problems/errors while running it.

Best,
Derrick
--
You received this message because you are subscribed to the Google Groups "garli_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to garli_users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

alicia alvarez

unread,
Aug 20, 2014, 8:20:30 AM8/20/14
to garli...@googlegroups.com
Dear Derrick
thanks for answering me!
lamentably, I got this error message...
"no model descriptions found in config file...."

I´m totally lost...sorry

Best
Alicia

Derrick Zwickl

unread,
Aug 20, 2014, 1:09:50 PM8/20/14
to garli...@googlegroups.com
Hi Alicia,

Sorry, I though that the configuration information that you had pasted in your email was only part of your entire configuration file, not the whole thing.  You do need more sections that specify things like the substitution model that you want to use.  I also did not previously notice that you hadn't specified an alignment file on the datafname line.  I pasted in the default configuration file below, with a few changes to set it to do what you want.  Note that many of the settings relate to treesearch details, and have no effect when optimizeinputonly is set.  You'll need to specify your alignment file as datafname, and your treefile will need to match the name specified as streefname (TreeNex below).  The default substitution model is set to GTR+I+G (the most complicated model), but could be changed by making edits in the [model1] section. 

One final note, from what you've said I've been assuming that what you want to do is to find the optimal branch lengths (and model parameters) under the ML criterion for the tree returned by TNT.  If I've misinterpreted that, let me know.

Best,
Derrick

[general]
datafname = your_alignment_filename_here

constraintfile = none
streefname = TreeNex
attachmentspertaxon = 50
ofprefix = TreeNex.out

randseed = -1
availablememory = 512
logevery = 10
saveevery = 100
refinestart = 1
outputeachbettertopology = 0
outputcurrentbesttopology = 0
enforcetermconditions = 1
genthreshfortopoterm = 20000
scorethreshforterm = 0.05
significanttopochange = 0.01
outputphyliptree = 0
outputmostlyuselessfiles = 0
writecheckpoints = 0
restart = 0
outgroup = 1
resampleproportion = 1.0
inferinternalstateprobs = 0
outputsitelikelihoods = 0
optimizeinputonly = 1
collapsebranches = 1

searchreps = 2
bootstrapreps = 0

[model1]
datatype = nucleotide
ratematrix = 6rate
statefrequencies = estimate
ratehetmodel = gamma
numratecats = 4
invariantsites = estimate

[master]
nindivs = 4
holdover = 1
selectionintensity = 0.5
holdoverpenalty = 0
stopgen = 5000000
stoptime = 5000000

startoptprec = 0.5
minoptprec = 0.01
numberofprecreductions = 10
treerejectionthreshold = 50.0
topoweight = 1.0
modweight = 0.05
brlenweight = 0.2
randnniweight = 0.1
randsprweight = 0.3
limsprweight =  0.6
intervallength = 100
intervalstostore = 5
limsprrange = 6
meanbrlenmuts = 5
gammashapebrlen = 1000
gammashapemodel = 1000
uniqueswapbias = 0.1
distanceswapbias = 1.0

alicia alvarez

unread,
Aug 20, 2014, 1:33:16 PM8/20/14
to garli...@googlegroups.com
Hi Derrick
I figured it was going that way ...that the config file was incomplete! thanks for the clarification!

And yes, you are interpreting well...I want to opitmize the branch lengths of a tree obtained with tnt. But I dont have an alignment file alone, instead I have a matrix with molecular data plus morphological data (the matrix contains phylogenetic data for extant and extintct species of my group)...is that an inconvenient?

Thanks again,
best
Alicia


Derrick Zwickl

unread,
Aug 21, 2014, 3:14:03 PM8/21/14
to garli...@googlegroups.com

Hi Alicia,

Were you planning on including the morphological data during the tree optimization?  If you want to include those extinct species in the tree then you will need to.  Things become a bit more complicated in that case.  You'll want to read about configuration partitioned models, and configuration of the morphology models here:

https://www.nescent.org/wg_garli/Mkv_morphology_model

GARLI will unfortunately require that the two types of data are in separate NEXUS characters blocks (one nucleotide, one "standard" data).  So, having the two datatypes in a single matrix will be a problem.  There is some information about this on the above page, and if you download any version of the program you'll find example files in the example/partition/exampleRuns/mixedDnaMkv/ directory.

Let me know how it goes.
Best,
Derrick
--

alicia alvarez

unread,
Sep 4, 2014, 10:36:33 AM9/4/14
to garli...@googlegroups.com
Hi Derrick
soory about the delay!!

It worked fine!!!! (I think!)
part of the output says:

GARLI data subset 5
    CHARACTERS block #5 ("morphology")
    Data read as Standard k-state data,
    modeled as Standard k-state data
    ****
    WARNING - Constant characters found in standard data matrix (sites  1)
    Currently these will be ignored because including them in the
    likelihood calculations would require knowledge of how many states
    were possible for those columns (i.e., 1 state was observed, but
    was that out of 2 possible, or 3 or 4, etc)
    ****
    Part ambig. char's of taxon Acaremysgroup converted to full ambiguity:
      char  57
    Part ambig. char's of taxon Spalacopus converted to full ambiguity:
      char  23
    Part ambig. char's of taxon Aconaemys converted to full ambiguity:
      char  23  25
    Part ambig. char's of taxon Pithanotomys converted to full ambiguity:
      char  23
    Part ambig. char's of taxon Eucelophorus converted to full ambiguity:
      char  51
    Part ambig. char's of taxon Actenomys converted to full ambiguity:
      char  11
    Part ambig. char's of taxon Ctenomys converted to full ambiguity:
      char  11
    Part ambig. char's of taxon Thrichomys converted to full ambiguity:
      char  35
    Part ambig. char's of taxon Isothrix converted to full ambiguity:
      char  19
    Part ambig. char's of taxon Stichomys converted to full ambiguity:
      char  51  59
    Part ambig. char's of taxon Prospaniomys converted to full ambiguity:
      char  57
    Subset of data with 2 states:
      chars 2 5-8 10-27 29-45 48-51 53 55-60 68-70 72 73 75-77
    Summary of data:
      69 sequences.
      0 constant characters.
      59 parsimony-informative characters.
      0 uninformative variable characters.
      59 total characters.
      58 unique patterns in compressed data matrix.
    Pattern processing required < 1 second

    Part ambig. char's of taxon Acaremysgroup converted to full ambiguity:
      char  3  66
    Part ambig. char's of taxon Chasichimys converted to full ambiguity:
      char  66
    Part ambig. char's of taxon Ctenomys converted to full ambiguity:
      char  9
    Subset of data with 3 states:
      chars 3 4 9 28 46 47 52 54 64 66 67 74
    Summary of data:
      69 sequences.
      0 constant characters.
      12 parsimony-informative characters.
      0 uninformative variable characters.
      12 total characters.
      12 unique patterns in compressed data matrix.
    Pattern processing required < 1 second

    Subset of data with 4 states:
      chars 61 62 65
    Summary of data:
      69 sequences.
      0 constant characters.
      0 parsimony-informative characters.
      3 uninformative variable characters.
      3 total characters.
      3 unique patterns in compressed data matrix.
    Pattern processing required < 1 second

    Part ambig. char's of taxon Proechimys converted to full ambiguity:
      char  71
    Subset of data with 5 states:
      chars 63 71
    Summary of data:
      69 sequences.
      0 constant characters.
      2 parsimony-informative characters.
      0 uninformative variable characters.
      2 total characters.
      2 unique patterns in compressed data matrix.
    Pattern processing required < 1 second


and then it looks like as it was treated my morphological dataset as 4 separate dataset....
Model 5
  Number of states = 2 (standard data)
  Character change matrix:
    One rate (symmetric one rate Mk model)
  Equilibrium State Frequencies: equal (0.50, fixed)
  Rate Heterogeneity Model:
    no rate heterogeneity

Model 6
  Number of states = 3 (standard data)
  Character change matrix:
    One rate (symmetric one rate Mk model)
  Equilibrium State Frequencies: equal (0.33, fixed)
  Rate Heterogeneity Model:
    no rate heterogeneity

Model 7
  Number of states = 4 (standard data)
  Character change matrix:
    One rate (symmetric one rate Mk model)
  Equilibrium State Frequencies: equal (0.25, fixed)
  Rate Heterogeneity Model:
    no rate heterogeneity

Model 8
  Number of states = 5 (standard data)
  Character change matrix:
    One rate (symmetric one rate Mk model)
  Equilibrium State Frequencies: equal (0.20, fixed)
  Rate Heterogeneity Model:
    no rate heterogeneity

This part made ​​me lose a little ...

Thanks for all your help. This is whole new field for me!

Best
Alicia

Derrick Zwickl

unread,
Sep 10, 2014, 4:43:16 PM9/10/14
to garli...@googlegroups.com

Hi Alicia,

Yes, that all looks correct.  Sorry, the configuration and output when using those models can be a bit confusing!

Best,
Derrick

alicia alvarez

unread,
Sep 11, 2014, 2:08:30 PM9/11/14
to garli...@googlegroups.com
This is a good news!
Thanks a lot for all your help!

Best,
Alicia
Reply all
Reply to author
Forward
0 new messages