NCL in Garli 2.01 doesn't like NEXUS tree file for optimizeinputonly

107 views
Skip to first unread message

goo...@d-dub.org.uk

unread,
Oct 24, 2014, 10:38:54 AM10/24/14
to garli...@googlegroups.com

Hi

I'm trying to calculate site-wise log likelihoods for a pre-calculated topology on an alignment.

Among other Garli settings I have:

optimizeinputonly = 1
outputsitelikelihoods = 1
streefname = response_to_reviewers/isolates_for_fig_variant_positions_synon_tree.nex

To get the tree into a nexus file, saving from APE or DendroPy I get this error when launching Garli:

Multiple TAXA Blocks have been read (or implied using NEWTAXA in other blocks) and a TREE command (which requires a TAXA block) has been encountered in a TREES block..
This can be caused by reading multiple files. It is possible that
each file
is readable separately, but cannot be read unambiguously when read in sequence.
One way to correct this is to use the
    TITLE some
-unique-name-here ;
command
in the TAXA block and an accompanying
    LINK TAXA
=the-unique-title-goes here;
command to specify which TAXA block
is needed.
Line:   71
Column: 10

Following the suggestion of linking the TREE block with the TAXON block I get this error message:

Reading TAXA block...storing read block: TAXA
 successful
Reading TREES block...storing read block: TREES
 successful

ERROR
: No nexus trees block or Garli block was found in file response_to_reviewers/isolates_for_fig_variant_positions_synon_tree.nex,
     which was specified
as the source of starting model and/or tree


Finally, using a nexus tree file that Garli itself produced (for the best tree) gives the same error as the latter.

How can I prepare a tree to NCL's satisfaction to calculate site-wise log likelihoods on an alignment?

thanks!

Dave




goo...@d-dub.org.uk

unread,
Oct 24, 2014, 10:42:58 AM10/24/14
to garli...@googlegroups.com, goo...@d-dub.org.uk

Correction: using Garli's own tree returns this error:

Reading TREES block...
Unknown taxon CF01 i01 in TRANSLATE command.
The translate key 1 has NOT been added to the translation table!
Line:   5
Column: 15

ERROR
: NCL encountered a problem reading the dataset.



Derrick Zwickl

unread,
Oct 24, 2014, 2:26:55 PM10/24/14
to garli...@googlegroups.com

Hi Dave,

I haven't seen that before.  It may be obvious, but first verify that the taxon names are exactly the same in all cases.  Note that it is an oddity of the NEXUS standard the way in which spaces and underscores are treated, i.e.

my_taxon (underscore only)
and
'my taxon' (space in quotes)

are equivalent but

'my_taxon' (underscore in quotes)

is NOT.  It is possible that one of the conversions between programs fouled up the taxon names. 

If that doesn't help, let me know exactly which blocks appear in which files and we can figure it out.

Best,
Derrick
--
You received this message because you are subscribed to the Google Groups "garli_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to garli_users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

goo...@d-dub.org.uk

unread,
Oct 28, 2014, 8:21:03 AM10/28/14
to garli...@googlegroups.com

Hi Derrick, thanks for your response.

The underscores seem consistent between TAXA block and TREE block. Using the nexus file below, specified as 'streefname' in the configuration file, alongside a fasta file containing a nucleotide multiple alignment with the same labels specified as 'datafname', the error returned is:

Loading starting model and/or tree from file response_to_reviewers/isolates_for_fig_variant_positions_synon_tree.nex

Reading TAXA block...storing read block: TAXA
 successful
Reading TREES block...storing read block: TREES
 successful

ERROR: No nexus trees block or Garli block was found in file response_to_reviewers/isolates_for_fig_variant_positions_synon_tree.nex,
     which was specified as the source of starting model and/or tree


#NEXUS

BEGIN TAXA;

    DIMENSIONS NTAX=57;
    TITLE the_taxa ;
    TAXLABELS
        LESB58
        CF06_i02
        CF06_i01
        CF04_i02
        CF04_i01
        CF01_i02
        CF01_i01
        CF07_i01
        CF09_i01
        CF05_i02
        CF10_i02
        CF10_i01
        CF09_i02
        CF07_i02
        CF08_i01
        CF08_i02
        CF03_i25
        CF03_i23
        CF03_i02
        CF03_i04
        CF03_i19
        CF03_i15
        CF03_i18
        CF03_i09
        CF03_i16
        CF03_i12
        CF03_i03
        CF03_i10
        CF03_i05
        CF05_i01
        CF03_i31
        CF03_i20
        CF03_i14
        CF03_i11
        CF03_i39
        CF03_i28
        CF03_i01
        CF03_i35
        CF03_i24
        CF03_i13
        CF03_i40
        CF03_i38
        CF03_i37
        CF03_i36
        CF03_i30
        CF03_i29
        CF03_i17
        CF03_i34
        CF03_i33
        CF03_i32
        CF03_i27
        CF03_i26
        CF03_i22
        CF03_i21
        CF03_i08
        CF03_i07
        CF03_i06
  ;
END;

BEGIN TREES;
    LINK TAXA=the_taxa;
    TREE 0 = [&U] (((((((((((((LESB58:0.01821748726,((CF06_i02:0.00925390888,CF06_i01:0.00438245479):0.0106492294,(((CF04_i02:0.00696734339,CF04_i01:0.01121447422):0.00217929529,CF01_i02:0.01596768014):0.004022303037,CF01_i01:0.01976327226):0.004386682063):0.0007242802531):0.00226250384,CF07_i01:0.0227541551):0.001664218027,(CF09_i01:0.02270564437,CF05_i02:0.1341125369):0.001664624782):0.0006223932141,((CF10_i02:0.002272729296,CF10_i01:0):0.03181818873,(CF09_i02:0.4681818783,CF07_i02:0.02499994636):4.842877388e-08):4.091707524e-07):0.002238203073,(CF08_i01:0.02492424473,CF08_i02:0.03189393878):4.397518933e-05):0.001571107656,(((((((CF03_i25:0.0,CF03_i23:0.0):0.00227272138,CF03_i02:5.820766091e-09):0.00110260956,CF03_i04:0.00117011182):0.00168180163,(CF03_i19:0.002272729762,CF03_i15:0.004545452073):0.0005581711885):0.001504082815,(CF03_i18:0.001304903533,CF03_i09:0.003240550868):0.00312713068):0.0004847465316,CF03_i16:0.002579291817):0.002027134411,(((CF03_i12:0.0,CF03_i03:0.0):0.0001715308172,CF03_i10:0.002101196442):0.0001471133437,CF03_i05:0.001979974564):0.01345750317):0.02682054788):0.001203621738,CF05_i01:0.02573202178):0.007261394989,(CF03_i31:0.002272728365,(CF03_i20:0.0,CF03_i14:0.0):0):0.005731797311):0.001448570052,CF03_i11:0.002667837543):0.008688115515,(CF03_i39:0.0,(CF03_i28:0.0,CF03_i01:0.0):0.0):0.0002705892548):0.002006688621,CF03_i35:5.086418241e-06):0.002267526928,(CF03_i24:0.0,CF03_i13:0.0):0.00454532681):1.279016431e-07,CF03_i40:0.0,(CF03_i38:0.002272726968,(CF03_i37:0,((CF03_i36:0.0,(CF03_i30:0.0,(CF03_i29:0.0,CF03_i17:0.0):0.0):0.0):0.002272726968,(CF03_i34:0.0,(CF03_i33:0.0,(CF03_i32:0.0,(CF03_i27:0.0,(CF03_i26:0.0,(CF03_i22:0.0,(CF03_i21:0.0,(CF03_i08:0.0,(CF03_i07:0.0,CF03_i06:0.0):0.0):0.0):0.0):0.0):0.0):0.0):0.0):0.0):1.164153218e-10):0):0.0):0.0);

END;

The configuration file, which I realise may include some irrelevant settings from when I was actually searching for trees (I just want the site-wise log-likelihood scores for this tree) is:

[general]
datafname = response_to_reviewers/isolates_for_fig_variant_positions_synon.fna
ofprefix = response_to_reviewers/isolates_for_fig_variant_positions_synon
datatype = nucleotide
outputsitelikelihoods = 1
ratematrix = (a b a a b a)
statefrequencies = estimate
numratecats = 1
ratehetmodel = none
invariantsites = none
streefname = response_to_reviewers/isolates_for_fig_variant_positions_synon_tree.nex
attachmentspertaxon = 5
availablememory = 12000
logevery = 10
saveevery = 100
treerejectionthreshold = 50
outputphyliptree = 1
searchreps = 5
topoweight = 1
modweight = 0.05
brlenweight = 0.2
randnniweight = 0.1
randsprweight = 0.3
limsprweight = 0.6
linkmodels = 0
subsetspecificrates = 1
randseed = -1
refinestart = 1
outputeachbettertopology = 0
outputcurrentbesttopology = 0
enforcetermconditions = 1
scorethreshforterm = 0.05
genthreshfortopoterm = 20000
significanttopochange = 0.01
outputmostlyuselessfiles = 0
writecheckpoints = 0
restart = 0
resampleproportion = 1.0
inferinternalstateprobs = 0
optimizeinputonly = 1
collapsebranches = 1
[master]
nindivs = 4
holdover = 1
selectionintensity = 0.5
holdoverpenalty = 0
stopgen = 5000000
stoptime = 5000000
startoptprec = 0.5
minoptprec = 0.01
numberofprecreductions = 10
treerejectionthreshold = 50.0
topoweight = 1.0
modweight = 0.05
brlenweight = 0.2
randnniweight = 0.1
randsprweight = 0.3
limsprweight =  0.6
intervallength = 100
intervalstostore = 5
limsprrange = 6
meanbrlenmuts = 5
gammashapebrlen = 1000
gammashapemodel = 1000
uniqueswapbias = 0.1
distanceswapbias = 1.0



goo...@d-dub.org.uk

unread,
Oct 28, 2014, 11:01:55 AM10/28/14
to garli...@googlegroups.com, goo...@d-dub.org.uk

Derrick

I edited the starting tree I wanted to use into a constraints string and ran a conventional tree search with those constraints obtaining site-wise log-likelihoods that way.

I then tried using Garli's 'best' tree as a starting tree for the original analysis I was attempting and Garli was happy to calculate site-wise log-likelihoods. (I thought I had tried with a Garli produced nexus tree file but I guess not). The difference between the nexus tree files of R's ape and Python's DendroPy and of Garli is that the former listed a separate TAXA block (ape translated labels, dendropy didn't) while Garli included a 'translate' instruction for the labels inside a TREES block and had no TAXA block.

Nexus is apparently a complex and difficult to implement standard!

Once I got the tree through NCL, Garli seems to have behaved as expected though.

thanks for all your hard work on Garli.


Dave



goo...@d-dub.org.uk

unread,
Oct 28, 2014, 11:27:32 AM10/28/14
to garli...@googlegroups.com, goo...@d-dub.org.uk

On Tuesday, October 28, 2014 3:01:55 PM UTC, goo...@d-dub.org.uk wrote:

Nexus is apparently a complex and difficult to implement standard!


But he was wrong! His labels were incorrect all the time!

Sorry for unecesssary bandwidth use :-)


goo...@d-dub.org.uk

unread,
Nov 19, 2014, 11:20:22 AM11/19/14
to garli...@googlegroups.com, goo...@d-dub.org.uk

For the record, I'm sure now this was a problem with underscores in labels and probably a bug in NCL:

Making a tree using Garli from an alignment containing labels with underscores, then feeding that same tree (no taxa block, trees block with translate command) into Garli to calculate site-wise log-likelihoods returns this error:

Reading TREES block...
Unknown taxon CF01 i01 in TRANSLATE command.
The translate key 1 has NOT been added to the translation table!
Line:   5
Column: 13


ERROR
: NCL encountered a problem reading the dataset.



Adding a taxa block and linking them via a block title returns:

Reading TAXA block...storing read block: TAXA
 successful
Reading TREES block...storing read block: TREES
 successful
Reading PAUP block...storing read block: PAUP
 successful

ERROR
: No nexus trees block or Garli block was found in file alignments_isolates_coinfection_paper/01b_isolates_for_fig_variant_positions_CF05_singlepop_constrained_non_trivial_1SNP.best.tre,

     which was specified
as the source of starting model and/or tree

Loading and saving the tree in dendropy in nexus format creates a file with a taxa block and a trees block but no translate command and this error:

Reading TAXA block...storing read block: TAXA
 successful
Reading TREES block...
Multiple TAXA Blocks have been read (or implied using NEWTAXA in other blocks) and a TREE command (which requires a TAXA block) has been encountered in a TREES block..
This can be caused by reading multiple files. It is possible that
each file
is readable separately, but cannot be read unambiguously when read in sequence.
One way to correct this is to use the
    TITLE some
-unique-name-here ;
command
in the TAXA block and an accompanying
    LINK TAXA
=the-unique-title-goes here;
command to specify which TAXA block
is needed.
Line:   71
Column: 10


ERROR
: NCL encountered a problem reading the dataset.


Linking the blocks as suggested returns this error:

Reading TAXA block...storing read block: TAXA
 successful
Reading TREES block...storing read block:
TREES
 successful

ERROR
: No nexus trees block or Garli block was found in file alignments_isolates_coinfection_paper/01b_isolates_for_fig_variant_positions_CF05_singlepop_constrained_non_trivial_1SNP.best.nex,

     which was specified
as the source of starting model and/or tree


Removing all the underscores in the labels solves all these errors and Garli works as expected, even if the tree was loaded and saved via dendropy. Given that Garli cannot read its own nexus tree files if underscores are in labels and none of the error messages provided a correct diagnosis (i.e., they were misleading) and underscores should be handled by the nexus format/schema, it seems to me this is a bug. Probably in NCL rather than Garli?

Derrick Zwickl

unread,
Dec 2, 2014, 11:33:19 AM12/2/14
to garli...@googlegroups.com

Hello,

Thanks for all your efforts in testing out all of the possibilities in this situation, and sorry for my slow reply to this!  It is essentially an issue of how the taxon names in the Fasta alignment are interpreted and translated to the Nexus name format by NCL, and then how GARLI takes those names and tries to match them up to the taxon names in the already-Nexus tree file.  Clearly, if GARLI can't read its own output tree files in this context then something needs to be done.  I'll put it on my queue of issues to take care of.  There's no fundamental reason that the tree scoring option should require a Nexus formatted tree in the first place, that was essentially laziness on my part.  So, allowing a raw newick input tree would be one option.

Best,
Derrick
Reply all
Reply to author
Forward
0 new messages