CHANGES BETWEEN 5.0 beta2 and 5.0 Official Release
New Features:
- Status messages for progress when using Model Selection and Fixed States
transformation commands.
- report(diagnosis) produces columns of only relevant information.
- Improved error messages.
- Updated Documentation.
- Updated Test Suite and Tutorials.
- New option for optimization level of dynamic likelihood (see below).
Bugs Fixed:
- Script analysis would drop graphsupports command in finalized script.
- Due to a number of issues when loading trees with taxa missing from the
loaded data, we do not allow these trees to be loaded. The application
will report the missing trees and skip loading the information. This can
be used to generate a select command to filter terminals, like
select(terminals, not files:("missing_terminals_file"))
- General NonAdditive Characters causing error under Exhaustive DO. The
appropriate functions not being called from previous testing.
- Logic backwards in processing the level and orientation arguments for
reading BreakInv and chromosome characters.
- Loading data with unequal number of fragments (separated by #) did not
cause an error, instead filled sections with "missing" data.
- Issues with synonym files and processing trees (Ron Clouse)
- Build under dynamic likelihood joined two nodes with different models.
- Order of nexus blocks had Assumption block first, not last.
New Commands (see manual for full explanations) :
- set( opt:exhaustive_dyn) will optimize the dynamic likelihood model
directly, instead of via an implied alignment.
CHANGES BETWEEN 5.0 beta1 and 5.0 beta2
New Features:
- report trees with branches would work for likelihood characters, but would
not report parsimony branch lengths. The new command allows printing the
branch lengths from three different methods (see below), these options are
ignored under likelihood as we always report the parameter value that
maximizes the log-likelihood.
- report graphtrees, asciitrees, and trees with collapsible property have
been extended to use the branch lengths mentioned above (see command
description below).
- Custom Alphabet transformation to static character via static approx. This
was an issue with the characters that can be represented as prealigned and
how to properly transform between them.
- Added support for ocaml PARMAP for generating fixed-state cost matrices.
- Added command report(robinson_foulds) (see command description below)
Bugs Fixed:
- Selecting terminals caused an error in ncurses display when writing to an
incorrect window (Louise Crowley).
- Transforming to likelihood under a model selection criteria (aic,aicc,bic)
from a tcm under parsimony with different gap cost than substitution cost
resulted (in failure) with added data that represents the cost of a gap.
These should be filtered from the transformation, as in other likelihood
transformations on those characters.
- Custom-alphabet implied alignment produced an alignment with extra indels.
This was due to an encoding issue that became relevant when mixing custom
alphabet characters and levels.
- Issue with missing data in custom-alphabet prealigned characters resolved.
- generation of cost-matrix with all-elements row/col and non-zero diagonal
cost matrix --was being replaced with zeros, should have been min of row.
- No reported error when reading in sequence data with unequal fragments.
This forced the computation to proceed as if those fragments were
missing. Now (as in POY4) we report an error to the user.
New Commands (see manual for full explanations) :
- report command for parsimony branch lengths
report(trees:(branches:min))
- minimum number of changes is reported on branches
report(trees:(branches:max))
- maximum number of changes is reported on branches
report(trees:(branches:single))
- number of changes reported based on single assignment of dynamic
characters
- report command for tree distances using Robinson Foulds distance metric
report(robinson_foulds)
- will print matrix to terminal window
report("OUTFILE", robinson_foulds)
- will print matrix to file OUTFILE
Changed Commands (see manual for full explanations) :
- report command for collapsed branches has been changed. Instead of
collapse:true or collapse:false, we've extended the command like the
branch lengths above to collapse:min, collapse:max, collapse:single. The
branch is collapsed when the length as defined above is equal to 0.0.
report(trees:(collapse:single))
report("tree.pdf", graphtrees:collapse:min)
report( asciitrees:collapse:max )
CHANGES BETWEEN 5.0 alpha3 and 5.0 beta
New Features:
- Consistency in alignment procedures across the application. The trace-back
procedures produce the same result in affine (with gap opening equal to 0)
as normal alignment procedures, as well as alignment procedures with speed
increases (newkkonen), and the space saving algorithm.
- Continuous characters are fully supported above the range of 0-255, now
0 to the maximum size of integers on the machine. Although this change is
slightly slower, characters that do fit in the 255 range are vectorized as
previously implemented.
- build(N,random) does not do a random Wagner build, but generates and
diagnoses a random topology.
- Sankoff and Sequence characters with matrices of non-0 diagonal elements
have been implemented.
- Information theoretic model selection procedures have been designed under
static and dynamic likelihood (see new command below). This command can be
used on multiple character sets and types of data in which case the model
selected for each data-set are combined on the final tree(s). A tree must
be in memory when the command is executed. The command analises all the
models possible for each tree and selects the best based on the
information criteria selected.
- Updated build/compile procedures for different environments.
- Implemented No Common Mechanism likelihood model (see new command below).
- Bootstrap Probabilities under likelihood have been implemented. We use the
same command as BP for other characters previously.
- Better memory usage when loading multiple files of the same TCM.
- Speed increases in the diagnosis of normal and affine sequence alignment.
- Changed most (all found/possible) functions to tail-recursion to avoid
stack-overflows, should also increase speeds.
Bugs Fixed:
- Character Selection procedures (through the IDENTIFIERS in the command
structure) have been verified and implement special cases of each-other
when necessary for lower chances of future bugs. (Thanks to Fernando
Marques).
- Bug-fix with partitioned dynamic likelihood characters of multiple models
or in combination with static characters causing optimization failures.
- Error in transform(prealigned) on static characters --command only works
on dynamic characters. These characters should have been ignored.
- Likelihood model optimization routine returning matrix of NAN when branch
lengths were sub-normal; minimum value has been used to avoid this.
- Error in report(seq_stats) when missing data is present.
- Proper usage of missing data in iterative:exact and iterative:approx.
Previously missing data could be assigned in the median nodes of
characters in certain situations, resulting in 0 costs assignments in
subtrees, as well as errors in the median assignment functions. (Thanks to
Denis Jacob Machado)
- elikelihood for static likelihood characters had a bug in counting
'uninformative' data in estimating transition probabilities.
- Single assignment functions for the newkkonen alignment procedure were not
calling the proper median function.
- Static likelihood takes '?' into account correctly under fifth state (gap
as an additional state) models. Previously it was interpreted as a gap,
now it is interpreted as missing, like gaps in four-state models.
- Static likelihood was not taking into account missing data correctly.
- Correction for tree diagnosis in non-0 diagonal tcm matrix.
- selecting unique topologies may choose suboptimal likelihood trees if the
models/branches are different, we now select the lower of the two.
- transform(likelihood(...) -> transform(parsimony) resulted in an error
state of the application and incorrect costs from before the likelihood
transform command. Now the transform can be done to recover the parsimony
costs or vice versa.
- Join caused a failure in fuse, causing failure in diagnosis of tree.
New Commands (see manual for full explanations) :
- Command to set the optimization thoroughness for the likelihood procedures
has been added. The command,
set(opt:coarse)
set(opt:exhaustive)
set(opt:no_opt)
determines the number of passes for the optimization algorithm, and the
convergence factors for the numerical routines.
- Information theoretic model selection for likelihood uses the same
transform command as (e)likelihood, but replaces the model (ie. jc69, gtr)
with an information theoretic criteria --aic, aicc, or bic. ie,
transform(likelihood:(aic,rates:gamma:(4)))
- No Common Mechanism (ncm) has been added as an additional model under
likelihood. This is for static characters only. ie,
transform(likelihood:(ncm))
Changed Commands (see manual for full explanations) :
- reading prealigned characters (outside of nucleotides) have been unified
with the normal command structure. For example,
read( custom_alphabet:("DATAFILE", "MATRIXFILE"; init3D:true) )
now is,
read( prealigned:( custom_alphabet:("DATAFILE"), tcm:("MATRIXFILE")) )
The previous command structure was only briefly implemented in an alpha.
Known Issues :
- Pre-aligned affine data reports an incorrect cost. This option for
analysis has been turned off and an error is reported.
- Diagnosis on level over 5 creates a seg-fault. This is probably a memory
issue as the function would exceed many computers limits.
- Newkkonen space saving (set(space_saving_alignment)) command has been
de-activated due to segfault.
- 'help' commands generated from latex docs are a mess.
CHANGES BETWEEN 5.0 alpha2 and 5.0 alpha3
Bugx Fixed:
- Continuous characters are fully supported from Hennig86/Nona files. The
previous format that POY reads is the same (integers separated by spaces,
missing data represented as '?', and ranges defined in square brackets
separated by spaces. The data limit for the continuous characters requires
a maximum range of 255. (Thanks to Edmundo Gonzalez)
- Costs displayed on trees is incorrect for partitions/sets of characters.
This is fixed to represent the overall tree cost.
CHANGES BETWEEN 5.0 alpha1 and 5.0 alpha2
New Features:
- orientation set to true by default for breakinversion data-type.
- faster alignment under low-mem settings
Bugs Fixed:
- Fixed Makfile in src directory to perform the install, and removed the
Makefile/configuration in the root directory. These files were synonyms
for the ones the src directory and add no value to the compilation
process.
- Default for configuring with --enable-mpi is to set interface to flat.
This is a requirement that is oft forgotten and there is no reason why we
cannot facilitate that requirement. Setting interface to anything else
will over-write that choice and report a warning.
- Add Error message for input sequence with different number of
fragments(fragments are devided by '#'). (bug report by Torsten Dikow).
- Improved Makefile and Configure scripts from minor errors. Also removed
Makefile and configure script from the root directory to avoid confusion
and easier to maintain. (bug report by Jan De Laet).
- Report lkmodel (to report likelihood model), was not working properly in
certain situations; without identifiers. (bug report by John Denton).
- Transform likelihood with multiple types (for example a combination of
static and dynamic) would fail in the transform due to alphabet size
issues. We partition the data now between static and dynamic and then
apply the transform to the characters. (bug report by Fernando Marques).
- Used non-affine alignment for affine models under parsimony; this has
been reverted correctly, and also includes affine low-mem ukkonen.
- Compiling supramap and cmxs libraries dynamic linking rule was missing
in our myocamlbuild file. (bug report by Travis Treseder).
- Diagnosing Static and Dynamic Likelihood mixed models caused errors due to
demarcation of sets of data. Resolved so we group by the type of model
being applied to the characters as well as pre-defined sets and data
classes.
- Diagnosis for Dynamic Likelihood characters did not work on leaf nodes.
- Backtrace works the same in POY4 and POY5 for normal alignment; the issue
is in regard to the preference in inserting indels in which sequence.
- Affine alignment bug with aligning two sequences at a point each having
gap polymorphisms. This is a very rare instance.
New Commands:
- set(space_saving_alignment)
- set(normal_alignment)
- commands turn on/off low-memory alignment procedure. Default off.
Changed Commands:
- transform(chromosome:(newkkonen,..)
- transform(genome:(newkkonen,... ))
- this option is specified in the low-memory/space-saving alignment
procedure mentioned above.
CHANGES BETWEEN 4.1.2.1 and 5.0 alpha1
New Features
- Added likelihood criterion for diagnosing trees.
- Added methods of optimization for likelihood during build/swap/fuse
- Support for dynamic and static characters under a variety of models.
- Most Parsimonious and Maximum Average Likelihood cost models.
- Static and Dynamic character support for likelihood, including any
alphabet size (ie, discrete morphological characters, amino acid, ...)
- Added a variety of median solvers for rearrangements.
- Support for Genome and Chromosome characters with annotator Mauve.
- New selection method for polymorphic data in fixed state characters.
- Added level support on all alphabets sizes.
- Changed default TCM to 1,1.
- Updated configure scripts for newer versions of gcc and ocaml.
- Low memory option for alignment of sequences.
- Changed command for transform for dealing with identifiers.
- Require one type of delimiters in data files.
- Choice of equally costly medians can be user specified.
- pre-aligned for custom-alphabet and amino-acid characters.
- Allow additional medians by search-based command for fixed state
characters.
- Internal assignment in diagnosis output of fixed state characters print
taxon name.
- Default for amino-acid to not use 3D alignment in up-pass
- Better support for manipulating and calling character sets in data
- Graphic output for mauve outlining alignment of blocks and rearrangements
Bugs Fixed:
- Memory leaks in grappa interface.
- Building random trees does not use a modified Wagner build.
- Missing data is presented as a '?' in output from implied alignments.
- Avoid rediagnosing trees before certain operations.
- Issues in reading prealigned phylip files.
- Better support for all features of NEXUS files.
- Added POY block to nexus files for our specific needs, including
likelihood, chromosome, genome, and dynamic character information.
- Detecting file types is more accurate.
- Parsing file types has better support.
- Can transform Break Inversion and Custom alphabet with cost matrix file.
- Replaced command dynamic_pam to deal with new datatypes.
- now chromosome, genome, breakinv, etc.
- Priority for backtrace in the alignments standardized between floating
point alignment (dynamic likelihood), affine, and sequence characters.
- Nexus output is fully produced when called in the report command
--includes trees, set information, data, etc.
- Support for scientific notation of floating point numbers in parsers.
- Custom Alphabet and Break Inversion data characters case sensitive.
- Fixed and improved POY help documentation
- Initial cost for downpass in fixed state characters was incorrect (up-pass
and final costs were correct).
- Status messages during branch and bound build (after every 1% complete).
- 3D option set to false is observed after re-diagnosis.
- all element code (X) in amino acid is treated as a polymorphism
- Improved max_time behavior in searching
- Fixed cost issue in rearrangement for annotated characters
Features eliminated:
- dynamic_pam command has been replaced by the commands chromosome, genome,
breakinv, and custom_alphabet.
New Commands:
- transform( likelihood:( ... ) )
- transform( genome:( ... ) )
- transform( chromosome:( ... ) )
- transform( breakinv:( ... ) )
- transform( custom_alphabet:( ... ) )
- transform( parsimony )
- transform( level:INT )
- set( partition:( ... ))
- set( codon_partition:( ... ))
- swap/fuse/build( optimize:(model:(...),branches:(...)) )
- report( trees:(branches) )
- report( lkmodel )
Changed Commands:
- transform( [IDS], (transformations,...) )
- read( custom_alphabet:([datafile],[costmatrix],[prealigned]))