Beast metadata import fails on package-supplied test tree

61 views
Skip to first unread message

Michael G

unread,
May 11, 2014, 8:47:14 PM5/11/14
to dendrop...@googlegroups.com
Hi all,
I understand that dendropy delegates the parsing of the precise metadata format saved by BEAST to the dendropy user. That is fine and a logical step, particularly since the BEAST metadata format changes, it seems, with every other version number.

My issue is with parsing BEAST metadata from the package-supplied test tree "pythonidae.beast.summary.tre". I cannot parse this file's metadata when following the tutorial "Working with Metadata Annotations".

>>> import dendropy
>>> ds = dendropy.DataSet.get_from_path("pythonidae.beast.summary.tre", "beast-summary-tree", extract_comment_metadata=True)
>>> ds.annotations
AnnotationSet([])

I did some investigation work as to the location of the issue via traceback. The issue seems to be in 2., 5. and 6. below.

1. Function "AnnotatedDataObject._get_annotations" in module "base" correctly imports the metadata.
2. Function "tree_from_token_stream" in module "nexustokenizer" does NOT save the metadata.
3. Function "parse_comment_metadata" in module "nexustokenizer" correctly parses the metadata.
4. Function "NexusTokenizer.store_comment_metadata" in module "nexustokenizer" correctly stores the metadata.
5. Function "NexusTokenizer.store_comments" is NOT being called.
6. Function "NexusReader._parse_tree_statement" NO longer contains metadata.

I can provide more detailed results on where the above-mentioned functions break, if that were helpful.

Has anyone had a similar issue?
Thank you, Michael G.

-- 
Michael Gruenstaeudl (Grünstäudl), Ph.D.

Jeet Sukumaran

unread,
May 11, 2014, 9:31:32 PM5/11/14
to dendrop...@googlegroups.com
Hi Michael,

There issue here is some confusion as to how annotations associated with
particular elements in the source data get mapped or associated with
elements in the data model, a.k.a., scoping.

Your code attempts to access the annotations of the DataSet instance.
Annotations for the DataSet as a whole would correspond to annotations
at the file-scope. The sample does not have any annotations at file scope.

Nor does it have annotations at tree block scope; if it did, these
annotations would be accessible as part of the annotations of the
corresponding TreeList (``ds.tree_lists[0].annotations`` in the code).

Nor does it have annotations at the tree scope; if it did the
annotations would be available via, e.g.
``ds.tree_lists[0][0].annotations``.

There *are* however, annotations at *node* scope. To access these, you
would you have to dereference the particular nodes.

All of this is illustrated by the following code:

```
import dendropy
ds = dendropy.DataSet.get_from_path("pythonidae.beast.summary.tre",
"beast-summary-tree",
extract_comment_metadata=True)
print(ds.annotations)
for trees in ds.tree_lists:
print(trees.annotations)
for tree in trees:
print(tree.annotations)
for nd in tree:
print(nd.annotations)
```

In the above code, all the annotations are empty till we hit the
innermost loop.

Note that I have specified "nexus" rather than "beast-summary-tree" as
the schema. Both work equally well for populating the annotations set in
this case. Specifying "beast-summary-tree" simply creates and adds full
attributes to the nodes from the annotations. So, for example, the
annotation 'length_median' is a new attribute of the node with the value
given by the annotation value.

-- jeet
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--



--------------------------------------
Jeet Sukumaran
--------------------------------------
jeetsu...@gmail.com
--------------------------------------
Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
--------------------------------------

Michael G

unread,
May 12, 2014, 6:50:13 PM5/12/14
to dendrop...@googlegroups.com
Hi Jeet,
thank you for your informative response. I have a follow-up question, which may be of interest to several other DendroPy users.

My aim is to import a beast summary tree including the associated metadata, modify some of the metadata attributes (e.g. height_95%_HPD, length_range, ...) and write the tree to file again with the updated metadata values. 

I know how to locate the metadata:

>>> import dendropy

# Importing Dendropy

>>> ds = dendropy.DataSet.get_from_path("pythonidae.beast.summary.tre", "beast-summary-tree", extract_comment_metadata=True)

# Loading tree with metadata

>>> nodes = ds.tree_lists[0][0].nodes()

# Getting to the node-level

>>> nodes[0].beast_info

{'height_median': 138.84432406415, 'height_range': [74.3872519715, 239.3934470149], 'length_95hpd': None, 'posterior': 1.0, 'height': 140.03444901095395, 'length': 0.0, 'length_median': None, 'length_range': None, 'height_95hpd': [96.08130962980002, 188.13285870459998]}

# Accessing all metadata info per node

>>> nodes[0].height

140.03444901095395

# Accessing particular metadata info


And I know how to modify particular attribute values:

>>> nodes[64].height_95hpd

[0.0, 2.113599975928082e-06]

# Displaying height_95hpd of last node (i.e. node 64)

>>> nodes[64].height_95hpd = [0.0, 1.0]

# Assign new value

>>> nodes[64].height_95hpd

[0.0, 1.0]

# Confirmation of new value


However, I don't know how to save the new values to a tree file, because the default way only saves the tree with the original metadata values:

>>> ds.write_to_path('outfile.nex', 'nexus')

# saving tree to file


I suspect that I need to dynamically update the metadata values associates in the tree_list, but I don't know how. Could you give some assistance?

Thank you, 
Michael

-- 
Michael Gruenstaeudl (Grünstäudl), Ph.D.

Jeet Sukumaran

unread,
May 13, 2014, 9:24:42 AM5/13/14
to dendrop...@googlegroups.com
You will need to explicitly (re-)add the attribute to the annotation set
of the object as a dynamic (bound) annotation, using the
`<object>.annotations.add_bound_attribute()` method.

For example:

```
import dendropy

ds = dendropy.DataSet.get_from_path("pythonidae.beast.summary.tre",
"beast-summary-tree",
extract_comment_metadata=True)
print(ds.annotations)
tree = ds.tree_lists[0][0]
for nd in tree:
nd.height_range = [0,1]
nd.annotations.add_bound_attribute("height_range")
print(tree.as_string("nexus"))
```

-- jeet
> However, I don't know how to save the *new* values to a tree file,
> because the default way only saves the tree with the *original* metadata
> values:
>
>>>> ds.write_to_path('outfile.nex', 'nexus')
>
>
>
> # saving tree to file
>
>
> I suspect that I need to dynamically update the metadata values
> associates in the tree_list, but I don't know how. Could you give some
> assistance?
>
> Thank you,
> Michael
>
> --
> Michael Gruenstaeudl (Grünstäudl), Ph.D.
> E-mail: gruenst...@osu.edu <mailto:gruenst...@osu.edu>

Jeet Sukumaran

unread,
May 13, 2014, 10:00:03 AM5/13/14
to dendrop...@googlegroups.com
One note, you probably want to clear the existing annotation associated
with the attribute name before adding it, to avoid duplication:

```
import dendropy

ds = dendropy.DataSet.get_from_path("pythonidae.beast.summary.tre",
"beast-summary-tree",
extract_comment_metadata=True)
tree = ds.tree_lists[0][0]
for nd in tree:
nd.height_range = [0,1]
# clear existing annotation 'height_range'
# or 'nd.annotations.drop()' to clear all existing
nd.annotations.drop("height_range")
nd.annotations.add_bound_attribute("height_range")
```

On 5/12/14, 6:50 PM, Michael G wrote:
> However, I don't know how to save the *new* values to a tree file,
> because the default way only saves the tree with the *original* metadata
> values:
>
>>>> ds.write_to_path('outfile.nex', 'nexus')
>
>
>
> # saving tree to file
>
>
> I suspect that I need to dynamically update the metadata values
> associates in the tree_list, but I don't know how. Could you give some
> assistance?
>
> Thank you,
> Michael
>
> --
> Michael Gruenstaeudl (Grünstäudl), Ph.D.
> E-mail: gruenst...@osu.edu <mailto:gruenst...@osu.edu>
Reply all
Reply to author
Forward
0 new messages