load_tables function has been replaced by TableCollection.tree_sequence.

24 views
Skip to first unread message

Richard Kerr

unread,
Sep 2, 2020, 6:37:58 AM9/2/20
to msprime-users
Hello,

I am working hard at trying to learn msprime (which isn't easy as I have to learn python at the same time).

I am wanting to only keep non-rare variants with the following function, and found some one else using the load_table function. I have since learnt this function has been deprecated, and the change logs says it has been replaced with TableCollection.tree_sequence. 

Can anyone help me in understanding how this new function works?


def remove_rare_variants():
    num_rare_derived=0
    threshold=0.01
    sites = msprime.SiteTable()
    mutations = msprime.MutationTable()
    for tree in ts.trees():
        for site in tree.sites():
            mut = site.mutations[0]
            freq = tree.num_samples(mut.node) / N
            if freq > threshold:
                num_rare_derived += 1
                site_id = sites.add_row(
                        position=site.position,
                        ancestral_state=site.ancestral_state)
                mutations.add_row(
                        site=site_id, node=mut.node, derived_state=mut.derived_state)
    tables = ts.dump_tables()
    new_ts = msprime.load_tables(
        nodes=tables.nodes, edges=tables.edges, sites=sites, mutations=mutations)
    return new_ts

Jerome Kelleher

unread,
Sep 2, 2020, 7:10:43 AM9/2/20
to msprim...@googlegroups.com
Hi Richard,

Welcome to the msprime/tskit community!

I looks like the documentation that you're working from is a little out
of date, unfortunately, apologies for that. The example in the tskit
tutorial might be a helpful place to start:

https://tskit.readthedocs.io/en/latest/tutorial.html#editing-tree-sequences

What the code below is doing is iterating over the sites in the tree
sequence one-by-one, and using the trees to figure out what the
frequency of the variants are. Basically all you need to do is work with
the TableCollection object directly rather than the Site and Mutation
tables, and then call tables.tree_sequence() at the end to get your
filtered-down tree sequence.

Hope this helps!

Cheers,
Jerome
> --
> You received this message because you are subscribed to the Google
> Groups "msprime-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to msprime-user...@googlegroups.com
> <mailto:msprime-user...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/msprime-users/8cf13695-84a8-4fc5-9add-0ada9c2d4b53n%40googlegroups.com
> <https://groups.google.com/d/msgid/msprime-users/8cf13695-84a8-4fc5-9add-0ada9c2d4b53n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages