10 views

Skip to first unread message

Jul 8, 2021, 3:22:40 AM7/8/21

to slim-discuss

Hi list,

I was wondering if it was possible to calculate certain tree stats on just a subset of individuals. Specifically, I am trying to calculate the mean tree height for individuals in the last generation with a certain phenotype. Phenotype info is stored in a separate list, not the .trees file, with pedigree id serving as key. So basically I am hoping to create a new .trees file by filtering the original one down to the individuals that appear in the reference list.

Code to calculate mean tree heights from entire tree file:

ts = pyslim.load(file).simplify()

hfp = []

for tree in ts.trees():

mean_height = np.mean([tree.time(root) for root in tree.roots])

left, right = map(int, tree.interval)

height_for_pos[left: right] = mean_height

hfp.append(mean_height)

mh = np.mean(height_for_pos)

hfp = []

for tree in ts.trees():

mean_height = np.mean([tree.time(root) for root in tree.roots])

left, right = map(int, tree.interval)

height_for_pos[left: right] = mean_height

hfp.append(mean_height)

mh = np.mean(height_for_pos)

Many thanks in advance for any pointers. (Also feel free to let me know if the above code is totally wrong :) Python is not my first language...)

Cheers,

Greg

Jul 8, 2021, 5:41:08 PM7/8/21

to Greg Torda, slim-discuss

Hi, Greg! This is a job for the "samples" argument to simplify: see

the example here:

https://pyslim.readthedocs.io/en/latest/tutorial.html#simplification

Also see the next example, that discusses extracting individuals.

Your python code looks fine, but note that it's taking the mean height

of trees *not* weighted by how far along the genome they extend for.

-- peter

> --

> SLiM forward genetic simulation: http://messerlab.org/slim/

> ---

> You received this message because you are subscribed to the Google Groups "slim-discuss" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/8a929772-695e-4ba7-9f1d-3b0ed7d3e69cn%40googlegroups.com.

the example here:

https://pyslim.readthedocs.io/en/latest/tutorial.html#simplification

Also see the next example, that discusses extracting individuals.

Your python code looks fine, but note that it's taking the mean height

of trees *not* weighted by how far along the genome they extend for.

-- peter

> SLiM forward genetic simulation: http://messerlab.org/slim/

> ---

> You received this message because you are subscribed to the Google Groups "slim-discuss" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/8a929772-695e-4ba7-9f1d-3b0ed7d3e69cn%40googlegroups.com.

Jul 10, 2021, 1:21:39 AM7/10/21

to Peter Ralph, slim-discuss

This is it, thanks heaps for the prompt reply, Peter!

Here is what I did (in case anyone was interested):

import numpy as np

import msprime, pyslim

ts = pyslim.load(file).simplify()

# extra simplification step based on a list of individuals

keeper = np.genfromtxt(csvfile, delimiter=',')

# these are pedigree_ids which I need to convert to ids

keeper = keeper.astype(int).tolist()

keep_indivs = []

for i in ts.individuals():

pid = i.metadata["pedigree_id"]

if pid in keeper:

keep_indivs.append(i.id)

keep_nodes = []

for i in keep_indivs:

keep_nodes.extend(ts.individual(i).nodes)

sts = ts.simplify(keep_nodes)

height_for_pos = np.zeros(int(sts.sequence_length))

hfp = []

for tree in sts.trees():

Many thanks again, Peter!

Cheers,

Greg

Here is what I did (in case anyone was interested):

import numpy as np

import msprime, pyslim

ts = pyslim.load(file).simplify()

# extra simplification step based on a list of individuals

keeper = np.genfromtxt(csvfile, delimiter=',')

# these are pedigree_ids which I need to convert to ids

keeper = keeper.astype(int).tolist()

keep_indivs = []

for i in ts.individuals():

pid = i.metadata["pedigree_id"]

if pid in keeper:

keep_indivs.append(i.id)

keep_nodes = []

for i in keep_indivs:

keep_nodes.extend(ts.individual(i).nodes)

sts = ts.simplify(keep_nodes)

height_for_pos = np.zeros(int(sts.sequence_length))

hfp = []

for tree in sts.trees():

mean_height = np.mean([tree.time(root) for root in tree.roots])

left, right = map(int, tree.interval)

height_for_pos[left: right] = mean_height

hfp.append(mean_height)

mh = np.mean(height_for_pos)

In my slim simulation all basepairs are unlinked and so I calculate a tree for every single one and don't need to weight by span on genome (if I understood your comment right, Peter).
left, right = map(int, tree.interval)

height_for_pos[left: right] = mean_height

hfp.append(mean_height)

mh = np.mean(height_for_pos)

Many thanks again, Peter!

Cheers,

Greg

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu