Hello,
I am trying to fit exponential decay functions of r2 across physical distance of my simulated tree sequences. So far I can retrieve the M x M matrix of r2 of simulations using:
import msprime
tree_sequence=msprime.simulate(sample_size=50, Ne=600,
length=1e7, recombination_rate=1e-08,
mutation_rate=1e-8)
ld_calc = tskit.LdCalculator(tree_sequence)
A = ld_calc.r2_matrix()
However, I haven't found the best way to turn this matrix into a data structure where I can find the physical distance (in bp) between each SNP in each pairwise computation.
I can get the M SNP positions with:
variant_positions = [variant.site.position for variant in tree_sequence.variants()]
but after this I'm not sure what's next? I know I can write a VCF of my tree sequences and then compute LD, but I'm trying to keep this computation without producing many intermediate files.
Thanks,
Soul