Hi,
I am exploring your SymPy stats module and find it very useful. I am new to Python so I am struggling to use the stats functionality to setup my problems. I was able to do the genetics example shown below where I:
1) define two Bernoulli variables for each individual {(xi1, xi2), (xj1, xj2)},
2) compute the sum of the two variables (xi = xi1 + xi2 and xj = xj1 + xj2; this is simply Binomial with n=2), and
3) compute variance of a function of two Binomials, Var(y=f(xi, xj)).
I would like to expand the example in two ways:
1) Add correlation between xi1 & xi2 and between xj1 & xj2 and reevaluate variance of the function of two Binomials, Var(f(xi, xj)).
2) Add another correlated position, which will lead to this set of variables:
- individual i, position 1: xi11, xi21
- individual i, position 2: xi12, xi22
- individual j, position 1: xj11, xj21
- individual j, position 2: xj12, xj22
and then compute covariance of a function of two Binomials between two locations, Cov(y1=f(xi11 + xi21, xj11 + xj21), y2=f(xi12 + xi22, xj12 + xj22).
Does anyone have an idea how to introduce these correlations?
Thanks!
Gregor
from sympy import *
from sympy.stats import *
# Allele freq
p, q = symbols('p q', real=True, positive=True)
q = 1 - p
# Allele and genotypes of individual i
xi1 = Bernoulli('xi1', p)
xi2 = Bernoulli('xi2', p)
xi = xi1 + xi2
# xi = Binomial('xi', n=2, p=p)
# TODO: create distribution for xi that correlates xi1 & xi2
# Allele and genotypes of individual j
xj1 = Bernoulli('xj1', p)
xj2 = Bernoulli('xj2', p)
xj = xj1 + xj2
# xj = Binomial('xj', n=2, p=p)
# TODO: create distribution for xi that correlates xj1 & xj2
# Number of matching alleles between individuals i and j at one locus
yij = symbols('yij', integer=True)
yij = xi * xj - xi - xj + 2
# Variance of the number of matching alleles between individuals i and j
# (assume independence between xi and xj)
VarY = variance(yij)
factor(VarY) # -4*p*(p - 1)*(3*p**2 - 3*p + 1)
VarY.subs(p, 0.50) # 0.25
VarY.subs(p, 0.25) # 0.328125000000000
VarY.subs(p, 0.01) # 0.0384238800000000