Help with chain rule

49 views
Skip to first unread message

Holi Rapanoel

unread,
Feb 14, 2018, 11:26:03 PM2/14/18
to sympy
Dear All,

I would like to take the derivative symbolically of a function but it doesn't seem to work. Basically W and S are matrices, Z=f(W,S) and loglikelihood=f(eta,Z). I would like the partial derivatives of the loglikelihood wrt eta and W_ij. It works for eta but not the W_ij. Here is the code

n = 3
T = 5
eta = sympy.Symbol('eta')
W = sympy.MatrixSymbol('W', n, n)
S = sympy.MatrixSymbol('S',n,T)
def detdyn(s,w):
    '''Z if a function of S and W'''
    f = w*s
    f_average = sympy.HadamardProduct(f, s)
    f_average = sympy.ones(1,f_average.shape[0]) * f_average #row sum
    #compute coefficients of Z
    c=f.as_mutable()
    for r in range(T):
        c.col_op(r, lambda i,j: i / f_average[0, r])
    z = sympy.HadamardProduct(c,s)
    return z
Tk = S.shape[1] #length of the data
Zk = detdyn(S,W)[:,:-1] #compute the deterministic dynamics
Sk = S[:,1:] #drop first observation
Sl = sympy.IndexedBase("Sk")
Zl = sympy.IndexedBase("Zk")
i, t = sympy.symbols('i t', cls=sympy.Idx)
loglikelihood=(Tk-1)*sympy.loggamma(eta)+sympy.Sum(sympy.Sum(-sympy.loggamma(eta*Zl[i,t])+(eta*Zl[i,t]-1)*sympy.ln(Sl[i,t]),(i, 0, 2)),(t, 0, Tk-1))

For example
sympy.diff(loglikelihood,W[0,0])
gives
Sum(0, (i, 0, 2), (t, 0, 4))
which is not correct.

What am I doing wrong?

Thank you,
Holi

Leonid Kovalev

unread,
Feb 15, 2018, 12:30:50 AM2/15/18
to sympy
loglikelihood does not involve W anywhere, so the derivative is zero. The reason is that the object Zl and Sl that you introduced have nothing to do with Zk and Sk that were computed earlier, they share the name but are of a different class.  Other issues: 

Since you are using passive forms Sum, HadamardProduct (instead of summation or hadamard_product), you'll need loglikelihood.doit() to get the computation done before taking the derivative. 

The summation (t, 0, Tk-1) is out of bounds. Sum ranges include the end value of the index, unlike Python ranges. So you are summing over Tk values of the index but the matrix does not have that many because one column was dropped earlier. 

Calculus with indexed symbols is still rough around the edges in SymPy. If the above issues are sorted, you'll hit another one, #14216 - differentiating loggamma leads to unpolarify, which doesn't understand indexed matrix elements. A workaround is to fill the matrix with non-indexed symbols, for example generating them with symarray. This is what I do below; the code is modified according to the above remarks. 

n = 3
T
= 5
eta
= sympy.Symbol('eta')

W
= sympy.Matrix(symarray('W', (n, n)))
S
= sympy.Matrix(symarray('S', (n, T)))

def detdyn(s,w):
   
'''Z if a function of S and W'''
    f
= w*s
    f_average
= sympy.HadamardProduct(f, s)
    f_average
= sympy.ones(1,f_average.shape[0]) * f_average #row sum
   
#compute coefficients of Z
    c
=f.as_mutable()
   
for r in range(T):
        c
.col_op(r, lambda i,j: i / f_average[0, r])
    z
= sympy.HadamardProduct(c,s)
   
return z
Tk = S.shape[1] #length of the data
Zk = detdyn(S,W)[:,:-1] #compute the deterministic dynamics
Sk = S[:,1:] #drop first observation

i
, t = sympy.symbols('i t', cls=sympy.Idx)
loglikelihood=(Tk-1)*sympy.loggamma(eta)+sympy.Sum(sympy.Sum(-sympy.loggamma(eta*Zk[i,t])+(eta*Zk[i,t]-1)*sympy.ln(Sk[i,t]),(i, 0, 2)),(t, 0, Tk-2))
print
(loglikelihood.doit().diff(W[0, 0]))


 

Holi Rapanoel

unread,
Feb 15, 2018, 11:39:13 AM2/15/18
to sympy
Thanks very much for your help. There is still a lot I need to learn with sympy.
Reply all
Reply to author
Forward
0 new messages