"gradients" produces TensorFlow graphs, any part of that graph can be further differentiated.
Some gotchas:
- "gradients" takes derivatives with respect to single y, so need to call it several times for multi-dimensional y.
(technically gradients function can take a list of ys, but it sums them up before differentiating)
- "gradients" produces Python list instead of Tensor, so need to use "tf.pack" to convert to Tensor
- "gradients" can produce None in cases when gradient is 0, but that's an illegal input to "gradients" so you need to replace None's with 0's
Here's an example of getting a Hessian matrix of loss by calling "gradients" twice
def replace_none_with_zero(l):
return [0 if i==None else i for i in l]
tf.reset_default_graph()
x = tf.Variable(1.)
y = tf.Variable(1.)
loss = tf.square(x) + tf.square(y)
sess = create_session()
grads = tf.gradients([loss], [x, y])
hess0 = replace_none_with_zero(tf.gradients([grads[0]], [x, y]))
hess1 = replace_none_with_zero(tf.gradients([grads[1]], [x, y]))
hessian = tf.pack([tf.pack(hess0), tf.pack(hess1)])
print hessian.eval()