how to use derivative() correctly?

KK Sasa

unread,

Dec 6, 2014, 8:27:55 AM12/6/14

to casadi...@googlegroups.com

Hello,

Because I am new to Python and CasADi, the following questions should be quite simple. I wrote a syntax:

t = SX.sym("t")

d = SX.sym("d")

p = SX.zeros(2,1)

p[1,0] = 1/(1+exp(-(t-d)))

p[0,0] = 1 - p[1,0]

f = SXFunction([t,d],[p])

f.init()

print f([0,0]) # yes, returns 0.5 and 0.5

But I cannot obtain the derivatives...

df = f.derivative(1,0)

df.init() # must initiate

df.setInput(0,0) # (value, index)

df.setInput(0,1)

df.evaluate()

print df.getOutput() # NOT -0.25 and 0.25

My questions are

1. How to obtain derivatives correctly?

2. What is the second arg in f.derivative(1,0)? The document reads adjoint derivative. If I want to get hessian, just change the first arg.1 to 2?

3. In next stage, I want the lengths of t and d to vary because their lengths depend on the data structure. Everytime, the user may input different size of data. Is it possible to convert the code c code (going to be used by R program), but the length still can change flexibly?

Thanks.

Joel Andersson

unread,

Dec 6, 2014, 2:42:16 PM12/6/14

to casadi...@googlegroups.com

Hi!

In general, there are two ways if generating derivative information in CasADi:

1. From CasADi expressions (for you d, t, p), you can generate expressions for the derivatives. Syntax e.g. jacobian(p, d), hessian(p[0]+p[1],t)
2. From CasADi functions (for you f), you can generate new functions. Syntax e.g. f.jacobian(), f.derivative(), f.hessian() (if f is scalar-valued).

Since CasADi calculates derivatives by source code transformation, (2) will actually do (1) internally. But for you, it's probably easier to work with (1) directly.

The function f.derivative(n,m) is probably not what you're looking for. It generates a new function for calculating n Jacobian-times-vector products (with forward mode AD) and m Jacobian-transposed-times-vector products (using reverse mode AD). In addition it calculates the undifferentiated function, which is the first output.

Hope this helps,
Joel

KK Sasa

unread,

Dec 6, 2014, 11:59:05 PM12/6/14

to casadi...@googlegroups.com

Thanks for reply, Joel.

1. I always got stuck in how to numerically evaluate. The jacobian(p, d) returns symbolic, so how to evaluate numerically? Say, when t=0 and d=0. Why p[0] plus p[1] in your example: hessian(p[0]+p[1],t)?
2. Same question for f.jacobian(), f.derivative(), f.hessian(), how to numerically evaluate?

3. Above example seems doing symbolic manipulation. Is there any way to get jacobian and hessian by forward mode AD?

Many thanks.

KK Sasa

unread,

Dec 7, 2014, 1:55:01 AM12/7/14

to casadi...@googlegroups.com

Thanks for reply, Joel.

1. jacobian(p, d) and hessian(p[0]+p[1],t) are symbolic expression, so how to numerically evaluate them?
2. What is the meaning of p[0] plus p[1] in your example: hessian(p[0]+p[1],t)?
3. Above are symbolic way to get derivatives. Is there other way to do that by means of AD?

Joel Andersson

unread,

Dec 7, 2014, 6:19:37 AM12/7/14

to casadi...@googlegroups.com

Hi!

1. You created a function out of symbolic expressions and evaluated it numerically in your code above.

f = SXFunction([input expression(s)],[output expression(s)]) 
f.init()
[numerical results] = f([numerical argument(s)]

Or equivalently using f.setInput(...), f.evaluate(), f.getOutput(...). Please make sure you read the relevant sections in the user guide.

2. You can only calculate Hessians of scalar-valued expressions (and functions), so p[0]+p[1] is just an example of a scalar-valued expression. In your example, p is a 2-by-1 matrix so p[0]+p[1] is just the sum of the entries.

3. CasADi _always_ calculates derivatives via AD. CasADi uses a graph representation that allows it to calculates AD of symbolic expressions using a similar syntax to what you are used to from symbolic tools such as Symbolic Toolbox for MATLAB or SymPy. So what you do above, is AD and _not_ the "symbolic differentiation" you are used to from e.g. Symbolic Toolbox.

Best regards,

Joel

KK Sasa

unread,

Dec 7, 2014, 9:57:18 PM12/7/14

to casadi...@googlegroups.com

Thank you, Joel. Sorry to hassle again. So the logic seems been that I have to always generate SX expression and then use SXFunction to get new SX function. Here is the above example:

t = SX.sym("t")

d = SX.sym("d")

p = SX.zeros(2,1)

p[1,0] = 1/(1+exp(-(t-d)))

p[0,0] = 1 - p[1,0]

f = SXFunction([t,d],[p])

f.init()

df = jacobian(p,t)

dp = SX.zeros(2,1)

d1 = SXFunction([df],[dp])

XFunctionInternal::XFunctionInternal: Xfunction input arguments must be purely symbolic.
Argument #0 is not symbolic

I check the type of df, it is SX expression. No idea why it said df is not symbolic? Thanks.

Joel Andersson

unread,

Dec 8, 2014, 3:57:36 AM12/8/14

to casadi...@googlegroups.com

"SXFunction([df],[dp])" makes no sense since you are trying to create a function that takes "df" as input and calculates "dp". Your inputs should be t and d. Probably you wanted to do "d1 = SXFunction([t,d],[df])".

Best regards,

Joel

KK Sasa

unread,

Dec 9, 2014, 11:03:11 AM12/9/14

to casadi...@googlegroups.com

Thank you, Joel. At first, I always felt hard to understand the examples in the document, because too few explanations inside syntax. But now, I know more and more the logic behind CasADi after communicating with you. Here, I got a question about hessian matrix. I have three parameters, and am wondering how to get the FULL hessian matrix of them? Following is the syntax:

from casadi import *

import scipy # default abbrev. is sp

from scipy import stats as st

import numpy # default abbrev. is np

point = 2; dim0 = 2; people = 1; mean0 = [0,0]; cov0 = [[1,0],[0,1]]; seed([1])

t = st.multivariate_normal.rvs(mean0,cov0,people); t = t.reshape(people,dim0)

a = np.matrix('1;2')

###### use MX

theta = MX.sym("theta",people,dim0)

alpha = MX.sym("alpha",dim0,1)

delta = MX.sym("delta")

probability = MX.zeros(people,point)

probability[:,1] = 1/(1+exp(-(mul(theta,alpha)-delta)))

probability[:,0] = 1 - probability[:,1]

probability = [probability[:,0],probability[:,1]]

p = MXFunction([alpha,delta,theta],probability) # symbol into function

p.init() # initiate function

#print p([a,0,t])

dp1 = MXFunction.hessian(p,0,0)

dp1.init() # initiate function

print dp1([a,0,t])

My question is about dp1. It returns alpha's hessian matrix. I know I can also get delta's and theta's respectively. But how to get their hessian matrix simultaneously? The full hessian matrix? Thanks.

Joel Andersson

unread,

Dec 12, 2014, 2:30:27 AM12/12/14

to casadi...@googlegroups.com

Hi!

To get the full Hessian you have to reformulate the problem so that the input is vector-valued. That is instead of a function with n scalar-valued inputs you can provide one vector-valued argument.

Anyway, for you I think it's easier (and more efficient) to work with the SX datatype. And just print the expressions to see what you have at each step. Also, I think it's easier to work with the he "hessian" function for expressions. Finally, note that [a,b,c] gives a list in Python. To get a vertical concatenation, you need to do: vertcat([a,b,c]).