Slicing MX variables

312 views
Skip to first unread message

jose santiago rodriguez

unread,
Mar 25, 2014, 3:42:57 PM3/25/14
to casadi...@googlegroups.com
Hi,

I am currently using CasADi to generate a C-file for a fully transcribed optimal control problem. The goal is to get small C-files no matter the number of discretization points . So far I have gotten significant improvement in the size of the file by using a "checkpoint scheme" with two levels of functions. To get such improvement I also avoided  (as much as possible) unnecessary operations which involve concatenation and slicing, and I tried to use splitting operations instead.

For me is clear why the unnecessary concatenations make bigger the C-file, but I am not sure why slicing causes problems too. When exactly slicing becomes a problem?  would something like the following be problematic?

x=msym("x",10)
y=x[1:5]
z=x[[5,6,7]] # is this slicing too?

If i generate code for a function using the variable "y" the slicing will be seen in the generated code right?. but what about a function using z? 

Thanks,
Santiago

Joel Andersson

unread,
Mar 26, 2014, 4:00:39 PM3/26/14
to casadi...@googlegroups.com
Hello!

Both of the examples you made above (y=x[1:5] and z=x[[5,6,7]]) are essentially calling the same function, so there is no real difference there. As you have noticed, this can become a bottleneck in code generation. 

While there are good chances that this will be improved in the future, the best option is to try to avoid the the slicing altogether and use the "vertsplit" operation instead.

This works if you want to divide an operation (for example corresponding to the free variables of your NLP) into a lot of blocks and later use all of the blocks. The operation is in some sense the "inverse" of a vertcat operation and works like this:
x_block1, x_block2, x_block3 = vertsplit(x,[0,n1,n1+n2,n1+n2+n3])

where n1, n2, n3 are the number of rows in block 1, 2 and 3 respectively. I hope you understand the idea. The "vertsplit" operation has efficient derivative calculation rules and compact code generation.

Note that it may be hard to avoid "slicing operations" (a.k.a. getNonzeros, setNonzeros) in your code completely since they also arise when assembling complete (sparse) Jacobians and Hessian. The code needed for this is usually moderate.

I hope this helps you further!
Joel


jose santiago rodriguez

unread,
Mar 27, 2014, 4:57:01 AM3/27/14
to casadi...@googlegroups.com
Thanks Joel,

yeah so as I mentioned I improved a lot the code generation of the collocation problem. Still I see a linear dependency of the size of the file to the number of elements used, but the slope of the line is small. Therefore, even if I use many elements the file is still "compilable"with gcc.  Right now I have used only two levels of "packing" and probably multiple levels (packing two elements, four elements, and so on...) will get the logarithmic dependency you mentioned. However I am wondering how this will affect the execution time. After doing some profiling I saw that memory usage and timing for evaluating the Jacobian of the constraints decrease significantly. Should I expect the same improvement in execution time (building the nlp and then solving it)? If yes then it might be a good idea for use more levels of packing?

I avoided slicing in my code but now that I am trying to merge it with the LocalDAECollocator I am facing some situations in which I might have to use it, that was the reason of my previous post. Thanks a lot for your help!
Santiago

Joel Andersson

unread,
Mar 27, 2014, 8:11:08 AM3/27/14
to casadi...@googlegroups.com
Hello again Santiago!

Checkpointing, which is what you are doing when you are creating a hierarchy of functions, does indeed affect the execution time. In general, every "level" of checkpointing will cost you an extra forward simulation. I think it is seldom necessary to have more than 2 levels, unless you have very long time horizons, then maybe 3 levels.

Anyway, for direct collocation you should in general be able to get down the execution time enough so that all function and derivative calculations (including exact Hessian) is small compared to the linear solver calls in the NLP solver.

If you need to compress the size of the generated code further, you need to look into the generated code and find out what exactly is taking up space.

Also note, there are other, much faster compilers than gcc, e.g. clang.

Greetings!
Joel


Reply all
Reply to author
Forward
0 new messages