Function.map() and parallel evaluation of gradient and Hessian

684 views
Skip to first unread message

Mikhail Katliar

unread,
Jul 2, 2016, 1:44:53 PM7/2/16
to CasADi
Dear All,

Let's say I have a NLP which looks like following:

x = MX.sym('x')
y = MX.sym('y')
f = Function('f', [y, x], [some_function(y, x)])

N = data.shape[1]
f_map = f.map('f_map', N, 'openmp', [1], [])

nlp = {'x' : x, 'f' : sum2(f_map(data, x))}
solver = nlpsol('NLPSolver', 'ipopt', nlp)
solution = solver()

The question is: will I have parallel evaluation of the gradient and Hessian inside the solver?

Best regards,
Mikhail

Joris Gillis

unread,
Jul 2, 2016, 1:51:52 PM7/2/16
to CasADi
Hi Mikhail,

In principle, yes.
However, since reduced outputs are unimplemented for openmp map, it will switch back to serial map when reduced outputs are needed.
The reverse mode of a map with reduced inputs has reduced outputs...

The workaround is not use reduced inputs yet; just repmat'ing the input.

Best,
  Joris

Mikhail Katliar

unread,
Jul 2, 2016, 2:13:18 PM7/2/16
to CasADi
I actually get this message during the solver construction:

CasADi warning: "OpenMP not yet supported for reduced outputs. Falling back to serial mode." issued  on line 99 of file "/home/kotlyar/software/casadi/casadi/core/function/map.cpp". 

How can I benefit from parallelization for objective functions like this?

суббота, 2 июля 2016 г., 19:44:53 UTC+2 пользователь Mikhail Katliar написал:

Joris Gillis

unread,
Jul 2, 2016, 2:16:19 PM7/2/16
to CasADi
Right, that's the warning for the problem I mentioned.

Try something like (untested):

f_map = f.map('f_map', N, 'openmp')
nlp = {'x' : x, 'f' : sum2(f_map(data, repmat(x,1,N) ))}


Best,
  Joris

Mikhail Katliar

unread,
Jul 2, 2016, 4:34:30 PM7/2/16
to CasADi
Hi Joris,

your solution seems to parallelize the execution (tested on small problem), but eats up all memory (on real-size problem). In my case, data has size 16384x636 and x is 30x4. When the solver is started, memory usage goes beyond 8Gb and the system runs out of memory. If I reformulate the problem without map(), memory usage is negligible, but there is no parallelization.

How can I make CasADi calculate my objective and derivatives in parallel without running out of memory in this case?

суббота, 2 июля 2016 г., 20:16:19 UTC+2 пользователь Joris Gillis написал:

Joris Gillis

unread,
Jul 2, 2016, 6:10:13 PM7/2/16
to CasADi
Hi Mikhail,

This is gonna sound like deja-vue :-)
This is another known and open issue. Joel did some heavy lifting with new features for thread-safe memory management, but this hasn't been applied to openmp map yet.
For now, the openmp map implementation allocates  times the memory for a single function evaluation.

You may try a simple workaround: Divide up and in a large and small factor, and stack map serial and map openmp together:

f_map = f.map('f_map_inner', 2048, 'serial').map('f_map_outer',8,'openmp')



Best,
  Joris
Reply all
Reply to author
Forward
0 new messages