Multi threaded execution of CasADi functions from generated C-code

802 views
Skip to first unread message

Paul Daum

unread,
Sep 4, 2018, 4:42:53 AM9/4/18
to CasADi
Dear all,

I'm trying to write a C++ program that loads a generated shared library (which was generated with python code) and then executes the generated code in multiple threads. The loading of the library doesn't have to be necessarily happen in parallel. However, the execution of the functions in that library should happen in parallel (without the use of mutex variables).

All my attempts have always led to segmentation faults, forcing me to use synchronization schemes. However, in my code the evaluation of those functions take the most resources, therefore mutexes would make my program very slow.

Since code says more than a thousand words, here is a minimal example. It would be optimal if I could also execute the same function in two different threads.

My question: How can I make this thread safe without mutexes? Is there a CasADi module that I don't know about that could do this for me? What changes would I have to do to the minimal example below to make it work?


model.py:
import sys
from subprocess import call
from casadi import *

# Create two dummy-functions
def f(x):
return x[0] + x[1]


def g(x):
return x[0] + 1


# Create casadi function objects
x = SX.sym('x', 2)
f = Function('f', [x], [f(x)])
g = Function('g', [x], [g(x)])

# Generate the code
c = CodeGenerator("model.c")
c.add(f)
c.add(g)
c.generate()

# Compile the generated code into a shared object
call("cc -fPIC -shared -o model.so model.c -lcasadi".split())


minimal.cpp:
#include <chrono>
#include <thread>
#include <string>
#include <iostream>
#include <casadi/casadi.hpp>

// Compile with: g++ minimal.cpp -lcasadi -pthread minimal.out

using std::cout;
using std::endl;
using casadi;

// This function is called by two different threads
void casadi_thread_fun (std::string fname) {
cout << endl << "I am thread " << fname << endl;

// Load the function from the shared object

Function f = external(fname, "./model.so");

cout << endl << "Loaded function in thread " << fname << endl;

// Create some dummy input and an output container
DM x{0,1};
DMVector y;

// Evaluate the function over and over
while(true) {
f.call(DMVector{x}, y);
std::this_thread::sleep_for(std::chrono::milliseconds(5));

}
}

int main() {
// Create two threads that call two different functions
std::thread thread1 (casadi_thread_fun, "f");
std::this_thread::sleep_for(std::chrono::milliseconds(500));
std::thread thread2 (casadi_thread_fun, "g");

// This parent thread shall idle forever
while(true) {
std::this_thread::sleep_for(std::chrono::milliseconds(5));
}
}

Output when executing minimal.out:
I am thread f
Loaded function in thread f
fffff
fffffffffffffffffffffffffffffffffffffffffffffffff [...]
I am thread g
Loaded function in thread g
[Segmentation fault]


Sincelery,

Paul Daum

Joris Gillis

unread,
Sep 4, 2018, 4:52:38 PM9/4/18
to CasADi
Dear Paul,

There are several parts of CasADi here that are potentially not thread-safe:
 - the creation/deletion of CasADi objects (your code creates/deletes DM objects on the fly). This is a rather fundamental restriction.
 - the memory checkout/release process under the hood of call. This is thread-safe since CasADi 3.4.1
 - many of our interfaces are not thread-safe yet, if you just use SX/MX symbolics, it should be fine

Below is a modification of your code that works in CasADi 3.4.1+:

 
// Note x{0,1} gives a 0-by-1 matrix
  DM x
= std::vector<double>{0,1};

 
DMVector y;

 
// Evaluate the function over and over
 
while(true) {


 
DMVector arg;
 
DMVector res;

 
{

    std
::lock_guard<std::mutex> lock(mtx_);
    arg
= {x};
    res
.push_back(1);
 
}

  f
.call(arg, res);
Message has been deleted
Message has been deleted

joaosa...@gmail.com

unread,
Sep 15, 2021, 9:35:23 AM9/15/21
to CasADi
I had a similar issue. Mainly would like to paralellize facets of my code but that was not possible when using shared memory.
A solution to this could be using MPI message passage interface where the memory is distributed.
In a simplistic manner, Threads exploit multiple cpu's but they share the same shunk of memory; while mpi exploits multiple cpu's 
the memory is also multiple (e.g. one shunk of memory per cpu).

In this case it occurs seg fault, because std::Function or something in casadi libraries is not thread safe and even thought it is two different function f and g,
probably they share some members (I guess).

Anyway I played a bit with OpenMPI  and here you can find a workarround to this issue:


#include <casadi/casadi.hpp>

#include <mpi.h>
using namespace casadi;



void callFuncLoop(Function & func, const DM &x_0, int sleep_time){
std::cout << "Calling func: " << func.name() << std::endl;
int call_count = 0;
while(true) {
auto result = func(std::vector<casadi::DM>{x_0})[0];
std::this_thread::sleep_for(std::chrono::milliseconds(sleep_time));
call_count++;
if(call_count%5) std::cout << func.name() << " - " << call_count << " calls" << std::endl;
}
}

int main(int argc, char** argv){
MPI_Init( &argc, &argv );
// Reading size and rank
int process_id, size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &process_id);
// Create casadi function objects
auto x = SX::sym("x", 2,1);
auto f = Function("f", {x}, {x(0) x(1)});
auto g = Function("g", {x}, {x(0)});


if(process_id == 0) {
auto x_0 = DM({0,1});
callFuncLoop(f, x_0, 5);
}
if(process_id == 1) {
auto x_0 = DM({0,1});
callFuncLoop(g, x_0, 5);
}

// Finalisation
MPI_Finalize();

}

Note: you need to install mpi and run the executable
mpirun -np 2 ./executable

run 2  processes over the executable



If  you are interested in the avenue , this tutorial was very helpful:

joaosa...@gmail.com

unread,
Sep 15, 2021, 9:37:04 AM9/15/21
to CasADi
I dont know how to format the code (sorry in advance) :)
Reply all
Reply to author
Forward
0 new messages