To unsubscribe from the SUNDIALS-USERS list: write to: mailto:SUNDIALS-USERS-...@LISTSERV.LLNL.GOV
Hello Aizada,
Yes, you will need to use some sort of CPU multithreading (e.g., Pthreads, OpenMP, std::thread) to launch the solvers to avoid blocking on the host. When using CVODE with GPUs, the integrator logic still lives on the CPU and the data operations (e.g., vector operations) are done on the GPU. The CUDA streams only apply to execution on the device, so naturally if you want fully asynchronous behavior on the CPU and GPU you will need separate CPU threads of execution.
Regards,
Cody