Hi everyone,
I'm reaching out to get some insights on an issue I'm facing. I recently read Cody J. Balos's paper titled:
"SUNDIALS time integrators for exascale applications with many independent systems of ordinary differential equations"
I'm working on a project with a similar flavor to what was done for Pele. In our CFD solver, we decouple chemistry and flow using a time-splitting operator. Each cell is treated as an independent reactor, with a system of ODEs describing the species' rate of change. We compute the right-hand side and Jacobian of these ODEs using PyJac, which is seamlessly integrated with the CPU CVODES dense direct solver.
Now, I'm attempting to port this approach to GPUs. Since PyJac routines work in batches, I send a group of cells to the PyJac kernels, which return the Jacobians and rates of change for the entire batch. I've noticed that there is a MAGMA CVODES dense solver available, which I might be able to use. My current strategy is as follows:
I'm not entirely sure if this is the most efficient or correct approach, as my intuition tells me that solving the system might not scale linearly with the batch size (though I admit my understanding of advanced ODE techniques is somewhat limited). I also looked for a batch dense linear solver that could handle each cell’s system independently on different threads, but I couldn't find any such function.
Could anyone clarify what procedure was used in Balos's paper? Specifically, which SUNDIALS solver did they employ, and how did they handle the independent systems on the GPU?
Thank you very much in advance for your help!
Best regards, Borja
To unsubscribe from the SUNDIALS-USERS list: write to: mailto:SUNDIALS-USERS-...@LISTSERV.LLNL.GOV
Hello Borja,
Your strategy seems to be on the right track. The MAGMA SUNMatrix has a constructor for batched systems (see https://sundials.readthedocs.io/en/latest/sunmatrix/SUNMatrix_links.html#c.SUNMatrix_MagmaDenseBlock). When you create a block MAGMA SUNMatrix, then the MAGMA SUNLinearSolver will use the batched routines from MAGMA to solve your linear system. These are the matrix and solver used in the SUNDIALS+Pele paper.
Regards,
Cody
|
Cody Balos 7000 East Avenue, Livermore, CA 94550 |
Since this topic (GPU chemistry) has come up on the list a few times recently I thought I would point to our work in this area. We’ve implemented GPU parallelized chemical kinetics solvers using CVODE for time integration with both sparse and dense Jacobians (semi-analytic or divided difference) for both CUDA [1] and HIP platforms [2]. Our focus is on combustion, but much of the implementation should be transferrable to other contexts. Feel free to reach out to me if you have questions about our implementation.
Kind regards,
Russell Whitesides, Ph.D.
Energy Conversion & Group Leader
Lawrence Livermore National Laboratory
[1] https://github.com/LLNL/zero-rk/blob/master/applications/cfd_plugin/reactor_nvector_serial_cuda.cu
[2] https://github.com/LLNL/zero-rk/blob/hip/applications/cfd_plugin/reactor_nvector_serial_gpu.hip