Forwarding this to the mailing list as it never went through due to the IT outage on Friday.
From:
utpal kiran <utpal...@gmail.com>
Date: Thursday, July 18, 2024 at 8:25 AM
To: sundial...@llnl.gov <sundial...@llnl.gov>
Subject: Unable to solve using SUNLinSol_cuSolverSp_batchQR( )
Hi,
I have an application (as a part of a bigger problem) consisting of many independent ODE systems having the same sparsity pattern. I am able to solve this problem on CPU by calling CVODE functions inside a loop, by using a dense matrix (SUNDenseMatrix) and dense linear solver (SUNLinSol_Dense). Now, I want to solve my problem on a GPU using CUDA for which i have identified cuSOLVER sparse batched QR linear solver. I have done the problem setup in exactly the same manner as given in 'cvRoberts_block_cusolversp_batchqr.cu' example, with my own RHS and Jacobian functions. However, CVODE throws me the following error.
[ERROR][rank 0][/src/cvode/cvode.c:3698][cvHandleFailure] At t = 0 and h = 6.90801e-310, the corrector convergence test failed repeatedly or with |h| = hmin
SUNDIALS_ERROR: CVode() failed with retval = -4
What could be wrong?
It is noted that the jacobian (size: 53x53) for my problem is singular (one whole row consists of zero entries). Is this problem arising because my jacobian is singular? If yes, how SUNLinSol_Dense is able to solve it? If that is the case, how to solve this problem on GPU?
Thank you
Utpal Kiran
To unsubscribe from the SUNDIALS-USERS list: write to: mailto:SUNDIALS-USERS-...@LISTSERV.LLNL.GOV
To unsubscribe from the SUNDIALS-USERS list: write to: mailto:SUNDIALS-USERS-...@LISTSERV.LLNL.GOV
Hi Utpal,
The cuSolver sparse QR requires that your matrices have full rank (see https://docs.nvidia.com/cuda/cusolver/index.html#cusolverspxcsrqrbatched). Since your Jacobian matrix is singular (and therefore not full rank) this solver will not work for you. I suggest trying the MAGMA linear solver interface we have in SUNDIALS.
Cody
Hi Utpal,
Your Jacobian is only one part of the linear system that CVODE solves within its Newton iteration. In fact, CVODE solves linear systems of the form (I – gamma*J(t,y))*x = b, where J(t,y) is your singular Jacobian, gamma is a scalar that is proportional to the time step size, and I is the identity matrix. Thus although your Jacobian is singular, the linear system is not.
That said, the SUNDIALS dense linear solver uses partial pivoting (so does MAGMA), while the cuSolver sparse QR does not. We have experienced that will ill-conditioned linear systems, partial pivoting can be critical for numerical stability, and we have seen identical results as you mentioned when using cuSolver.
I will thus echo Cody’s recommendation to try MAGMA.
Daniel R. Reynolds (he/him)
Professor, SMU Mathematics
Hi Utpal,
Try turning on the advanced CMAKE options SUNDIALS_DEBUG, SUNDIALS_DEBUG_ASSERT, SUNDIALS_DEBUG_CUDA_LASTERROR when you build. This may help debug the second case (with cuSolver) in particular.
Cody