Hi Yingmo,
Glad to hear of your interest in Ceres+CUDA!
I recommend first understanding the objectives you are seeking to optimize, and the hardware you have available. Depending on whether this is to maximize for speed, or for offload from CPU, and the platform you are running on, you might want to choose between CUDA Dense QR or a platform-specific optimized LAPACK, such as Intel MKL. You should run some experiments to compare between them and see which fits your needs.
We are working on incorporating Sparse CUDA support, but this is an endeavor without a clear deadline in sight since the CuSparse support in CUDA is still preliminary and it's unclear how advantageous it would be.
If you have a specific problem in mind (ideally with source code) that would help us plan future Ceres + CUDA developments, please do share!
Regards,
Joydeep