GSOC 2025: Dynamic CUDA Support in OpenCV DNN

47 views
Skip to first unread message

Ambika Sharan

unread,
Apr 3, 2025, 2:30:18 AMApr 3
to opencv-gsoc-202x

Hello OpenCV,

I hope this email finds you well. I am writing to share my proposal for implementing Dynamic CUDA Support in OpenCV’s DNN Module, which I believe would greatly enhance OpenCV’s flexibility and usability for NVIDIA GPU acceleration.


Overview

Currently, OpenCV’s DNN module supports CUDA as a backend for deep learning inference, but this introduces heavy dependencies on the CUDA SDK. My goal is to decouple CUDA support by implementing a dynamic loading mechanism, similar to the OpenVINO backend. This will allow OpenCV to run without requiring CUDA at compile-time, while still enabling GPU acceleration dynamically through a separate CUDA plugin.


Key Objectives & Approach

  • Develop a CUDA Plugin: The CUDA execution engine will be compiled as a separate shared library (opencv_cuda_dnn.so/.dll), which OpenCV can load dynamically.

  • Implement Dynamic Loading: Use dlopen() (Linux/macOS) or LoadLibrary() (Windows) to detect and load the CUDA plugin at runtime, avoiding direct linking to CUDA.

  • Automatic Memory Management: Ensure seamless GPU memory transfers by handling host-to-device and device-to-host memory copying within the plugin.

  • Modify OpenCV Build System: Introduce a CMake option (WITH_CUDA_PLUGIN) to enable building the CUDA plugin separately, ensuring OpenCV itself remains CUDA-independent.

  • Graceful Fallback & Performance Considerations: If CUDA is unavailable, OpenCV should safely fall back to another backend, ensuring robust error handling and minimal performance overhead.


Expected Deliverables

  • CUDA Plugin Implementation – A separate dynamically loaded library for CUDA inference.

  • Integration with OpenCV DNN – Modifications to cv::dnn::Net for detecting and using the CUDA plugin dynamically.

  • Build System Enhancements – Updates to CMake to support plugin-based CUDA loading.

  • Testing & Benchmarking – Ensuring functionality, correctness, and measuring performance impact.

  • Comprehensive Documentation – Clear usage guides and API references


Timeline & Next Steps

I plan to engage with the OpenCV community to refine the implementation details, validate feasibility through a prototype, and submit my work as a series of patches. I would appreciate any feedback or guidance on this approach to ensure alignment with OpenCV’s long-term vision.


Qualifications:  Education: B.S. in Computer Science and Data Science, University of Wisconsin-Madison
Relevant Experience:

  • AMD GPU Intern: Experience with HIP, CUDA and OpenMP.

  • HPC & Compiler Optimization: Research in High-Performance Computing (HPC) and performance engineering at AMD.

  • C++ Development: Strong background in C++ for system programming and software performance tuning.

Would you be available for a brief discussion on this proposal? I am happy to incorporate any suggestions and further refine the plan before formally submitting it.


Looking forward to your thoughts.


Best regards,

Ambika Sharan


Reply all
Reply to author
Forward
0 new messages