CUDA NVIDIA Engineer
Location: Waukesha WI
We are seeking a skilled CUDA Programmer to design, develop, and optimize high-performance applications on NVIDIA GPUs. The role focuses on accelerating compute-intensive workloads, optimizing memory usage, and collaborating with system and application teams to maximize GPU performance.
Key Responsibilities
- Profile and tune GPU applications for performance, memory efficiency, and scalability.
- Work with CPU–GPU parallel programming models and optimize data transfer.
- Leverage NVIDIA libraries (CUDA, cuBLAS, cuDNN, NCCL as applicable).
- Collaborate with system, compute, or AI/ML teams to integrate GPU-accelerated components.
- Debug GPU kernels and address performance bottlenecks using NVIDIA profiling tools.
- Ensure portability and performance across different NVIDIA GPU architectures.
Required Skills
- Strong experience in CUDA programming and parallel computing concepts.
- In-depth understanding of NVIDIA GPU architecture (threads, warps, SMs, memory hierarchy).
- Proficiency in C/C++ for high-performance computing.
- Experience with CUDA profiling and debugging tools (Nsight, nvprof).
- Solid understanding of multi-threading, memory optimization, and performance tuning.
Preferred Skills
- Experience with AI/ML, HPC, or graphics workloads on GPUs.
- Familiarity with multi-GPU programming and communication frameworks (NCCL, MPI).
- Exposure to Python bindings (CUDA Python, PyTorch extensions).
- Experience with Linux-based development environments.