Doubling of GPU Memory using CUDA Toolkit 7.5 RC

279 views

Skip to first unread message

WixAuto

unread,

Aug 2, 2015, 12:06:37 PM8/2/15

to Caffe Users

According to the CUDA Toolkit 7.5 RC notes:

16-bit floating point (FP16) data format

Store up to 2x larger datasets in GPU memory
Reduce memory bandwidth requirements by up to 2x
New mixed precision cublasSgemmEX() routine supports 2x larger matrices

What does one have to do in Caffe to take advantage of this excellent enhancement?

I am new to Caffe but with access to C++ skills. Have been struggling to fit my large NN into a 4GB GPU and this would help greatly with that problem. Could someone provide a step-by-step guide on how to do this? Any advice would be helpful.

Many Thanks

Gerzain Mata

unread,

Aug 2, 2015, 7:34:08 PM8/2/15

to Caffe Users

I believe that NVIDIA makes this statement because of the introduction of the half float, which takes up 16 bits instead of the usual 32. The only problem with this is that even the most basic operations (bitwise comparisons, shifts, etc.) won't actually work unless you have a Tegra X1 board. What it does let you do though is convert floating point numbers to half precision which the GPU can then work on using your usual matrix/ library function calls (Thrust/ cuBLAS/ etc.). Go to http://devblogs.nvidia.com/parallelforall/new-features-cuda-7-5/ for more info

Reply all

Reply to author

Forward

0 new messages