Doubling of GPU Memory using CUDA Toolkit 7.5 RC

279 views
Skip to first unread message

WixAuto

unread,
Aug 2, 2015, 12:06:37 PM8/2/15
to Caffe Users
According to the CUDA Toolkit 7.5 RC notes:

16-bit floating point (FP16) data format
  • Store up to 2x larger datasets in GPU memory
  • Reduce memory bandwidth requirements by up to 2x
  • New mixed precision cublasSgemmEX() routine supports 2x larger matrices

 What does one have to do in Caffe to take advantage of this excellent enhancement?


 I am new to Caffe but with access to C++ skills. Have been struggling to fit my large NN into a 4GB GPU and this would help greatly with that problem. Could someone provide a step-by-step guide on how to do this? Any advice would be helpful.


Many Thanks



Gerzain Mata

unread,
Aug 2, 2015, 7:34:08 PM8/2/15
to Caffe Users
I believe that NVIDIA makes this statement because of the introduction of the half float, which takes up 16 bits instead of the usual 32.  The only problem with this is that even the most basic operations (bitwise comparisons, shifts, etc.) won't actually work unless you have a Tegra X1 board.  What it does let you do though is convert floating point numbers to half precision which the GPU can then work on using your usual matrix/ library function calls (Thrust/ cuBLAS/ etc.). Go to http://devblogs.nvidia.com/parallelforall/new-features-cuda-7-5/ for more info
Reply all
Reply to author
Forward
0 new messages