Dear community,
I'm following
this tutorial, in which a pre-trained model is loaded to perform a simple forward pass of an image.
The blob to be processed is defined as :
net.blobs['data'].reshape(50,3,227,227)
In cpu mode everything works fine. But when I set the gpu mode, I get the error :
F0218 17:51:32.118782 2115146496 syncedmem.cpp:64] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
I know this is a common error, the common solution being to reduce the batch size. But in my case, even with batch_size=1 I get the same error.
I'm running OS X 10.10.5, with cuda 7.5, latest cuda driver installed. Those are my gpu properties according to caffe:
lobelia:~ user$ /path_to_caffe/build/tools/caffe device_query -gpu 0
I0218 17:56:36.361524 2115146496 caffe.cpp:112] Querying GPUs 0
I0218 17:56:36.480191 2115146496 common.cpp:168] Device id: 0
I0218 17:56:36.480242 2115146496 common.cpp:169] Major revision number: 3
I0218 17:56:36.480260 2115146496 common.cpp:170] Minor revision number: 0
I0218 17:56:36.480271 2115146496 common.cpp:171] Name: GeForce GT 650M
I0218 17:56:36.480284 2115146496 common.cpp:172] Total global memory: 1073414144
I0218 17:56:36.480293 2115146496 common.cpp:173] Total shared memory per block: 49152
I0218 17:56:36.480300 2115146496 common.cpp:174] Total registers per block: 65536
I0218 17:56:36.480307 2115146496 common.cpp:175] Warp size: 32
I0218 17:56:36.480314 2115146496 common.cpp:176] Maximum memory pitch: 2147483647
I0218 17:56:36.480320 2115146496 common.cpp:177] Maximum threads per block: 1024
I0218 17:56:36.480337 2115146496 common.cpp:178] Maximum dimension of block: 1024, 1024, 64
I0218 17:56:36.480345 2115146496 common.cpp:181] Maximum dimension of grid: 2147483647, 65535, 65535
I0218 17:56:36.480353 2115146496 common.cpp:184] Clock rate: 900000
I0218 17:56:36.480388 2115146496 common.cpp:185] Total constant memory: 65536
I0218 17:56:36.480397 2115146496 common.cpp:186] Texture alignment: 512
I0218 17:56:36.480403 2115146496 common.cpp:187] Concurrent copy and execution: Yes
I0218 17:56:36.480409 2115146496 common.cpp:189] Number of multiprocessors: 2
I0218 17:56:36.480415 2115146496 common.cpp:190] Kernel execution timeout: Yes
Does anyone know what is going on?