GPU requirement for running Fully convolutional networks

Sharif Amit

unread,

Oct 23, 2016, 6:19:49 PM10/23/16

to Caffe Users

Can someone kindly tell me if i can train FCN model for semantic segmentation task with a 6 GB GPU like GTX 1060?
And what are some reduced FCN that can be trained with the gpu mentioned above?

Jonathan R. Williford

unread,

Oct 24, 2016, 9:24:30 AM10/24/16

to Caffe Users

You might be able to train, if your batch size is very small (increase iter_size to offset this reduction). The fully connected layers usually take up the most memory.

Cheers,

Jonathan

Ahmed Ibrahim

unread,

Oct 24, 2016, 11:51:50 AM10/24/16

to Caffe Users

My understanding is the batch size of FCN is always 1. And I guess 6GB is enough. give it a try.

Sharif Amit

unread,

Oct 24, 2016, 9:05:31 PM10/24/16

to Caffe Users

I have already tried it with a GTX650 2 GB and a GTX660 3 GB. But i got error "cudaSuccess (2 vs. 0) out of memory". So I have purchased the GTX1060 which will be arriving by tomorrow. Hoping I will be able to train with that.

Ketil Malde

unread,

Oct 27, 2016, 1:45:42 AM10/27/16

to Jonathan R. Williford, Caffe Users

> You might be able to train, if your batch size is very small (increase
> iter_size to offset this reduction). The fully connected layers usually
> take up the most memory.

For images, a 256x256 image is 64K pixels, and should take 192 kilobytes
(one byte per color channel). A batch with size 256 would require about
50 megabytes, which is miniscule compared to any GPU memory, I think.

On the other hand, the connect between two 4096-neuron FC (IP) layers
require 4096² = 16M weights, which I guess will be 32 bit floats on a
GPU. So that's 64 megabytes. So unless you have very large input data,
and/or a very simple network, the bulk of the memory is likely to go
into the network itself.

Oh wait: the batch is forward-fed as a tensor, isn't it? So
intermediate values are generated for the whole batch at once,
so while you only need to store two copies of the weights (the extra one
for updating), you need to store batch-size times layer size of data,
one layer at a time - here 4096x256, so the weights still dominate in
most cases. And you only need to store this one layer at a time.

This is all very back-of-envelope guesstimates, I'd be curious to hear
about practical experiences. I've currently used a Titan X with 12GB
and not run into any problems (using AlexNet). What kind of network and
data gave you (previous poster) an out of memory error on 2GB? Did
anybody run out with 6GB or 8GB cards?

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Sharif Amit

unread,

Nov 2, 2016, 4:43:43 AM11/2/16

to Caffe Users

Hello Ketil.
Sorry for the late reply. I'm really new to the whole "Semantic segmentation" thing and I want to answer you with the little knowledge i have. I might be wrong, but what i understand is every pixel in the image is assigned a class. So it has HxW dimension for the size and Color channels being the 3rd and there is a 4th one for classifying the pixel. I'm using Pascal Voc-2012 data which has 20 classes (and 1 more class for for "none"), The net I'm using is FCN-32 and it took approximately 3.5 GB memory from GPU for training (check using the command "nvidia-smi").. So i was unable train the network with a 2GB and a 3GB GPU previously. My next task will be to train the FCN-8 network.

qi wang

unread,

Nov 1, 2018, 4:23:45 AM11/1/18

to Caffe Users

Hi Sharif,

Actually, I did use fcn-32s on a dataset with 80*80 dimension for each image, even if the accuracy is only around 70% and visual result is quite coarse.

BUT my configuration if quadro P620 (2GB) memory, so it shall be theoretically viable to run fcn-32s, while I also met with memory overflow when using fcn-16.