Is there any way optimize memory usage in keras on TensorFlow? It uses alot higher memory compared to Torch.

1,466 views
Skip to first unread message

Eren Gölge

unread,
Oct 20, 2016, 11:54:23 AM10/20/16
to Keras-users
I compare the below model with the 18 layers ResNet of Facebook's implementation and the memory waste is too much on keras side.

Here is my keras model:

model = Sequential()
model
.add(Convolution2D(32, 7, 7, border_mode='same', subsample=(2, 2), init='glorot_uniform', input_shape=(224, 224, 3))) # 112
model
.add(Activation('relu'))
model
.add(MaxPooling2D(pool_size=(2, 2))) # 56

model
.add(Convolution2D(64, 3, 3, border_mode='same', subsample=(1, 1), init='glorot_uniform', )) # 28
model
.add(Activation('relu'))
model
.add(MaxPooling2D(pool_size=(2, 2))) # 14

model
.add(Convolution2D(128, 3, 3, border_mode='same', subsample=(1, 1), init='glorot_uniform', ))
model
.add(Activation('relu'))
model
.add(MaxPooling2D(pool_size=(2, 2))) # 14

model
.add(Convolution2D(256, 3, 3, border_mode='same', subsample=(1, 1), init='glorot_uniform', ))
model
.add(Activation('relu'))
model
.add(MaxPooling2D(pool_size=(2, 2))) # 14

model
.add(Convolution2D(512, 3, 3, border_mode='same', subsample=(1, 1), init='glorot_uniform', ))
model
.add(Activation('relu'))
model
.add(MaxPooling2D(pool_size=(7, 7))) # 7

model
.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model
.add(Dense(512))
model
.add(Activation('relu'))
model
.add(Dropout(0.5))
model
.add(Dense(10))
model
.add(Activation('softmax'))

model
.summary()

model
.compile(loss='categorical_crossentropy',
              optimizer
='rmsprop',


It allocates 3306 MB  of GPU memory where as 18 layers ResNet allocates ~2400MB. Depending on that I have following questions;

1. Is it just a pitfall of being on the generic side with Keras or it is just the general case of Tensorflow against Torch?
2. Is there any way to optimize memory use like sharing forward and backward pass variables or any other way?

thanks for any comment

Oscar Serra

unread,
Nov 8, 2016, 3:44:56 PM11/8/16
to Keras-users
Hi Eren, this seems to be a drawback of using TensorFlow. I have been reading a few TF tech support forums and they are working on it. I am trying to run VGG16 and would only use GPU (2GB) on Theano backend.
What I am trying to find out now is how to modify the batch size. Any help in that direction would be appreciated. Thanks
Reply all
Reply to author
Forward
0 new messages