Resize Image and MemoryError

776 views
Skip to first unread message

Gledson Melotti

unread,
Mar 9, 2018, 1:13:49 PM3/9/18
to keras...@googlegroups.com
Hello, I'm using python to transform my images. I have 50,000 images to use with deep learning in various sizes. I performed a resize for 224x224 (three layers). After I did this, I need to transform to a float32 file. Unfortunately I can not. A 'MemoryError' error appears. I can not understand this error, because the examples I am analyzing, use the dataset cifar10 and transform the images into float32 and does not generate the MemoryError.

Could you help me with this problem of MemoryError?

Why with my images generate this error and with the dataset of cifar10 does not generate this error?

I send the code (algorithm) I'm using to resize my images in 224x224x3 format and then switch to float32.


import numpy as pd
from numpy array
from numps zeros
from PIL import Image
import sys
import os
xtrain=[]

path='C:/...'
dirs=os.listdir(path)

for item in dirs:
             if os.path.isfile(path+item):
             im= Image.open(path+item)
             imResize=im.resize((224,224), Image.ANTIALIAS)
             imResize.save(path+item)
                         

for item in dirs:
             if os.path.isfile(path+item):
             im= Image.open(path+item)
             img=array(im)
             xtrain.append(img)
X_train=np.float32(xtrain)

Thanks,
Gledson Melotti.
Message has been deleted

Rohit Saha

unread,
Mar 11, 2018, 4:33:40 PM3/11/18
to Keras-users
I see that in your first for loop you are reading images from a directory, resizing them and overwriting the previous images with the resized ones. It's a good practice to keep the original images as it is and resize them on the go (while your model trains). In the second for loop, you are again reading the same resized images, converting them to an array and appending them to a list - which in my opinion is a bad practice and is possibly occupying your ram leading to the memory problem. It works for cifar because images in that dataset are 32px along both dimensions. You are using images which are 7 times bigger than cifar images.

I would suggest using cv2 for the same thing that you are doing. cv2 reads images in numpy format and you can easily feed those images directly to your model.

Suggested code:

import cv2
import os

path = os.getcwd() + "/"
dirs = os.listdir(path)
X = np.empty((n_samples, 224, 224, 3))  
index = 0

for file in dirs: (demo code - suit the loop according to your directory structure)
    image = cv2.imread(file, 1) #1 for RGB, 0 for grayscale
    resized_image = cv2.resize(image, (224, 224)) #width, height
    X[index, :, :, :] = resized_image  

X is a 4d tensor (numpy array) with dimensions : [n_samples, height, width, channels]
You can feed X directly to a model in keras using model.fit or, using model.train_on_batch if X can't fit into memory all at once (which can happen in your case).

Efficient code:

batch_size = 32
for iter in range(n_iterations):
    index = 0
    X, Y = np.empty((batch_size, 224, 224, 3)), np.empty((batch_size, n_classes))
    for file in dirs: (demo code - suit the loop conditions according to your directory structure)
        
        image = cv2.imread(file, 1) #1 for RGB, 0 for grayscale
        resized_image = cv2.resize(image, (224, 224)) #width, height
        X[index, :, :, :] = resize_image  #creating a batch of 32 images - this shouldn't cause any memory error ; you can experiment with the batch_size as well
        Y[index, :] = label (one hot vector)
        index += 1
    
    #X and Y have the data for 1 batch
     model.train_on_batch(X, Y)

Hope this helps.
Reply all
Reply to author
Forward
0 new messages