Input shape to Conv2D for grayscale images

13,607 views
Skip to first unread message

Colin Nordin Persson

unread,
Apr 27, 2017, 5:09:54 AM4/27/17
to Keras-users
I have a training set on the form X_train.shape = (1000, 420, 420) representing 1000 grayscale images (actually spectrograms) with size 420x420. 

I think the Keras documentation is a bit confusing because there are two descriptions of what the argument input_shape should be for a Conv2D-layer:

  • input_shape=(128, 128, 3) for 128x128 RGB pictures 
  • (samples, rows, cols, channels) if data_format='channels_last'
(my configuration is set to channels_last)


According to the first description I think I should use input_shape = (420,420,1), this does however give the error:

expected conv2d_1_input to have 4 dimensions, but got array with shape (1000, 420, 420)


When instead trying input_shape = (1000,420,420,1), I get the error:

Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5


So I'm clearly doing something weird, anyone has an idea?


Thanks!

Matias Valdenegro

unread,
Apr 27, 2017, 5:27:41 AM4/27/17
to Keras-users
input_shape = (420, 420, 1) is the correct one, but it seems you did not reshape your input data as well, your input data should have shape (1000, 420, 420, 1). Then it should work.

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/a012c385-2cc6-4342-aa11-5339ea841a80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daπid

unread,
Apr 27, 2017, 7:15:59 AM4/27/17
to Colin Nordin Persson, Keras-users

On 27 April 2017 at 11:09, Colin Nordin Persson <colin....@gmail.com> wrote:
I have a training set on the form X_train.shape = (1000, 420, 420) representing 1000 grayscale images (actually spectrograms) with size 420x420. 

Spectrograms are time series, so you should treat them with 1D convolutions.

Regarding your doubts, when passing input_shape to a constructor, it is defined per sample, but in reality you have the batch size as first dimension, hence the extra dimension in the input size.

Colin Nordin Persson

unread,
Apr 27, 2017, 7:55:17 AM4/27/17
to Keras-users
Why is that? A spectrogram is 2D, one dimension is time and the other is frequency. You mean I should just consider one dimension?


(Btw, it worked after re-shaping as explained above)

Daπid

unread,
Apr 27, 2017, 8:40:59 AM4/27/17
to Colin Nordin Persson, Keras-users
On 27 April 2017 at 13:55, Colin Nordin Persson <colin....@gmail.com> wrote:
Why is that? A spectrogram is 2D, one dimension is time and the other is frequency. You mean I should just consider one dimension?

A spectrogram has the time dimension, and the different frequencies are the channels.

The convolutional hypothesis says that your underlying truth is invariant to translation across the convolutional dimensions. In an image, that means a cat in the top right corner is the same as the same cat in the lower left or right in the middle. In a spectrogram a peak at 200 Hz is not the same as a peak at 880 Hz.

The other consequence of convolutions is that things that are close across the convolutional dimensions are strongly related, and get more loose as they fall further away: close pixels form a line, then a texture, and then an object. In a spectrogram, for a given time step, all frequencies are equally important, and you shouldn't process the high and low frequencies separately. Or, in other words, (t=0, f=20 Hz) is more related to (t=0, f=200 Hz) than (t=1, f=20 Hz), but a 2D convolution would consider them as equally separated (for a given kernel size).

feng...@gmail.com

unread,
Aug 22, 2017, 5:49:33 PM8/22/17
to Keras-users
I met the same problem, I have 100 gray scale images and each of them is 137*137 pixels. So the training data has the shape of (100, 137, 137). But how can I convert a 3D array of shape (100, 137, 137) to a 4D array of shape (100, 137, 137, 1) ?
Thanks!


On Thursday, April 27, 2017 at 4:27:41 AM UTC-5, Matias Valdenegro wrote:
input_shape = (420, 420, 1) is the correct one, but it seems you did not reshape your input data as well, your input data should have shape (1000, 420, 420, 1). Then it should work.
On 27 April 2017 at 10:09, Colin Nordin Persson <colin....@gmail.com> wrote:
I have a training set on the form X_train.shape = (1000, 420, 420) representing 1000 grayscale images (actually spectrograms) with size 420x420. 

I think the Keras documentation is a bit confusing because there are two descriptions of what the argument input_shape should be for a Conv2D-layer:

  • input_shape=(128, 128, 3) for 128x128 RGB pictures 
  • (samples, rows, cols, channels) if data_format='channels_last'
(my configuration is set to channels_last)


According to the first description I think I should use input_shape = (420,420,1), this does however give the error:

expected conv2d_1_input to have 4 dimensions, but got array with shape (1000, 420, 420)


When instead trying input_shape = (1000,420,420,1), I get the error:

Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5


So I'm clearly doing something weird, anyone has an idea?


Thanks!

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

Benjamin Vallet

unread,
Sep 19, 2017, 3:32:36 AM9/19/17
to Keras-users
You can just use numpy for that. 
For example: 

import numpy as np

x = np.zeros((10, 15))
print(x.shape)    # This gives (10, 15)

y = np.reshape(x, (10, 15, 1))
print(y.shape)    # This gives (10, 15, 1)
                          # and your data are basically unchanged

byerian6...@gmail.com

unread,
Oct 19, 2018, 3:18:15 AM10/19/18
to Keras-users
Excellent answer. For those hunting down this error I will write it large:


The cause of the error is because greyscale images are by default 2D (shape (720,1280) say), whereas RGB are (720,1280,3). 

The fix for the error is to reshape your (720,1280) to (720,1280,1) 




 
On Thursday, April 27, 2017 at 2:27:41 AM UTC-7, Matias Valdenegro wrote:
input_shape = (420, 420, 1) is the correct one, but it seems you did not reshape your input data as well, your input data should have shape (1000, 420, 420, 1). Then it should work.
On 27 April 2017 at 10:09, Colin Nordin Persson <colin....@gmail.com> wrote:
I have a training set on the form X_train.shape = (1000, 420, 420) representing 1000 grayscale images (actually spectrograms) with size 420x420. 

I think the Keras documentation is a bit confusing because there are two descriptions of what the argument input_shape should be for a Conv2D-layer:

  • input_shape=(128, 128, 3) for 128x128 RGB pictures 
  • (samples, rows, cols, channels) if data_format='channels_last'
(my configuration is set to channels_last)


According to the first description I think I should use input_shape = (420,420,1), this does however give the error:

expected conv2d_1_input to have 4 dimensions, but got array with shape (1000, 420, 420)


When instead trying input_shape = (1000,420,420,1), I get the error:

Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5


So I'm clearly doing something weird, anyone has an idea?


Thanks!

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

neha....@gmail.com

unread,
Feb 28, 2020, 10:37:21 AM2/28/20
to Keras-users
How do we save the "y" image?
I have had trouble saving it using scipy.misc.imsave

misraay...@gmail.com

unread,
Feb 28, 2020, 3:09:29 PM2/28/20
to Keras-users
you could try cv2.imwrite
Reply all
Reply to author
Forward
0 new messages