Convolution2D on images with different sizes

瀏覽次數:3,836 次
跳到第一則未讀訊息

deco...@gmail.com

未讀,
2016年2月4日 中午12:03:482016/2/4
收件者:Keras-users
I would  to process the output of the Convolution2D with a RNN (Similar approach to the Image Caption Generation with attention paper http://arxiv.org/pdf/1502.03044v2.pdf ) so I think I could use a Masking layer to deal with the different output size.

However I can't deal with different input shape to the network, What would be the best way to use Convolution2D on images with different size? Up to now I could only think of padding the numpy arrays of the input images with zeros to make them of the same size. However this can be very ineficient if there is a considerable variation in the size of the images (as can be the case).

Can someone imagine a way that it can be possible to work with different size images other than padding them? I'm not an expert in Theano but I think it can't be done. But it would be great if it was possible since I think it would probably avoid unnecessary usage of memory and computing time.

If someone thinks its possible and can give me some hints I would really appreciate it. (Also if someone is certain that it can't be done with current Theano and/or Tensorflow )

Atlas

未讀,
2016年2月10日 下午5:34:222016/2/10
收件者:Keras-users、deco...@gmail.com
Could you have a small input shape for your Convolution2D sets and then downsample your images to it? If an image is smaller then use the upsample or zeropad layers..

François Chollet

未讀,
2016年2月10日 下午5:37:462016/2/10
收件者:Atlas、Keras-users、deco...@gmail.com
What would be the best way to use Convolution2D on images with different size?

I believe that is possible with Keras. Do you have a specific code example that doesn't work?

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/8776dba5-225e-43a9-ad8b-4c0519e231d9%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Klemen Grm

未讀,
2016年2月12日 凌晨3:40:452016/2/12
收件者:Keras-users、deco...@gmail.com
In  what use case does this not work for you? I've just tried it with a fully convolutional network, different image sizes can be used both in prediction and in training modes, provided the tensor shapes match.

>>> from keras.models import Sequential
>>> from keras.layers import Convolution2D
>>> m = Sequential()
>>> m.add(Convolution2D(8,3,3, input_shape=(1,10,10)))
>>> m.compile(loss="mae", optimizer="sgd")
>>> c = m.predict(np.random.rand(1,1,10,10))
>>> c.shape
(1, 8, 8, 8)
>>> c = m.predict(np.random.rand(1,1,20,20))
>>> c.shape
(1, 8, 18, 18)
>>> m.fit(np.random.rand(100,1,10,10), np.random.rand(100,8,8,8))
<keras.callbacks.History object at 0x7f349919fb10>
>>> m.fit(np.random.rand(100,1,12,12), np.random.rand(100,8,10,10))
<keras.callbacks.History object at 0x7f3489d07a90>

deco...@gmail.com

未讀,
2016年2月15日 清晨5:08:382016/2/15
收件者:Keras-users、deco...@gmail.com
From the constructor of the Convolution2D, since it requires the input_shape parameter:

m.add(Convolution2D(8,3,3, input_shape=(1,10,10)))

I imagined that It has to work with images of size grayscale 10x10. Now I see that this parameter is somehow ignored and that it actually works with images of different sizes.  So I will be more specific in my question:

model = Sequential()

model.add(Convolution2D(8, 3, 3, input_shape=np.shape(X_train[np.newaxis,0,:,:])))

model.add(Permute((3,2,1)))

model.add(Reshape((-1,np.prod(model.layers[-1].output_shape[-2:]))))

<------- Masking ? ------->

model.add(SimpleRNN(output_dim))

model.add(TimeDistributedDense(nb_classes,activation='softmax'))

optimizer = RMSprop(lr=learning_rate)

model.compile(loss='categorical_crossentropy', optimizer=optimizer)



In my training set, each image has a different shape. Since the output of the Convolution2D will have different shape depending on the imput image, after the reshape I will have sequences of different length.  Is it possible to deal with that variable length sequence generated inside the neural network ?
The only idea that I could come up with was to pad the images with zeros, so that the output of the Conv2D will always produce a sequence of the same length, but some of the elements of that sequence will just contain no information at all and I am wasting computation time.

It would be great If I could have a layer to put in between the Reshape layer and the SimpleRNN layer that could pad/mask the sequences output from the Conv2D to make it possible to. Is that possible with Keras ?

Maybe, since It seems to ignore the input_shape parameter I can just work with batch_size=1 and it should work. Am I right ?

Klemen Grm

未讀,
2016年2月15日 清晨5:13:012016/2/15
收件者:Keras-users、deco...@gmail.com
No, that's not the case. The input_shape parameter is not ignored, it is used for the case where the following layers depend on the shape of the layer's output. Therefore, different input sizes will only work when that's not the case - ie, when you're working with a fully-convolutional network and when the training output sizes match. If you have non-convolutional layers, they will be initialised for the input size specified, and the network will only work for inputs of that size, in which case you may consider either scaling, cropping or padding your images to match it.

asphalt

未讀,
2017年7月21日 清晨6:52:352017/7/21
收件者:Keras-users、deco...@gmail.com

Hey, 

I have a training set of images, each of which has a different size. I do not want to lose data by resizing the images. How can one feed the data to model.fit() funcion, as it accepts only arrays and a single array consisting of multiple arrays( of different dimensions) is not supported by numpy. 

Thanks for all the help!!

Daπid

未讀,
2017年7月21日 上午8:34:592017/7/21
收件者:asphalt、Keras-users、deco...@gmail.com
The simplest thing is to use fit_generator and feed same-sized batches. The alternative, is to implement some sort of masking and padding, but how to do it correctly depends exactly on what exactly you are doing inside the network.

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/c2485083-50b7-4c08-ad01-58663eda579d%40googlegroups.com.

deco...@gmail.com

未讀,
2017年7月26日 清晨7:18:372017/7/26
收件者:Keras-users、asfi...@gmail.com、deco...@gmail.com
If your dataset is small and performance is not a main issue, you can train with batchsize=1, thus all of your images in the batch (one) will be of the same size.


El viernes, 21 de julio de 2017, 14:34:59 (UTC+2), David Menéndez Hurtado escribió:
The simplest thing is to use fit_generator and feed same-sized batches. The alternative, is to implement some sort of masking and padding, but how to do it correctly depends exactly on what exactly you are doing inside the network.
On 21 July 2017 at 12:52, asphalt <asfi...@gmail.com> wrote:

Hey, 

I have a training set of images, each of which has a different size. I do not want to lose data by resizing the images. How can one feed the data to model.fit() funcion, as it accepts only arrays and a single array consisting of multiple arrays( of different dimensions) is not supported by numpy. 

Thanks for all the help!!

On Thursday, February 4, 2016 at 10:33:48 PM UTC+5:30, deco...@gmail.com wrote:
I would  to process the output of the Convolution2D with a RNN (Similar approach to the Image Caption Generation with attention paper http://arxiv.org/pdf/1502.03044v2.pdf ) so I think I could use a Masking layer to deal with the different output size.

However I can't deal with different input shape to the network, What would be the best way to use Convolution2D on images with different size? Up to now I could only think of padding the numpy arrays of the input images with zeros to make them of the same size. However this can be very ineficient if there is a considerable variation in the size of the images (as can be the case).

Can someone imagine a way that it can be possible to work with different size images other than padding them? I'm not an expert in Theano but I think it can't be done. But it would be great if it was possible since I think it would probably avoid unnecessary usage of memory and computing time.

If someone thinks its possible and can give me some hints I would really appreciate it. (Also if someone is certain that it can't be done with current Theano and/or Tensorflow )

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

stewart...@gmail.com

未讀,
2017年10月18日 上午10:41:262017/10/18
收件者:Keras-users
No matter what you do, all of the images in a batch must be of the same dimension. The frameworks are simply built that way and I am aware of no exceptions.
One of the best things you can do performance-wise is to batch the images in clusters of similar image size, however this may have adverse consequences on the quality of your mini-batch gradients, particularly if the statistical properties of smaller images do not match those of the larger images, which depending on the dataset may or may not be likely.
回覆所有人
回覆作者
轉寄
0 則新訊息