Input shape of Conv1D in functional API

3,457 views
Skip to first unread message

peter....@gmail.com

unread,
Feb 15, 2017, 4:08:24 PM2/15/17
to Keras-users
Hello,

I've been unable to apply a Convolution Neural Network (CNN), adapted from the word embedding example in the Keras blog, to a set of 24000 vectors/examples of 159 values/features each. 
Namely, I have:

scanorset = scanorset.reshape((scanorset.shape[0], 1, scanorset.shape[1]))    # ie, (24000, 1, 159)
clf1 = KerasRegressor(build_fn=cnnWEmbed, nb_epoch=20, batch_size=5, verbose=0)
clf1.fit(
scanorset, targetsjoin)

and

def cnnWEmbed():
sequence_input = Input((1,159)) # also tried (159,1)
x = Conv1D(128, 5, activation='relu')(sequence_input)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(35)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(1)(x)
model = Model(sequence_input, preds)

model.compile(loss='mean_squared_error', optimizer='adam')

model.summary()
plot(model, to_file='cnnWEmbed.png', show_shapes=True)

return model



I couldn't find documentation for the functional Input class, and I don't know how to interpret what's happening. Therefore, all variations attempted as input to the "Input" class lead to some error.
Could someone help?


Best,
Pedro


Daπid

unread,
Feb 16, 2017, 4:00:34 AM2/16/17
to peter....@gmail.com, Keras-users
Per the documentation, the Conv1D layer takes an input of shape
(samples, steps, input_dim). The first one is implied, so Input has
the shape (steps, input_dim). Here, "steps" is the length of your
input vector (the number of words in a sentence). If you want to keep
it variable, you can also set it to None, but then Flatten + Dense
wouldn't be legal layers.
> --
> You received this message because you are subscribed to the Google Groups
> "Keras-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to keras-users...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/keras-users/c333bab5-d44a-4fdb-b2b7-fe8aa04e0f6e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

peter....@gmail.com

unread,
Feb 16, 2017, 6:28:09 AM2/16/17
to Keras-users
I've tried both (159, 1) and (1,159), since I have 1 fixed length vector of 159 floats (non related with each other) per each example. All resulted in errors.

Is it a problem with my topology?
How can I use a CNN with this data shape?

What would be the most appropriate networks/topologies for this type/shape of data?

Best regards,
Pedro

Daπid

unread,
Feb 16, 2017, 6:48:24 AM2/16/17
to peter....@gmail.com, Keras-users
If you have only one time step, there is no point in using
convolutions. They are for data that have translational invariance in
one dimension (like audio over time or words in a text).
> --
> You received this message because you are subscribed to the Google Groups "Keras-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/ec5f5687-9c26-41c6-92ce-38c07e143880%40googlegroups.com.

peter....@gmail.com

unread,
Feb 16, 2017, 7:07:53 AM2/16/17
to Keras-users
Then how is an image example, of height x width (or words of a sentence/example), different from my vector (of 1 x 159)?

Best,

Daπid

unread,
Feb 16, 2017, 7:11:54 AM2/16/17
to peter....@gmail.com, Keras-users
In an image you look at close pixels looking for certain patterns,
that are consistent across it. For example, a vertical edge can be
found in the centre or in the bottom left corner, and it looks the
same because it only depends on the pixels around it. Your vector of
159 numbers has no structure, so convolutions are meaningless.

On 16 February 2017 at 13:07, <peter....@gmail.com> wrote:
> Then how is an image example, of height x width (or words of a sentence/example), different from my vector (of 1 x 159)?
>
> Best,
>
> --
> You received this message because you are subscribed to the Google Groups "Keras-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/d0c22e32-dca3-4c10-9840-785828841082%40googlegroups.com.

peter....@gmail.com

unread,
Feb 16, 2017, 7:15:44 AM2/16/17
to Keras-users
My numbers are the output of various metrics for the same task/purpose. Therefore, some relation might occur, at least that's what I'd like to find. Therefore, assuming that my numbers are somewhat related, how do I make my example work?

jpeg729

unread,
Feb 16, 2017, 10:08:58 AM2/16/17
to Keras-users
An image isn't just height x width, it is height x width x colour channels.
The convolution is applied to fixed size subsquares of the input and operates on the data in the colour channels of the pixels in that subsquare. The colour channels could be replaced by precomputed localised features, which is what AlphaGo did with ~98 features if memory serves.

Input shape=(159,1) would mean the 1D space is 159 units long, and each unit contains 1 feature. This would work if your data were of the shape (24000, 159, 1) rather than (24000, 1, 159).

Input shape=(1,159) would mean the 1D space is 1 unit long, and each unit contains 159 features. Some have pointed out that applying a convolution to a space of size 1 doesn't make much sense. In your example however there is another problem. Your filter size is set to 5, but the space is only 1 unit long. Unless you pad the input, the space is not big enough to fit the filter in. Conv1D takes a border_mode parameter whose default value is 'valid' and that means that the convolution is applied to the input with no padding. See http://datascience.stackexchange.com/questions/11840/border-mode-for-convolutional-layers-in-keras. You could add border_mode="same", but that would be subject to the first objection - that a simpler net would be functionally equivalent.

jpeg729

unread,
Feb 16, 2017, 10:18:38 AM2/16/17
to Keras-users
It makes sense to apply a convolution to pictures, because their pixels are locally related to each other. It makes sense to apply a convolution to words in a sentence, or to steps in a time series, because each word/step is most related to close neighbours.

It sounds like your data is composed of 24000 sets of similar but otherwise unrelated measurements. i.e. they are not part of a time series, nor are they spatially related. Each vector of data contains diverse metrics that have no spatial nor temporal relationship. Therefore, there is absolutely no reason to use convolutions and poolings here. Just use dense layers.
Reply all
Reply to author
Forward
0 new messages