1D convolution as 2D Convolution

402 views
Skip to first unread message

Rafael Valle

unread,
Mar 1, 2017, 1:26:03 AM3/1/17
to lasagn...@googlegroups.com
Is a 1d convolution (batch_size, alphabet_size, length) the same as a 2d convolution with (batch_size, 1, alphabet_size, length)?

Jan Schlüter

unread,
Mar 1, 2017, 9:47:05 AM3/1/17
to lasagne-users
Is a 1d convolution (batch_size, alphabet_size, length) the same as a 2d convolution with (batch_size, 1, alphabet_size, length)?

If the 1D convolution has a filter of size (foo,) and the 2D convolution has a filter of size (1, foo), then yes. In the 2D convolution, you can also put the "1" at the third or last position (the latter requires a filter size of (foo, 1) then). Those are three of the variants available in lasagne.theano_extensions.conv.

Best, Jan
Message has been deleted

Rafael Valle

unread,
Mar 1, 2017, 11:29:23 AM3/1/17
to lasagne-users
Jan, maybe I understand you wrong but 1D filter size N and 2D filter size (1,  N) give different W shapes. Using 2D filter size (alphabet_size, N) gives the same W output as the 1D convolution.
For example:

input_var = T.ftensor3('inputs')
layer = InputLayer(shape=(None, 32, 128), input_var=input_var)
layer = Conv1DLayer(layer, 64, (7, ), stride=1, pad=0)
print layer.get_W_shape()
(64, 32, 7)


input_var = T.ftensor4('inputs')
layer = InputLayer(shape=(None, 1, 32, 128), input_var=input_var)
layer = Conv2DLayer(layer, 64, (1, 7), stride=1, pad=0)
print layer.get_W_shape()
(64, 1, 1, 7)

layer = InputLayer(shape=(None, 1, 32, 128), input_var=input_var)
layer = Conv2DLayer(layer, 64, (32, 7), stride=1, pad=0)
print layer.get_W_shape()
(64, 1, 32, 7)

Jan Schlüter

unread,
Mar 2, 2017, 6:40:06 AM3/2/17
to lasagne-users
Jan, maybe I understand you wrong but 1D filter size N and 2D filter size (1,  N) give different W shapes. Using 2D filter size (alphabet_size, N) gives the same W output as the 1D convolution.

Oh yes, sorry. It's 2D filter size (1, N) if you make your input (batchsize, channels, 1, length). For (batchsize, 1, channels, length) you'll need a 2D filter of (channels, N).

xavier...@gmail.com

unread,
Apr 2, 2017, 3:08:38 AM4/2/17
to lasagne-users
Hi Rafael,
Although in theory this should be exactly the same, I have recently had issues with that. 2D convolution is giving me terrible results when learning multidimensional time series. I would recommend using 1D convolution whenever there is no "spatial" structure in your second dimension. I'm not sure why there is a difference in practice, I have a feeling it has to do with weights initialization or updates, but my model was only learning the biases basically.
Pus 1DConv seemed to be faster anyways.
Cheers,
Xavier

Rafael Valle

unread,
Apr 2, 2017, 1:06:24 PM4/2/17
to lasagne-users
Can you create a small example confirming that the kernel sizes are initializing with the same weights? 
I'm very curious what lasagne devs have to say about it. the only difference in implementation I can think of if the epsilon added to the convolution and the weight initialization as you mentioned...

Also, by multidimensional time-series do you mean valuation over time of symbols in an alphabet or over real valued functions?
If it's the second, I'd be very thankful if you could share references.

Jan Schlüter

unread,
Apr 3, 2017, 9:28:25 AM4/3/17
to lasagne-users
I'm very curious what lasagne devs have to say about it.

The Conv1DLayer in Lasagne is implemented as a 2D convolution, since Theano doesn't have native 1D convolution support. So it's the same as a Conv2DLayer with some inserted singleton dimensions (dimensions of size 1) in the input and weights, and a 2D kernel size that matches the input size in one of the two dimensions, such that it can only move in the other dimension.
Of course, if there's no spatial structure in a particular dimension, you shouldn't convolve in that dimension, either by using a 1D convolution or by matching the filter size to the input size (which is equivalent).

Best, Jan

Xavier Audier

unread,
Apr 4, 2017, 4:02:24 PM4/4/17
to lasagne-users
I haven't taken the time to check unfortunately but by implementing the 1DConv instead of the 2D the weights seemed to be learned a lot more easily.
I've been using real valued multi channel time series, EEG in my case. no reference available however as this is still work in progress :/
But as Jan mentioned the 1D conv is actually calculated using 2D convolution functions so everything should be the same.
Reply all
Reply to author
Forward
0 new messages