Batch normalization and Weight normalization for 1d CNN

922 views
Skip to first unread message

Obayogy Jaj

unread,
Jan 25, 2017, 2:12:25 AM1/25/17
to lasagne-users
Hi,

I want to implement a multi layer 1d CNN  with batch normalization[link] or weight normalization [1 ]
but I found the code of author could be run correctly for conv1dlayer  

convB = Conv1DLayer( h  ,  num_filters, filter_size ,  pad ='same', nonlinearity=lasagne.nonlinearities.rectify  )
conv1 = weight_norm(conv1)


  1. anyone could give me some advices
  2. and batch_norm  of lasagne could be used to conv1d ? 

thank you



[1] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Jan Schlüter

unread,
Jan 26, 2017, 10:53:25 AM1/26/17
to lasagne-users
I want to implement a multi layer 1d CNN  with batch normalization[link] or weight normalization [1 ]
but I found the code of author could be run correctly for conv1dlayer  

convB = Conv1DLayer( h  ,  num_filters, filter_size ,  pad ='same', nonlinearity=lasagne.nonlinearities.rectify  )
conv1 = weight_norm(conv1)


  1. anyone could give me some advices
The code of the authors does not handle 1d-convolution, but it's easy to extend. After https://github.com/openai/weightnorm/blob/55917c3/lasagne/nn.py#L240-L243, add:
elif incoming.W_param.ndim == 3:
    W_axes_to_sum = (1,2)
    W_dimshuffle_args = [0,'x','x']
Untested, but this will probably be enough.

We should add weight normalization to Lasagne some time, but with a simpler implementation.
  1. and batch_norm  of lasagne could be used to conv1d ? 
Yes. If you combine it with weight normalization, note that the authors recommend mean-only batch normalization, which you'll need to copy from their code as well.

Best, Jan

Obayogy Jaj

unread,
Jan 26, 2017, 6:26:47 PM1/26/17
to lasagne-users
hi, Jan

after adding, it will cause error of "ValueError: Input dimension mis-match " ....



在 2017年1月27日星期五 UTC+9上午12:53:25,Jan Schlüter写道:

Obayogy Jaj

unread,
Jan 26, 2017, 7:03:21 PM1/26/17
to lasagne-users
hi, Jan
I tested the code after add what you said 
1)I found error at https://github.com/openai/weightnorm/blob/55917c3/lasagne/nn.py#L259, the adding  of input and  dimshuffle b 

2) may be caused by line https://github.com/openai/weightnorm/blob/55917c3/lasagne/nn.py#L225,   the k = self.input_shape[1] , but this k is the num_input_channels for 2D convolution input (batch_size, num_input_channels, input_rows, input_columns), 
but for conv1d, our input_shape is (n_batch, seq_len , n_dim ), there is no num_input_channels for  conv1d input.   

elif len(self.input_shape)== 3:
    self.axes_to_sum = (1,2,3)
    self.dimshuffle_args = ['x','x','x']

But this is still not correct ...
I donot know how to set and use dimshuffle correctly here,

thanks very much.









在 2017年1月27日星期五 UTC+9上午12:53:25,Jan Schlüter写道:
I want to implement a multi layer 1d CNN  with batch normalization[link] or weight normalization [1 ]

Jan Schlüter

unread,
Jan 27, 2017, 5:01:25 AM1/27/17
to lasagne-users
Sorry, hadn't read the full code. Yes, this also needs to be adapted.

3) after https://github.com/openai/weightnorm/blob/55917c3/lasagne/nn.py#L232  , i add 
elif len(self.input_shape)== 3:
    self.axes_to_sum = (1,2,3)
    self.dimshuffle_args = ['x','x','x']

By comparison to 2D convolution, it should be (0, 2) and ['x', 0, 'x'].


for conv1d, our input_shape is (n_batch, seq_len , n_dim ), there is no num_input_channels for  conv1d input

conv1d input can be interpreted as (n_batch, num_input_channels, input_rows). Note that if you have a dimension that may change in length (like seq_len), it should be the last one. The number of input channels should be kept constant, it's equivalent to the number of units in a dense layer (i.e., it indicates the number of features). The Conv1DLayer will learn a separate bias for each input channel, so it's important the number of channels is fixed. And actually, even if the sequence length is fixed, you will not want to learn a separate bias per time step, but a separate bias per feature. So if you need to combine Conv1DLayer and recurrent layers, you will need a DimshuffleLayer(..., (0, 2, 1)) in between.

Best, Jan
Reply all
Reply to author
Forward
0 new messages