Change in weights when sequential model (containing batch-normalization layers) used as a single layer

117 views
Skip to first unread message

Pramod Bachhav

unread,
Aug 3, 2018, 6:07:28 AM8/3/18
to Keras-users
Hi,

I am trying some complex variational auto-encoder architecture, where I want to use a sequential model as a layer.
Meanwhile I observed that when a batch-normalization layer is included in a simple feedforward (FF) network, and if such network used as a layer, it changes weight structure of the model.

e.g. To be specific, I wrote a simple code (attached) to demonstrate my question. So model2 is created using model1 (a simple FF containing BN layers).
Basically both models perform the same operation but their weights w1 and w2 are different.

However, surprisingly their outputs y1 and y2 are same, meaning (in my opinion) that keras takes care of the changes in weights during prediction 

But my requirement is that I need weights of model2 (w2) to be same as model1(w1)

Have someone come across a similiar problem or Am I missing something. Any opinions would be really useful.

- Pramod  


Check.py

Sergey O.

unread,
Aug 3, 2018, 8:35:52 AM8/3/18
to Pramod Bachhav, Keras-users
Looking at your code:

"model1" is a series of ops and weights. "model2" is just a wrapper around model1 (including it's ops and weights). So a call to "model2" should give back identical results as call to "model1". 

If you do:
print(model1.trainable_weights)
print(model2.trainable_weights)

you can see the weights the two models share...

I'm guessing you were trying to accomplish something along the lines of:
def mk_model():
  return Sequential([
      Dense(32, input_shape=(50,)),
      BatchNormalization(),
      Dense(32),
      BatchNormalization(),
      Activation('relu'),
      Dense(10),
      Activation('softmax')])

model1 = mk_model()
model2 = mk_model()

If you want either of the models to share the same weights, you need to simply give the layers names. For example:
def mk_model():
  return Sequential([
      Dense(32, input_shape=(50,),name="BOB"),
      BatchNormalization(),
      Dense(32),
      BatchNormalization(),
      Activation('relu'),
      Dense(10),
      Activation('softmax')])

model1 = mk_model()
model2 = mk_model()

Now model1 and model2 will share the same weights for the first layer named "BOB"

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/6832ab1f-5370-4841-bc30-d4b2de6109db%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sergey O.

unread,
Aug 3, 2018, 8:43:39 AM8/3/18
to Pramod Bachhav, Keras-users
Oh sorry, I was wrong about the second example (ignore that part)
if you do:
print(model1.trainable_weights)
print(model2.trainable_weights)
model1 is using "BOB" and model2 is using "BOB2" weights

Pramod Bachhav

unread,
Aug 3, 2018, 9:20:40 AM8/3/18
to kings...@gmail.com, keras...@googlegroups.com
HI,
Thanks for you reply.

Maybe I wasn't clear before.

The weight values are same (as I am using fix seed) in both w1 and w2.  but the structures of both weight matrices w1 and w2 are different (please look at the attached snapshot).
So it seems BN layer structure gets changed
i.e. if I use a wrapper around model1 to get model2, the prediction is same but the weight structure gets changed.

There is no issue if BN layer is absent.
I was trying to understand this strange behaviour. I wanted structure of the weight matrix w2 same as w1.
  

To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
weights.JPG

Sergey O.

unread,
Aug 3, 2018, 10:18:08 AM8/3/18
to Pramod Bachhav, Keras-users
The weights should be identical because they are shared across the two models.
You should get the same output for both models, before and after training, regardless if you fit "model1" or "model2"

But I see what you are saying, for some reason the get_weights() function is returning the weights of the two "models" in a different order, but they should be identical.
I tried to debug this!

for i in range(len(model1.trainable_weights)):
  print(i,model1.trainable_weights[i],model2.trainable_weights[i])

output identical

for i in range(len(model1.non_trainable_weights)):
  print(i,model1.non_trainable_weights[i],model2.non_trainable_weights[i])

output identical

for i in range(len(model1.get_weights())):
  print(i,model1.get_weights()[i].shape,model2.get_weights()[i].shape)

output different

The get_weights() function is returning the trainable/(non-trainable weights from the batchnorm layer) in different order between the two "models"...

Here is what the get_weights() function does:
def get_weights(self):
  weights = []
  for layer in self.layers:
    weights += layer.weights
    return K.batch_get_value(weights)

Let's reproduce to see what happens:

for i in range(len(model1.layers)):
  for j in range(len(model1.layers[i].weights)):
    print("model1",i,j,model1.layers[i].weights[j])

for i in range(len(model2.layers)):
  for j in range(len(model2.layers[i].weights)):
    print("model2",i,j,model2.layers[i].weights[j])

It appears the trainable weights are printed first and the non-trainable weights are printed last for each layer!
Since the first layer of model2 is model1 (see print(model2.layers) or (print(model2.summary(120)), it returns the all the weights from model1 (regardless of what layer they are part of in model1) in the order of trainable followed by non-trainable.

This feels like a bug...
Not sure if we should try fix the get_weights() function or hack an alternative way to get the weights in the right order.

Sergey O.

unread,
Aug 3, 2018, 10:43:03 AM8/3/18
to Pramod Bachhav, Keras-users
Assuming you actually want model1 and model2 to share the same weights and output the weights in the same order when you call get_weights()

I guess one way would be to make a model0
Then make a model1 wrapper to model0 and a model2 wrapper to model0 (just like you are currently doing between model2 and model1).

Then model1 == model2 and return the weights from model0 in the same order when you call model1.get_weights() or model2.get_weights()


Daπid

unread,
Aug 3, 2018, 11:19:43 AM8/3/18
to Sergey O., Pramod Bachhav, Keras-users
On 3 August 2018 at 16:18, Sergey O. <kings...@gmail.com> wrote:
It appears the trainable weights are printed first and the non-trainable weights are printed last for each layer!
Since the first layer of model2 is model1 (see print(model2.layers) or (print(model2.summary(120)), it returns the all the weights from model1 (regardless of what layer they are part of in model1) in the order of trainable followed by non-trainable.

This feels like a bug...
Not sure if we should try fix the get_weights() function or hack an alternative way to get the weights in the right order.

I think you are on to something. It is possible that fixing that would also fix this problem:

https://github.com/keras-team/keras/issues/10784


/David.

Pramod Bachhav

unread,
Aug 3, 2018, 11:41:49 AM8/3/18
to Keras-users
HI,

Thanks for the explanation.
Will try that hack. Hopefully this weird behaviour of get_weights gets solved.

Pramod Bachhav

unread,
Aug 3, 2018, 11:43:46 AM8/3/18
to Keras-users
Maybe, seems the similar issue. Tracking ....
Thanks

Ted Yu

unread,
Aug 4, 2018, 12:23:43 PM8/4/18
to Keras-users
Would utilizing layers_by_depth help stabilize the order of the weights ?
I was thinking of something like the following:

    def get_weights_by_depth(self):
        weights = []
        for depth, layer in self.layers_by_depth.items():
            weights += layer.weights
        return K.batch_get_value(weights)

To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

Ted Yu

unread,
Aug 4, 2018, 12:33:20 PM8/4/18
to Keras-users
I tried what I suggested (after fixing typos) - it didn't help.

FYI

Sergey O.

unread,
Aug 4, 2018, 2:50:19 PM8/4/18
to Ted Yu, Keras-users
The issue is that the entire model is a single "layer". You need to iterate through weights within a layer by the order of the opts that consistue that sequential-model-layer. Getting the trainable and non-trainable weights for each opt.

I think the issue is more in the reference creation within "layer.weights" and less due to the get_weights() function.

Ted Yu

unread,
Aug 5, 2018, 7:28:42 PM8/5/18
to kings...@gmail.com, keras...@googlegroups.com
I noticed that the layers under Sequential layer of model2, when printed, have reverse order compared to the layers from model1.

After some experiment, the following method returns identical weights for model1 and model2.

    def get_normalized_weights(self):
        """Retrieves the weights of the model.

        # Returns
            A flat list of Numpy arrays.
        """
        weights = []
        for depth, layers in self.layers_by_depth.items():
            for layer in layers:
                if layer.__class__.__name__ == 'Sequential':
                    sorted_layers1 = layer.layers
                    sorted_layers1.reverse()
                    for layer1 in sorted_layers1:
                        weights += layer1.trainable_weights
                        weights += layer1.non_trainable_weights
                else:
                    weights += layer.trainable_weights
                    weights += layer.non_trainable_weights
        return K.batch_get_value(weights)

Comment is welcome.
Reply all
Reply to author
Forward
0 new messages