Strange "An operation has `None` for gradient." error

asp...@cs.unibo.it

unread,

Sep 18, 2018, 11:02:35 AM9/18/18

to Keras-users

Hallo. I was playing with the variational autoencoder of the keras blog
at https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py

At line 186 the loss function is defined as a sum of the reconstruction error an KL divergence

vae_loss = K.mean(reconstruction_loss + kl_loss)

The weird fact is that if I replace it with, say

vae_loss = K.mean(kl_loss)

I get the following compilation error:

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Any idea?

Thanks.

Ted Yu

unread,

Sep 18, 2018, 11:25:53 AM9/18/18

to asp...@cs.unibo.it, keras...@googlegroups.com

I think this is because the gradient comes from reconstruction_loss term.

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/7deb1114-3f28-4251-a974-2b1c4595a759%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrea Asperti

unread,

Sep 18, 2018, 11:31:35 AM9/18/18

to Ted Yu, asp...@cs.unibo.it, keras...@googlegroups.com

You mean I need some explicit mention of the output layers in order

to compute the gradient for the parameters involved in those layers?

That's silly.

vae_loss = K.mean(0*reconstruction_loss + kl_loss)

perfectly compiles, and the compiler could do the transformation by itself.

From: Ted Yu <yuzh...@gmail.com>
Sent: Tuesday, September 18, 2018 5:25:13 PM
To: asp...@cs.unibo.it
Cc: keras...@googlegroups.com
Subject: Re: Strange "An operation has `None` for gradient." error

Ted Yu

unread,

Sep 18, 2018, 11:41:09 AM9/18/18

to andrea....@unibo.it, asp...@cs.unibo.it, keras...@googlegroups.com

I am interested to know how the compiler handles the '0*' case.

My assumption is that:

gradient of (0*reconstruction_loss + kl_loss) != gradient of (kl_loss)

mycal....@gmail.com

unread,

Dec 4, 2018, 5:23:34 PM12/4/18

to Keras-users

Oh! I think I may know why this error is happening. And yes, let me say up front that it's dumb that the 0 multiplication works when just leaving the term out doesn't.

My thinking is that we need to include reconstruction_loss so that there's a gradient defined for the neurons in the decoder. If you just use kl_loss, there's no gradient for the second half of the vae. Once you add in reconstruction_loss, you now have symbolic gradients defined for those neurons, so tensorflow will be happy.

In fact, I've run into a similar sort of issue in my own development. If I'm debugging with a couple of differently-weighted loss functions and want to set the weight of one of them to 0, I still need to define a loss function. I think that's because of the same reasons that I just laid out above.

Mihai Mehedint

unread,

Jul 2, 2019, 10:25:17 PM7/2/19

to Keras-users

I had a similar issue (with custom layers and models) and the solution for me was to build the layers in the def __init__() method before using them in call()

'self.lstm_custom_1 = keras.layers.LSTM(128,batch_input_shape=batch_input_shape, return_sequences=False,stateful=True)

self.lstm_custom_1.build(batch_input_shape)

self.dense_custom_1 = keras.layers.Dense(32, activation = 'relu')

self.dense_custom_1.build(input_shape=(batch_size, 128))'

Hope it helps

Reply all

Reply to author

Forward