Gradient for tf.multinomial

272 views
Skip to first unread message

Sicheng Wang

unread,
Jul 24, 2018, 3:01:10 AM7/24/18
to Discuss
I am a user of Keras with TF backend. When implementing a custom layer in keras, I did something like this:

def build(self, input_shape):

    self.weight = self.add_weight(shape=(1,self.num_classes),trainable=True,name='prob')

def call(self,x):

   x_shape = K.shape(x)[0

   weight = K.softmax(self.weight)
   samples = tf.multinomial(tf.log(weight), x_shape,output_dtype='int32')
   return K.cast(K.transpose(SB),dtype='float32')
def compute_output_shape(self, input_shape):
   return (input_shape[0],1)

The gist is to get x_shape number of samples. However, after compiling and trying to train the layer, It raised the following error:

    raise ValueError('An operation has `None` for gradient. '

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.


Does that mean gradient is not implemented for  tf.multinomial?

Thank you.

Martin Wicke

unread,
Jul 24, 2018, 8:02:41 AM7/24/18
to sichengwa...@gmail.com, Discuss
Can you remove cast? Cast may not have a gradient.

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/6e9ceba1-40e1-4f60-a62b-38a090c45db3%40tensorflow.org.

Sicheng Wang

unread,
Jul 25, 2018, 1:34:43 AM7/25/18
to Discuss
Actually if I do

x = tf.Variable([[5,3,2]])

y = tf.cast(x,dtype=tf.float32)

z = tf.multinomial(x,10,output_dtype=tf.int32)


Then both gradients yield None:


tf.gradients(y,x)

[None]

tf.gradients(z,x)

[None]


So it seems there is no way out?

Sicheng

Martin Wicke

unread,
Jul 25, 2018, 1:58:20 AM7/25/18
to Sicheng Wang, Discuss
If the output type is int, the gradient will be none. Ints are not typically differentiable. 

Reply all
Reply to author
Forward
0 new messages