tf.keras Custom Dense Layer with Custom Matrix Multiplication has None for Gradient

443 views

Skip to first unread message

saleem ullah

unread,

Mar 8, 2019, 10:07:40 AM3/8/19

to Discuss

I am trying to implement my own custom Dense layer in Tensorflow version 1.12.0. I am following the instructions for defining custom layers from writing-your-own-keras-layers. Implementation of the custom Dense layer using tf.matmul(inputs,self.kernel) works perfectly fine. Given below is the description of the Dense layer.

class MyLayer(layers.Layer):

  def __init__(self, output_dim, **kwargs):
    self.output_dim = output_dim
    super(MyLayer, self).__init__(**kwargs)

  def build(self, input_shape):
    shape = tf.TensorShape((input_shape[1], self.output_dim))
    
    # Create a trainable weight variable for this layer.
    self.kernel = self.add_weight(name='kernel',
                                  shape=shape,
                                  initializer='uniform',
                                  trainable=True)
    
    
    super(MyLayer, self).build(input_shape)

  def call(self, inputs):
    
    y = tf.matmul(inputs,self.kernel)
        return (y)

The model is as follows:


model = tf.keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    MyLayer(20, input_shape=(1, 784)),
    #MyLayer(input_shape=(10,)),
    layers.Activation('relu'),
    MyLayer(10,input_shape=(1, 20)),
    #MyLayer(input_shape=(10,)),
    layers.Activation('relu'),
    keras.layers.Dense(10, input_shape=(1, 10), activation='softmax')])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels, epochs=1, batch_size=1,
         validation_data=(val_data, val_labels))

However, when I replace the tf.matmul () with my own custom python-based matrix multiplication algorithm, it gives the following errors.

Traceback (most recent call last):
  File "custom_layer.py", line 156, in <module>
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

My custom matrix multiplication algorithm is using three nested loops to compute the output.

Can somebody please clarify why the custom matrix multiplication is has 'None' for the gradient. Or can somebody guide me what am I doing wrong here.