Applying a time distributed layer on a sequential model errors

989 views
Skip to first unread message

Rodrigo Pimentel

unread,
Sep 4, 2018, 12:04:51 PM9/4/18
to TensorFlow.js Discussion
Hi,

I'm trying to port a sequential attention layer from keras to tfjs. So far i noticed we don't have a `batch_dot` implementation for TF.js, but was able to work around it, the problem i have is when i pass a model to the time distributed layer. On keras it works just fine, but on TF.js  i get an error `TypeError: p.reshape is not a function` originated on the recurrent layer definition file. Here is a codepen replicating the error i get: https://codepen.io/rodrigopivi/pen/WgOpEK?editors=1012

Not sure how to fix it, any help is mostly welcome. Also, if somebody already implemented the batch_dot operation or this kind of attention layer, please let me know.

Thank you very much.  This is the attention implementation for keras i have working just fine:

class TimeSeriesAttention(Layer):
def __init__(self, **kwargs):
if 'input_shape' not in kwargs and 'input_dim' in kwargs:
kwargs['input_shape'] = (kwargs.pop('input_dim'),)
super(TimeSeriesAttention, self).__init__(**kwargs)
self.input_spec = InputSpec(ndim=3)
self.supports_masking = True

def build(self, input_shape):
dimensions = input_shape[2]
timed = keras.models.Sequential(name='per_time_step')
timed.add(keras.layers.Dense(dimensions, input_shape=(dimensions,), kernel_initializer='zeros'))
timed.add(keras.layers.Activation('softmax'))
# Xavier initializatiton is good for tanh activation
timed.add(keras.layers.Dense(dimensions, kernel_initializer='glorot_normal'))
timed.add(keras.layers.Activation('tanh'))
self.timed = keras.layers.TimeDistributed(timed)
self.trainable_weights = self.timed.trainable_weights
self.non_trainable_weights = self.timed.non_trainable_weights
self.built = True

def call(self, inputs):
encoded = self.timed(inputs)
self_attended = K.batch_dot(inputs, K.permute_dimensions(encoded, (0, 2, 1)))
attention = K.softmax(self_attended)
attention = K.permute_dimensions(attention, (0, 2, 1))
return K.batch_dot(attention, inputs)

def compute_output_shape(self, input_shape):
return input_shape



yass...@google.com

unread,
Sep 5, 2018, 10:27:23 AM9/5/18
to TensorFlow.js Discussion
Looking at your codepen it looks like you are passing a tf.Model to the layer property of timeDistributed. From the docs (https://js.tensorflow.org/api/latest/#layers.timeDistributed) it looks like timeDistributed expects a layer and not a model. When I passed the dense layer directly to it, it worked without error.

Yannick

Rodrigo Pimentel Villasante

unread,
Sep 5, 2018, 10:33:56 AM9/5/18
to yass...@google.com, TensorFlow.js Discussion
Thank you very much for your response. Yes, i want to port that keras code since the TF.js api is almost identical. Are there any plans to implement this functionality of accepting a model at te timeDistributed layer as keras does? And how much effort would be to implement this, (maybe i can try to write a custom RNN layer, just not sure the amount of work required for this atm)

--
You received this message because you are subscribed to the Google Groups "TensorFlow.js Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfjs+uns...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/tfjs/.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfjs/eca9fcbb-429a-4816-adb7-197b0d0c9842%40tensorflow.org.

Shanqing Cai

unread,
Sep 5, 2018, 1:51:20 PM9/5/18
to TensorFlow.js Discussion, yass...@google.com
Rodrigo and Yannick, 

You are both right. The issue goes away when you use a plain Layer object, instead of a Model. But a Model is a Layer in Keras and TensorFlow.js, so what Rodrigo is trying to do is legitimate.

It fails due to a bug in which the TimeDistributed layer handles array of step outputs currently. I filed an issue to track this: https://github.com/tensorflow/tfjs/issues/681. We will try to fix it soon.

Shanqing

Rodrigo Pimentel Villasante

unread,
Sep 6, 2018, 3:54:24 PM9/6/18
to Shanqing Cai, TensorFlow.js Discussion, yass...@google.com
Thank you very much Shanqing, just saw your fix got merged!

Shanqing Cai

unread,
Sep 6, 2018, 3:55:41 PM9/6/18
to Rodrigo Pimentel, TensorFlow.js Discussion, Yannick Assogba
Sure thing. Beware that the fix won't take effect on npm or any of the CDNs until the next release of tensorflow.js happens.
--
---
Shanqing Cai
Software Engineer
Google

Rodrigo Pimentel Villasante

unread,
Sep 12, 2018, 9:34:35 AM9/12/18
to Shanqing Cai, TensorFlow.js Discussion, yass...@google.com
Hi Shanqing,

Thank you again for your help on fixing this issue, after this PR got merged (https://github.com/tensorflow/tfjs-layers/pull/315) i was able to manually build the tfjs layers master, and was able to pass a model as layer as expected, but i think i found another (possible related bug). When training a model that has a layer that is a model itself, and pass the validationSplit as a training parameter, the code will only work if the validation split is `0.5` because the training and validation tensors will have the exact same shape, but if the validationSplit argument has another value (e.g.: 0.2), it throws the error `Error: Input 0 is incompatible with layer time_distributed_TimeDistributed1: expected shape=4,1,2, found shape=2,1,2.`

Here is an example typescript code for this problem:

import * as tf from '@tensorflow/tfjs';
import { InputSpec } from '@tensorflow/tfjs-layers/dist/engine/topology';
class TT extends tf.layers.Layer {
public static className = 'TT';
public className = TT.className;
public timed: tf.layers.Layer | null = null;
constructor(config?: any) {
super(config || {});
this.inputSpec = [new InputSpec({ ndim: 3 })];
}
public build(inputShape: tf.Shape): void {
const dimensions = inputShape[2];
const timed = tf.sequential({ name: 'per_time_step' });
timed.add(tf.layers.dense({ inputShape: [dimensions], kernelInitializer: 'zeros', units: dimensions }));
timed.add(tf.layers.activation({ activation: 'softmax' }));
this.timed = tf.layers.timeDistributed({ layer: timed });
this.trainableWeights = this.timed.trainableWeights;
this.nonTrainableWeights = this.timed.nonTrainableWeights;
this.built = true;
}
public call(inputs: tf.Tensor[], kwargs: any) {
if (!this.built || !this.timed) { throw new Error(); }
return tf.tidy(() => {
this.invokeCallHook(inputs, kwargs);
return (this.timed as tf.layers.Layer).apply(inputs) as tf.Tensor;
});
}
public computeOutputShape(inputShape: tf.Shape) { return inputShape; }
}
tf.serialization.SerializationMap.register(TT);
const inputs = tf.input({ dtype: 'float32', shape: [1, 2] });
const lstm = tf.layers.lstm({ units: 2, returnSequences: true }).apply(inputs) as tf.SymbolicTensor;
const timeAttention = new TT({}).apply(lstm) as tf.SymbolicTensor;
const model = tf.model({ inputs, outputs: timeAttention });
const optimize = tf.train.adam(0.0066, 0.0025, 0.1);
model.compile({ loss: 'categoricalCrossentropy', metrics: ['accuracy'], optimizer: optimize });
const inp = tf.tensor3d([[[1,1]],[[2,2]],[[3,3]],[[4,4]],[[5,5]],[[6,6]]],[6,1,2]);
const out = tf.tensor3d([[[1,0]],[[2,0]],[[3,0]],[[4,0]],[[5,0]],[[6,0]]], [6, 1, 2]);
(async () => {
// NOTE: Here is the issue, if validationSplit is 0.5 this code works, else it fails
await model.fit(inp, out, { validationSplit: 0.5 });
})();

For now i am just using 0.5 as validation split as a workaround, else not sure where this problem can be fixed. Thank you very much!

Shanqing Cai

unread,
Sep 18, 2018, 10:36:59 AM9/18/18
to TensorFlow.js Discussion, ca...@google.com, yass...@google.com
Hi Rodrigo,

I looked into this issue a little. Apart from the workaround you mentioned (which is not satisfactory), you can add a line to your custom TT.build() function, so that it looks like:

```
  build(inputShape) {
    const dimensions = inputShape[2];
    const timed = tf.sequential({ name: 'per_time_step' });
    timed.add(tf.layers.dense({
      inputShape: [dimensions],
      kernelInitializer: 'zeros',
      units: dimensions
    }));
    timed.add(tf.layers.activation({ activation: 'softmax' }));
    this.timed = tf.layers.timeDistributed({layer: timed});
    this.timed.build(inputShape);  // <-- Added line.
    this.trainableWeights = this.timed.trainableWeights;
    this.nonTrainableWeights = this.timed.nonTrainableWeights;
    this.built = true;
  }
```

Then your model.fit() call will work with any valid validationSplit values.
This not only solves the problem, but also feels more correct than your original code, because it is strange to build a layer object, but not its constituent layer objects.

Let me know if this sounds reasonable to you.

Best,
Shanqing

Rodrigo Pimentel Villasante

unread,
Sep 19, 2018, 7:58:46 PM9/19/18
to Shanqing Cai, TensorFlow.js Discussion, yass...@google.com
Thank you very much for your help Shanqing, it works great now!

Reply all
Reply to author
Forward
0 new messages