Keras LSTM bidirectional model converted to Tensorflowjs not producing correct inference

112 views
Skip to first unread message

Jay M

unread,
Feb 11, 2019, 9:48:30 AM2/11/19
to TensorFlow.js Discussion
Hello,

I have created & trained a Keras based LSTM bidirectional model to classify video. This model works awesome and classifies the videos with 90+ accuracy. But when I converted this model to tensoflorjs model using the tensorflorjs_converter tool and used the same on browser, the model always throws the same output (top 3 results) for any video input - BasketballDunk; prob. 0.860, BalanceBeam; prob. 0.088, BodyWeightSquats; prob. 0.024

I have checked all the inputs, their shape, etc. that are given to the LSTM bidirectional model and can't find any issues. But still the inference from LSTM bidirectional model is always the same irrespective of the video input. Please help me fixing this issue. All the required details are below.

(entire model is based on the examples given in this github repository by Xianshun Chen (chen0040) ->https://github.com/chen0040/keras-video-classifier)

Details of the model:
 - uses MobileNet model to extract features
 - uses LSTM bidirectional model to take-in extracted features and classify the video as one of 20 classes

Dataset used:
 - UCF101 - Action Recognition Data Set (http://crcv.ucf.edu/data/UCF101.php)

Tensorflowjs converted model:
 - tensorflowjs converted model, sample videos and html file to test are all in this drive location as zip file: https://drive.google.com/open?id=1k_4xOPlTdbUJCBPFyT9zmdB3W5lYfuw0
 - to test the model, just unzip and build using 'yarn' and run using 'yearn watch'
 - index.html has the instructions to test

NOTE: I have tried LSTM model (unidirectional) and same issue is with that converted model as well. Only difference is that it produces 'Billards' as the top prediction with probability over 0.95.

Regards,
Jay

Shanqing Cai

unread,
Feb 11, 2019, 9:52:58 AM2/11/19
to Jay M, TensorFlow.js Discussion
Can you create an issue for this on GitHub? https://github.com/tensorflow/tfjs/issues



--
You received this message because you are subscribed to the Google Groups "TensorFlow.js Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfjs+uns...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/tfjs/.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfjs/dfcf3615-d3fc-40b6-8843-241d5c264a39%40tensorflow.org.


--
---
Shanqing Cai
Software Engineer
Google

Nikhil Thorat

unread,
Feb 11, 2019, 10:39:40 AM2/11/19
to Shanqing Cai, Jay M, TensorFlow.js Discussion
Also when you file the issue make sure you mention which browser you are using.

Jay M

unread,
Feb 11, 2019, 11:45:31 AM2/11/19
to TensorFlow.js Discussion, ca...@google.com, jaya...@googlemail.com
@Shanqing, here is the issue I created on Github. Hope this helps. Let me know if more details are required.

@Nikhil, here are the browsers I have tested on. Yes, have updated the Github issue with these
Firefox 65.0 (64-it) on Windows 10
Microsoft Edge 42.17134.1.0 on Windows 10

Regards,
Jay

Jay M

unread,
Feb 11, 2019, 9:45:38 PM2/11/19
to TensorFlow.js Discussion, ca...@google.com, jaya...@googlemail.com
Just to add...in the same environment, I had tried converting my finetuned MobileNet Keras model to tensorflowjs. This works perfectly in the browser. So, I am sure the environment or the tensorflow versions are perfect without any issues. Somehow LSTM based keras model converted to tensorflowjs are not working (returning the same predictions all the time).

I know LSTM models are not supported in Tensorflow Lite as well (tflite converter throws error when trying to convert a keras model with LSTM)...so, wondering if tensorflowjs also has similar issues specific to LSTM.

Jayasudan Munsamy

unread,
Feb 21, 2019, 7:43:38 AM2/21/19
to TensorFlow.js Discussion, ca...@google.com
Hi @caisq,
Found out the reasons for tfjs converted model not producing the correct inference...at last :)

Reasons:
1. Input to LSTM model had NaN in them! Though I was passing the extracted features from MobileNet model to LSTM, features.**dataSync() was not used**. Because of this, when I added the extracted features into a tf.buffer they were added as NaN. (when I printed values in log just before adding to tf.buffer, they printed values correctly!...this is strange). So, when I used dataSync() on the extracted features, they got added into tf.buffer correctly.
2. **Use of tf.buffer() to store the extracted features** (from MobileNet) and converting them to tensors before passing to LSTM model. Instead I used tf.stack() to store the extracted features and then passed the stacked tensor to LSTM model. (I understand that tf.stack() does the equivalent of np.array())

Thanks anyways for your time.

BTW, Doc for dataSync() says **'This blocks the UI thread until the values are ready, which can cause performance issues'**. So, is it safe to use? In my case, code did not work without dataSync()...so, wondering if there is any other replacement/equivalent method for dataSync() that will not create performance issue. Let me know.

Also, I noticed some minor differences in predictions between python model and tfjs model on few occasions. Is this supposed to happen? or am I missing something?

Regards,
Jay

Shanqing Cai

unread,
Feb 21, 2019, 9:25:00 AM2/21/19
to Jayasudan Munsamy, TensorFlow.js Discussion
(Since this is a little more involved than a typical question answering, I'll continue the conversation
here instead of on GitHub or Stackoverflow)

See my inline comments below.

On Thu, Feb 21, 2019 at 7:43 AM Jayasudan Munsamy <jaya...@googlemail.com> wrote:
Hi @caisq,
Found out the reasons for tfjs converted model not producing the correct inference...at last :)

Glad you figure it out! Great job debugging! As I wrote in

this doesn't appear to be a bug in LSTM or Bidirectional in TF.js.
 

Reasons:
1. Input to LSTM model had NaN in them! Though I was passing the extracted features from MobileNet model to LSTM, features.**dataSync() was not used**. Because of this, when I added the extracted features into a tf.buffer they were added as NaN. (when I printed values in log just before adding to tf.buffer, they printed values correctly!...this is strange). So, when I used dataSync() on the extracted features, they got added into tf.buffer correctly.

Sorry I don't fully understand. Aren't tf.buffer() supposed to give you an
all-zero tensor buffer by default? How did you do the adding to cause the NaNs?
 
2. **Use of tf.buffer() to store the extracted features** (from MobileNet) and converting them to tensors before passing to LSTM model. Instead I used tf.stack() to store the extracted features and then passed the stacked tensor to LSTM model. (I understand that tf.stack() does the equivalent of np.array()) 

Thanks anyways for your time.

BTW, Doc for dataSync() says **'This blocks the UI thread until the values are ready, which can cause performance issues'**. So, is it safe to use? In my case, code did not work without dataSync()...so, wondering if there is any other replacement/equivalent method for dataSync() that will not create performance issue. Let me know.

`await data()` is preferred over `dataSync()` whenever possible, but the reasons related to the UI-thread blocking.
 

Also, I noticed some minor differences in predictions between python model and tfjs model on few occasions. Is this supposed to happen? or am I missing something?

How minor are the differences? Have you debugged to find out whether the difference originates
from the MobileNet stage or the LSTM stage?
 

Regards,
Jay

Nikhil Thorat

unread,
Feb 21, 2019, 9:33:33 AM2/21/19
to Shanqing Cai, Jayasudan Munsamy, TensorFlow.js Discussion
For the future, you can use debug mode: tf.ENV.set('DEBUG', true) to automatically find NaNs. TensorFlow.js 1.0, which will be released in a few weeks, will have a method tf.enableDebugMode().

--
You received this message because you are subscribed to the Google Groups "TensorFlow.js Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfjs+uns...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/tfjs/.

Jayasudan Munsamy

unread,
Feb 21, 2019, 12:58:18 PM2/21/19
to Shanqing Cai, TensorFlow.js Discussion
Hi Shanqing,
Thanks for checking. My comments below.

> this doesn't appear to be a bug in LSTM or Bidirectional in TF.js.
Yes, I agree with you now :)

>Sorry I don't fully understand. Aren't tf.buffer() supposed to give you an all-zero tensor buffer by default? How did you do the adding to cause the NaNs?
I created an empty tf.buffer with the shape I want and then using .set() method of tf.buffer to set extracted feature values in it. This strangely set NaNs in the tf.buffer. Hope this clarifies your doubt.

> `await data()` is preferred over `dataSync()` whenever possible, but the reasons related to the UI-thread blocking.
Ok, will give it a try. 

> How minor are the differences? Have you debugged to find out whether the difference originates from the MobileNet stage or the LSTM stage?
Its at the LSTM stage. I need to do bit more testing to give some more details. Will do that tomorrow.

Regards,
Jay

Jayasudan Munsamy

unread,
Feb 21, 2019, 12:59:33 PM2/21/19
to Nikhil Thorat, Shanqing Cai, TensorFlow.js Discussion
debug mode will be of great help. Thanks for the info Nikhil.

Regards,
Jay

Daniel Smilkov

unread,
Feb 21, 2019, 2:03:06 PM2/21/19
to Jayasudan Munsamy, Nikhil Thorat, Shanqing Cai, TensorFlow.js Discussion
Hi Jay,

I created an empty tf.buffer with the shape I want and then using .set() method of tf.buffer to set extracted feature values in it. This strangely set NaNs in the tf.buffer. Hope this clarifies your doubt. 
Can you share the few lines of code where you make the buffer and used set(). I'm very curious. Note the arguments of set(value, row, col, depth, ...) where the first argument is the value and the following args are the coordinates. Also, using `buffer.set()` to set the value one by one is very slow. Use tf.buffer(shape, dtype, values) to immediately create a buffer, where values are TypedArray.

Daniel


--
You received this message because you are subscribed to the Google Groups "TensorFlow.js Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfjs+uns...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/tfjs/.

Jayasudan Munsamy

unread,
Feb 21, 2019, 10:13:27 PM2/21/19
to Daniel Smilkov, Nikhil Thorat, Shanqing Cai, TensorFlow.js Discussion
Hi Daniel,
I think you missed to check the context in which I answered that question about tf.buffer.set() :) ...here it is...

Context: Though I was passing the extracted features from MobileNet model to LSTM, **i missed to call dataSync() on the features tensor**. Because of this, when I added the extracted features into a tf.buffer they were added as NaN. (when I printed values in log just before adding to tf.buffer, they printed values correctly!...this is strange). So, when I used dataSync() on the extracted features, they got added into tf.buffer correctly.

Yep, I understand how to use tf.buffer and it worked fine. But, it was not useful for my purpose and had to replace it with tf.stack() to pass the inputs to LSTM. Anyways, I don't see any issues with tf.buffer.set() method as such. So, we are good.

Regards,
Jay
Reply all
Reply to author
Forward
0 new messages