How and where are model weights stored?

100 views
Skip to first unread message

emmanuel chappat

unread,
Jul 20, 2018, 11:54:36 AM7/20/18
to TensorFlow.js Discussion
Hi,

In my use case, I am frequently deleting then rebuilding a model using `fromConfig`/`getConfig`  then setting `setWeights` and resume training. I am wondering how are the weights stored, could I keep possible to pass them to the model by reference so that deleting the model would not delete / reset the weights ?

Thanks a lot

David Soergel

unread,
Jul 20, 2018, 12:42:45 PM7/20/18
to emmanuel chappat, TensorFlow.js Discussion
The weights are stored as Tensors, which keep a bit of metadata in a regular JavaScript object but refer to the actual data in a block of GPU memory.  You can obtain the weights Tensor via getWeights and then pass it to setWeights; this is effectively passing a reference because the GPU memory is left alone.  If on the other hand you call getWeights().data() to get a Float32Array and then create a new Tensor from that to pass back to setWeights(), that would incur a round trip from GPU to main memory and back, which it sounds like you are trying to avoid.

(I'd be surprised if avoiding that round trip is worth the effort, BTW-- unless the model is very large and you're reloading it extremely frequently, the GPU upload/download time is probably negligible compared to the rest of the work.  In your shoes I'd measure whether this is actually a performance concern before trying too hard to optimize it).

Of course, if you're calling tf.disposeVariables() as we discussed before, that blows away the GPU memory.  So if you really want to pursue this, I think you'd have to sort out which weights you want to keep, and manually dispose the rest to avoid any leakage.  There is internal plumbing about reference counting, Tensor.keep() blocking GPU memory collection, etc. that I'd have to review to be sure about exactly how to do this correctly.  So before we get into that, are you sure it will really help with your use case?

-ds

--
You received this message because you are subscribed to the Google Groups "TensorFlow.js Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfjs+uns...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/tfjs/.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfjs/c8477b72-54c0-48a0-b0c2-eb3cba6c5677%40tensorflow.org.

Shanqing Cai

unread,
Jul 20, 2018, 12:47:37 PM7/20/18
to David Soergel, emmanuel chappat, TensorFlow.js Discussion
Emmanuel,

In addition to what David wrote, may I ask why you need to call `fromConfig` / `toConfig` and rebuild a model? Depending on your use case, there might be easier and more efficient ways.

Best,
Shanqing



--
---
Shanqing Cai
Software Engineer
Google

emmanuel chappat

unread,
Jul 20, 2018, 1:29:02 PM7/20/18
to TensorFlow.js Discussion, soe...@google.com, emmanuel...@gmail.com
Thanks a ton for the input guys.

The reason why I am using the `fromConfig` is because I am using a react/redux app to draw a GUI of the model with witch the user can interact. I can't store a reference to the actual model in the react state for performance issue, so I am using the `getConfig` object as a "bridge" between react and TF. Then, I only build the model and feed it weights when it's needed (training or inference). The rest of the time the user interacts with the "representation of the mode".

To illustrate the back and forth imagine the following use case:  User build a model, start training, then decide to remove one hidden layer to then resume the training. To do so I would update the model config object in react, then once the training is resumed, rebuild that model using `fromConfig` and `setWeight`, removing only the weight that is not needed anymore.

From your answers, I guess I could either:
- Grab the weight reference using `getWeights` the feed them to the model on rebuild, however i'd then have to figure out cases where layers have been added and dont hold the weight data for these layers.
- Duplicate the weight data and hold onto that, re-uploading it again to GPU when building the model. Maybe less efficient but as pointed by David, not sure by how much.

Would you guys recommend another approach ?
Reply all
Reply to author
Forward
0 new messages