The weights are stored as Tensors, which keep a bit of metadata in a regular JavaScript object but refer to the actual data in a block of GPU memory. You can obtain the weights Tensor via getWeights and then pass it to setWeights; this is effectively passing a reference because the GPU memory is left alone. If on the other hand you call getWeights().data() to get a Float32Array and then create a new Tensor from that to pass back to setWeights(), that would incur a round trip from GPU to main memory and back, which it sounds like you are trying to avoid.
(I'd be surprised if avoiding that round trip is worth the effort, BTW-- unless the model is very large and you're reloading it extremely frequently, the GPU upload/download time is probably negligible compared to the rest of the work. In your shoes I'd measure whether this is actually a performance concern before trying too hard to optimize it).
Of course, if you're calling tf.disposeVariables() as we discussed before, that blows away the GPU memory. So if you really want to pursue this, I think you'd have to sort out which weights you want to keep, and manually dispose the rest to avoid any leakage. There is internal plumbing about reference counting, Tensor.keep() blocking GPU memory collection, etc. that I'd have to review to be sure about exactly how to do this correctly. So before we get into that, are you sure it will really help with your use case?
-ds