2GB message limit

217 views
Skip to first unread message

Gareth Williams

unread,
Apr 9, 2022, 9:59:07 AM4/9/22
to prot...@googlegroups.com
Hi there,

I'm creating a model in Tensorflow, which I think is erroring because it's hitting the protobuf message hard limit when compiling with `@tf.function`.

I printed the message length just before the error and it's 2217676732, so it makes sense that it's hitting the 2GB limit.

I'm just wondering, is there any way to get around this that anyone knows of? I read about Cap'n Proto here https://stackoverflow.com/a/34186672 but am not sure how I'd be able to connect this to TensorFlow. 

Is there any way to split/recombine the messages? Perhaps this is more of a TensorFlow question, I'll perhaps cross-post on the TensorFlow group in case anyone there has ideas.

Many thanks,
Gareth

Deanna Garcia

unread,
May 2, 2022, 2:16:12 PM5/2/22
to Protocol Buffers
In general, the best ways to get around the message limit are to either split the messages and then recombine them or streaming. See this documentation for more information. Since this is through Tensorflow some of these techniques might not work, so reaching out to them directly is probably a good idea.

Gareth Williams

unread,
Jun 14, 2022, 11:25:47 AM6/14/22
to Protocol Buffers
Thanks Deanna,

I actually figured out what the problem was. 

The function factored_joint_mvn in tensorflow_probability/python/sts/internal/util.py uses LinearOperatorBlockDiag, without passing a name argument.

This in turn, if no name is provided, simply concatenates all the names of all input operators.

```
if name is None:
    # Using ds to mean direct sum.
    name = "_ds_".join(operator.name for operator in operators
```

Because we have a rather large state space model, this was resulting in names in the graph which were enormous (literally 10s of 1000s of characters). This across a large graph meant ridiculous protobuf messages.

This was the only thing sending the protobuf message lengths sky high. Using our own internal equivalent and assigning a name attribute reduced the protobuf message sizes to normal levels and reduced the memory of the process when running by several GB.

Hope that's helpful to someone else who hits the same problem one day!

Gareth

Reply all
Reply to author
Forward
0 new messages