Hmmm... Depending on what you are seeing there can be a lot of explanations.
1. HTTP/2 flow control prevents gRPC from sending. Note there is stream-level and connection-level flow control. By default grpc-java uses 1 MB as the window here, although your proxy is what is providing the window to the client. I only mention the 1 MB because the proxy will need to send data to a gRPC server and will be limited to 1 MB initially. Other implementations use 64 KB and auto-size the window.
- On the server-side there is a delay before we let more than 1 MB be sent. We wake up the application and route the request, run interceptors, and eventually the application stub will request the request message. At that point the server allows more to be sent. If you suspect this may be related to what you see, you can change the default window and see if you see different behavior. Be aware that the proxy will have its own buffering/flow control.
2. More than one request is being sent, so their are interleaved. We are currently interleave in 1 KB chunks
3. The client is slow doing protobuf encoding. We stream out data while doing protobuf encoding, so (with some other conditions I won't get into) it is possible to see the first part of a message before the end of the message has been serialized on the client-side. This is purely CPU-limited, but could be noticed during a long GC, for example
4. Laundry list of other things, like dropped packets