Sending a single huge message in one unary call vs sending chunks of messages with streaming

59 views
Skip to first unread message

kevin...@gmail.com

unread,
May 21, 2020, 6:17:33 PM5/21/20
to grpc.io
Hey All,

I have been testing and benchmarking my application with gRPC, I'm using gRPC C++. I have noticed a performance difference with following cases:

1. sending large size payload (100 MB+) with a single unary rpc
2. breaking the payload into pieces of 1 MB and sending them as messages using client streaming rpc.

For both cases, server side will process the data after receiving all of them and then send a response. I have found that 2 has smaller latency than 1.

I don't quite understand in this case why breaking up larger message into smaller pieces out performs the unary call. Wondering if anyone has any insight into this.

I have searched online and found a related github issue regarding optimal message size for streaming large payload: https://github.com/grpc/grpc.github.io/issues/371

Would like to hear any ideas or suggestions. 

Thx.

Best,
Kevin

Josh Humphries

unread,
May 21, 2020, 8:42:27 PM5/21/20
to kevin...@gmail.com, grpc.io
Is the server doing anything? One reason why streaming typically would outperform unary is because you can begin processing as soon as you receive the first chunk. Whereas with a unary RPC, your handler cannot be called until the entire request has been received and unmarshalled.

If this is a load test, where you are sending a significant load at the server and measuring the difference, then the memory access pattern of streaming may be friendlier to your allocator/garbage collector since you are allocating smaller, shorter-lived chunks of memory. (And there is of course the obvious advantage for memory use that you don't need to buffer the entire 100mb when you do streaming.)

If this is a no-op server, I would not expect there to be much difference in performance -- in fact, streaming may have a slight disadvantage due to the envelope and less efficient capability for compression (if you are using compression). Depending on the runtime implementation, there could be an advantage just due to pipelining: it's possible that your handler thread is handling unmarshalling of a message in parallel with a framework thread handling I/O and decoding the wire protocol. Whereas with a unary call, it's all handled sequentially.

----

Josh Humphries

FullStory  |  Atlanta, GA

Software Engineer

j...@fullstory.com



--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/26219adc-254e-4dc2-82a0-2b7f9513d41a%40googlegroups.com.
Message has been deleted

kevin...@gmail.com

unread,
May 21, 2020, 11:06:33 PM5/21/20
to grpc.io
Thx Josh for the reply.

let me clarify 

1. I'm sending 100 MB+ string as bytes type with protobuf, so what the server side is doing in the streaming case is, preallocates a size of 100 MB+ (size info provided from the first streaming message) and keep appending the broken up bytes sent through streaming rpc to it, after collecting and appending all the bytes, it will respond. For unary, once it receives the whole string, it will respond. So I would rather say the server is more like a no op than it is preprocessing the data. 

2. It is not a load test setting, both client and server are sync with single thread. No compression is used.

"Depending on the runtime implementation, there could be an advantage just due to pipelining: it's possible that your handler thread is handling unmarshalling of a message in parallel with a framework thread handling I/O and decoding the wire protocol. Whereas with a unary call, it's all handled sequentially." 

This sounds very interesting to me, I would like to see where it actually does this pipelining, do you have any reference code? Thx Josh!
To unsubscribe from this group and stop receiving emails from it, send an email to grp...@googlegroups.com.

Josh Humphries

unread,
May 21, 2020, 11:16:04 PM5/21/20
to kevin...@gmail.com, grpc.io
On Thu, May 21, 2020 at 11:06 PM <kevin...@gmail.com> wrote:
Thx Josh for the reply.

let me clarify 

1. I'm sending 100 MB+ string as bytes type with protobuf, so what the server side is doing in the streaming case is, preallocates a size of 100 MB+ (size info provided from the first streaming message) and keep appending the broken up bytes sent through streaming rpc to it, after collecting and appending all the bytes, it will respond. For unary, once it receives the whole string, it will respond. So I would rather say the server is more like a no op than it is preprocessing the data. 

2. It is not a load test setting, both client and server are sync with single thread. No compression is used.

"Depending on the runtime implementation, there could be an advantage just due to pipelining: it's possible that your handler thread is handling unmarshalling of a message in parallel with a framework thread handling I/O and decoding the wire protocol. Whereas with a unary call, it's all handled sequentially." 

This sounds very interesting to me, I would like to see where it actually does this pipelining, do you have any reference code? Thx Josh!

There's no explicit pipelining. It's the fact that your handler code is started as soon as the headers are received. And when it asks for the next message in the stream, it may handle unmarshaling. I don't think Java does it this way, but Go does. With the generated Go stream clients, your handler goroutine receiving a message is where the actual protobuf unmarshaling happens, which can run concurrently with the gRPC framework goroutines, which may be handling decoding of subsequent frames in the HTTP/2 stream. But with a unary RPC, the request unmarshaling cannot begin until the last byte of the request is received.
 
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/e53109a4-b910-4e9c-b2f7-dda48e6deb5c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages