Hi gRPC dev team,
Recently, I am working on a pub/sub project on rhel7 system. We use gRPC as a messaging tool, and issue unary RPC call to publish to the gRPC server. Then server will send the message to the subscriber with server streaming RPC.
When we are performing performance test and try to test the throughput for the gRPC server, we find there is a very strange behavior: When we increase the rate for unary RPC call to the gRPC server (e.g., from 15k rps to 50 rps), we see a performance drop after some peak point.
After our investigate, we find that there are many "default-executor" threads occupying CPU when there is a performance drop, and when there is no such phenomenon, everything goes fine. I also post the CPU utils when there is a performance drop and not with client rps equaling 50k.
Is there any reasonable explanation for this? technically, when the system is saturated, the throughput should become a stable not a drop.
P.S., we know that gRPC has moved from "default-executor" to "Event-engine" thread model, but we also see a performance drop when we adopt event-engine in gRPC 1.62 for our cq-based async server. Could you help us to understand why it behaves such?
Thanks in advance!, I have also attached all the code we use to re-produce the issue. please check.