server = NettyServerBuilder.forPort(port)
.addService(testService)
.executor(Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors()))
.build()
.start();
final long startTime = System.currentTimeMillis();final long oneMinute = TimeUnit.MINUTES.toMillis(1);final RateLimiter rateLimiter = RateLimiter.create(1000);final StreamObserver<TestMessageRequest> requestObserver = client.asyncStub.testMessageRpc(client.replyObserver);while (System.currentTimeMillis() - startTime < oneMinute) {
rateLimiter.acquire(1);
threadPool.submit(() -> {
TestMessageRequest request = TestMessageRequest.getDefaultInstance();
requestObserver.onNext(request);
});
}
SEVERE: Exception while executing runnable io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1@48544a03 [java] java.lang.OutOfMemoryError: GC overhead limit exceeded
On my client, I am using Guava's RateLimiter to send messages in a bi-di stream at 1000 per second. (All using a shared channel and stub).Each message I am sending in a Runnable() just to parallelize the work.
Same behavior happens if I just call `onNext` directly without the task submission step.
Code roughly looks like:final long startTime = System.currentTimeMillis();final long oneMinute = TimeUnit.MINUTES.toMillis(1);final RateLimiter rateLimiter = RateLimiter.create(1000);final StreamObserver<TestMessageRequest> requestObserver = client.asyncStub.testMessageRpc(client.replyObserver);while (System.currentTimeMillis() - startTime < oneMinute) {
rateLimiter.acquire(1);
threadPool.submit(() -> {TestMessageRequest request = TestMessageRequest.getDefaultInstance();requestObserver.onNext(request);
});
}
So anywhere from the 20-60 second mark my server throws a:
SEVERE: Exception while executing runnable io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1@48544a03[java] java.lang.OutOfMemoryError: GC overhead limit exceeded
Am I doing something wrong? Is there any way to have the server support this high load.
That doesn't sound right, unless it is parallel across streams. onNext is not thread-safe; you'd need to hold a lock when calling it from multiple threads simultaneously. This isn't the cause of your problem, but it is a problem.
Is that 1000 QPS total, and all over one stream? Using a single stream can't use much more than a single core of processing (excluding protobuf and the application), so you may use some more streams. But 1k QPS is really low. We see 750k QPS ("Streaming secure throughput QPS (8 core client to 8 core server)") between a client and server with 8 cores each. Even with non-streaming RPCs we see 250k QPS.
Is that 1000 QPS total, and all over one stream? Using a single stream can't use much more than a single core of processing (excluding protobuf and the application), so you may use some more streams. But 1k QPS is really low. We see 750k QPS ("Streaming secure throughput QPS (8 core client to 8 core server)") between a client and server with 8 cores each. Even with non-streaming RPCs we see 250k QPS.I'm trying 1000 QPS continuously for a minute, all over one stream. FWIW, 500 QPS works fine and I tried that up to 5 minutes. For those charts you showed, are you able to show the server and client configs/code that produced those metrics? Also, I am still using GRPC 1.5, and I know there were performance improvements from then but I still think 1000 QPS is still very low for this version.
Anything else you could recommend? Thanks again.