gRPC performance (Java)

797 views
Skip to first unread message

Gordan Krešić

unread,
Dec 30, 2022, 11:35:44 AM12/30/22
to grp...@googlegroups.com
Out of curiosity, I decided to compare performance of making a gRPC call vs. making a REST call. To my surprise, gRPC turned out to be several times slower. I'm hoping that I'm just missing something obvious.

Repo with tests:

https://github.com/gkresic/muddy-waters

Build it with (you'll need Java 17 somewhere on the path):

./gradlew build

In that repo there are various REST clients implemented using different Java REST libs and frameworks and the one that uses gRPC is named 'plankton'.

General benchmark across all subprojects is to send multiple objects (called 'Payload' in sources) with one integer and one textual field and to receive only one such object as response (method for calculating that response is deliberately trivial and not important here). REST services are tested with wrk (https://github.com/wg/wrk) and gRPC with ghz (https://ghz.sh/). Just to rule out ghz as the cause of the low performances, I've implemented my own simple Java gRPC client benchmark.

To run plankton:

cd plankton/build/install/plankton/
bin/plankton

Benchmark using ghz (from repo root):

ghz --insecure --proto=plankton/src/main/proto/payload.proto --call=muddywaters.plankton.EatService/EatStream --duration=10s --duration-stop=wait --data-file=payload-10.json localhost:17001

On my machine it gives me ~9k requests/sec.

Now compare this to 'dolphin' subproject which implements REST endpoint using Vert.x that is based on same Netty as gRPC:

cd dolphin/build/install/dolphin/
bin/dolphin

Benchmark using wrk (from repo root):

wrk -t4 -c400 -d10s -s payload-10.lua http://localhost:16006/eat

It easily goes above 100k requests/sec.

To explore further, I wrote a simplest possible gRPC service that accepts empty message ('Void') and returns that same message as response, just to minimize the effect of message encoding/decoding/processing. You can test it with:

ghz --insecure --proto=plankton/src/main/proto/ping.proto --call=muddywaters.plankton.PingService/Ping --duration=10s --duration-stop=wait localhost:17001

However, even that most simple service of all maxes out at ~14k requests/sec.

Like I said, I wrote my own benchmark client that run against three services:

* Ping: receives empty message and returns it in response
* EatOne: receives one Payload and returns one Payload
* EatStream: receives stream of Payloads and returns one Payload - gRPC implementation of my "standardized" test

Run it with (from repo root):

./gradlew :plankton:benchmark

It will run all three tests *three* times, to rule out JVM JITing from the calculations. However, even that benchmark is not much faster:

Ping: 31k requests/sec
EatOne: 30k requests/sec
EatStream: 14k requests/sec (reminder: REST implementation from 'dolphin' subproject gives over 100k requests/sec for the same functionality)

What am I missing?

-gkresic.

Sergii Tkachenko

unread,
Jan 3, 2023, 5:57:21 PM1/3/23
to grpc.io
Just an idea - did you try to run `ghz` with the `--async` flag. It might make sense to play around with `--skipFirst` flag as well, so that first request to not pre-warmed JVM do not bias the result.
Also - looks like REST wrk benchmark uses 400 connections, while gRPC ghz just one? Consider trying `--connections=400` ghz argument.

Sergii Tkachenko

unread,
Jan 3, 2023, 5:58:09 PM1/3/23
to grpc.io
s/first request/first requests/

Gordan Krešić

unread,
Jan 4, 2023, 11:24:55 AM1/4/23
to grp...@googlegroups.com
On 03. 01. 2023. 23:57, 'Sergii Tkachenko' via grpc.io wrote:
> Just an idea - did you try to run `ghz` with the `--async` flag. It might make sense to play around with `--skipFirst` flag as well, so that first request to not pre-warmed JVM do not bias the result.
> Also - looks like REST wrk benchmark uses 400 connections, while gRPC ghz just one? Consider trying `--connections=400` ghz argument.
> Docs: https://ghz.sh/docs/usage

Already did all of that and more:

* played with `--async`, `--concurrency` and `--connections`

* invoked ghz multiple times to make sure server is "warm"

Like I already mentioned, I even wrote my own benchmark that gave *somewhat* better throughput than ghz, but still way lower than REST tests.

If no one can spot anything I missed on server side, I'll also write my own benchmarks for REST endpoints - that way I can guarantee that client logic is the same.

-gkresic.

Gordan Krešić

unread,
Jan 10, 2023, 8:51:15 AM1/10/23
to grp...@googlegroups.com
On 30. 12. 2022. 17:35, Gordan Krešić wrote:
> Out of curiosity, I decided to compare performance of making a gRPC call vs. making a REST call. To my surprise, gRPC turned out to be several times slower. I'm hoping that I'm just missing something obvious.

No, other than that one should never mix benchmark methods, nothing else :)

I went on and wrote full benchmark suite that included "official" gRPC server for Java, Vert.x implementation of gRPC server and Vert.x Web REST server.

Benchmarks are:

gRPC Ping: simple call, with empty parameters and response
gRPC EatOne: one Payload object as parameter and one as response
gRPC EatStream: stream of Payload messages in request (tested with 10 and 100 messages) with only one returned as response
REST EatStream: same as previous, but over REST protocol (content was still encoded using Protobuf)

In all cases, gRPC client is build using "official" gRPC client for Java and REST client is build using Retrofit.

Results:

Benchmark (payloadFileName) Mode Cnt Score Error Units
c.s.g.j.grpc.GrpcOfficial_1_Ping.benchmark N/A thrpt 16424.231 ops/s
c.s.g.j.grpc.GrpcOfficial_2_EatOne.benchmark N/A thrpt 17131.613 ops/s
c.s.g.j.grpc.GrpcOfficial_3_EatStream.benchmark payload-10.json thrpt 8194.403 ops/s
c.s.g.j.grpc.GrpcOfficial_3_EatStream.benchmark payload-100.json thrpt 1705.228 ops/s

c.s.g.j.grpc.GrpcVertx_1_Ping.benchmark N/A thrpt 18484.730 ops/s
c.s.g.j.grpc.GrpcVertx_2_EatOne.benchmark N/A thrpt 18138.326 ops/s
c.s.g.j.grpc.GrpcVertx_3_EatStream.benchmark payload-10.json thrpt 15511.513 ops/s
c.s.g.j.grpc.GrpcVertx_3_EatStream.benchmark payload-100.json thrpt 7191.088 ops/s

c.s.g.j.rest.RestVertx_EatStream.benchmark payload-10.json thrpt 3588.211 ops/s
c.s.g.j.rest.RestVertx_EatStream.benchmark payload-100.json thrpt 3394.641 ops/s

REST backend is still faster than "official" server when streaming large number of payloads, but slower on small payload size.

Vert.x gRPC backend is fastest of all and it gets faster proportionally with size of payload streamed.

Repo: https://github.com/gkresic/grpc-bench

-gkresic.
Reply all
Reply to author
Forward
0 new messages