Hi there,
Thanks for your question. We actually are somewhat short on documentation for using this. The C++ version of the test is in test/cpp/qps but it actually measures several performance characteristics and can be tuned in different ways. You can run 2 or more processes called QPS workers that act as the gRPC clients and servers under test, as well as 1 "driver" process that sets those workers up to run a specific test scenario. The driver sends the configuration protos to the workers and reports the resulting statistics after the scenario is complete: latency at the median, 90%, 95%, 99%, and 99.9% percentile; QPS for the given scenario; QPS per available server core (*); and client/server user/system time.
(*) The server can theoretically run on all available cores even if it actually doesn't in practice, so the default is to assume that all cores are available. The C++ worker on Linux can also be restricted to a limited set of cores using a configuration option. The QPS/core metric is really only meaningful for tests that limit the set of cores.
You can get a good feeling for the set of configuration options available in test/cpp/qps/qps-sweep.sh which runs 12 different testing scenarios (and has comments describing them) in both secure and insecure testing versions. I've filed issue
https://github.com/grpc/grpc/issues/5232 to make sure that we add documentation to this directory. A sample execution to run these would be:
On machine A:
$ bins/opt/qps_worker --driver_port=10000 &
On machine B:
$ bins/opt/qps_worker --driver_port=10000 &
On the driver machine (can be machine A or B, or a different machine)
$ export QPS_WORKERS="A:10000,B:10001"
$ test/cpp/qps/qps-sweep.sh
(You can run everything on the same machine and just use A and B as localhost)
There are many other codes in that directory which implement the worker and driver together in 1 process. Those are all part of our test suite and we run them with every pull request, though they are not always the ideal configurations for getting meaningful performance results.
There is also a Node.js benchmark in src/node/performance that implements a subset of these scenarios. In particular, Node also has a worker implementation that can also act as a client or server, and the Node and C++ workers can interoperate with each other. The driver is still the C++ driver. We will soon have interoperable performance tests in a greater selection of languages.
Best regards,
Vijay