Hi there,
The number and variety of performance benchmarking tools included in the gRPC repo varies a lot depending on the language and stack, though it is one of our goals to improve this across all platforms. In the C/C++ stack, most of the performance testing has focused on latency and scalability, but it is possible to derive the bandwidth measure that you're seeking. As with any network performance test in C++, you'll run qps_worker on two different machines, and then initiate a qps_driver on a third. You'll set the QPS_WORKERS environment variable to the host and ports of the workers. And then you'll set the flags of the qps_driver so that the payload_size is really big (specified in bytes).
The script single_run_localhost.sh in test/cpp/qps is an example of how to run these, though you'll need to start your qps_worker jobs on different machines to get a feeling for network bandwidth. Just use --payload_size=1048576 or whatever as one of your driver arguments.