Our implementation is largely the same as the Chromium on the server side, with some OS specific optimizations(which may be released at some point), but
1) We're being conservative(I expect you could easily see 200Mbits/core from the current client-server on relatively new desktop CPUs)
2) The QuicServer appears to be slower per-core than our server.
3) The QuicClient is slow enough that it's typically the limiting factor. One obvious optimization is that it currently doesn't use recvmmsg, it uses recvmsg.
What we're hoping to avoid is a characterization of QUIC based on a 1Gbps egress test, which isn't an interesting test for most of our users anyway.