Broadly, I see two possibilities:
- We create one TCP connection for each RPC we send.
- We create one TCP connection per server and multiplex all RPC calls to that server on that single connection.
Option 1 makes it trivial to match RPC responses to their corresponding requests. It also makes sense to me because algorithmically, each RPC round-trip is independent of all others. However, it incurs the handshake overhead for each RPC. In practice, I don't know how significant that overhead is. I also don't know how well flow control works under this strategy.
Option 2 avoids the per-RPC handshake overhead. But I'm worried that it would cause TCP's in-order delivery guarantee to work against us. If one message to a server is dropped or delayed, subsequent messages will also be delayed, unnecessarily. It also complicates the RPC layer, since we have to manually incorporate a way to match responses to requests.
What are your experiences? Is one method clearly better than the other in practice?