The overhead from communication should not be too high. Of the order of 10ms if you set everything up nicely, and you use a Unix domain socket. This is mostly down to latency, so if you go to a larger system the communication overhead shouldn't grow substantially.
We are currently doing some profiling
https://github.com/i-pi/i-pi/pull/277 so we should also be able to reduce a few bottlenecks and give better indications of how to set up a "fast" calculation.