1. I don't have the code in front of me to check.
2. Yes, as long as the smallest channel latency is long compared to the packet rate. Any additional work is for synchronizing across MPI, which should be negligible at the small number of ranks you are using and reasonable channel latencies.
Peter