Hi,
It consists of 2 parts:
Improved scheduler with polling support:
https://codereview.appspot.com/7314062/
https://codereview.appspot.com/7448048/
and the network poller itself:
https://codereview.appspot.com/7579044/
These two changes together improve network performance ~10x on linux:
BenchmarkTCPPersistent 81670 7782 -90.47%
BenchmarkTCPPersistent-2 26598 4808 -81.92%
BenchmarkTCPPersistent-4 15633 3674 -76.50%
BenchmarkTCPPersistent-8 18093 2407 -86.70%
BenchmarkTCPPersistent-16 17472 1875 -89.27%
BenchmarkTCPPersistent-32 7679 1637 -78.68%
Why it is faster:
- global mutex for poller state is eliminated
- edge-triggered epoll is used instead of level-triggered
- it uses heap-based runtime timers instead of home-grown list-based timers
- moreover, timers are set only once as opposed to setting them on
each read/write
- the polling is done by runtime worker threads at appropriate
moments, as opposed to by a separate goroutine at inappropriate
moments
- poller injects batches of newly runnable goroutines directly into scheduler
- blocking/unblocking of goroutines is more efficient in runtime and
no need to allocate additional channels for that
- other minor improvements
If you want to look at the current code:
src/pkg/net/fd_unix.go (Read and Write functions)
src/pkg/net/fd_poll_runtime.go (interface to runtime poller)
src/pkg/runtime/netpoll.c (poller code)
src/pkg/runtime/netpoll_epoll.c (linux part)
src/pkg/runtime/proc.c (scheduler integration, search for netpoll)