FWIW I also wondered what was going on. I doubt that
io.Copy is anything to do with it. I suspect the scheduler.
I wrote a little bit of code to test it (to play with it,
go get
code.google.com/p/rog-go/exp/cmd/websocket...)
It seems there's a big difference between MAXGOPROCS=1
and MAXGOPROCS=2 (runtime.NumCPU reports 4 on my machine).
Its unusual that MAXGOPROCS>1 speeds things up so much.
Some sample runs (I needed the long wait between runs to
prevent the network stack from running out of local addresses;
there's probably a better way):
# sysctl -w 'net.core.netdev_max_backlog=2500'
# ulimit -n 30000
# for i in 1 2 3 4; do
echo ncpu $i
GOMAXPROCS=$i websocket-stress | websocket-analyse
sleep 300
done
ncpu 1
total 31.95649s
latency: min 47us; max 862.475ms; mean 268.299392ms; median 220.062ms
connect: min 246us; max 16.983571s; mean 1.987242486s; median 284.384ms
delay: min 1us; max 120.636ms; mean 2.2523ms; median 1.941ms
ncpu 2
total 16.191334s
latency: min 35us; max 602.721ms; mean 132.065983ms; median 40.62ms
connect: min 200us; max 1.028406s; mean 93.174134ms; median 4.979ms
delay: min 0; max 72.225ms; mean 1.497035ms; median 1.292ms
ncpu 3
total 14.619608s
latency: min 30us; max 357.473ms; mean 49.460422ms; median 5.347ms
connect: min 212us; max 369.663ms; mean 18.330433ms; median 1.314ms
delay: min 1us; max 57.4ms; mean 1.346355ms; median 1.145ms
ncpu 4
total 13.646085s
latency: min 32us; max 73.886ms; mean 6.128948ms; median 1.294ms
connect: min 200us; max 100.802ms; mean 4.921183ms; median 1.028ms
delay: min 0; max 45.42ms; mean 1.264279ms; median 1.123ms
#