Help me understand UDP vs TCP performance?

409 views
Skip to first unread message

Jakob Borg

unread,
Nov 7, 2014, 8:14:31 AM11/7/14
to golang-nuts
I'm writing a library for stream oriented connections on top of UDP; the main purpose being easier NAT busting and to be able to play with various congestion control mechanisms etc. I immediately noticed that I have abysmal performance compared to TCP in simple benchmarks over the loopback interface. Since I'm doing almost the simplest possible flow control, allocating a lot of memory all over the place, etc, this is no great surprise.

But when I start digging into the benchmarks a bit deeper I seem to be limited by other things. In particular, UDP writes seem to be very, very slow. These are three benchmarks that each start one writer goroutine and one reader, and writes/reads data in a tight loop. "DST" is my protocol wrapping UDP. The fourth benchmark skips the reader groutine and just runs WriteTo() in a tight loop; the fifth does the same but uses DialUDP() and Write() instead of WriteTo().

The MB/s number is the relevant one, not ns/op since the amount of data read/written per op differs between the tests - DST & TCP uses large writes, UDP uses packet-sized ones..

$ go test -run None -bench DST\|UDP\|TCP -benchmem -cpu 4
PASS
BenchmarkDST-4            2000  517116 ns/op   126.73 MB/s  46173 B/o   329 allocs/op
BenchmarkTCP-4           50000   30822 ns/op  2126.25 MB/s      9 B/op    0 allocs/op
BenchmarkUDP-4          500000    5733 ns/op   178.61 MB/s     33 B/op    1 allocs/op
BenchmarkUDPDevNull-4   500000    5867 ns/op   174.52 MB/s     33 B/op    1 allocs/op
BenchmarkUDPDialled-4   500000    4015 ns/op  254.99 MB/s      1 B/op    0 allocs/op
ok  github.com/calmh/dst 9.054s

(go version go1.3.3 darwin/amd64)

It's obvious that my stuff is *much* worse than TCP, but then not *that* much worse when comparing to raw UDP. And reading seems not to be the issue; just pumping packets at lo0 in a tight loop is equally slow. A part of the problem seems to be whatever WriteTo() does in addition to Write() (resolving the address somehow?), but even Write() is a magnitude slower than TCP.

Is this as designed, or expected because of some property of sending UDP that I don't understand, or not expected at all and I've screwed something up in my tests?

(As an aside on "-cpu 4", the UDP tests fail without it because I'm not handling retransmissions and it seems to overrun the read/write buffers before context switching goroutines on a single core. TCP is almost twice as fast with the default of one core...)

The benchmark code is in https://github.com/calmh/dst/blob/master/benchmark_test.go - the rest of the DST code in the same repo, in full-hacking-proof-of-concept mode for the interested veiwer, but just sorting out my UDP writes first would be nice. :)

//jb

Jakob Borg

unread,
Nov 7, 2014, 8:21:51 AM11/7/14
to golang-nuts
2014-11-07 14:14 GMT+01:00 Jakob Borg <ja...@nym.se>:
A part of the problem seems to be whatever WriteTo() does in addition to Write() (resolving the address somehow?), but even Write() is a magnitude slower than TCP.

This part of my test was flawed; there is no difference in performance between WriteTo() and Write(); it all runs at ~250 MB/s or 170 kpps. Perhaps that's all that can be expected from the UDP stack?

//jb

Nick Craig-Wood

unread,
Nov 7, 2014, 9:49:14 AM11/7/14
to Jakob Borg, golang-nuts
> ok github.com/calmh/dst <http://github.com/calmh/dst> 9.054s
>
> (go version go1.3.3 darwin/amd64)
>
> It's obvious that my stuff is *much* worse than TCP, but then not *that*
> much worse when comparing to raw UDP. And reading seems not to be the
> issue; just pumping packets at lo0 in a tight loop is equally slow. A
> part of the problem seems to be whatever WriteTo() does in addition to
> Write() (resolving the address somehow?), but even Write() is a
> magnitude slower than TCP.
>
> Is this as designed, or expected because of some property of sending UDP
> that I don't understand, or not expected at all and I've screwed
> something up in my tests?
>
> (As an aside on "-cpu 4", the UDP tests fail without it because I'm not
> handling retransmissions and it seems to overrun the read/write buffers
> before context switching goroutines on a single core. TCP is almost
> twice as fast with the default of one core...)
>
> The benchmark code is
> in https://github.com/calmh/dst/blob/master/benchmark_test.go - the rest
> of the DST code in the same repo, in full-hacking-proof-of-concept mode
> for the interested veiwer, but just sorting out my UDP writes first
> would be nice. :)
>
> //jb


I tried this on linux and I didn't get the test to work once.

BenchmarkUDP-4 2014/11/07 14:18:50 Received 198457 packets out of
200000; 0.8% loss
--- FAIL: BenchmarkUDP-4
udp_test.go:49: read udp: i/o timeout
ok _/home/ncw/udptest 3.096s

That indicates about 89 MB/s if it was working.

The fact that lots of packets go missing indicates to me that this isn't
a good test. It is the usual problems with UDP and no flow control.

Adding a bit of flow control like this

@@ -15,8 +26,11 @@
src := make([]byte, 1472)
io.ReadFull(rand.Reader, src)

+ flow := make(chan struct{}, 16)
+
go func(n int) {
for i := 0; i < n; i++ {
+ flow <- struct{}{}
_, err := bConn.WriteTo(src, aAddr)
if err != nil {
b.Fatal(err)
@@ -32,6 +46,7 @@
if i%1000 == 0 {
aConn.SetReadDeadline(time.Now().Add(1 * time.Second))
}
+ <-flow
n, err := aConn.Read(buf)
if err != nil {
log.Printf("Received %d packets out of %d; %.1f%% loss", i, b.N,
100-float64(i*100)/float64(b.N))


Makes the test run reliably at 120 -150 MB/s

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

James Bardin

unread,
Nov 7, 2014, 2:32:50 PM11/7/14
to golan...@googlegroups.com

The difference in throughput you see is because the UDP needs to make more syscalls to send the same amount of data. There an overhead to each of those send calls, and when you hand the data to Syscall in chunks of 1472 bytes instead if 65536 bytes, it's going to be much slower. 

Make the tests comparable by using the same size buffers and packets, and you'll see that UDP wins handily.


Reply all
Reply to author
Forward
0 new messages