performance of UDPConn.WriteToUDP

482 views
Skip to first unread message

Donald Hamel

unread,
Feb 10, 2015, 8:48:34 PM2/10/15
to golan...@googlegroups.com
Hello,

I am prototyping an RTSP server in golang on Windows 64 bits. Right now I am using github.com/wernerd/GoRTP/src/net/rtp to manage the RTP stream. I have one stream sending to multiple remote destinations (100 destinations and I want 1000). I am having a performance issue with UDPConn.WriteToUDP. I have profiled the application and it seems the problem is not with the UDP WSASendTo but with runtime.cgocall_errno. 

Am I interpreting the SVG profile result properly? 

Has anyone seen this problem?

It seems to me that reading errno should take long, no?

If someone has a suggestion to get better performance, please let me know.


(pprof) top 20
86.59s of 92.05s total (94.07%)
Dropped 221 nodes (cum <= 0.46s)
Showing top 20 nodes out of 39 (cum >= 0.62s)
      flat  flat%   sum%        cum   cum%
    72.98s 79.28% 79.28%     74.22s 80.63%  runtime.cgocall_errno
     8.33s  9.05% 88.33%      8.33s  9.05%  runtime.osyield
     1.99s  2.16% 90.49%      1.99s  2.16%  runtime.stdcall6
     0.49s  0.53% 91.03%      0.58s  0.63%  runtime.mallocgc
     0.48s  0.52% 91.55%      0.48s  0.52%  runtime.stdcall5
     0.31s  0.34% 91.88%     77.57s 84.27%  github.com/dhx71/GoRTP/net/rtp.(*TransportUDP).WriteDataTo
     0.24s  0.26% 92.15%     75.81s 82.36%  net.(*ioSrv).ExecIO
     0.23s  0.25% 92.40%      0.77s  0.84%  net.ipToSockaddr
     0.18s   0.2% 92.59%      2.55s  2.77%  runtime.netpoll
     0.16s  0.17% 92.76%     77.96s 84.69%  github.com/dhx71/GoRTP/net/rtp.(*Session).WriteData
     0.15s  0.16% 92.93%      0.49s  0.53%  runtime.deferreturn
     0.15s  0.16% 93.09%      8.10s  8.80%  schedule
     0.14s  0.15% 93.24%     74.30s 80.72%  net.func·027
     0.13s  0.14% 93.38%     75.94s 82.50%  net.(*netFD).writeTo
     0.12s  0.13% 93.51%      0.51s  0.55%  runtime.netpollWait
     0.12s  0.13% 93.64%     74.16s 80.56%  syscall.WSASendto
     0.11s  0.12% 93.76%      0.67s  0.73%  runtime.newobject
     0.10s  0.11% 93.87%     73.87s 80.25%  syscall.WSASendTo
     0.09s 0.098% 93.97%      8.04s  8.73%  findrunnable
     0.09s 0.098% 94.07%      0.62s  0.67%  net.(*pollDesc).Wait

Thanks,
Donald

brainman

unread,
Feb 11, 2015, 12:03:16 AM2/11/15
to golan...@googlegroups.com
On Wednesday, 11 February 2015 12:48:34 UTC+11, Donald Hamel wrote:

> ... I am having a performance issue with UDPConn.WriteToUDP.

You are not telling what your issue is.

> ... I have profiled the application and it seems the problem is not with the UDP WSASendTo but with runtime.cgocall_errno. 
> Am I interpreting the SVG profile result properly? 

I don't think so. runtime.cgocall_errno makes calls into Windows kernel. So all your SVG shows is that most of the time your process spends waiting for Windows to complete syscall (WSASendTo in particular). Is that something you don't expect to happen here?

Alex

Donald Hamel

unread,
Feb 11, 2015, 6:20:34 AM2/11/15
to brainman, golan...@googlegroups.com
Sorry for the lack of description. My problem is that sending a ~250 bytes payload to 100 destinations every 30ms takes 10% of the CPU on my PC. I would like to send to 1000 destinations which I doubt I will be able to do. It seems to me 10% is too much but I haven't tested the equivalent code in C. I will do that and let you know the result. 

Thanks for your reply.
Donald

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/ERV8ROd12tQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dmitry Vyukov

unread,
Feb 11, 2015, 7:07:11 AM2/11/15
to Donald Hamel, golang-nuts
On Wed, Feb 11, 2015 at 4:48 AM, Donald Hamel <donald...@gmail.com> wrote:
> Hello,
>
> I am prototyping an RTSP server in golang on Windows 64 bits. Right now I am
> using github.com/wernerd/GoRTP/src/net/rtp to manage the RTP stream. I have
> one stream sending to multiple remote destinations (100 destinations and I
> want 1000). I am having a performance issue with UDPConn.WriteToUDP. I have
> profiled the application and it seems the problem is not with the UDP
> WSASendTo but with runtime.cgocall_errno.

runtime.cgocall_errno samples include the actual WSASendTo syscall. So
it may be that the time is consumed by the syscall.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Jesper Louis Andersen

unread,
Feb 11, 2015, 9:11:25 AM2/11/15
to Donald Hamel, brainman, golang-nuts

On Wed, Feb 11, 2015 at 12:19 PM, Donald Hamel <donald...@gmail.com> wrote:
Sorry for the lack of description. My problem is that sending a ~250 bytes payload to 100 destinations every 30ms takes 10% of the CPU on my PC. I would like to send to 1000 destinations which I doubt I will be able to do. It seems to me 10% is too much but I haven't tested the equivalent code in C. I will do that and let you know the result. 

Immediate hunch: 100 destinations every 30ms is 3333 syscalls per second. It doesn't sound very high to me, hence I would eyebeam system tuning and configuration before making anything drastic happen. How much time is spent in the kernel?


--
J.

Donald Hamel

unread,
Feb 11, 2015, 9:40:20 AM2/11/15
to Jesper Louis Andersen, brainman, golang-nuts
I just made a simple application in C++ to compare with. It uses between 6-7% CPU while the golang version uses between 8-10%. It is not so bad finally and I cannot blame golang syscalls...

Jesper, I am on Windows and when I look with ProcessExplorer I see that 80% of the time is spent in the kernel. 

I don't see what I could do to improve this. I have tried sending bigger packets every 90ms but is causes other issues with the RTSP clients.

Thanks for your help. I think I will try this on Linux and see if it is better.

Dmitry Vyukov

unread,
Feb 11, 2015, 9:46:46 AM2/11/15
to Donald Hamel, Jesper Louis Andersen, brainman, golang-nuts
Are sending in parallel or sequentially?
If sequentially and you want to reduce CPU consumption, set GOMAXPROCS to 1.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Donald Hamel

unread,
Feb 11, 2015, 12:36:02 PM2/11/15
to Dmitry Vyukov, Jesper Louis Andersen, brainman, golang-nuts
There is going to be one go routine per channel. Then one channel will periodically send to all destinations in a for loop. I guess i could have one process per channel and set GOMAXPROCS to 1. Thanks for the suggestion, i will try that.

Donald

Envoyé de mon iPhone

Donald Hamel

unread,
Feb 11, 2015, 1:59:51 PM2/11/15
to Dmitry Vyukov, Jesper Louis Andersen, brainman, golang-nuts
My coworker has tested the prototype on a real server with 1000 simultaneous clients and CPU usage is around 5-6%.

So I think I won't have to find an optimization after all.

Sorry for the noise and thank you for your help.
Donald


Florian Weimer

unread,
Feb 12, 2015, 3:41:56 PM2/12/15
to golan...@googlegroups.com
* Donald Hamel:

> My coworker has tested the prototype on a real server with 1000
> simultaneous clients and CPU usage is around 5-6%.
>
> So I think I won't have to find an optimization after all.

If that changes, keep in mind that UDP performance varies greatly
between operating systems, even more so than TCP performance.
Reply all
Reply to author
Forward
0 new messages