How fast can gopacket handles?

3,569 views
Skip to first unread message

Chun Zhang

unread,
May 26, 2017, 12:01:20 PM5/26/17
to golang-nuts
Hi, All,

I am trying to write a small program to handle packets coming from a GigE wire. The testing code snip is as below.

The problem I am facing is that this is extremely slow, can only handle up to 250Mbps-ish traffic with normal ipv4 sized packets, anything above that resulting significant packet drop.  Note that I am doing nothing with the packet at this moment. If I try to do any packet processing, then apparently it gets slower.

Has anybody measured the efficiency of the gopacket package? Is there any other faster alternatives?

PS: the host machine is an ubuntu VM with 12-core and 12G memory, but looks only 2 cores are used for this program.

Thanks,
Chun



// Open device
handle, err = pcap.OpenLive(device, snapshot_len, promiscuous, timeout)
if err == nil {
Info.Println("Open interface ", device, "successfully")

}
defer handle.Close()


//fmt.Println("In the deafult reading case ", time.Now())
// Use the handle as a packet source to process all packets
packetSource := gopacket.NewPacketSource(handle, handle.LinkType())
Info.Println("pcketsourc is ", packetSource, time.Now())
for packet := range packetSource.Packets() {
Debug.Println("-------------------------------------------------------------------")
count++
Warning.Println("packet count ", count)

// write to a pcap for testing
/*err = w.WritePacket(packet.Metadata().CaptureInfo, packet.Data())
if err != nil {
fmt.Println(err)
}*/

continue

Egon

unread,
May 26, 2017, 12:37:55 PM5/26/17
to golang-nuts
As a baseline measurement I suggest writing the same code in C; this shows how much your VM / config / machine can handle.

With gopacket -- use src.NextPacket instead of Packets.

There are also: https://github.com/akrennmair/gopcap and https://github.com/miekg/pcap

+ Egon

Chun Zhang

unread,
May 26, 2017, 1:51:55 PM5/26/17
to golang-nuts
Good point. 
as a comparison: tcpdump -w /dev/null can handle up to 750Mbps, where sending machine's  speed limit reached. I think it should be able to handle line rate.

Are those two packages lighter/faster than gopacket?


Thanks,
Chun

Egon

unread,
May 26, 2017, 4:59:18 PM5/26/17
to golang-nuts
On Friday, 26 May 2017 20:51:55 UTC+3, Chun Zhang wrote:
Good point. 
as a comparison: tcpdump -w /dev/null can handle up to 750Mbps, where sending machine's  speed limit reached. I think it should be able to handle line rate.

Are those two packages lighter/faster than gopacket?

Nevermind, just noticed... gopacket/pcap is a fork of akrennmair/gopcap

Anyways, to get more information on what is taking time in your program see https://blog.golang.org/profiling-go-programs
 
Maybe try something like this:

handle, err := pcap.OpenLive(device, snapshot_len, promiscuous, timeout)
// ...
for {
    data, ci, err := handle.ZeroCopyReadPacketData()
    // ...

This should remove allocations from critical path.

PS: code untested and may contain typos :P

Kevin Conway

unread,
May 27, 2017, 5:24:13 AM5/27/17
to Egon, golang-nuts
can only handle up to 250Mbps-ish traffic

I'm not familiar with gopacket, but I have seen multiple occasions where logging to a file or stdout became a bottleneck. Your code snippet is logging on every packet which seems excessive. Try logging with less frequency and, if using stdout, consider using a log destination with different buffering characteristics like a file or syslog over UDP. 

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chun Zhang

unread,
May 27, 2017, 7:05:11 AM5/27/17
to golang-nuts, egon...@gmail.com
Thanks Kevin and Egon!

With a few experiments, I found that the logging, even to a file, is quite time consuming, so turning off logging helps, resulting in 500Mbps-ish no drop rate; however, still not even close to Gbps.

Then I turned on both lazy and nocopy option in the decoding option, the lazy options seems to help. I got something close to 700Mbps, where the sender's limit is reached.

Given that said, the program does not nothing but receiving the packet at this moment. Any actual processing of the packet in the same thread significantly hurt the rate. Besides spinning multiple thread to handle the actual work, anything else in the gopacket land that can be done?

Thanks again!
Chun

Rajanikanth Jammalamadaka

unread,
May 27, 2017, 10:15:22 AM5/27/17
to golang-nuts
Can you offload the actual packet processing to a different goroutine?

Kevin Conway

unread,
May 27, 2017, 12:04:55 PM5/27/17
to Rajanikanth Jammalamadaka, golang-nuts
 Any actual processing of the packet in the same thread significantly hurt the rate
 offload the actual packet processing to a different goroutine

As Rajanikanth points out, you'll need to put your work in other goroutines to make use of your other cores for processing them. One goroutine per packet is likely going to cause its own issues. I'd suggest adding a configurable batch size to let you iterate and find the ideal number packets to spin off for processing in a goroutine. Maybe experiment with a few different patterns. For example you might start with a naive goroutine creation on each batch of a significant size (https://play.golang.org/p/GH16HEJgiy) or implement something like a worker pool model where you send segments of work to available workers (https://play.golang.org/p/3D_JuWdA4a).

Also, given that you attempting to provide as much active time to the packet collector as possible it might be worthwhile to investigate usage of https://golang.org/pkg/runtime/#LockOSThread which allows you to isolate your consumer goroutine to an OS thread and force all other goroutines to operate in other OS threads.

On Sat, May 27, 2017 at 9:15 AM Rajanikanth Jammalamadaka <rajan...@gmail.com> wrote:
Can you offload the actual packet processing to a different goroutine?

Egon

unread,
May 28, 2017, 2:25:03 AM5/28/17
to golang-nuts, egon...@gmail.com


On Saturday, 27 May 2017 14:05:11 UTC+3, Chun Zhang wrote:
Thanks Kevin and Egon!

With a few experiments, I found that the logging, even to a file, is quite time consuming, so turning off logging helps, resulting in 500Mbps-ish no drop rate; however, still not even close to Gbps.

Then I turned on both lazy and nocopy option in the decoding option, the lazy options seems to help. I got something close to 700Mbps, where the sender's limit is reached.

Given that said, the program does not nothing but receiving the packet at this moment. Any actual processing of the packet in the same thread significantly hurt the rate. Besides spinning multiple thread to handle the actual work, anything else in the gopacket land that can be done?

Profile your code. :)

Chun Zhang

unread,
May 30, 2017, 10:50:50 AM5/30/17
to golang-nuts, egon...@gmail.com
Thank you Rajanikanth, Kevin and Egon! I will explore the ideas you guys provided and keep you updated.

Best Regards,
Chun

Chun Zhang

unread,
Jun 9, 2017, 10:47:46 AM6/9/17
to golang-nuts, egon...@gmail.com
Hi, All, 

Update on this issue. Based on the suggestion I got earlier, I dedicated one thread, which is locked to a os thread to handle packet receiving, this thread then put the received packet on a buffered channel/queue. Without doing any extra work, this thread is able to take packets up to 800Mbps-ish, which is the limit of the sender. 

12 goroutines are then kicked off to take items from this queue and distribute them to other work queues for further processing. The distributing thread parses the packet to a gopacket.Packet interface and hashes based on the ip address etc. Even though I have used the faster version, the parse routine takes a LOT of the cpu power.  By profiling the program, it seems the parsing takes 1/3 of whole time. The whole app is then limited to roughly 120Mbps, aka, 50kpps. 

The decoding routine I am using is pretty much like the example here

Decoding Packets Faster


I am wondering what further optimization can I do to speed this up?

Thanks,
Chun

Kevin Conway

unread,
Jun 9, 2017, 1:31:18 PM6/9/17
to Chun Zhang, golang-nuts, egon...@gmail.com

On first appearance, the article you linked to uses some patterns that are different from the same documentation in gopacket: https://godoc.org/github.com/google/gopacket#hdr-Fast_Decoding_With_DecodingLayerParser . Namely, the official docs suggest reusing the parser and results allocations while the third party docs demonstrate reallocating these components on each packet. It's worth double checking that you aren't performing unnecessary allocations due to bad example code.

Also, I recommend reading through  https://blog.golang.org/profiling-go-programs if you haven't already. It gives some good examples of drilling down into smaller components to find more granular code bottlenecks.

Egon

unread,
Jun 9, 2017, 1:35:56 PM6/9/17
to golang-nuts, egon...@gmail.com
The usual 
1. do less work,
2. don't make copies,
3. produce less garbage.

e.g. 
1./2. maybe you can avoid parsing the whole data and just extract the necessary bits directly.
2./3. maybe you can create a pool of packets so that the internal structures for them can be reused

Essentially try to exactly understand why the parsing takes that much time. Write a benchmark that stresses that point of the code. Optimize it.

This might also require improving gopacket itself.

Chun Zhang

unread,
Jun 9, 2017, 2:51:52 PM6/9/17
to golang-nuts, chun...@gmail.com, egon...@gmail.com
Thank you Kevin for spotting the unnecessary memory allocation! Took the example code naively and didn't think through :) That does make an impact to the app and the throughput is increased by another 30Mbps to 150Mbps - ish. As a benchmark, a similar C++ version of the app is able to handle 250Mbps. 

I am digging into the profiling and try to optimize further. 

Best Regards,
Chun
Reply all
Reply to author
Forward
0 new messages