Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to achieve super low TCP latency?

1,681 views
Skip to first unread message

Ignoramus446

unread,
Mar 27, 2011, 9:36:47 PM3/27/11
to
I have an application for my Linux powered milling machine.

http://igor.chudov.com/projects/Bridgeport-Series-II-Interact-2-CNC-Mill/

It requires, or rather benefits, from super low TCP/IP latency.

To avoid any network or router issues, I am only talking to 127.0.0.1,
so that network throughput is not even a part of my equation.

And yet, I see latencies that I would prefer to lower. To wit:

ping localhost
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.010 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.013 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.013 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.016 ms
64 bytes from localhost (127.0.0.1): icmp_seq=5 ttl=64 time=0.012 ms
64 bytes from localhost (127.0.0.1): icmp_seq=6 ttl=64 time=0.014 ms
64 bytes from localhost (127.0.0.1): icmp_seq=7 ttl=64 time=0.013 ms
^C
--- localhost ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 5999ms
rtt min/avg/max/mdev = 0.010/0.013/0.016/0.001 ms

How can I reduce that 13 microsecond latency?

Also, in TCP based comms, I see about 60 microsecond delays between a
request sent and a reply received.

Of course, part of this is an app related stuff, but I estimate it to
be under one-two microseconds.

How can I improve this?

i

Ignoramus446

unread,
Mar 28, 2011, 12:39:54 AM3/28/11
to
To further clarify, this milling application is such that the
bandwidth available FAR exceeds bandwidth that may ever be needed.

Thus, I would prefer to minimize delays pertaining to buffering.

The controller i talk to supports TCP only, so I cannot go to UDP or
anything of the sort.

i

default

unread,
Mar 28, 2011, 1:14:00 AM3/28/11
to

Probably by not using TCP at all.

This is an interesting topic and I'm curious about solid information.
I've done a lot of real-time applications, and if I needed sub-
microsecond latencies I'd be inclined to steer clear of big opaque
libraries like TCP/IP. The protocol is designed for reliability over long-
distance diverse networks, not low latency on a single computer. You must
have a powerful reason for wanting to use it in an application that
doesn't required long-distance communication.

You didn't say what OS, but your are asking on general linux groups. This
is hard real-time stuff, and you'd be better off asking on forums
dedicated to real-time programming. I don't know any, unfortunately. I
know that real-time is done using specialized versions of Linux, and I'm
curious about its possibilities for this kind of work.

Jorgen Grahn

unread,
Mar 28, 2011, 6:34:40 AM3/28/11
to
["Followup-To:" header set to comp.os.linux.networking.]

Don't estimate -- measure. You can get good numbers from low down in
the IP stack from tcpdump(1), and you can get numbers at the
process--kernel border using strace(1).

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Rick Jones

unread,
Mar 28, 2011, 12:20:34 PM3/28/11
to
Ping != TCP, but you may already know that. How do you write data to
the socket? Is it ever more than one send call per "message?" Is
your message size fixed? How does the controller software at the
other end pull data from the socket? (strace may be helpful if you
don't have source).

Ultimately, over loopback, your minimum latency will be a function of
the stack path length and whether you tickle unpleasant things (like
say doing multiple sends for a single message). There will also be
the question of whether or not your client or the controller software
gets timeslided.

For some idea of the path length, you can profile. You might also try
running a netperf TCP_RR test with CPU utilization enabled, and little
to nothing else running on the system at the time. (Of course, I have
a natural bias towards suggesting folks run netperf :)

happy benchmarking,

rick jones
--
It is not a question of half full or empty - the glass has a leak.
The real question is "Can it be patched?"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

Ignoramus7104

unread,
Mar 28, 2011, 2:10:50 PM3/28/11
to
On 2011-03-28, Rick Jones <rick....@hp.com> wrote:
> Ping != TCP, but you may already know that.

Right.

> How do you write data to the socket?

By sayind "send" and using TCP_NODELAY option.

The amount of sending is minuscule, compared to the available
bandwidth. So, the "pipe" will never be full, or even close to full.

> Is it ever more than one send call per "message?"

Never

> Is your message size fixed?

Not fixed, but usually small.

> How does the controller software at the other end pull data from the
> socket? (strace may be helpful if you don't have source).

I have a source. I wrote both parts.

Sadly, I am basically forced to use TCP, as opposed to say UDP, due to
some stupid requirement, which I prefer not to debate.

So, within the limitation of having to use TCP, I want to find the
best way of sending.

> Ultimately, over loopback, your minimum latency will be a function of
> the stack path length and whether you tickle unpleasant things (like
> say doing multiple sends for a single message). There will also be
> the question of whether or not your client or the controller software
> gets timeslided.

This is where I am, indeed, concerned. I did time some things and just
timing "send( ... )" alone gives something like 20 microseconds to
just do "send". (!!!)

So, just the send function seems to trigger pretty huge work,
amounting to 20 microseconds.

> For some idea of the path length, you can profile. You might also try
> running a netperf TCP_RR test with CPU utilization enabled, and little
> to nothing else running on the system at the time. (Of course, I have
> a natural bias towards suggesting folks run netperf :)

I will read about this netperf. Sounds exciting!

i

Rick Jones

unread,
Mar 28, 2011, 3:07:16 PM3/28/11
to
Ignoramus7104 <ignora...@nospam.7104.invalid> wrote:
> On 2011-03-28, Rick Jones <rick....@hp.com> wrote:

> > How do you write data to the socket?

> By sayind "send" and using TCP_NODELAY option.

> The amount of sending is minuscule, compared to the available
> bandwidth. So, the "pipe" will never be full, or even close to full.

> > Is it ever more than one send call per "message?"

> Never

Then, unless you have multiple messages outstanding at one time, the
TCP_NODELAY is extraneous.

> > Is your message size fixed?

> Not fixed, but usually small.

> > How does the controller software at the other end pull data from
> > the socket? (strace may be helpful if you don't have source).

> I have a source. I wrote both parts.

> Sadly, I am basically forced to use TCP, as opposed to say UDP, due
> to some stupid requirement, which I prefer not to debate.

Well, many of us here are fond of the "Paul Harvey" (as in, learning
"The rest of the story.") :)

> So, within the limitation of having to use TCP, I want to find the
> best way of sending.

And receiving. Frankly, it shoulds like you are already there for
sending - single send per message is about as good as you can get
unless you have multiple messages pending.

How many receive calls do you make per message?

> > Ultimately, over loopback, your minimum latency will be a function of
> > the stack path length and whether you tickle unpleasant things (like
> > say doing multiple sends for a single message). There will also be
> > the question of whether or not your client or the controller software
> > gets timeslided.

> This is where I am, indeed, concerned. I did time some things and just
> timing "send( ... )" alone gives something like 20 microseconds to
> just do "send". (!!!)

> So, just the send function seems to trigger pretty huge work,
> amounting to 20 microseconds.

Well, since this is loopback it may not always be just send. Much of
the receive side can end-up happening on "your" stack as well. I
believe if conditions are right, the processing can go all the way to
queuing the buffer to the receiver's socket.

rick jones
--
portable adj, code that compiles under more than one compiler

0 new messages