Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Strange socket delays...

42 views
Skip to first unread message

Andreas Leitgeb

unread,
Dec 7, 2017, 9:35:13 AM12/7/17
to
I've got a client-server application, the client is a Tcl script.
(the server is likely irrelevant, anyway its a Java program.)

I've also got two machines, a Linux machine and an AIX machine,
each of which can run both client and server, and the clients
can connect either to the local server or to the respective
other machine's server. (testing environment; in production,
both sides would run on various AIX servers)

Most of the communication is pretty fast (less than 1ms for a
request-reply pair, but for certain requests there is a delay
that happens outside the actual processing, and this delay
depends on the client: if the (tcl) client runs on AIX, these
delays are 200ms, if the client is on linux, it's "only" 40ms
(but still too much!), in both cases independent(!) on the
server (whether localhost or the other machine). These delays
also seem independent on how much data is actually transferred
at these steps of the dialog.

The dialog runs essentially like this:
1) server->client: "give me a line of input from file1"
(client reads local file1)
2) client->server: "here's the line: ..."
3) server->client: "write this line to file2: ..."
(client writes to local file2, no reply to server)
(delay)
4) server->client: "write this line to file2: ..."
(client writes to local file2, no reply to server)
5) server->client: "write this line to file2: ..."
(client writes to local file2, no reply to server)
6) server->client: "give me a line of input from file1"
(client reads local file1)
7) client->server: "here's the line: ..."
8) server->client: "write this line to file2: ..."
(client writes to local file2, no reply to server)
(delay)
9) server->client: "give me a line of input from file1"
(client reads local file1)
...

The delay occurs after each *first* "write this line", thus
after 3,8 but not after immediately subsequent further writes,
as those in 4 or 5. The delays are fully deterministic in that
pattern of occurring!

If the communication goes via a pair of named pipes (that was the
old mode of local-only communication, before adding sockets),
then there is no noticeable delay (always less than 1ms).

Does such a socket-related delay ring any bells?

Rich

unread,
Dec 7, 2017, 12:05:01 PM12/7/17
to
Andreas Leitgeb <a...@logic.at> wrote:
> [long, detailed, explanation clipped, see original posting for the
> details]
>
> The delay occurs after each *first* "write this line", thus
> after 3,8 but not after immediately subsequent further writes,
> as those in 4 or 5. The delays are fully deterministic in that
> pattern of occurring!
>
> If the communication goes via a pair of named pipes (that was the
> old mode of local-only communication, before adding sockets),
> then there is no noticeable delay (always less than 1ms).
>
> Does such a socket-related delay ring any bells?

The only bell it rings for me is the Nagle algorithm interaction with
TCP delayed ACK's:

https://en.wikipedia.org/wiki/Nagle's_algorithm

Scroll down to the paragraph starting: "This algorithm interacts badly
with TCP delayed acknowledgments".

Andreas Leitgeb

unread,
Dec 7, 2017, 2:36:30 PM12/7/17
to
Rich <ri...@example.invalid> wrote:
> Andreas Leitgeb <a...@logic.at> wrote:
>> [long, detailed, explanation clipped, see original posting for the
>> details]
>> The delay occurs after each *first* "write this line", ...
>> Does such a socket-related delay ring any bells?
>
> The only bell it rings for me is the Nagle algorithm interaction with
> TCP delayed ACK's:
>
> https://en.wikipedia.org/wiki/Nagle's_algorithm
>
> Scroll down to the paragraph starting: "This algorithm interacts badly
> with TCP delayed acknowledgments".
>

Yes, that seems to hit the nail on its head!
Thanks a lot! I wasn't aware of it.

In the meantime, based on experimental results, I changed the
protocol to always send a reply.

If I need better performance, I will remove the dummy replies
and set TCP_NODELAY on the Java side. I'm lucky, that that's the
side I'd need it for, because if it were the Tcl side, then it
seems like I'd have to wait for TIP 344, which is pending since
end of 2008.

0 new messages